Skip to content
This repository was archived by the owner on Jun 14, 2026. It is now read-only.

Commit 9c8dd11

Browse files
committed
Initial microplex-us release
0 parents  commit 9c8dd11

75 files changed

Lines changed: 33576 additions & 0 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
.venv/
2+
.pytest_cache/
3+
.ruff_cache/
4+
artifacts/
5+
.DS_Store
6+
__pycache__/
7+
*.pyc

.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.14

README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# microplex-us
2+
3+
US-specific survey adapters, calibration targets, pipelines, and PolicyEngine integration
4+
built on top of the generic `microplex` engine.
5+
6+
## Docs
7+
8+
- [Docs index](./docs/README.md)
9+
- [Architecture](./docs/architecture.md)
10+
- [Source semantics](./docs/source-semantics.md)
11+
- [Benchmarking](./docs/benchmarking.md)
12+
13+
## Current focus
14+
15+
`microplex-us` is being built as a library-first replacement path for
16+
`policyengine-us-data`:
17+
18+
- canonical source and target metadata
19+
- PE-US-compatible export
20+
- full-target benchmarking against the active targets DB
21+
- run registry and DuckDB index for frontier analysis
22+
23+
The architecture is still evolving, so the docs are deliberately technical and
24+
operational rather than paper-like.

docs/README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# microplex-us docs
2+
3+
- [Architecture](./architecture.md)
4+
- [Source semantics](./source-semantics.md)
5+
- [Benchmarking](./benchmarking.md)
6+
7+
This doc set is intentionally technical. It is meant to answer three questions:
8+
9+
1. What is the current architecture?
10+
2. How do source semantics and variable semantics drive donor integration?
11+
3. How do we measure progress against `policyengine-us-data` on real targets?
12+
13+
The docs describe the code that exists today. They do not try to freeze a final
14+
paper narrative while the architecture is still moving.

docs/architecture.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Architecture
2+
3+
`microplex-us` is the US-specific country package built on top of the generic
4+
`microplex` engine.
5+
6+
## Package split
7+
8+
- `microplex`: generic engine pieces
9+
- source descriptors and observation frames
10+
- fusion planning
11+
- synthesis and calibration
12+
- canonical target spec and provider protocol
13+
- generic geography and entity abstractions
14+
- `microplex-us`: US-specific implementations
15+
- CPS, PUF, and other source providers
16+
- PE-US target import and compilation
17+
- PE-US export and evaluation
18+
- US experiment, registry, and artifact layers
19+
20+
## Current build flow
21+
22+
Main entrypoint:
23+
24+
- `microplex_us.pipelines.USMicroplexPipeline`
25+
26+
Current broad flow:
27+
28+
1. Load one or more `SourceProvider`s into `ObservationFrame`s.
29+
2. Build a `FusionPlan` from the source descriptors.
30+
3. Choose a public structured scaffold source.
31+
4. Prepare canonical seed data from the scaffold.
32+
5. Integrate donor-only variables from other sources using source and variable
33+
capability metadata, with donor-block-specific automatic condition selection,
34+
declared condition-entity policy, and native-entity projection when entity
35+
IDs are available.
36+
6. Synthesize a new population.
37+
7. Build PolicyEngine-style entity tables.
38+
8. Materialize PE-derived features needed by targets.
39+
9. Calibrate against PE-US DB targets.
40+
10. Export a PE-ingestable H5 and evaluate against the full active target set.
41+
42+
Important files:
43+
44+
- `src/microplex_us/pipelines/us.py`
45+
- `src/microplex_us/policyengine/us.py`
46+
- `src/microplex_us/policyengine/comparison.py`
47+
- `src/microplex_us/pipelines/artifacts.py`
48+
- `src/microplex_us/pipelines/index_db.py`
49+
50+
## What is already true
51+
52+
- The package is library-first. The core build, artifact saving, experiment
53+
running, and frontier tracking all live in importable APIs.
54+
- PolicyEngine evaluation uses the real `policyengine-us-data` targets DB as
55+
truth targets.
56+
- Saved runs persist:
57+
- artifact bundle
58+
- `policyengine_harness.json`
59+
- `run_registry.jsonl`
60+
- `run_index.duckdb`
61+
62+
## What is not final yet
63+
64+
- Broad PE-US parity is not stable yet.
65+
- The current US path is still scaffold-plus-donors rather than a fully
66+
symmetric multientity latent-population model.
67+
- Held-out target evaluation is not the default loop yet.
68+
- Local-area production replacement is still future work.
69+
70+
## Design direction
71+
72+
The intended long-run shape is:
73+
74+
- canonical source metadata
75+
- canonical variable semantics
76+
- multientity fusion
77+
- derived-variable materialization after atomic modeling
78+
- target compilation as a generic feature/filter/aggregation problem
79+
80+
The current implementation is already moving in that direction:
81+
82+
- canonical target spec
83+
- source capability registry
84+
- variable semantic registry
85+
- donor block specs with declared match strategies
86+
- donor block specs with declared condition-entity policy
87+
- variable semantics with declared projection aggregation for group-level donor fits
88+
- automatic donor condition selection from source overlap plus data signal
89+
- native-entity donor execution for tax-unit-native blocks when IDs are present
90+
- full-target PE-US harness
91+
92+
But it is still an actively evolving system, not a finished paper architecture.

docs/benchmarking.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# Benchmarking
2+
3+
The benchmark question is:
4+
5+
> Is Microplex closer to the real target DB than `policyengine-us-data` is?
6+
7+
## What is truth
8+
9+
Truth is the active target set loaded from the PE-US targets DB.
10+
11+
Main provider:
12+
13+
- `microplex_us.policyengine.PolicyEngineUSDBTargetProvider`
14+
15+
The baseline dataset is not truth. It is only the incumbent comparator.
16+
17+
## What PolicyEngine does
18+
19+
`policyengine-us` is the shared measurement operator.
20+
21+
Both:
22+
23+
- the Microplex candidate dataset
24+
- the `policyengine-us-data` baseline dataset
25+
26+
are run through the same PE-US variable materialization and the same target
27+
compiler before being compared to the same targets.
28+
29+
So the benchmark shape is:
30+
31+
`dataset -> policyengine-us -> implied aggregates -> compare to target DB`
32+
33+
## Current default harness
34+
35+
Default saved-build evaluation now uses:
36+
37+
- the full active PE-US target estate
38+
- one `all_targets` slice
39+
40+
Main files:
41+
42+
- `src/microplex_us/policyengine/harness.py`
43+
- `src/microplex_us/policyengine/comparison.py`
44+
45+
## Main metrics
46+
47+
Per run:
48+
49+
- `candidate_composite_parity_loss`
50+
- `baseline_composite_parity_loss`
51+
- `candidate_mean_abs_relative_error`
52+
- `baseline_mean_abs_relative_error`
53+
- `target_win_rate`
54+
- `supported_target_rate`
55+
56+
The frontier metric is currently:
57+
58+
- `candidate_composite_parity_loss`
59+
60+
This is a diversity-aware outer loss over the target set rather than a raw
61+
target-count-weighted mean alone.
62+
63+
## Saved outputs
64+
65+
Every serious saved run can write:
66+
67+
- artifact bundle directory
68+
- `policyengine_harness.json`
69+
- `run_registry.jsonl`
70+
- `run_index.duckdb`
71+
72+
These live under the selected artifact root.
73+
74+
## Inspecting runs
75+
76+
Useful Python APIs:
77+
78+
- `select_us_microplex_frontier_entry(...)`
79+
- `select_us_microplex_frontier_index_row(...)`
80+
- `list_us_microplex_target_delta_rows(...)`
81+
- `compare_us_microplex_target_delta_rows(...)`
82+
83+
The last helper is meant for questions like:
84+
85+
- what changed between two broad runs?
86+
- which targets improved under a source-policy change?
87+
- which target families regressed even when overall loss improved?
88+
89+
## Current broad reference point
90+
91+
As of March 27, 2026, the best recorded broad `national + state` `CPS+PUF`
92+
frontier in the main artifact root was:
93+
94+
- artifact id: `cps_puf_500_native_wages`
95+
- candidate composite parity loss: `0.8906`
96+
- baseline composite parity loss: `4.5412`
97+
- candidate mean absolute relative error: `0.9928`
98+
- baseline mean absolute relative error: `1.1920`
99+
100+
That does **not** mean Microplex is already better on most targets. The same run
101+
had a low `target_win_rate`, meaning the gain comes from improving the overall
102+
loss surface rather than beating the incumbent on a majority of individual
103+
targets.
104+
105+
## Important caveats
106+
107+
- This is parity evaluation, not held-out evaluation.
108+
- Calibration and evaluation still overlap unless explicitly separated in build
109+
config.
110+
- A broad win on the composite loss is not the same thing as a majority-target
111+
win.
112+
- Local-area production parity is not finished yet.
113+
114+
## Repro pattern
115+
116+
Broad versioned builds use:
117+
118+
- `build_and_save_versioned_us_microplex(...)`
119+
- `build_and_save_versioned_us_microplex_from_source_provider(...)`
120+
- `build_and_save_versioned_us_microplex_from_source_providers(...)`
121+
122+
The resulting run can then be inspected through the JSON artifacts or via the
123+
DuckDB index.

0 commit comments

Comments
 (0)