This repository was archived by the owner on Jun 14, 2026. It is now read-only.
Commit af62615
Add ScaleUpRunner harness for synthesizer scale-up benchmark
Implements the stage-1/2/3 protocol from docs/synthesizer-benchmark-scale-up.md
as a real runnable harness.
Components:
- src/microplex_us/bakeoff/scale_up.py
* ScaleUpStageConfig: frozen dataclass with curated 50-column default
(14 demographics + 36 income/wealth/benefit targets)
* ScaleUpRunner: load_frame, split, fit_and_generate, run
* _load_enhanced_cps: entity-aware loader that broadcasts
household / SPM-unit / tax-unit / family / marital-unit variables
down to person level via person_<entity>_id -> <entity>_id lookups
* Per-method metrics: PRDC precision/density/coverage (via prdc
library), wall time, peak RSS, rare-cell preservation ratios
(elderly self-employed, young dividend, disabled SSDI,
top-1 % employment), zero-rate MAE
* CLI: python -m microplex_us.bakeoff.scale_up --stage stage1 ...
* Stage configs: stage1 (~77k from ECPS), stage2 (1M, needs larger
source), stage3 (v6 seed-ready 3.4M x 155)
- tests/bakeoff/test_scale_up.py
* Smoke tests on a 500-row, 5-column, ZI-QRF-only slice
* Entity-broadcast verification via real ECPS loading
* Column-missing error path
* Default column-set sanity check
Notable limitations recorded for follow-up:
- state_fips / snap_reported / net_worth / housing_assistance and other
non-person entity variables are now correctly broadcast to person
level via ID lookup. This was the blocker for a flat DataFrame.
- enhanced_cps_2024 has 77k persons, not the 100k stage-1 target.
n_rows=None now uses all available.
- is_household_head is not in ECPS; replaced with is_separated.
Not in this commit (deliberate):
- No execution of stage1 / stage2 / stage3 runs yet
- No CTGAN / TVAE support (present in registry, not in default method set)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent a408fb4 commit af62615
4 files changed
Lines changed: 861 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
0 commit comments