|
| 1 | +# B2 downstream validation (v11-per-stage-lambda) |
| 2 | + |
| 3 | +Run date: 2026-04-22 |
| 4 | +Artifact: `artifacts/live_pe_us_data_rebuild_checkpoint_20260421_v11_per_stage_lambda/v11-per-stage-lambda/policyengine_us.h5` |
| 5 | +Period: 2024 |
| 6 | +Method: `scripts/run_b2_batched.py` with batch_size=50_000 for income_tax, 100_000 for aca_ptc, full-dataset for the rest. |
| 7 | +Comparison framework: `microplex_us.validation.downstream.DOWNSTREAM_BENCHMARKS_2024`. |
| 8 | + |
| 9 | +## Results |
| 10 | + |
| 11 | +| Variable | Computed | Benchmark | Rel error | Source | |
| 12 | +|----------|---------:|----------:|---------:|--------| |
| 13 | +| income_tax | $2,089.7B | $2,400.0B | −12.9% | IRS SOI 2022 ~$2.22T; CBO 2024 projection ~$2.4T | |
| 14 | +| eitc | $64.2B | $64.0B | +0.3% | IRS SOI 2023 (Table 2.5) | |
| 15 | +| snap | $101.8B | $100.0B | +1.8% | USDA FNS FY2024 | |
| 16 | +| ctc | $151.9B | $115.0B | +32.1% | IRS SOI 2023 (pre-OBBBA $2,000/qc) | |
| 17 | +| ssi | $108.2B | $66.0B | +64.0% | SSA SSI Annual Statistical Report 2024 | |
| 18 | +| aca_ptc | $14.1B | $60.0B | −76.4% | CMS/IRS ACA PTC 2024 (IRA-enhanced) | |
| 19 | + |
| 20 | +## Reading |
| 21 | + |
| 22 | +- **Within ±15%** of benchmark: income_tax (−12.9%), eitc (+0.3%), snap (+1.8%). The tax-mechanics chain and the two largest means-tested programs reconcile to published totals once calibrated weights are applied. |
| 23 | +- **Elevated +30% to +65%**: ctc and ssi. ctc = 32% above IRS SOI suggests either more qualifying children per household than IRS counts, or the synthesis pulled CTC-eligible families with higher frequency than the population-level CTC claim rate; ssi at +64% is the cleanest outlier and points to either over-representation of the aged / disabled low-income subpopulation or a missed means-test gate in the synthesis-then-materialize step. |
| 24 | +- **Under at −76%**: aca_ptc. The `has_marketplace_health_coverage` flag is in the synthesis target set, but the reconciled PTC depends on a policy-output chain (MAGI, federal poverty line, premium contribution). Either marketplace enrollment is under-represented at the income bands where PTC is largest, or the IRA-enhanced subsidy schedule isn't firing as it does in production IRS data. |
| 25 | + |
| 26 | +## Interpretation for the paper's B2 section |
| 27 | + |
| 28 | +Three headline aggregates reconcile within single-digit or low-teens relative error. The three that don't (ctc, ssi, aca_ptc) are individually diagnosable — each points to a specific shortfall in the synthesis step rather than a structural problem in the calibration framework. A follow-up calibration pass can add direct targets on these aggregates (CTC disbursed, SSI disbursed, ACA PTC disbursed) to drive them in. |
| 29 | + |
| 30 | +The income_tax reconciliation at −12.9% is the most important single number: it's the paper's headline claim that the calibrated synthesis produces a PolicyEngine-US-readable frame whose downstream tax-output reconciles to IRS administrative totals within a credible tolerance. |
| 31 | + |
| 32 | +## Reproduction |
| 33 | + |
| 34 | +```bash |
| 35 | +# All variables except income_tax and aca_ptc fit in the full-dataset path: |
| 36 | +for var in ssi snap eitc ctc; do |
| 37 | + .venv/bin/python -u scripts/run_b2_validation_single_var.py \ |
| 38 | + --dataset <h5> --output <json_path> --variable "$var" --period 2024 |
| 39 | +done |
| 40 | + |
| 41 | +# income_tax and aca_ptc need batching to avoid 30+ GB peak RSS: |
| 42 | +.venv/bin/python -u scripts/run_b2_batched.py \ |
| 43 | + --dataset <h5> --output <json_path> --variable income_tax \ |
| 44 | + --period 2024 --batch-size 50000 |
| 45 | + |
| 46 | +.venv/bin/python -u scripts/run_b2_batched.py \ |
| 47 | + --dataset <h5> --output <json_path> --variable aca_ptc \ |
| 48 | + --period 2024 --batch-size 100000 |
| 49 | +``` |
0 commit comments