Skip to content

Add saved-H5 rescoring and household-metrics workflow#71

Open
MaxGhenis wants to merge 6 commits into
mainfrom
codex/saved-h5-main
Open

Add saved-H5 rescoring and household-metrics workflow#71
MaxGhenis wants to merge 6 commits into
mainfrom
codex/saved-h5-main

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

@MaxGhenis MaxGhenis commented Apr 3, 2026

Summary

  • add a saved-H5 rescoring workflow for rerunning reforms on locally built long-run calibrated datasets
  • split scenario simulation from weighted aggregation via a household-metrics-first path
  • add local parallel fanout tooling for year x scenario materialization
  • track representative and all-reforms long-run findings from the saved-H5 checks
  • document the methodology caveats around dependency behavior in the late tail

Scope

  • extend src/year_runner.py with household-level scenario metrics and explicit weighted aggregation helpers
  • add scripts/materialize_household_metrics.py
  • add scripts/materialize_household_metrics_grid.py
  • add scripts/aggregate_household_metrics.py
  • update scripts/score_saved_h5_reforms.py, batch/compute_year.py, and modal_batch/compute.py to use the new aggregation path
  • update README.md for the saved-H5 and household-metrics workflows
  • add focused regression coverage in tests/test_year_runner_household_metrics.py

Method

  • this PR is designed to work from locally built calibrated H5 datasets rather than waiting on policyengine-us-data CI
  • it validates saved-H5 metadata before scoring and composes the intended baseline tax reform during rescoring
  • baseline is treated as just another scenario in the new household-metrics path
  • scenario simulation can now run ahead of final weighted aggregation, as long as the year support dataset exists

Validation

  • uv run pytest tests/test_saved_h5_rescoring.py tests/test_year_runner_household_metrics.py -q
  • uv run ruff check src/year_runner.py scripts/materialize_household_metrics.py scripts/materialize_household_metrics_grid.py scripts/aggregate_household_metrics.py scripts/score_saved_h5_reforms.py batch/compute_year.py modal_batch/compute.py tests/test_saved_h5_rescoring.py tests/test_year_runner_household_metrics.py
  • python3 -m py_compile src/year_runner.py scripts/materialize_household_metrics.py scripts/materialize_household_metrics_grid.py scripts/aggregate_household_metrics.py scripts/score_saved_h5_reforms.py batch/compute_year.py modal_batch/compute.py

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 3, 2026

@MaxGhenis MaxGhenis marked this pull request as ready for review April 3, 2026 12:32
@MaxGhenis MaxGhenis changed the title Add saved-H5 rescoring and long-run findings Add saved-H5 rescoring and household-metrics workflow Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant