Move all randomness to data package for deterministic country package#442
Closed
MaxGhenis wants to merge 2 commits into
Closed
Move all randomness to data package for deterministic country package#442MaxGhenis wants to merge 2 commits into
MaxGhenis wants to merge 2 commits into
Conversation
This change moves ALL random number generation from policyengine-us into the dataset generation in policyengine-us-data. The country package is now a purely deterministic rules engine. ## Key Changes ### policyengine-us-data: - Add take-up rate YAML parameter files in `parameters/take_up/` - Generate all stochastic boolean take-up decisions in CPS dataset - Use seeded RNG (seed=100) for full reproducibility ### Stochastic variables generated: **Take-up decisions (boolean):** - takes_up_snap_if_eligible - takes_up_aca_if_eligible - takes_up_medicaid_if_eligible - takes_up_eitc (already boolean) - takes_up_dc_ptc (already boolean) All random generation now uses np.random.default_rng(seed=100) for full reproducibility across dataset builds. ## Trade-offs **IMPORTANT**: Take-up rates can no longer be adjusted dynamically via policy reforms or in the web app. They are fixed in the microdata. This is an acceptable trade-off for the cleaner architecture of keeping the country package purely deterministic. To adjust take-up rates, the microdata must be regenerated. Related: policyengine-us PR (must be merged after this) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Create takeup parameter files with rates from NIEER report - Head Start: 40% (pre-pandemic), 30% (pandemic 2020-2021) - Early Head Start: 9% - Generate stochastic takeup in CPS dataset using same pattern as SNAP/Medicaid - Coordinates with policyengine-us PR adding takeup variables
4 tasks
Contributor
Author
|
Closing fork PR - recreating from upstream branch to enable CI |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR moves ALL random number generation from policyengine-us into the dataset generation in policyengine-us-data. The country package is now a purely deterministic rules engine.
Changes
New take-up rate parameters
Added YAML parameter files in
policyengine_us_data/parameters/take_up/:CPS dataset generation
Stochastic variables generated
Take-up decisions (boolean):
Trade-offs
IMPORTANT: Take-up rates can no longer be adjusted dynamically via policy reforms or in the web app. They are fixed in the microdata at generation time. This is an acceptable trade-off for the cleaner architecture of keeping the country package purely deterministic.
To adjust take-up rates for analysis, the microdata must be regenerated with updated parameter values.
Test Plan
Related PRs
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com