Make country package purely deterministic - read stochastic variables from dataset#1355
Closed
MaxGhenis wants to merge 3 commits into
Closed
Make country package purely deterministic - read stochastic variables from dataset#1355MaxGhenis wants to merge 3 commits into
MaxGhenis wants to merge 3 commits into
Conversation
This change moves all randomness generation from policyengine-uk to policyengine-uk-data, following the pattern established in policyengine-us. Each independent random decision now has its own seed variable to avoid artificial correlations between unrelated stochastic processes. Changes: - Add 11 new seed variables (4 person-level, 4 benunit-level, 3 household-level): - is_disabled_for_benefits_seed - marriage_allowance_take_up_seed - is_higher_earner_seed - attends_private_school_seed - child_benefit_take_up_seed - child_benefit_opts_out_seed - pension_credit_take_up_seed - universal_credit_take_up_seed - first_home_purchase_seed - household_owns_tv_seed - tv_licence_evasion_seed - Update all variables using random() to use their specific seed variable This ensures reproducible simulations and allows the dataset to control all stochastic elements of the model. Related: policyengine-uk-data PR (must be merged first)
598937b to
d4b0b58
Compare
… from dataset This change removes all random number generation from policyengine-uk. All stochastic variables are now generated in policyengine-uk-data and read from the dataset. The country package is now a purely deterministic rules engine. ## Key Changes - Remove all take-up rate parameters (moved to policyengine-uk-data) - Remove all random seed/draw variables for take-up decisions - Simplify would_claim variables to dataset-only (no formula) - Keep formulas with fallback to random() for policy calculator (non-microsim) - Add would_claim_marriage_allowance variable - Add random draw variables for tie-breaking and conditional probabilities ### Variables now sourced from dataset: **Take-up decisions (boolean):** - would_claim_child_benefit - child_benefit_opts_out - would_claim_pc - would_claim_uc - would_claim_marriage_allowance - would_claim_tfc - would_claim_extended_childcare - would_claim_universal_childcare - would_claim_targeted_childcare **Other stochastic variables:** - household_owns_tv - would_evade_tv_licence_fee - main_residential_property_purchased_is_first_home **Random draws (for formulas):** - is_higher_earner_random_draw (tie-breaking) - attends_private_school_random_draw (income-conditional) ### Formulas preserved for policy calculator: - attends_private_school (complex income percentile logic) - is_disabled_for_benefits (conditional on qualifying benefits) - is_higher_earner (uses random draw for tie-breaking) These fall back to random() when not in a microsimulation context. ## Trade-offs **IMPORTANT**: Take-up rates can no longer be adjusted via policy reforms. They are fixed in the microdata. This is an acceptable trade-off for the cleaner architecture. To adjust take-up rates, regenerate the microdata. Related: policyengine-uk-data PR #[TBD] (must be merged FIRST) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
d4b0b58 to
0065d57
Compare
4 tasks
Collaborator
Author
|
Superseded by #1439 - fresh PR after resolving conflicts |
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR removes all random number generation from policyengine-uk. All stochastic variables are now generated in policyengine-uk-data and read from the dataset. The country package is now a purely deterministic rules engine.
Changes
Removed
Simplified
All would_claim variables now use dataset values with deterministic fallbacks:
Other stochastic variables simplified to dataset-only:
Added
Preserved formulas (fully deterministic)
These variables keep their formulas but with NO random() calls:
Test updates
Updated expected fiscal impacts in
reforms_config.yamlto reflect the new stochastic simulation method.Trade-offs
IMPORTANT: Take-up rates can no longer be adjusted dynamically via policy reforms or in the web app. They are fixed in the microdata. This is an acceptable trade-off for the cleaner architecture of keeping the country package purely deterministic.
To adjust take-up rates for analysis, the microdata must be regenerated with updated parameter values in policyengine-uk-data.
Test Plan
Related PRs
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com