This repository was archived by the owner on Jun 19, 2026. It is now read-only.
Commit 843293c
authored
Make the FRS dataset build deterministic (seed all RNG draws) (#425)
* Make property_purchased assignment deterministic (seeded RNG)
The data build set property_purchased via unseeded np.random.random(),
so every build drew a different vector of purchasers. That made the
dataset non-reproducible and intermittently spiked the first income
decile's effective tax rate (the draw occasionally marked too many
high-property, low-income households as purchasers), failing
test_first_decile_tax_rate_reasonable and blocking releases.
Draw from a seeded numpy Generator (default_rng(0)) instead of the
global RNG, whose state depends on whatever ran earlier in the build.
Same FRS input now always yields the same ~3.85% purchaser assignment.
Pairs with the policyengine-uk fix flipping property_purchased's
default to False, which fail-safes any household this build does not
explicitly set.
* Fix review findings: seed capital gains and BRMA sampling too
Independent review found the property_purchased seed was necessary but
not sufficient for a reproducible build: two more assignments drew from
the unseeded global numpy RNG.
- imputations/capital_gains.py: quantile draws for the capital gains
amount imputation now come from a seeded default_rng(0), so capital
gains (and CGT revenue) are reproducible.
- frs.py BRMA assignment: both pandas .sample() calls (region/category
rent sampling and the household-level pick) now take a seeded
random_state generator instead of the global RNG.
The SPI synthetic sampling (income.py) was already seeded. The only
remaining unseeded np.random is childcare/takeup_rate.py, which is not
reached by the dataset build (test-only); left for separate cleanup.
Broadened the changelog to reflect that the whole FRS build is now
deterministic.1 parent e5c7f84 commit 843293c
3 files changed
Lines changed: 29 additions & 10 deletions
File tree
- changelog.d
- policyengine_uk_data/datasets
- imputations
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1251 | 1251 | | |
1252 | 1252 | | |
1253 | 1253 | | |
1254 | | - | |
| 1254 | + | |
| 1255 | + | |
| 1256 | + | |
1255 | 1257 | | |
1256 | 1258 | | |
| 1259 | + | |
1257 | 1260 | | |
1258 | 1261 | | |
1259 | 1262 | | |
| |||
1262 | 1265 | | |
1263 | 1266 | | |
1264 | 1267 | | |
1265 | | - | |
| 1268 | + | |
1266 | 1269 | | |
1267 | 1270 | | |
1268 | 1271 | | |
| |||
1276 | 1279 | | |
1277 | 1280 | | |
1278 | 1281 | | |
1279 | | - | |
| 1282 | + | |
| 1283 | + | |
| 1284 | + | |
1280 | 1285 | | |
1281 | 1286 | | |
1282 | 1287 | | |
| |||
1430 | 1435 | | |
1431 | 1436 | | |
1432 | 1437 | | |
1433 | | - | |
1434 | | - | |
1435 | | - | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
| 1445 | + | |
| 1446 | + | |
1436 | 1447 | | |
1437 | 1448 | | |
1438 | 1449 | | |
| |||
1443 | 1454 | | |
1444 | 1455 | | |
1445 | 1456 | | |
1446 | | - | |
1447 | | - | |
| 1457 | + | |
| 1458 | + | |
1448 | 1459 | | |
| 1460 | + | |
| 1461 | + | |
1449 | 1462 | | |
1450 | | - | |
| 1463 | + | |
1451 | 1464 | | |
1452 | 1465 | | |
1453 | 1466 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
117 | 117 | | |
118 | 118 | | |
119 | 119 | | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
120 | 125 | | |
121 | 126 | | |
122 | 127 | | |
| |||
128 | 133 | | |
129 | 134 | | |
130 | 135 | | |
131 | | - | |
| 136 | + | |
132 | 137 | | |
133 | 138 | | |
134 | 139 | | |
| |||
0 commit comments