You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jun 19, 2026. It is now read-only.
The dataset calibration process is inflating the UK population significantly above ONS targets. The base FRS 2025 dataset has ~69M people, but after calibration this jumps to ~74M - about 6% above the ONS mid-2024 actual estimate of 69.3M.
Evidence
Dataset
2025 Population
Base dataset (frs_2025_with_ss.h5)
68.97M
Calibrated dataset (frs_2025_calibrated_v3.h5)
73.57M
ONS mid-2024 actual estimate
69.3M
ONS 2022-based projection for 2025
~70M
Root Cause Investigation
The uk_population target IS included in the calibration loss function (loss.py line 318)
The population index in policyengine-uk (ons.population) uses reasonable growth rates
BUT the calibration is not constraining population properly - other targets are pulling weights in a direction that inflates total population
Potential Solutions
Increase the weight on population targets in the calibration loss function
Add a hard constraint that total population must match the target
Review conflicting targets that may be inflating population (e.g., regional age bands sum to more than national total)
Summary
The dataset calibration process is inflating the UK population significantly above ONS targets. The base FRS 2025 dataset has ~69M people, but after calibration this jumps to ~74M - about 6% above the ONS mid-2024 actual estimate of 69.3M.
Evidence
Root Cause Investigation
uk_populationtarget IS included in the calibration loss function (loss.pyline 318)ons.population) uses reasonable growth ratesPotential Solutions
Impact
This is causing CI test failures in PR #216:
Data Sources