Skip to content

Improve calibration weight initialization with country-aware divisors#262

Merged
MaxGhenis merged 2 commits into
mainfrom
fix-calibration-initialization
Jan 17, 2026
Merged

Improve calibration weight initialization with country-aware divisors#262
MaxGhenis merged 2 commits into
mainfrom
fix-calibration-initialization

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

Summary

  • Fix calibration weight initialization to use country-aware divisors
  • Previously divided all household weights by 650 (total constituencies)
  • Now divides by areas in each household's country (e.g., 59 for Scotland)

Problem

At epoch 0, Scottish targets were starting at ~10% of target because:

  • Scottish households had weight divided by 650
  • But only 59 Scottish constituencies could use those weights
  • Net effect: Scotland started at 59/650 ≈ 9% of target

This caused slow convergence and potentially suboptimal final results for devolved nation targets.

Solution

Use the country mask r to compute areas per household:

areas_per_household = r.sum(axis=0)  # areas each household can contribute to
original_weights = np.log(household_weight / areas_per_household + noise)

Expected Results

Target Before (epoch 0) After (epoch 0)
Scotland age 0-9 11% of target ~100% of target
Scottish Child Payment 5% of target ~100% of target
Wales targets ~6% of target ~100% of target
NI targets ~3% of target ~100% of target

This should result in faster convergence and better final accuracy, especially for Scotland, Wales, and Northern Ireland specific targets.

Test plan

  • CI passes (lint + full calibration test)
  • Compare calibration logs to verify improved epoch 0 values
  • Check final SCP accuracy improves

🤖 Generated with Claude Code

MaxGhenis and others added 2 commits January 17, 2026 07:27
Use country-aware initialization for calibration weights. Previously,
each household's weight was divided by total area count (e.g., 650 for
constituencies), causing Scottish households to start at ~9% of their
target (59/650). Now weights are divided by areas in that household's
country, so all countries start at ~100% of their targets.

This should improve convergence speed and final accuracy for Scotland,
Wales, and Northern Ireland specific targets like Scottish Child Payment.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MaxGhenis MaxGhenis changed the title Improve calibration weight initialization for devolved nations Improve calibration weight initialization with country-aware divisors Jan 17, 2026
@MaxGhenis

Copy link
Copy Markdown
Contributor Author

Experiment results

Compared calibration logs from CI runs with 30 epochs.

Overall accuracy (targets within 10% of goal)

Category Before After
Scotland targets (22) 0 within 10% (0%) 13 within 10% (59%)
Non-Scotland targets (516) 267 within 10% (52%) 335 within 10% (65%)
Total (538) 267 within 10% (50%) 348 within 10% (65%)

Scottish Child Payment convergence

Epoch Before After
0 £25m (5% of £471m target) £291m (62%)
10 £66m (14%) £465m (99%)
30 £434m (92%) £536m (114%)

The new initialization reaches target by epoch 10 vs epoch 30+. Some overshoot at epoch 30 would resolve with production's 512 epochs.

Why it helps

The country mask r restricts which households contribute to which constituencies. Before, all weights were divided by 650 (total constituencies), but:

  • Scottish households can only contribute to 59 Scottish constituencies
  • Welsh households can only contribute to 40 Welsh constituencies
  • NI households can only contribute to 18 NI constituencies

This caused Scotland to start at 59/650 ≈ 9% of target. Now we divide by the actual number of constituencies in each household's country, so all countries start at ~100%.

@MaxGhenis MaxGhenis merged commit be3e06c into main Jan 17, 2026
3 checks passed
@MaxGhenis MaxGhenis deleted the fix-calibration-initialization branch January 17, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant