Problem
On the corrected model (#21), the weight optimiser places more mass just above £85k than below it (local step up ≈ +35% across the threshold; visible in turnover_distribution_85k.png and the Section 5 figure). This is economically backwards as a picture of the real world: administrative data (OBR EFO Mar-2023 Chart C; Liu et al. 2021) show excess mass below the threshold and a hole above.
Both the old step-down (produced by the mis-scaled liabilities) and the new step-up are optimizer equilibria, not behaviour — the paper says so and no costing reads the local shape behaviourally. Shape-sensitivity is small for the headline numbers: the [85k,90k) anchor band is £235.7m (direct) vs £233.5m (smooth counterfactual ignoring local shape), 0.9% apart; [85k,100k) is −698.2 vs −686.1 (1.8%). But the figure invites misreading, and within-band shape should not be an optimizer side-effect at all.
Proposed fix: calibrate the near-threshold region to published £1k-band counts
The OBR March-2023 EFO Chart C underlying data (HMRC £1,000-band counts of businesses, £65k–£90k, outturn years + projections) is already in this repo (data/processed/obr_vat_bunching.csv, plotted by scripts/plot_obr_bunching.py). Add these fine near-threshold band counts as calibration targets (2023-24-appropriate series; rescale to the coarse-band totals to avoid double counting), so:
- the synthetic file reproduces the administratively observed bunching shape as an explicit, cited target — consistent with the paper's transparency framework (target-inherited shape, no behavioural claims, placebo still applies);
- the [85k,90k) mass — the anchor band — is disciplined by real data rather than an optimizer equilibrium;
- the Section 5 story sharpens: the estimator applied to the calibrated file recovers the target-inherited step, and the placebo (remove the OBR targets) collapses it, exactly the pattern the paper documents.
Universe caveat to handle explicitly: the OBR/HMRC chart counts are VAT-population based (registered traders incl. voluntary), narrower than the ONS registered-business frame below the threshold; use the series as a within-band shape target (relative £1k-band densities), not as level targets.
Alternative (weaker)
If the OBR-target route stalls: impose a smooth monotone within-band density prior near band edges (penalise weight-density curvature within ±£10k of any calibration band edge), so within-band shape is a stated modelling assumption rather than an optimizer artifact. Doesn't reproduce real bunching, but removes the misleading spike.
Refs: #15, #21 (paper rewrite), the two-vintage spurious-signal exhibit in Section 5.
🤖 Generated with Claude Code
Problem
On the corrected model (#21), the weight optimiser places more mass just above £85k than below it (local step up ≈ +35% across the threshold; visible in
turnover_distribution_85k.pngand the Section 5 figure). This is economically backwards as a picture of the real world: administrative data (OBR EFO Mar-2023 Chart C; Liu et al. 2021) show excess mass below the threshold and a hole above.Both the old step-down (produced by the mis-scaled liabilities) and the new step-up are optimizer equilibria, not behaviour — the paper says so and no costing reads the local shape behaviourally. Shape-sensitivity is small for the headline numbers: the [85k,90k) anchor band is £235.7m (direct) vs £233.5m (smooth counterfactual ignoring local shape), 0.9% apart; [85k,100k) is −698.2 vs −686.1 (1.8%). But the figure invites misreading, and within-band shape should not be an optimizer side-effect at all.
Proposed fix: calibrate the near-threshold region to published £1k-band counts
The OBR March-2023 EFO Chart C underlying data (HMRC £1,000-band counts of businesses, £65k–£90k, outturn years + projections) is already in this repo (
data/processed/obr_vat_bunching.csv, plotted byscripts/plot_obr_bunching.py). Add these fine near-threshold band counts as calibration targets (2023-24-appropriate series; rescale to the coarse-band totals to avoid double counting), so:Universe caveat to handle explicitly: the OBR/HMRC chart counts are VAT-population based (registered traders incl. voluntary), narrower than the ONS registered-business frame below the threshold; use the series as a within-band shape target (relative £1k-band densities), not as level targets.
Alternative (weaker)
If the OBR-target route stalls: impose a smooth monotone within-band density prior near band edges (penalise weight-density curvature within ±£10k of any calibration band edge), so within-band shape is a stated modelling assumption rather than an optimizer artifact. Doesn't reproduce real bunching, but removes the misleading spike.
Refs: #15, #21 (paper rewrite), the two-vintage spurious-signal exhibit in Section 5.
🤖 Generated with Claude Code