You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the subnational setting addressed here, the constraint vector $\mathbf{T}$ contains targets at multiple geographic levels---congressional districts, states, and the nation---requiring that district totals sum to state totals, which sum to national totals. This hierarchical structure produces $m \approx37{,}800$ simultaneous constraints with near-collinearity across levels, placing the problem beyond the regime where classical closed-form calibration methods operate reliably.
14
14
15
+
Subnational microsimulation also sits within the broader spatial microsimulation literature. Reviews of static spatial microsimulation often distinguish between \emph{reweighting} methods, which begin from survey microdata and adjust weights to match small-area constraints, and \emph{synthetic reconstruction} methods, which construct new small-area populations from aggregate tables \citep{tanton2014review, odonoghue2014review}. This distinction matters for the present paper. Our method remains, at core, a calibration-weighting approach over observed CPS households, so GREG and IPF are the closest classical benchmarks. At the same time, by cloning households, assigning them to new geographies, and assembling area-specific output files, the pipeline also produces derived spatial microdata for small-area policy analysis, making it relevant to some synthetic-population use cases even though the empirical comparison in this paper focuses on calibration methods.
The GREG estimator minimizes the chi-squared distance $\sum_i (w_i - d_i)^2 / d_i$ subject to Equation~\ref{eq:calibration_constraint}, yielding the closed-form solution:
Iterative proportional fitting \citep[IPF;][]{deming1940, ireland1968} adjusts cell counts in a contingency table to match given marginal totals. The algorithm cycles through dimensions, scaling each dimension's cells so that their marginal matches the target, then repeating until convergence. IPF converges to the maximum entropy solution subject to the marginal constraints \citep{ireland1968}.
32
+
Iterative proportional fitting \citep[IPF;][]{deming1940, ireland1968} adjusts cell counts in a contingency table to match given marginal totals. The algorithm cycles through dimensions, scaling each dimension's cells so that their marginal matches the target, then repeating until convergence. IPF converges to the maximum entropy solution subject to the marginal constraints \citep{ireland1968}. In the spatial microsimulation literature, it appears in both synthetic reconstruction and reweighting forms; the discussion here concerns the reweighting form that starts from survey microdata and updates weights rather than building a synthetic joint distribution from scratch \citep{tanton2014review}.
31
33
32
34
In the microsimulation context, IPF adjusts household weights to match cross-classified population counts---for example, persons by age group within each congressional district. IPF has several practical advantages: it preserves non-negativity by construction (weights are scaled multiplicatively, so positive weights remain positive), it requires no matrix inversion, and it scales well to high-dimensional contingency tables. EUROMOD, the EU-wide tax-benefit microsimulation model, uses IPF-based calibration to reweight national surveys to demographic benchmarks across member states.
33
35
34
36
However, IPF has three limitations relevant to subnational calibration. First, IPF does not naturally enforce hierarchical consistency: district-level targets produced by IPF do not automatically sum to the correct state totals, requiring post-hoc reconciliation that may introduce new inconsistencies. Second, IPF handles only count targets organized as contingency table margins; incorporating continuous-valued targets (e.g., aggregate income or benefit spending) requires auxiliary procedures outside the IPF framework. Third, IPF scales cell counts multiplicatively, which can produce extreme weights when initial cells are small or zero, and convergence slows or fails when marginal constraints are mutually inconsistent.
35
37
36
38
\subsection{Spatial microsimulation}
37
39
38
-
Spatial microsimulation constructs small-area populations by selecting or reweighting survey records to match local-area census constraints. \citet{williamson1998} introduced a combinatorial optimization approach that selects a subset of survey records for each small area using simulated annealing to minimize the difference between weighted survey totals and census benchmarks. \citet{huang2001} extended this with a deterministic algorithm based on systematic record selection. \citet{tanton2011} applied generalized regression reweighting to create small-area estimates of poverty and housing stress in Australia.
40
+
Spatial microsimulation constructs small-area populations either by reweighting existing microdata or by synthesizing new unit-record populations from aggregate constraints\citep{tanton2014review, odonoghue2014review}. \citet{williamson1998} introduced a combinatorial optimization approach that selects a subset of survey records for each small area using simulated annealing to minimize the difference between weighted survey totals and census benchmarks. \citet{huang2001} extended this with a deterministic algorithm based on systematic record selection. \citet{tanton2011} applied generalized regression reweighting to create small-area estimates of poverty and housing stress in Australia.
39
41
40
42
\citet{harland2012} developed methods for creating realistic synthetic populations at fine geographic scales using iterative proportional fitting combined with Monte Carlo sampling. \citet{lovelace2016} provided an accessible implementation in R with the \texttt{spatial-microsim-book} framework.
41
43
42
-
These methods typically operate at a single geographic level---producing estimates for each small area independently. Joint calibration across multiple geographic levels (district, state, national) with a single set of weights is uncommon in the spatial microsimulation literature, as it requires simultaneously satisfying tens of thousands of constraints that span different administrative geographies. Other operational models avoid the problem entirely: TAXSIM (NBER) operates at the national level without geographic calibration, while state-level models maintained by individual state revenue departments calibrate only within their own jurisdiction.
44
+
Within this literature, combinatorial optimization and especially simulated annealing occupy an important place as methods for generating synthetic spatial microdata from observed survey records \citep{tanton2014review, odonoghue2014review}. Their main advantage is that they work directly with real microdata, can be flexible about household structure, and can accommodate settings where the unit of analysis in the constraints and the microdata do not align neatly. Their main disadvantage is computational intensity: they are usually run area by area and search a large discrete space of candidate record combinations \citep{harland2012, odonoghue2014review}.
45
+
46
+
These methods typically operate at a single geographic level---producing estimates for each small area independently. Joint calibration across multiple geographic levels (district, state, national) with a single set of weights is uncommon in the spatial microsimulation literature, as it requires simultaneously satisfying tens of thousands of constraints that span different administrative geographies. Other operational models avoid the problem entirely: TAXSIM (NBER) operates at the national level without geographic calibration, while state-level models maintained by individual state revenue departments calibrate only within their own jurisdiction. For this reason, simulated annealing is an important reference point in the broader spatial microsimulation literature, but it is not the closest like-for-like empirical comparator to our setting. The benchmark design in this paper therefore focuses on GREG and IPF as the classical calibration baselines most closely aligned with a shared weighted-microdata formulation.
43
47
44
48
\subsection{$L_0$ regularization and the Hard Concrete distribution}
This trade-off does not exist in classical calibration methods. IPF and GREG produce a single set of weights without sparsity control. To reduce dataset size, researchers must discard records post hoc or apply ad hoc thresholding---neither of which jointly optimizes accuracy and sparsity. The Hard Concrete gate provides a principled mechanism for this joint optimization, with $\lambda_{L_0}$ serving as the researcher's preference parameter over the Pareto frontier.
9
9
10
+
This feature also broadens the method's practical role. The local preset yields area-specific unit-record files that resemble synthetic spatial microdata, while remaining anchored in observed survey households, model-based enhancements, and calibrated administrative totals. The paper's empirical evaluation remains focused on calibration baselines, but the resulting datasets are relevant to some of the same downstream use cases as synthetic-population workflows.
11
+
10
12
\subsection{Computational cost}
11
13
12
14
The pipeline runs on Modal, a cloud compute platform, using T4 GPUs for the optimization step. Stage 1 (clone creation and imputation) requires approximately 2--3 hours of CPU time. Stage 2 (matrix construction) requires approximately 2--3 hours across parallel workers, dominated by running \policyengine{} simulations for each of the 51 state-level configurations. Stage 3 (optimization) requires approximately 30--60 minutes of GPU time for the national preset (4,000 epochs) and 5--15 minutes for the local preset (1,000 epochs). Stage 4 (H5 assembly) requires approximately 4--5 hours across parallel workers for all 488 (436 CDs, 50 states plus DC, NYC and a national) H5 builds.
Copy file name to clipboardExpand all lines: paper-l0/sections/introduction.tex
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -7,11 +7,11 @@ \section{Introduction}
7
7
8
8
Existing calibration methods scale poorly to this setting. Iterative proportional fitting \citep[IPF;][]{deming1940, ireland1968} adjusts weights along one dimension at a time, cycling through marginal constraints until convergence. IPF handles cross-classified tables but does not naturally accommodate hierarchical geographic constraints---district targets must sum to state targets, which must sum to national targets---without ad hoc post-processing. Generalized regression (GREG) estimators \citep{deville1992, sarndal2007} solve a constrained optimization problem that minimizes distance from initial weights subject to exact calibration constraints. GREG produces a closed-form solution for moderate numbers of constraints but becomes computationally intractable and numerically unstable as the constraint count approaches the tens of thousands.
9
9
10
-
Spatial microsimulation methods take a different approach, constructing synthetic populations for small areas by combinatorial optimization \citep{williamson1998, huang2001}, simulated annealing \citep{harland2012}, or deterministic reweighting \citep{tanton2011, lovelace2016}. These methods typically operate at a single geographic level and require separate calibration runs for each area, making joint multi-level calibration difficult.
10
+
Spatial microsimulation methods take a different approach, often distinguishing between reweighting methods and synthetic reconstruction methods for constructing small-area microdata \citep{tanton2014review}. Within this broader literature, researchers have used combinatorial optimization and simulated annealing \citep{williamson1998, huang2001, harland2012} as well as deterministic reweighting \citep{tanton2011, lovelace2016}. These methods typically operate at a single geographic level and require separate calibration runs for each area, making joint multi-level calibration difficult.
11
11
12
12
This paper presents a method that addresses these limitations by jointly optimizing weight magnitudes and sparsity in a single gradient-based framework. We adapt the Hard Concrete distribution \citep{louizos2018}, originally developed for neural network pruning, to the survey calibration setting. Each household-geography combination receives a continuous weight and a stochastic binary gate. The gate is parameterized by a learnable logit and trained via gradient descent to minimize a loss function that combines relative calibration error across all 37,800 targets with an $L_0$ penalty on the expected number of active records. At inference time, the stochastic gates collapse to deterministic zeros and ones, producing a sparse dataset in which most household-geography combinations are dropped while the retained records carry calibrated positive weights.
13
13
14
-
The approach builds on \citet{woodruff2024}, who developed a two-stage methodology for constructing enhanced national microsimulation datasets from the Current Population Survey (CPS) and the IRS Public Use File (PUF). Their method uses quantile regression forests (QRF) to impute 72 tax variables from the PUF onto CPS records, then applies dropout-regularized gradient descent to reweight the combined dataset against approximately 7,000 national targets. The present paper extends this framework from a single national dataset to subnational coverage by introducing three new components: (a) a clone-and-assign procedure that replicates each CPS household across multiple geographic locations, (b) $L_0$ Hard Concrete gates that replace dropout regularization and enable exact sparsity, and (c) a hierarchical uprating scheme that reconciles targets from different administrative sources at district, state, and national levels.
14
+
The approach builds on \citet{woodruff2024}, who developed a two-stage methodology for constructing enhanced national microsimulation datasets from the Current Population Survey (CPS) and the IRS Public Use File (PUF). Their method uses quantile regression forests (QRF) to impute 72 tax variables from the PUF onto CPS records, then applies dropout-regularized gradient descent to reweight the combined dataset against approximately 7,000 national targets. The present paper extends this framework from a single national dataset to subnational coverage by introducing three new components: (a) a clone-and-assign procedure that replicates each CPS household across multiple geographic locations, (b) $L_0$ Hard Concrete gates that replace dropout regularization and enable exact sparsity, and (c) a hierarchical uprating scheme that reconciles targets from different administrative sources at district, state, and national levels. Because the method still solves a calibration-weighting problem over survey-based microdata, GREG and IPF are the closest classical empirical comparators; at the same time, the clone-and-assign pipeline produces derived spatial microdata files that are also relevant to synthetic-population use cases.
15
15
16
16
The configurable sparsity penalty produces datasets of different sizes for different use cases. A high penalty ($\lambda_{L_0} = 10^{-4}$) retains approximately 50,000 records, suitable for national-level web-based simulation where download size and computation time matter. A low penalty ($\lambda_{L_0} = 10^{-8}$) retains approximately 3--4 million records, preserving geographic resolution for all 436 congressional districts.
0 commit comments