Skip to content

Commit 4ab5fef

Browse files
committed
Reorganization and feedback integration -- second draft
1 parent c5fe83f commit 4ab5fef

16 files changed

Lines changed: 700 additions & 373 deletions

paper-l0/bibliography/references.bib

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,16 @@ @article{anderson2013
115115
year = {2013}
116116
}
117117

118+
@article{imai2014,
119+
title = {Covariate Balancing Propensity Score},
120+
author = {Imai, Kosuke and Ratkovic, Marc},
121+
journal = {Journal of the Royal Statistical Society: Series B},
122+
volume = {76},
123+
number = {1},
124+
pages = {243--263},
125+
year = {2014}
126+
}
127+
118128
@article{tanton2014review,
119129
title = {A Review of Spatial Microsimulation Methods},
120130
author = {Tanton, Robert},

paper-l0/main.pdf

-21.7 KB
Binary file not shown.

paper-l0/sections/abstract.tex

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,21 @@
11
\begin{abstract}
2-
Tax-benefit microsimulation models typically operate at the national level, using household survey weights calibrated to aggregate population targets. Subnational analysis---at the level of states, congressional districts, or local authorities---requires datasets that simultaneously satisfy geographic distributional constraints while preserving household-level detail. We present a method based on $L_0$ regularization that jointly optimizes survey weight magnitudes and sparsity to produce calibrated subnational microsimulation datasets.
2+
Subnational microsimulation requires survey microdata that reproduce administrative totals across
3+
nested geographies while remaining usable in a policy model. In the United States, that means
4+
calibrating mixed count and dollar targets for district-level units, states, and the nation from a
5+
single microdata pipeline. Classical calibration methods provide important reference points, but
6+
they do not naturally cover the full production problem: generalized regression (GREG) can
7+
produce negative weights and becomes difficult to use in very large, collinear systems, while
8+
iterative proportional fitting (IPF, or raking) is most natural for count-style margins.
39

4-
Our approach builds on the Hard Concrete distribution \citep{louizos2018}, which induces exact sparsity by multiplying each household's weight by a learned stochastic gate that collapses to a deterministic zero or one at inference time. We parameterize each gate with a log-alpha and temperature parameter, and jointly optimize these alongside log-transformed weight magnitudes using a single loss function combining scale-invariant relative calibration error, an $L_0$ sparsity penalty on the expected count of active households, and a light $L_2$ regularizer on weight magnitudes.
5-
6-
The pipeline begins with the US Current Population Survey. Each household record is cloned multiple times and assigned to random census blocks drawn from a population-weighted distribution. Program participation indicators are re-randomized per geographic assignment using local take-up rates. Each clone is then run through \policyengine{}'s tax-benefit microsimulation engine to generate geography-specific outputs. The $L_0$ optimizer selects which household-geography combinations to retain, calibrating simultaneously against approximately 37,800 targets across three geographic levels. The sparsity penalty is configurable: a higher penalty produces a compact national dataset of approximately 50,000 records, while a lower penalty yields a larger dataset of approximately 3--4 million records covering all 436 congressional districts and 50 states individually. The method is implemented as the open-source \texttt{l0-python} PyTorch package.
10+
We present an $L_0$-regularized calibration pipeline built in PolicyEngine's US data workflow. The
11+
pipeline clones CPS households across sampled geographies, constructs a sparse calibration matrix
12+
from tax-benefit simulations, and jointly optimizes positive weights and Hard Concrete gates. The
13+
gates make sparsity explicit, so the same framework can support compact national datasets and
14+
larger subnational datasets. The empirical sections benchmark $L_0$ against GREG and IPF on shared
15+
exported calibration packages, moving from a tractable comparison tier to a scaling frontier and a
16+
production-feasibility case. The benchmark reports calibration error, runtime, memory use, failure
17+
modes, and retained-record counts.
718
\end{abstract}
819

9-
\noindent\textbf{Keywords:} microsimulation, survey calibration, $L_0$ regularization, subnational analysis
20+
\noindent\textbf{Keywords:} microsimulation, survey calibration, $L_0$ regularization, subnational
21+
analysis

paper-l0/sections/appendix.tex

Lines changed: 50 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@
33
\section{Optimization hyperparameters}
44
\label{app:hyperparameters}
55

6-
Table~\ref{tab:hyperparameters_full} lists all hyperparameters used in the $L_0$ optimization with their default values, code-level names, and roles.
6+
Table~\ref{tab:hyperparameters_full} lists the default hyperparameters used in the $L_0$
7+
optimization, together with their code-level names and roles.
78

89
\begin{table}[ht]
910
\centering
@@ -35,9 +36,12 @@ \section{Optimization hyperparameters}
3536
\section{Calibration target variables}
3637
\label{app:targets}
3738

38-
The calibration pipeline draws targets from seven administrative sources across three geographic levels. The database contains 37,758 active targets in total. The following tables list every target domain included in the \texttt{policy\_data.db} database, grouped by geographic level.
39+
The calibration pipeline draws targets from seven administrative sources across three geographic
40+
levels. The pipeline stores these targets in a target database, \texttt{policy\_data.db}, which
41+
contains 37,758 active targets in total. The following tables list every target domain included in
42+
that database, grouped by geographic level.
3943

40-
\subsection{Congressional district targets (33,572)}
44+
\subsection{District-level targets (33,572)}
4145

4246
\begin{table}[H]
4347
\centering
@@ -47,24 +51,26 @@ \subsection{Congressional district targets (33,572)}
4751
Target domain & Type & Count \\
4852
\midrule
4953
\multicolumn{3}{l}{\textit{Census ACS S0101}} \\
50-
Person count by age band (18 bands $\times$ 436 CDs) & count & 7{,}848 \\
54+
Person count by age band (18 bands $\times$ 436 district-level units) & count & 7{,}848 \\
5155
\midrule
5256
\multicolumn{3}{l}{\textit{IRS SOI}} \\
53-
Person count by AGI bracket (9 bins $\times$ 436 CDs) & count & 3{,}924 \\
54-
EITC dollars by qualifying children (4 bins $\times$ 436 CDs) & \$ & 1{,}744 \\
55-
Tax unit count by qualifying children (4 bins $\times$ 436 CDs) & count & 1{,}744 \\
56-
Aggregate AGI (unconditional, $\times$ 436 CDs) & \$ & 436 \\
57-
21 income/deduction dollar totals, each with domain $> 0$ ($\times$ 436 CDs) & \$ & 9{,}156 \\
58-
Tax unit count for each of the 21 domains ($\times$ 436 CDs) & count & 9{,}156 \\
57+
Person count by AGI bracket (9 bins $\times$ 436 district-level units) & count & 3{,}924 \\
58+
EITC dollars by qualifying children (4 bins $\times$ 436 district-level units) & \$ & 1{,}744 \\
59+
Tax unit count by qualifying children (4 bins $\times$ 436 district-level units) & count & 1{,}744 \\
60+
Aggregate AGI (unconditional, $\times$ 436 district-level units) & \$ & 436 \\
61+
21 income/deduction dollar totals, each with domain $> 0$ ($\times$ 436 district-level units) & \$ & 9{,}156 \\
62+
Tax unit count for each of the 21 domains ($\times$ 436 district-level units) & count & 9{,}156 \\
5963
\midrule
6064
\multicolumn{3}{l}{\textit{Census ACS S2201}} \\
61-
SNAP household count ($\times$ 436 CDs) & count & 436 \\
65+
SNAP household count ($\times$ 436 district-level units) & count & 436 \\
6266
\midrule
6367
& & \textbf{33{,}572} \\
6468
\bottomrule
6569
\end{tabular}
6670
}
67-
\caption{Congressional district calibration targets (436 CDs). Each row is replicated across all 436 districts. IRS SOI provides paired dollar and count targets for each income/deduction domain.}
71+
\caption{District-level calibration targets. The 436 district-level units correspond to the 435
72+
congressional districts plus the District of Columbia. IRS SOI provides paired dollar and count
73+
targets for each income and deduction domain.}
6874
\label{tab:cd_targets}
6975
\end{table}
7076

@@ -78,31 +84,33 @@ \subsection{State targets (4,080)}
7884
Target domain & Type & Count \\
7985
\midrule
8086
\multicolumn{3}{l}{\textit{Census ACS S0101}} \\
81-
Person count by age band (18 bands $\times$ 51 states) & count & 918 \\
87+
Person count by age band (18 bands $\times$ 50 states + DC) & count & 918 \\
8288
\midrule
8389
\multicolumn{3}{l}{\textit{IRS SOI}} \\
84-
Person count by AGI bracket (9 bins $\times$ 51 states) & count & 459 \\
85-
EITC dollars by qualifying children (4 bins $\times$ 51) & \$ & 204 \\
86-
Tax unit count by qualifying children (4 bins $\times$ 51) & count & 204 \\
87-
Aggregate AGI (unconditional, $\times$ 51 states) & \$ & 51 \\
88-
20 income/deduction dollar totals (domain $> 0$, $\times$ 51) & \$ & 1{,}020 \\
89-
Tax unit count for each of the 21 domains ($\times$ 51) & count & 1{,}071 \\
90+
Person count by AGI bracket (9 bins $\times$ 50 states + DC) & count & 459 \\
91+
EITC dollars by qualifying children (4 bins $\times$ 50 states + DC) & \$ & 204 \\
92+
Tax unit count by qualifying children (4 bins $\times$ 50 states + DC) & count & 204 \\
93+
Aggregate AGI (unconditional, $\times$ 50 states + DC) & \$ & 51 \\
94+
20 income/deduction dollar totals (domain $> 0$, $\times$ 50 states + DC) & \$ & 1{,}020 \\
95+
Tax unit count for each of the 21 domains ($\times$ 50 states + DC) & count & 1{,}071 \\
9096
\midrule
9197
\multicolumn{3}{l}{\textit{USDA FNS SNAP}} \\
92-
SNAP spending ($\times$ 51 states) & \$ & 51 \\
93-
SNAP household count ($\times$ 51 states) & count & 51 \\
98+
SNAP spending ($\times$ 50 states + DC) & \$ & 51 \\
99+
SNAP household count ($\times$ 50 states + DC) & count & 51 \\
94100
\midrule
95101
\multicolumn{3}{l}{\textit{CMS Medicaid}} \\
96-
Medicaid enrollment ($\times$ 51 states) & count & 51 \\
102+
Medicaid enrollment ($\times$ 50 states + DC) & count & 51 \\
97103
\midrule
98104
\multicolumn{3}{l}{\textit{Census STC}} \\
99-
State income tax collections ($\times$ 51 states) & \$ & 51 \\
105+
State income tax collections ($\times$ 50 states + DC) & \$ & 51 \\
100106
\midrule
101107
& & \textbf{4{,}080} \\
102108
\bottomrule
103109
\end{tabular}
104110
}
105-
\caption{State-level calibration targets (50 states + DC). IRS SOI variables mirror the district structure. USDA provides both SNAP spending and household counts; CMS provides Medicaid enrollment.}
111+
\caption{State-level calibration targets for the 50 states plus the District of Columbia. IRS SOI
112+
variables mirror the district-level structure. USDA provides both SNAP spending and household
113+
counts, and CMS provides Medicaid enrollment.}
106114
\label{tab:state_targets}
107115
\end{table}
108116

@@ -150,7 +158,9 @@ \subsection{National targets (106)}
150158
\bottomrule
151159
\end{tabular}
152160
}
153-
\caption{National-level calibration targets. CBO, JCT, SSA, CMS, and Census values are curated from the cited administrative sources and stored in the ETL pipeline. Dollar values are inflation-adjusted to the calibration year.}
161+
\caption{National-level calibration targets. CBO, JCT, SSA, CMS, and Census values are curated from
162+
the cited administrative sources and stored in the ETL pipeline. Dollar values are
163+
inflation-adjusted to the calibration year.}
154164
\label{tab:national_targets}
155165
\end{table}
156166

@@ -197,3 +207,18 @@ \section{Algorithm pseudocode}
197207
\State \Return $\hat{\mathbf{w}}$
198208
\end{algorithmic}
199209
\end{algorithm}
210+
211+
\section{Dataset assembly details}
212+
\label{app:assembly}
213+
214+
After optimization, the fitted weight vector is reshaped back to clone-by-household form and used
215+
to build deployable H5 datasets. The assembly step keeps only active cloned units, reconstructs the
216+
associated person and tax-unit memberships, derives geography from the stored census block
217+
assignment, and recomputes geography-dependent quantities such as Supplemental Poverty Measure
218+
threshold adjustments.
219+
220+
Two implementation details matter for fidelity. First, geography is derived from the same cloned
221+
block assignments that were used to build the calibration matrix, so the assembled dataset matches
222+
the unit universe that the optimizer saw. Second, take-up draws are regenerated with the same
223+
deterministic seeds used during matrix construction, which keeps take-up-dependent targets
224+
consistent between the calibration package and the published dataset.

0 commit comments

Comments
 (0)