Skip to content

Commit d3eccbb

Browse files
MaxGhenisclaude
andcommitted
Add rules-based tax-unit construction engine from policyengine-us-data
Extract the rules-based tax-unit / filing-status construction engine from policyengine-us-data into microunit (roadmap item 2). The engine is copied verbatim (no logic changes) and made source-agnostic and self-contained. This integrates additively with the existing unit-assignment scaffold. New modules: - src/microunit/tax_unit_construction.py: core engine. Public entry construct_tax_units(person, year, mode) with "policyengine" (default) and "census_documented" modes; HEAD/SPOUSE/DEPENDENT role constants. The only change vs. the source is the internal import (now microunit.rule_helpers); zero non-import edits to the logic. - src/microunit/rule_helpers.py: dependency/filing rule helpers (renamed from tax_unit_rule_helpers). The optional policyengine_us import shim is dropped; the qualifying-relative gross income limit now loads from packaged data, so the engine no longer depends on policyengine-us. - src/microunit/data/dependent_gross_income_limit.yaml: vendored IRC 151(d) exemption-amount values (through 2026), loaded via importlib.resources. Integration: - __init__.py: additively export construct_tax_units, the role constants, modes, CPSRelationshipCode, and the rule helpers (existing API unchanged). - units/tax.py: add construct_tax_partition(), a UnitPartition adapter over construct_tax_units, fulfilling the prior "port rules-based tax-unit construction here" TODO. assign_tax_partition still preserves native IDs. - units/__init__.py: export construct_tax_partition. - pyproject.toml: add numpy and pyyaml deps; ship the YAML as wheel/sdist data. - uv.lock: regenerated for the new direct dependencies. - README.md: document the engine, the two modes, the input contract, and the ACS-column-mapping boundary. Tests (60 passing total): test_tax_unit_construction.py ports the full CPS suite to the microunit namespace; test_tax_partition_adapter.py covers the new adapter; test_import.py checks the public API and packaged-data resolution. ACS boundary: acs_to_cps_columns.py (ACS-PUMS-specific RELSHIPP/RELP and spouse/parent inference) is intentionally NOT included. microunit takes already-normalized CPS-like person frames; ACS column mapping and the ACS-specific tests remain in policyengine-us-data. Extracted from PolicyEngine/policyengine-us-data@f7458313. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent d08532a commit d3eccbb

12 files changed

Lines changed: 1933 additions & 5 deletions

README.md

Lines changed: 81 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,86 @@ partition = assign_spm_partition(persons)
5555
print(partition.to_frame())
5656
```
5757

58+
## Rules-based tax-unit construction
59+
60+
`microunit` includes the rules-based tax-unit / filing-status construction
61+
engine extracted from
62+
[`policyengine-us-data`](https://github.com/PolicyEngine/policyengine-us-data).
63+
It applies federal filing and dependency rules to assign people into tax
64+
units, infer each person's role (head / spouse / dependent), and infer a
65+
filing status per unit. It is the same engine reused across the CPS and ACS
66+
pipelines there, and is **source-agnostic**: it operates on
67+
already-normalized, CPS-like person frames. It is consumed by
68+
`policyengine-us-data` and `microplex-us`.
69+
70+
```python
71+
import pandas as pd
72+
from microunit import construct_tax_units
73+
74+
# person uses CPS-like column names (see "Input contract" below).
75+
person_assignments, tax_unit = construct_tax_units(person, year=2024)
76+
```
77+
78+
`construct_tax_units(person, year, mode="policyengine")` returns:
79+
80+
- **`person_assignments`** (indexed like the input): `TAX_ID` (`int64`,
81+
dense 1-based id), `tax_unit_role_input` (bytes: `HEAD` / `SPOUSE` /
82+
`DEPENDENT`), `is_related_to_head_or_spouse` (bool).
83+
- **`tax_unit`** (one row per `TAX_ID`): `filing_status_input` (bytes:
84+
`JOINT` / `HEAD_OF_HOUSEHOLD` / `SURVIVING_SPOUSE` / `SEPARATE` /
85+
`SINGLE`).
86+
87+
The string columns are byte strings (the HDF5-friendly encoding used by the
88+
source pipeline); decode with `.decode()`.
89+
90+
A `UnitPartition` adapter is also provided:
91+
92+
```python
93+
from microunit.units import construct_tax_partition
94+
95+
partition = construct_tax_partition(person, year=2024) # UnitPartition(unit_type="tax")
96+
```
97+
98+
### Modes
99+
100+
- **`"policyengine"`** (default, `microunit.POLICYENGINE_MODE`): PolicyEngine's
101+
dependency/filing-rule flow.
102+
- **`"census_documented"`** (`microunit.CENSUS_DOCUMENTED_MODE`): the publicly
103+
documented Census tax-model flow.
104+
105+
### Input contract
106+
107+
Required CPS columns (raises `KeyError` if missing): `PH_SEQ`, `A_LINENO`,
108+
`A_AGE`, `A_MARITL`, `A_SPOUSE`, `PEPAR1`, `PEPAR2`, `A_EXPRRP`.
109+
110+
Optional evidence columns (used when present, safely defaulted otherwise):
111+
income components (`WSAL_VAL`, `SEMP_VAL`, `FRSE_VAL`, `INT_VAL`, `DIV_VAL`,
112+
`RNT_VAL`, `CAP_VAL`, `UC_VAL`, `OI_VAL`, `ANN_VAL`, `PNSN_VAL`, `SS_VAL`),
113+
total money income (`PTOTVAL`), enrollment (`A_ENRLW`, `A_FTPT`, `A_HSCOL`),
114+
and disability flags (`PEDISDRS`, `PEDISEAR`, `PEDISEYE`, `PEDISOUT`,
115+
`PEDISPHY`, `PEDISREM`). Relationship codes follow the CPS ASEC `A_EXPRRP`
116+
recode, exposed as `microunit.CPSRelationshipCode`.
117+
118+
### ACS column mapping is the consumer's responsibility
119+
120+
The ACS PUMS -> CPS column mapping (`acs_to_cps_columns.py` in
121+
`policyengine-us-data`) is **not** part of `microunit`. That ~500-line module
122+
is ACS-PUMS-specific (`RELSHIPP`/`RELP` translation, marital-status recoding,
123+
and heuristic spouse/parent-pointer inference, since ACS provides no universal
124+
spouse or parent pointers) and belongs with the ACS reader. Consumers reading
125+
ACS should map their PUMS columns onto the CPS-like contract above and then
126+
call `construct_tax_units`. Accordingly, the ACS-specific tests from
127+
`policyengine-us-data` remain there; the full CPS construction test suite is
128+
ported here.
129+
130+
### Packaged data
131+
132+
The qualifying-relative gross income limit (the personal/dependent exemption
133+
amount under IRC 151(d), used by the IRC 152(d)(1)(B) gross income test) ships
134+
as package data at `microunit/data/dependent_gross_income_limit.yaml` and is
135+
loaded via `importlib.resources`, so the engine does not depend on
136+
`policyengine-us` being installed.
137+
58138
## Scope
59139

60140
This package should construct unit assignments and explain them. It should not
@@ -65,7 +145,7 @@ Near-term roadmap:
65145

66146
1. Move reusable SPM unit assignment out of `spm-calculator`.
67147
2. Move reusable tax-unit construction out of `policyengine-us-data` /
68-
`policyengine-us`.
148+
`policyengine-us`. (Done -- see "Rules-based tax-unit construction" above.)
69149
3. Add CPS and ACS source adapters for Microplex.
70150
4. Use SPM units as the temporary simplification for SNAP, Medicaid/MAGI, and
71151
other program units.

pyproject.toml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,9 @@ classifiers = [
3232
]
3333
requires-python = ">=3.11"
3434
dependencies = [
35+
"numpy>=1.24",
3536
"pandas>=2.0",
37+
"pyyaml>=6.0",
3638
]
3739

3840
[project.optional-dependencies]
@@ -46,6 +48,16 @@ Repository = "https://github.com/PolicyEngine/microunit"
4648

4749
[tool.hatch.build.targets.wheel]
4850
packages = ["src/microunit"]
51+
# Ship packaged rule data (the qualifying-relative gross income limit YAML)
52+
# alongside the Python modules.
53+
artifacts = ["src/microunit/data/*.yaml"]
54+
55+
[tool.hatch.build.targets.sdist]
56+
include = [
57+
"src/microunit",
58+
"tests",
59+
"README.md",
60+
]
4961

5062
[tool.pytest.ini_options]
5163
testpaths = ["tests"]

src/microunit/__init__.py

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,34 @@
33
from microunit.core import EgoUnitMembership, UnitPartition
44
from microunit.diagnostics import PartitionMatchReport, partition_match_report
55
from microunit.registry import UnitKind, UnitScheme, get_scheme, list_schemes
6+
from microunit.rule_helpers import (
7+
REFERENCE_PERSON_CODES,
8+
REFERENCE_QUALIFYING_CHILD_CODES,
9+
REFERENCE_QUALIFYING_RELATIVE_CODES,
10+
REFERENCE_SPOUSE_CODES,
11+
CPSRelationshipCode,
12+
dependent_gross_income_limit,
13+
qualifying_child_age_test,
14+
reference_relationship_allows_qualifying_child,
15+
reference_relationship_allows_qualifying_relative,
16+
related_to_head_or_spouse,
17+
)
18+
from microunit.tax_unit_construction import (
19+
CENSUS_DOCUMENTED_MODE,
20+
DEPENDENT,
21+
HEAD,
22+
POLICYENGINE_MODE,
23+
SPOUSE,
24+
SUPPORTED_TAX_UNIT_CONSTRUCTION_MODES,
25+
construct_tax_units,
26+
estimate_dependent_gross_income,
27+
)
28+
29+
__version__ = "0.1.0"
630

731
__all__ = [
32+
"__version__",
33+
# Core containers
834
"EgoUnitMembership",
935
"PartitionMatchReport",
1036
"UnitKind",
@@ -13,4 +39,23 @@
1339
"get_scheme",
1440
"list_schemes",
1541
"partition_match_report",
42+
# Rules-based tax-unit construction engine
43+
"construct_tax_units",
44+
"estimate_dependent_gross_income",
45+
"HEAD",
46+
"SPOUSE",
47+
"DEPENDENT",
48+
"POLICYENGINE_MODE",
49+
"CENSUS_DOCUMENTED_MODE",
50+
"SUPPORTED_TAX_UNIT_CONSTRUCTION_MODES",
51+
"CPSRelationshipCode",
52+
"REFERENCE_PERSON_CODES",
53+
"REFERENCE_SPOUSE_CODES",
54+
"REFERENCE_QUALIFYING_CHILD_CODES",
55+
"REFERENCE_QUALIFYING_RELATIVE_CODES",
56+
"dependent_gross_income_limit",
57+
"qualifying_child_age_test",
58+
"reference_relationship_allows_qualifying_child",
59+
"reference_relationship_allows_qualifying_relative",
60+
"related_to_head_or_spouse",
1661
]
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
description: >-
2+
Personal and dependent exemption amount under IRC 151(d). TCJA set the
3+
deduction to $0 from 2018 (made permanent by OBBB), but the underlying
4+
amount continues to be inflation-adjusted and published in annual Rev. Proc.
5+
for other provisions that reference it, such as the qualifying relative
6+
gross income test under IRC 152(d)(1)(B). The deduction suspension is
7+
represented separately in gov.irs.income.exemption.suspended.
8+
metadata:
9+
unit: currency-USD
10+
uprating: gov.irs.uprating
11+
period: year
12+
reference:
13+
- title: 26 U.S. Code § 151(d)(1) - Exemption amount
14+
href: https://www.law.cornell.edu/uscode/text/26/151#d_1
15+
- title: IRS Notice 2018-70 - Guidance on qualifying relative exemption amount
16+
href: https://www.irs.gov/pub/irs-drop/n-18-70.pdf
17+
values:
18+
2013-01-01:
19+
value: 3_900
20+
reference:
21+
- title: Rev. Proc. 2013-15
22+
href: https://www.irs.gov/pub/irs-drop/rp-13-15.pdf
23+
2014-01-01:
24+
value: 3_950
25+
reference:
26+
- title: Rev. Proc. 2013-35
27+
href: https://www.irs.gov/pub/irs-drop/rp-13-35.pdf
28+
2015-01-01:
29+
value: 4_000
30+
reference:
31+
- title: Rev. Proc. 2014-61
32+
href: https://www.irs.gov/pub/irs-drop/rp-14-61.pdf
33+
2016-01-01:
34+
value: 4_050
35+
reference:
36+
- title: Rev. Proc. 2015-53
37+
href: https://www.irs.gov/pub/irs-drop/rp-15-53.pdf
38+
2017-01-01:
39+
value: 4_050
40+
reference:
41+
- title: Rev. Proc. 2016-55
42+
href: https://www.irs.gov/pub/irs-drop/rp-16-55.pdf
43+
2018-01-01:
44+
value: 4_150
45+
reference:
46+
- title: Rev. Proc. 2017-58
47+
href: https://www.irs.gov/pub/irs-drop/rp-17-58.pdf
48+
2019-01-01:
49+
value: 4_200
50+
reference:
51+
- title: Rev. Proc. 2018-57
52+
href: https://www.irs.gov/pub/irs-drop/rp-18-57.pdf
53+
2020-01-01:
54+
value: 4_300
55+
reference:
56+
- title: Rev. Proc. 2019-44
57+
href: https://www.irs.gov/pub/irs-drop/rp-19-44.pdf
58+
2021-01-01:
59+
value: 4_300
60+
reference:
61+
- title: Rev. Proc. 2020-45
62+
href: https://www.irs.gov/pub/irs-drop/rp-20-45.pdf
63+
2022-01-01:
64+
value: 4_400
65+
reference:
66+
- title: Rev. Proc. 2021-45
67+
href: https://www.irs.gov/pub/irs-drop/rp-21-45.pdf
68+
2023-01-01:
69+
value: 4_700
70+
reference:
71+
- title: Rev. Proc. 2022-38
72+
href: https://www.irs.gov/pub/irs-drop/rp-22-38.pdf
73+
2024-01-01:
74+
value: 5_050
75+
reference:
76+
- title: Rev. Proc. 2023-34
77+
href: https://www.irs.gov/pub/irs-drop/rp-23-34.pdf
78+
2025-01-01:
79+
value: 5_200
80+
reference:
81+
- title: Rev. Proc. 2024-40
82+
href: https://www.irs.gov/pub/irs-drop/rp-24-40.pdf
83+
2026-01-01:
84+
value: 5_300
85+
reference:
86+
- title: Rev. Proc. 2025-32
87+
href: https://www.irs.gov/pub/irs-drop/rp-25-32.pdf
88+

0 commit comments

Comments
 (0)