Skip to content

Commit 0745788

Browse files
committed
Update package version
1 parent 0afced2 commit 0745788

4 files changed

Lines changed: 50 additions & 44 deletions

File tree

CHANGELOG.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,51 @@
1+
## [1.79.3] - 2026-04-16
2+
3+
### Changed
4+
5+
- Add a chunked mixed-geography matrix builder for memory-bounded national
6+
calibration (`--chunked-matrix`) that streams matrix columns in clone-household
7+
chunks with resumable per-chunk COO shards, progress logging (running average,
8+
elapsed, ETA), and a shared `entity_clone` module for household-subset
9+
materialization.
10+
11+
Fix three target-input integrity bugs surfaced by a new
12+
`analyze_target_consistency` diagnostic that flags cross-level and
13+
AGI-bucket-coverage inconsistencies:
14+
15+
- Drop the IRS workbook override for `total_self_employment_income`,
16+
`tax_unit_partnership_s_corp_income`, and `net_capital_gains`. The workbook
17+
columns `business_net_profits` / `partnership_and_s_corp_income` /
18+
`capital_gains_gross` are gross-only, while the geography-file line codes
19+
00900 / 26270 / 01000 already report net-of-loss. The override inflated
20+
these national targets by +40.7% / +26.1% / +3.1% at 2023 values. After
21+
the fix, all three reconcile to the penny across national, state, and
22+
district levels.
23+
- Remove the self-employment QRF winsor in `puf_impute.py`. QRF predictions
24+
are already bounded by training support; the 0.5/99.5 percentile clip
25+
was discarding the top 0.5% of legitimate signal and truncating imputed
26+
self-employment income at ~$1.1M vs the PUF training max of $74.6M.
27+
- Replace percentile-based top selection in `create_stratified_cps` with
28+
per-bracket caps (400/400/400/300/300 for the $500k-$1M through $10M+
29+
bands). Stops PUF templates from piling up above $10M and starving the
30+
middle-high $1M-$10M range.
31+
32+
Split calibration checkpoint signature validation into fatal structural
33+
mismatches and soft hyperparameter mismatches, letting callers tune
34+
`lambda_l0`, `beta`, `lambda_l2`, and `learning_rate` across resume phases.
35+
36+
Add `income_tax` national and state SOI targets, drop the unachievable
37+
JCT `deductible_mortgage_interest` target, and preserve positive mortgage
38+
interest inputs through structural conversion.
39+
40+
Retune the national Modal calibration to `lambda_l0=2e-2` at 1000 epochs
41+
and align `modal_app/pipeline.py` `log_freq` to 100.
42+
43+
Harden `make clean` so its ignored-CSV cleanup skips local environment and
44+
dependency directories such as `.venv/`, `venv/`, `env/`, `.tox/`, `.nox/`,
45+
and `node_modules/`, avoiding accidental deletion of package data inside local
46+
virtual environments.
47+
48+
149
## [1.79.2] - 2026-04-14
250

351
### Fixed

changelog.d/753.changed.md

Lines changed: 0 additions & 42 deletions
This file was deleted.

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ build-backend = "setuptools.build_meta"
88

99
[project]
1010
name = "policyengine_us_data"
11-
version = "1.79.2"
11+
version = "1.79.3"
1212
description = "A package to create representative microdata for the US."
1313
readme = "README.md"
1414
authors = [

uv.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)