Skip to content

Commit 28508d3

Browse files
juaristi22nikhilwoodruffbaogorekMaxGhenis
authored
Cleaning ACS age, SOI agi, hardcoded, and SNAP targets (#373)
* Use normal runner in PR tests * added the 3.11.12 pin * cps.py * adding diagnostics * lint * taking out bad targets * fixing workflow arg passthrough * deps and defaults * wrong pipeline for manual test * trying again to get the manual test to work * reverting to older workflow code * cleaning up enhanced_cps.py * Update package version * removing github download option. Switching to hugging face downloads * changelog entry * reverting the old code changes workflow * Update package version * start cleaning calibration targets * add us package to dependencies * update csv paths in tests too * manual test * pr * updates * trying to get the right workflow to run * taking out the token * ready for review * Update package version * adding diagnostics * taking out bad targets * fixing workflow arg passthrough * wrong pipeline for manual test * Update package version * removing github download option. Switching to hugging face downloads * reverting the old code changes workflow * remove districting file * remove duplications from merge with main * add changelog_entry * Add L0 Regularization, make a better small ECPS (#364) * initial commit of L0 branch * Add HardConcrete L0 regularization * l0 example completed * removing commented code * pre lint cleanup * post-lint cleanup * Refactor reweighting diagnostics * removed _clean from names in the reweighting function * modifying print function and test * Convert diagnostics prints to logging * removing unused variable * setting high tolerance for ssn test just to pass * linting * fixed data set creation logic. Modified parameters * docs. more epochs * Update package version * Pin microdf * adding diagnostics * taking out bad targets * Update package version * start cleaning calibration targets * trying to get the right workflow to run * ready for review * taking out bad targets * restore changes lost when merging with main * more cleanup * even more cleanup * fix file paths in new sparse ecps test * lint * fixing merge --------- Co-authored-by: Nikhil Woodruff <35577657+nikhilwoodruff@users.noreply.github.com> Co-authored-by: baogorek <baogorek@gmail.com> Co-authored-by: MaxGhenis <MaxGhenis@users.noreply.github.com> Co-authored-by: baogorek <baogorek@users.noreply.github.com> Co-authored-by: nikhilwoodruff <nikhilwoodruff@users.noreply.github.com>
1 parent 6aedac1 commit 28508d3

29 files changed

Lines changed: 9040 additions & 421 deletions

.gitignore

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,20 @@
33
**/.DS_STORE
44
**/*.h5
55
**/*.csv
6+
**/_build
7+
**/*.pkl
8+
venv
9+
10+
## old (not clean) targets
611
!healthcare_spending.csv
712
!medicaid_enrollment_2024.csv
813
!eitc.csv
914
!spm_threshold_agi.csv
10-
**/_build
1115
!population_by_state.csv
1216
!aca_spending_and_enrollment_2024.csv
13-
**/*.pkl
14-
venv
1517
!real_estate_taxes_by_state_acs.csv
18+
!np2023_d5_mid.csv
19+
!snap_state.csv
20+
!age_state.csv
21+
!agi_state.csv
22+
!soi_targets.csv

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
89
## [1.39.0] - 2025-07-18 12:46:15
910

1011
### Added

Makefile

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,14 @@ changelog:
2020
touch changelog_entry.yaml
2121

2222
download:
23-
python policyengine_us_data/storage/pull_age_targets.py
24-
python policyengine_us_data/storage/pull_soi_state_targets.py
25-
python policyengine_us_data/storage/pull_snap_state_targets.py
2623
python policyengine_us_data/storage/download_private_prerequisites.py
2724

25+
targets:
26+
python policyengine_us_data/storage/calibration_targets/pull_hardcoded_targets.py
27+
python policyengine_us_data/storage/calibration_targets/pull_age_targets.py
28+
python policyengine_us_data/storage/calibration_targets/pull_soi_targets.py
29+
python policyengine_us_data/storage/calibration_targets/pull_snap_targets.py
30+
2831
upload:
2932
python policyengine_us_data/storage/upload_completed_datasets.py
3033

changelog_entry.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
- bump: patch
2+
changes:
3+
fixed:
4+
- Edit and create files that pull SOI agi, ACS age, hardcoded and SNAP targets to follow the same clean csv format.
5+
- Track all csv files used by loss.py for backwards compatibility.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
from pathlib import Path
22

33
STORAGE_FOLDER = Path(__file__).parent
4+
CALIBRATION_FOLDER = STORAGE_FOLDER / "calibration_targets"
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
## Directory for storing calibration targets
2+
3+
This directory contains all data sources of the targets that will be calibrated for by the Enhanced CPS. Currently it stores all raw, or unprocessed targets as tracked csv files (for backward compatibility). Soon it will store scripts to pull data from each data source (one script per source) into long-formatted csv files that follow the column structure:
4+
5+
DATA_SOURCE,GEO_ID,GEO_NAME,VARIABLE,VALUE,IS_COUNT,BREAKDOWN_VARIABLE,LOWER_BOUND,UPPER_BOUND
6+
7+
To see the newly formatted target files run `make targets`.

policyengine_us_data/storage/aca_spending_and_enrollment_2024.csv renamed to policyengine_us_data/storage/calibration_targets/aca_spending_and_enrollment_2024.csv

File renamed without changes.
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
GEO_ID,GEO_NAME,0-4,5-9,10-14,15-19,20-24,25-29,30-34,35-39,40-44,45-49,50-54,55-59,60-64,65-69,70-74,75-79,80-84,85+
2+
0400000US01,AL,288019,305731,331262,350694,333795,313883,330634,320939,328313,300020,320058,308895,344106,308878,245130,182506,106693,88912
3+
0400000US02,AK,45211,48763,51926,43880,49734,55327,56523,57892,52239,41875,40307,39971,46760,36699,29939,20212,8911,7237
4+
0400000US04,AZ,391142,435723,454506,501122,498597,506283,520705,475102,467226,424460,438405,412511,467831,423759,384473,305244,185157,139098
5+
0400000US05,AR,176908,196519,197772,214564,203715,191200,201994,194099,205080,178181,181169,175535,200090,179624,141606,107556,67237,54883
6+
0400000US06,CA,2086820,2243195,2535289,2614940,2515300,2692007,2978622,2823039,2637271,2399749,2446253,2341362,2339427,2033126,1629227,1185754,749307,714505
7+
0400000US08,CO,303775,316028,364851,378487,379346,448664,479339,457235,417376,356557,356915,317170,357325,316669,262477,180799,98806,85791
8+
0400000US09,CT,180561,196179,207266,241031,232217,219887,230580,230935,238479,208809,232830,242945,265426,219040,175883,130823,83430,80855
9+
0400000US10,DE,54398,55381,63821,65460,61403,57870,67091,63384,65305,54110,61204,65582,76924,69935,62935,40604,24952,21531
10+
0400000US12,FL,1122270,1186629,1266208,1328142,1285607,1342235,1472398,1466735,1434863,1332086,1428440,1429660,1597671,1428538,1252984,1031541,657889,546830
11+
0400000US13,GA,621750,688835,746320,779849,744354,735418,781199,739394,747522,683951,722758,667422,671205,564304,466760,335485,186379,146322
12+
0400000US15,HI,77420,80863,88182,78043,85733,86607,99146,94217,98299,82134,85390,84903,90849,87734,78472,62326,38869,35951
13+
0400000US16,ID,110908,127279,139235,148318,133282,123741,130577,126765,132044,115448,109445,104600,120825,110277,97418,62572,41878,30114
14+
0400000US17,IL,661026,753268,779833,827493,817073,822654,859570,832327,846077,764366,790442,763751,825979,717382,578222,394724,268934,246568
15+
0400000US18,IN,401558,439600,446554,485931,473586,437183,455677,439138,439641,396479,422024,407227,436033,391056,313265,215955,142005,119287
16+
0400000US19,IA,182063,200296,210274,225470,222476,197296,201330,198873,211540,181433,182933,182323,215809,185799,159424,105554,73086,71025
17+
0400000US20,KS,169830,182863,206744,210613,214245,189943,187885,190222,197579,161993,163667,157918,188680,165477,141513,94937,60109,56329
18+
0400000US21,KY,264633,278400,288926,286721,300199,291097,304571,291450,278309,271388,284520,282757,298415,261189,223389,154249,94893,71048
19+
0400000US22,LA,275636,296496,303219,302119,293386,280701,308764,303626,318907,260323,268843,266286,300852,261779,222084,150817,87199,72712
20+
0400000US23,ME,59898,66379,72592,79986,77407,79804,88461,85520,90237,76935,91167,95023,112193,103911,85896,60760,39057,30496
21+
0400000US24,MD,346836,375815,390863,401595,367705,370945,421207,429897,420579,375125,393086,405301,414661,340924,285152,206040,125217,109305
22+
0400000US25,MA,342145,363038,384162,457558,479344,472523,493923,480321,447066,404523,440630,454158,489124,413455,339624,246788,152434,140583
23+
0400000US26,MI,529459,581280,611729,647022,659915,633703,680649,617475,611978,567058,627278,640195,697295,637646,522354,350815,226160,195250
24+
0400000US27,MN,326995,348596,385909,386002,348838,359264,380411,403217,381557,330827,335035,338498,386838,335437,271024,184265,121372,113830
25+
0400000US28,MS,167015,177103,198632,223891,192902,173595,177133,180703,208325,180013,177681,170004,196201,171623,140030,98222,59033,47584
26+
0400000US29,MO,348416,379249,391705,414976,406000,391461,415312,404433,395861,354941,365291,367470,422544,369037,302650,210514,138057,118239
27+
0400000US30,MT,55363,68047,69920,68365,75272,69333,75302,75613,75096,63420,62391,64814,77926,74684,66447,42585,27023,21211
28+
0400000US31,NE,120499,130577,141003,138455,140599,125649,128558,129056,133065,112010,108450,102776,126757,110984,90662,63014,40083,36182
29+
0400000US32,NV,171163,190758,196263,199389,183350,215003,240589,226041,220694,195940,205130,192508,202489,177646,154772,112201,66710,43530
30+
0400000US33,NH,62779,64759,75485,84264,83802,83766,92389,96143,83663,76916,91601,97772,118757,97653,77495,55058,34298,25454
31+
0400000US34,NJ,518528,538462,587449,582651,552670,567536,615144,628533,610081,582851,610980,616330,632849,537270,416989,311338,202963,178217
32+
0400000US35,NM,104293,117174,142197,140142,139508,133804,143021,140800,135285,118643,121271,111763,144698,126362,122671,81493,49398,41848
33+
0400000US36,NY,1035708,1060854,1153297,1203892,1232310,1330595,1403693,1298001,1254526,1141208,1225855,1280576,1315200,1126930,956146,691879,442243,418303
34+
0400000US37,NC,594739,641236,658597,729451,723916,700558,743213,702717,702189,652884,700335,675234,695066,624894,516826,375002,220658,177976
35+
0400000US38,ND,46488,52058,51737,54372,59265,59807,55024,56787,49918,39193,37670,39245,48407,42634,35005,25200,14172,16944
36+
0400000US39,OH,654683,708725,739497,775522,734816,754392,796394,746804,735050,673370,731647,721789,799249,738319,580004,397752,259273,238649
37+
0400000US40,OK,239611,274081,278672,293303,283736,260398,279729,275621,263647,231199,229644,218507,248691,219316,181697,128152,82831,64989
38+
0400000US41,OR,198150,222642,252517,256017,253357,278480,313412,293939,306809,261664,261304,239220,267973,255940,236559,160775,93550,81050
39+
0400000US42,PA,663339,706358,774701,847344,798935,785892,865513,860853,799306,726879,797213,835881,901515,823450,687134,481963,307085,298322
40+
0400000US44,RI,52718,58466,55872,71235,71178,73447,80339,71278,71652,60298,66240,73372,77491,73431,50330,40047,25180,23388
41+
0400000US45,SC,285830,314825,328008,361188,338310,328099,349358,343362,345541,307405,330087,333118,370330,331034,290462,212365,118298,85935
42+
0400000US46,SD,54886,61176,64030,65802,57907,56897,58090,61874,56381,50955,47699,55157,58878,55645,47913,28630,18179,19219
43+
0400000US47,TN,411032,428094,442218,452889,460289,481495,496983,468014,467361,421747,448094,434374,467303,407804,341541,241466,142756,113029
44+
0400000US48,TX,1913591,2066208,2205759,2198388,2089908,2141194,2239258,2204042,2147241,1897232,1859874,1651024,1694592,1402804,1145242,792763,466540,387641
45+
0400000US49,UT,229881,256131,273145,282085,291473,262174,244416,233775,234419,208506,175370,154185,156425,139755,114857,73869,45664,41604
46+
0400000US50,VT,27168,29952,35665,42725,40699,36520,39653,41552,40328,36068,40669,41956,51088,46136,39674,26432,18387,12792
47+
0400000US51,VA,476744,512565,546266,580019,562765,564757,597415,606153,601873,521323,544417,540862,561608,484054,396914,293416,177990,146557
48+
0400000US53,WA,417322,460067,480293,468878,475123,553933,624403,594771,546897,466808,466341,435575,483339,438907,363162,256015,149077,131969
49+
0400000US54,WV,87453,92813,106431,112145,112010,98811,107442,102307,111309,108208,114165,115584,121201,115806,109611,71224,48491,35060
50+
0400000US55,WI,307874,347068,354836,389433,395528,364729,376995,379586,371818,333758,354763,381987,419185,378204,306481,200957,130416,117337
51+
0400000US56,WY,30651,36438,38128,40417,34570,36374,40084,41636,40462,30725,30327,31373,40775,37809,32453,17778,12420,11637
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)