|
1 | 1 | About cps_data |
2 | 2 | ============== |
3 | 3 |
|
4 | | -This directory contains the following script: |
| 4 | +This directory contains the python scripts used to create `cps.csv.gz`. You |
| 5 | +can run all of the scripts with the command `python create.py`. By default, |
| 6 | +you will get a CPS file composed of the 2013, 2014, and 2015 March CPS Supplemental |
| 7 | +files. If you would like to use another combination of the 2013, 2014, 2015, |
| 8 | +2016, 2017, and 2018 files, there are two ways to do so. |
5 | 9 |
|
6 | | -* Python script **finalprep.py**, which reads/writes: |
| 10 | +1. You can modify `create.py` by adding the `cps_files` argument to the `create()` |
| 11 | +function call at the bottom of the file to specify which files you would like to |
| 12 | +use. For example, to use the 2016, 2017, and 2018 files, the function call would |
| 13 | +now be |
| 14 | +```python |
| 15 | +if __name__ == "__main__": |
| 16 | + create( |
| 17 | + exportcsv=False, exportpkl=True, exportraw=False, validate=False, |
| 18 | + benefits=True, verbose=True, cps_files=[2016, 2017, 2018] |
| 19 | + ) |
| 20 | +``` |
7 | 21 |
|
8 | | - Input files: |
9 | | - - cps_raw.csv.gz |
10 | | - - adjustment_targets.csv |
11 | | - - benefitprograms.csv |
| 22 | +2. You could write a separate python file that imports the `create()` function |
| 23 | +and calls it in the same way as above. |
12 | 24 |
|
13 | | - Output files: |
14 | | - - cps.csv |
| 25 | +## Input files: |
| 26 | +With the exception of the CPS March Supplements, all input files can be found |
| 27 | +in the `pycps/data` directory. |
| 28 | + |
| 29 | +### CPS March Supplements |
| 30 | +* asec2013_pubuse.dat |
| 31 | +* asec2014_pubuse_tax_fix_5x8_2017.dat |
| 32 | +* asec2015_pubuse.dat |
| 33 | +* asec2016_pubuse.dat |
| 34 | +* asec2017_pubuse.dat |
| 35 | +* asec2018_pubuse.dat |
| 36 | + |
| 37 | +### C-TAM Benefit Imputations |
| 38 | + |
| 39 | +Note that we only have C-TAM imputations for the 2013, 2014, and 2015 files. |
| 40 | +For other years, we just use the benefit program information in the CPS |
| 41 | +* Housing_Imputation_logreg_2013.csv |
| 42 | +* Housing_Imputation_logreg_2014.csv |
| 43 | +* Housing_Imputation_logreg_2015.csv |
| 44 | +* medicaid2013.csv |
| 45 | +* medicaid2014.csv |
| 46 | +* medicaid2015.csv |
| 47 | +* medicare2013.csv |
| 48 | +* medicare2014.csv |
| 49 | +* medicare2015.csv |
| 50 | +* otherbenefitprograms.csv |
| 51 | +* SNAP_Imputation_2013.csv |
| 52 | +* SNAP_Imputation_2014.csv |
| 53 | +* SNAP_Imputation_2015.csv |
| 54 | +* SS_augmentation_2013.csv |
| 55 | +* SS_augmentation_2014.csv |
| 56 | +* SS_augmentation_2015.csv |
| 57 | +* SSI_Imputation2013.csv |
| 58 | +* SSI_Imputation2014.csv |
| 59 | +* SSI_Imputation2015.csv |
| 60 | +* TANF_Imputation_2013.csv |
| 61 | +* TANF_Imputation_2014.csv |
| 62 | +* TANF_Imputation_2015.csv |
| 63 | +* UI_imputation_logreg_2013.csv |
| 64 | +* UI_imputation_logreg_2014.csv |
| 65 | +* UI_imputation_logreg_2015.csv |
| 66 | +* VB_Imputation2013.csv |
| 67 | +* VB_Imputation2014.csv |
| 68 | +* VB_Imputation2015.csv |
| 69 | +* WIC_imputation_children_logreg_2013.csv |
| 70 | +* WIC_imputation_children_logreg_2014.csv |
| 71 | +* WIC_imputation_children_logreg_2015.csv |
| 72 | +* WIC_imputation_infants_logreg_2013.csv |
| 73 | +* WIC_imputation_infants_logreg_2014.csv |
| 74 | +* WIC_imputation_infants_logreg_2015.csv |
| 75 | +* WIC_imputation_women_logreg_2013.csv |
| 76 | +* WIC_imputation_women_logreg_2014.csv |
| 77 | +* WIC_imputation_women_logreg_2015.csv |
| 78 | + |
| 79 | +### Imputation Parameters |
| 80 | + |
| 81 | +These parameters are used in the imputations found in `pycps/impute.py` |
| 82 | +* logit_beta.csv |
| 83 | +* ols_betas.csv |
| 84 | + |
| 85 | +## Output Files |
| 86 | + |
| 87 | +Only `cps.csv.gz` is included in the repository due to the size of `cps_raw.csv.gz`. |
| 88 | +* cps.csv.gz |
| 89 | +* cps_raw.csv.gz |
15 | 90 |
|
16 | 91 |
|
17 | 92 | Documentation |
|
0 commit comments