|
| 1 | +# dms_datastore Command Reference |
| 2 | + |
| 3 | +This document contains full CLI command help pointers and workflow-based usage examples for all commands defined in `pyproject.toml` under `[project.scripts]`. |
| 4 | + |
| 5 | +Path arguments are intentionally generic and OS-agnostic. Replace placeholders like `<raw_dir>`, `<staging_dir>`, and `<repo_dir>` with paths for your environment. |
| 6 | + |
| 7 | +## Main Entrypoint |
| 8 | + |
| 9 | +Use `dms` as a grouped CLI (or call commands directly by script name). |
| 10 | + |
| 11 | +```bash |
| 12 | +# grouped help |
| 13 | +dms --help |
| 14 | + |
| 15 | +# subcommand help (example) |
| 16 | +dms download_ncro --help |
| 17 | +``` |
| 18 | + |
| 19 | +## Help for Every Command |
| 20 | + |
| 21 | +```bash |
| 22 | +dms --help |
| 23 | +download_noaa --help |
| 24 | +download_hycom --help |
| 25 | +download_hrrr --help |
| 26 | +download_cdec --help |
| 27 | +download_wdl --help |
| 28 | +download_nwis --help |
| 29 | +download_des --help |
| 30 | +download_ncro --help |
| 31 | +download_mokelumne --help |
| 32 | +download_ucdipm --help |
| 33 | +download_cimis --help |
| 34 | +download_dcc --help |
| 35 | +download_montezuma_gates --help |
| 36 | +download_smscg --help |
| 37 | +compare_directories --help |
| 38 | +populate_repo --help |
| 39 | +station_info --help |
| 40 | +reformat --help |
| 41 | +auto_screen --help |
| 42 | +inventory --help |
| 43 | +usgs_multi --help |
| 44 | +delete_from_filelist --help |
| 45 | +data_cache --help |
| 46 | +merge_files --help |
| 47 | +dropbox --help |
| 48 | +coarsen --help |
| 49 | +update_repo --help |
| 50 | +update_flagged_data --help |
| 51 | +rationalize_time_partitions --help |
| 52 | +``` |
| 53 | + |
| 54 | +## Workflow A: Repository Build Pipeline (Download -> Reformat -> Auto Screen) |
| 55 | + |
| 56 | +This order matches the operational flow used in `populate_tasks.bat` and core scripts. |
| 57 | + |
| 58 | +### Stage 1: Download into raw/staging |
| 59 | + |
| 60 | +```bash |
| 61 | +# help (all downloaders follow this pattern) |
| 62 | +download_noaa --help |
| 63 | +download_nwis --help |
| 64 | +download_des --help |
| 65 | +download_ncro --help |
| 66 | +``` |
| 67 | + |
| 68 | +```bash |
| 69 | +# NOAA |
| 70 | +download_noaa --start 2024-01-01 --end 2024-01-31 --param water_level --stations ccc --dest <raw_dir> |
| 71 | + |
| 72 | +# NWIS |
| 73 | +download_nwis --start 2024-01-01 --end 2024-01-31 --stations sjj --param 00060 --dest <raw_dir> |
| 74 | + |
| 75 | +# DES |
| 76 | +download_des --start 2024-01-01 --end 2024-01-31 --stations cll --param flow --dest <raw_dir> |
| 77 | + |
| 78 | +# NCRO timeseries |
| 79 | +download_ncro --start 2024-01-01 --end 2024-12-31 --stations orm --param elev --dest <raw_dir> |
| 80 | + |
| 81 | +# NCRO inventory only |
| 82 | +download_ncro --inventory-only |
| 83 | + |
| 84 | +# CDEC |
| 85 | +download_cdec --start 2024-01-01 --end 2024-01-31 --stations cse --param elev --dest <raw_dir> |
| 86 | + |
| 87 | +# WDL (water years) |
| 88 | +download_wdl --syear 2020 --eyear 2024 --param flow --stations orm --dest <raw_dir> |
| 89 | + |
| 90 | +# HYCOM |
| 91 | +download_hycom --sdate 2024-01-01 --edate 2024-01-31 --raw_dest <hycom_raw_dir> --processed_dest <hycom_processed_dir> |
| 92 | + |
| 93 | +# HRRR |
| 94 | +download_hrrr --sdate 2024-01-01 --edate 2024-01-03 --dest <hrrr_raw_dir> |
| 95 | + |
| 96 | +# UCD IPM (positional dates) |
| 97 | +download_ucdipm 2024-01-01 2024-01-31 --stnkey 281 |
| 98 | + |
| 99 | +# CIMIS |
| 100 | +download_cimis --hourly --download --existing-dir <formatted_dir> |
| 101 | + |
| 102 | +# DCC gates |
| 103 | +download_dcc --base-dir <dcc_raw_dir> |
| 104 | + |
| 105 | +# Montezuma gates |
| 106 | +download_montezuma_gates --base-dir <montezuma_raw_dir> |
| 107 | + |
| 108 | +# SMSCG gates |
| 109 | +download_smscg --base-dir smscg --outfile dms_smscg_gate.csv |
| 110 | + |
| 111 | +# Mokelumne report conversion |
| 112 | +download_mokelumne --fname mokelumne_flow.csv --raw-dir <mokelumne_raw_dir> --converted-dir <formatted_dir> |
| 113 | +``` |
| 114 | + |
| 115 | +### Stage 2: Reformat raw -> formatted |
| 116 | + |
| 117 | +```bash |
| 118 | +# help |
| 119 | +reformat --help |
| 120 | + |
| 121 | +# from populate_tasks.bat-style flow |
| 122 | +reformat --inpath <raw_dir> --outpath <formatted_dir> |
| 123 | + |
| 124 | +# agency-limited run |
| 125 | +reformat --inpath <raw_dir> --outpath <formatted_dir> --agencies usgs --agencies noaa |
| 126 | +``` |
| 127 | + |
| 128 | +### Stage 2b: USGS multivariate cleanup on formatted |
| 129 | + |
| 130 | +```bash |
| 131 | +# help |
| 132 | +usgs_multi --help |
| 133 | + |
| 134 | +# from populate_tasks.bat-style flow |
| 135 | +usgs_multi --fpath <formatted_dir> |
| 136 | +``` |
| 137 | + |
| 138 | +### Stage 3: Auto screen formatted -> screened |
| 139 | + |
| 140 | +```bash |
| 141 | +# help |
| 142 | +auto_screen --help |
| 143 | + |
| 144 | +# full repo-style run |
| 145 | +auto_screen --fpath <formatted_dir> --dest <screened_dir> |
| 146 | + |
| 147 | +# targeted run |
| 148 | +auto_screen --fpath <formatted_dir> --dest <screened_dir> --stations sjj --params flow --plot-dest interactive |
| 149 | +``` |
| 150 | + |
| 151 | +## Workflow B: Dropbox Ingest (separate workflow) |
| 152 | + |
| 153 | +```bash |
| 154 | +# help |
| 155 | +dropbox --help |
| 156 | + |
| 157 | +# run from YAML spec |
| 158 | +dropbox --input dms_datastore/config_data/dropbox_spec.yaml |
| 159 | +``` |
| 160 | + |
| 161 | +## Workflow C: Staging -> Repo update and utilities |
| 162 | + |
| 163 | +These commands handle comparisons, planned updates, and maintenance between staging and repository directories. |
| 164 | + |
| 165 | +```bash |
| 166 | +# help |
| 167 | +populate_repo --help |
| 168 | +compare_directories --help |
| 169 | +update_repo --help |
| 170 | +update_flagged_data --help |
| 171 | +``` |
| 172 | + |
| 173 | +```bash |
| 174 | +# populate staged raw data (as used in populate_tasks.bat) |
| 175 | +populate_repo --dest <raw_dir> |
| 176 | + |
| 177 | +# inventory for formatted set (as used in populate_tasks.bat) |
| 178 | +inventory --repo <formatted_dir> |
| 179 | + |
| 180 | +# compare staging vs repo (as used in populate_tasks.bat) |
| 181 | +compare_directories --base <repo_raw_dir> --compare <staging_raw_dir> --outfile compare_raw.txt |
| 182 | + |
| 183 | +# plan repo reconciliation |
| 184 | +update_repo <staging_formatted_dir> <repo_formatted_dir> --plan --out-actions update_plan.csv |
| 185 | + |
| 186 | +# apply repo reconciliation |
| 187 | +update_repo <staging_formatted_dir> <repo_formatted_dir> --apply |
| 188 | + |
| 189 | +# plan screened flag-aware update |
| 190 | +update_flagged_data <staging_screened_dir> <repo_screened_dir> --plan --out-actions flagged_plan.csv |
| 191 | + |
| 192 | +# apply screened flag-aware update |
| 193 | +update_flagged_data <staging_screened_dir> <repo_screened_dir> --apply |
| 194 | +``` |
| 195 | + |
| 196 | +## Additional Utilities (with help + concrete usage) |
| 197 | + |
| 198 | +```bash |
| 199 | +# station lookup |
| 200 | +station_info --help |
| 201 | +station_info jersey |
| 202 | +station_info --config |
| 203 | + |
| 204 | +# delete files from list |
| 205 | +delete_from_filelist --help |
| 206 | +delete_from_filelist --dpath <raw_dir> --filelist files_to_delete.txt |
| 207 | + |
| 208 | +# cache management |
| 209 | +data_cache --help |
| 210 | +data_cache --to-csv |
| 211 | +data_cache --clear |
| 212 | + |
| 213 | +# merge/splice timeseries |
| 214 | +merge_files --help |
| 215 | +merge_files --merge-type merge --order last --pattern "<formatted_dir>/usgs_*.csv" --pattern "<formatted_dir>/cdec_*.csv" --output merged.csv |
| 216 | + |
| 217 | +# coarsen a CSV time series |
| 218 | +coarsen --help |
| 219 | +coarsen input.csv output.csv --grid 15min --qwidth 0.05 --heartbeat-freq 120min |
| 220 | + |
| 221 | +# rationalize time partitions |
| 222 | +rationalize_time_partitions --help |
| 223 | +rationalize_time_partitions "<formatted_dir>/*.csv" --dry-run |
| 224 | +rationalize_time_partitions "<formatted_dir>/*.csv" --yaml dms_datastore/config_data/rationalize_time_partitions.yaml --root-dir <project_root> |
| 225 | +``` |
0 commit comments