Skip to content

Commit c499431

Browse files
EliEli
authored andcommitted
Revise copilot instructions to point to AGENTS.md
1 parent 707796f commit c499431

1 file changed

Lines changed: 6 additions & 80 deletions

File tree

.github/copilot-instructions.md

Lines changed: 6 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,13 @@
11
# dms_datastore — Workspace Instructions
22

3-
## Project Overview
3+
Follow organization standards from BayDeltaSCHISM https://raw.githubusercontent.com/CADWRDeltaModeling/BayDeltaSCHISM/refs/heads/master/AGENTS.md
44

5-
`dms_datastore` is a Python library and CLI toolkit for the Delta Modeling Section (DMS) that downloads, formats, screens, and manages continuous time-series data from water-quality and hydrological agencies (USGS, CDEC, NOAA, NCRO, DES, etc.). Data flows through four stages: **raw → formatted → screened → processed**.
5+
Follow local project rules in AGENTS.md.
66

7-
## Build and Test
7+
Local project rules override organization defaults.
8+
9+
10+
# Build and Test
811

912
The `dms_datastore` conda environment is assumed to exist. Always activate it before running any tests or install commands.
1013

@@ -13,84 +16,7 @@ The `dms_datastore` conda environment is assumed to exist. Always activate it be
1316
conda activate dms_datastore
1417
pip install --no-deps -e .
1518

16-
# Unit/integration tests (no real repo required)
17-
conda activate dms_datastore && pytest
18-
19-
# Integration tests against a real repository
20-
conda activate dms_datastore && pytest test_repo/ --repo=<path_to_repo>
21-
22-
# Single file
23-
conda activate dms_datastore && pytest tests/test_filename.py
2419
```
2520

2621
pytest is configured in `pyproject.toml` (`[tool.pytest.ini_options]`): strict markers, JUnit XML output, ignores `setup.py` and `build/`.
2722

28-
## Architecture
29-
30-
| Layer | Modules | Purpose |
31-
|---|---|---|
32-
| Public API | `__init__.py` | Re-exports `read_ts_repo`, `read_ts`, `write_ts_csv` |
33-
| CLI | `__main__.py` | Click group `dms` aggregating all subcommands |
34-
| Config | `dstore_config.py`, `config_data/dstore_config.yaml` | Repo roots, station DBs, variable/source mappings |
35-
| File naming | `filename.py` | Parse/render filenames via `interpret_fname` / `meta_to_filename` |
36-
| I/O | `read_ts.py`, `write_ts.py` | Low-level CSV read/write with YAML front-matter |
37-
| Multi-file read | `read_multi.py` | `read_ts_repo` — resolves source priority, merges year-sharded files |
38-
| Download | `download_*.py` | One module per agency (CDEC, NWIS, NOAA, NCRO, DES, HRRR, HYCOM, …) |
39-
| Pipeline | `populate_repo.py`, `update_repo.py` | Orchestrate download → format → screen |
40-
| QA/QC | `auto_screen.py`, `screeners.py` | YAML-driven screening; flags stored as `user_flag` column |
41-
| Utilities | `inventory.py`, `merge_files.py`, `coarsen_file.py`, `rationalize_time_partitions.py`, `reconcile_data.py` | Repo maintenance |
42-
43-
## File Naming Convention
44-
45-
Pattern: `{agency}_{station_id@subloc}_{agency_id}_{variable}_{syear}_{eyear}.csv`
46-
47-
- `@subloc` is omitted when subloc is `default`/`None`
48-
- End year `9999` means open-ended (actively updated)
49-
- `variable@modifier` encodes e.g. `ec@daily`
50-
51-
Examples:
52-
- `usgs_anh@north_11303500_flow_2024.csv`
53-
- `cdec_sac_11447650_flow_2020_9999.csv`
54-
55-
See [dms_datastore/filename.py](../dms_datastore/filename.py) for `meta_to_filename` / `interpret_fname`.
56-
57-
## Data File Format
58-
59-
CSV files with `#`-commented YAML front-matter:
60-
61-
```csv
62-
# format: dwr-dms-1.0
63-
# date_formatted: 2024-01-15T12:00:00
64-
# source_info:
65-
# siteName: MOKELUMNE R A ANDRUS ISLAND
66-
datetime,value,user_flag
67-
2020-01-01 00:00:00,1.5,0
68-
```
69-
70-
- Index column: `datetime`
71-
- Always two data columns: `value` (float) and `user_flag` (`Int64`, nullable)
72-
- `user_flag != 0` → anomalous; masked by `read_ts` by default (`read_flagged=True`)
73-
- Files are year-sharded; wildcards handled automatically by `read_ts`
74-
75-
## Key Conventions
76-
77-
- **Station IDs with sublocation**: `station_id@subloc` (e.g. `anh@north`, `msd@bottom`)
78-
- **Variables with modifier**: `param@modifier` (e.g. `ec@daily`)
79-
- **Units**: SI for most variables; stage/flow in ft / cfs; salinity as specific conductivity at 25°C (µS/cm)
80-
- **Source priority** is declared per agency in `dstore_config.yaml` and resolved by `read_ts_repo` — do not hard-code provider preferences in code
81-
- **Config paths** are resolved by `dstore_config.config_file(label)` — checks cwd first, then `config_data/`
82-
- New download modules must register as a Click command in `__main__.py` and add an entry point in `pyproject.toml`
83-
84-
## Tests
85-
86-
- `tests/` — unit and integration tests with monkeypatched config; no real repo needed
87-
- `test_repo/` — integration tests; pass `--repo=<path>` to pytest
88-
- Use `tmp_path` and `monkeypatch` for config isolation
89-
- Do not couple unit tests to the shared repo path
90-
91-
## Key Reference Files
92-
93-
- [README.md](../README.md) — full data model, flags, units, configuration system
94-
- [README-dropbox.md](../README-dropbox.md) — Dropbox data ingestion via `dropbox_spec.yaml`
95-
- [README-commands.md](../README-commands.md) — CLI command reference
96-
- [dms_datastore/config_data/dstore_config.yaml](../dms_datastore/config_data/dstore_config.yaml) — central config

0 commit comments

Comments
 (0)