Skip to content

Commit 906c859

Browse files
thodson-usgsclaude
andauthored
chore: add AGENTS.md and switch pre-commit to ruff-only (DOI-USGS#293)
* chore(lint): replace flake8/prettier configs with ruff-only pre-commit CI already lints with ruff exclusively (python-package.yml), so .flake8 and .prettierrc.toml were dead config. Rewrite .pre-commit-config.yaml to drop black, blackdoc, flake8, isort, prettier, pyupgrade, and double-quote-string-fixer (all superseded by ruff check + ruff format) and add astral-sh/ruff-pre-commit so local pre-commit matches CI. Exclude tests/data/ from trailing-whitespace, end-of-file-fixer, and mixed-line-ending; those fixtures are byte-exact API response captures (RDB/TSV with significant trailing tabs) and must not be normalized. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: add AGENTS.md with repo scope, notebook index, and waterdata notes Brief guide for agents working in this repo: scope and excluded paths, example notebook index (demos/*.ipynb plus demos/hydroshare/*.ipynb), environment/lint/test/docs commands, testing gotchas (pytest-httpx fixture, Python <3.10 skips), and waterdata implementation notes (httpx client, kwarg-to-API spelling translation, OGC byte-limit auto-chunking in dataretrieval/waterdata/chunking.py). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: fix latent trailing whitespace and missing EOF newlines Auto-fixes from the new pre-commit hooks (trailing-whitespace, end-of-file-fixer). Touched files are docs (.rst, .nblink), the README code snippets, .gitignore, the nwqn Dockerfile, and the py.typed marker. No behavior changes; this clears the slate so future commits don't trip the hook on day one. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 569ff38 commit 906c859

32 files changed

Lines changed: 77 additions & 69 deletions

.flake8

Lines changed: 0 additions & 3 deletions
This file was deleted.

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,4 +111,4 @@ ENV/
111111
.mypy_cache/
112112

113113
# macOS
114-
*.DS_Store
114+
*.DS_Store

.pre-commit-config.yaml

Lines changed: 12 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -6,44 +6,23 @@ repos:
66
- repo: https://github.com/pre-commit/pre-commit-hooks
77
rev: v4.4.0
88
hooks:
9+
# tests/data/ holds byte-exact API response captures (TSV/RDB with
10+
# significant trailing tabs, plus pinned JSON). Skip whitespace and
11+
# EOL normalization there so we don't silently change fixture bytes.
912
- id: trailing-whitespace
13+
exclude: ^tests/data/
1014
- id: end-of-file-fixer
15+
exclude: ^tests/data/
16+
- id: mixed-line-ending
17+
exclude: ^tests/data/
1118
- id: check-docstring-first
1219
- id: check-json
1320
- id: check-yaml
14-
- id: double-quote-string-fixer
1521
- id: debug-statements
16-
- id: mixed-line-ending
17-
18-
- repo: https://github.com/asottile/pyupgrade
19-
rev: v3.3.1
20-
hooks:
21-
- id: pyupgrade
22-
args:
23-
- '--py38-plus'
24-
25-
- repo: https://github.com/psf/black
26-
rev: 23.3.0
27-
hooks:
28-
- id: black
29-
- id: black-jupyter
30-
31-
- repo: https://github.com/keewis/blackdoc
32-
rev: v0.3.8
33-
hooks:
34-
- id: blackdoc
35-
36-
- repo: https://github.com/PyCQA/flake8
37-
rev: 6.0.0
38-
hooks:
39-
- id: flake8
40-
41-
- repo: https://github.com/PyCQA/isort
42-
rev: 5.12.0
43-
hooks:
44-
- id: isort
4522

46-
- repo: https://github.com/pre-commit/mirrors-prettier
47-
rev: v3.0.0-alpha.6
23+
- repo: https://github.com/astral-sh/ruff-pre-commit
24+
rev: v0.15.15
4825
hooks:
49-
- id: prettier
26+
- id: ruff-check
27+
args: [--fix]
28+
- id: ruff-format

.prettierrc.toml

Lines changed: 0 additions & 2 deletions
This file was deleted.

AGENTS.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# AGENTS.md
2+
3+
## Scope
4+
- Python code is in `dataretrieval/`; `dataretrieval/waterdata/` is the modern USGS Water Data API, `dataretrieval/nwis.py` is legacy/deprecated.
5+
- `R/dataRetrieval/` is the R project copy; leave it alone unless the task asks for R work.
6+
- Exclude `.claude/worktrees/` from searches and edits; it contains stale worktrees that pollute results.
7+
8+
## Example Notebooks
9+
- `demos/*.ipynb` — top-level Water Data tour: `USGS_WaterData_Introduction_Examples.ipynb` is the entry point; `_ContinuousData_`, `_DailyStatistics_`, `_DiscreteSamples_`, `_ReferenceLists_` cover individual collections; `WaterData_demo.ipynb`, `peak_streamflow_trends.ipynb`, and `R Python Vignette equivalents.ipynb` are standalone walkthroughs.
10+
- `demos/hydroshare/*.ipynb` — per-service HydroShare examples (NLDI, NWIS WaterUse, and Water Data DailyValues / GroundwaterLevels / Measurements / ParameterCodes / Peaks / Ratings / Samples / SiteInfo / SiteInventory / Statistics / UnitValues). Mirror these when adding examples for a new collection.
11+
- `demos/nwqn_data_pull/` — non-notebook example: a lithops/Docker batch pipeline (`retrieve_nwqn_samples.py`, `retrieve_nwqn_streamflow.py`) with its own `README.md`.
12+
- Any `Untitled*.ipynb`, `*_test.ipynb`, or notebooks not listed here are untracked local scratch; ignore them.
13+
14+
## Environment
15+
- Use `pip install .[test,nldi]` (CI uses pip, not uv despite `uv.lock`). Docs: `pip install .[doc,nldi]`.
16+
17+
## Commands
18+
- Lint: `ruff check .` and `ruff format --check .`.
19+
- Tests: `coverage run -m pytest tests/ && coverage report -m`, or focused like `pytest tests/waterdata_test.py::test_mock_get_samples`.
20+
- Docs: install docs deps, `ipython kernel install --name "python3" --user`, then `make html` from `docs/`. `make docs` adds doctest+linkcheck (network-dependent).
21+
22+
## Testing Gotchas
23+
- Tests mock HTTP with `pytest-httpx`'s `httpx_mock` fixture and fixtures under `tests/data/`; keep new API tests offline. `tests/conftest.py` relaxes the fixture's strict-mode defaults (unused mocks and unmocked requests are tolerated) so rerun-on-failure works.
24+
- `tests/nwis_test.py::test_nwis_service_live` hits live NWIS.
25+
- `tests/nadp_test.py` is module-skipped (NADP deprecated).
26+
- `tests/waterdata_test.py` and `tests/waterdata_ratings_test.py` skip on Python <3.10, so a 3.9 run does not cover them.
27+
28+
## Implementation Notes
29+
- HTTP client is `httpx` (migrated from `requests` in #289); new code should use `httpx` and tests should mock with `httpx_mock`.
30+
- Public download helpers return `(DataFrame, metadata)`.
31+
- `dataretrieval/__init__.py` star-imports service modules; `dataretrieval/waterdata/__init__.py` controls Water Data exports via `__all__`.
32+
- `dataretrieval.waterdata.utils._default_headers()` adds `X-Api-Key` from `API_USGS_PAT`; never hard-code tokens in examples or tests.
33+
- Water Data request builders translate Python kwargs to API spellings (`skip_geometry` -> `skipGeometry`, `filter_lang` -> `filter-lang`); tests assert exact URLs/query params.
34+
- Multi-value OGC params are comma-joined GETs, except `monitoring-locations` which POSTs CQL2 JSON. The OGC edge WAF caps total request bytes (URL + body) at ~8200, so `dataretrieval/waterdata/chunking.py` auto-splits oversized queries across sub-requests (both GET and POST paths); preserve this when adding new list-shaped kwargs.
35+
- NLDI requires `geopandas` at import time (`pip install .[nldi]`); other modules fall back to pandas when geopandas is absent.

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ from dataretrieval import waterdata
5959

6060
# Get daily streamflow data (returns DataFrame and metadata)
6161
df, metadata = waterdata.get_daily(
62-
monitoring_location_id='USGS-01646500',
62+
monitoring_location_id='USGS-01646500',
6363
parameter_code='00060', # Discharge
6464
time='2024-10-01/2025-09-30'
6565
)
@@ -98,7 +98,7 @@ windows to avoid timeouts and other issues:
9898
```python
9999
# Get continuous data for a single monitoring location and water year
100100
df, metadata = waterdata.get_continuous(
101-
monitoring_location_id='USGS-01646500',
101+
monitoring_location_id='USGS-01646500',
102102
parameter_code='00065', # Gage height
103103
time='2024-10-01/2025-09-30'
104104
)
@@ -152,7 +152,7 @@ from dataretrieval import nldi
152152
# Get watershed basin for a stream reach
153153
basin = nldi.get_basin(
154154
feature_source='comid',
155-
feature_id='13293474' # NHD reach identifier
155+
feature_id='13293474' # NHD reach identifier
156156
)
157157

158158
print(f"Basin contains {len(basin)} feature(s)")
@@ -184,7 +184,7 @@ print(f"Found {len(flowlines)} upstream tributaries within 50km")
184184
### Legacy NWIS Services (Deprecated)
185185
- **Daily values (dv)**: Legacy daily statistical data
186186
- **Instantaneous values (iv)**: Legacy continuous data
187-
- **Site info (site)**: Basic site information
187+
- **Site info (site)**: Basic site information
188188
- **Statistics (stat)**: Statistical summaries
189189
- **Discharge peaks (peaks)**: Annual peak discharge events
190190

dataretrieval/py.typed

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +0,0 @@
1-

demos/nwqn_data_pull/Dockerfile_dataretrieval

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,4 +56,4 @@ COPY requirements.txt requirements.txt
5656
RUN pip install --no-cache-dir -r requirements.txt
5757

5858
ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]
59-
CMD [ "handler.entry_point.lambda_handler" ]
59+
CMD [ "handler.entry_point.lambda_handler" ]
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
{
22
"path": "../../../demos/hydroshare/USGS_WaterData_DailyValues_Examples.ipynb"
3-
}
3+
}
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
{
22
"path": "../../../demos/hydroshare/USGS_WaterData_GroundwaterLevels_Examples.ipynb"
3-
}
3+
}

0 commit comments

Comments
 (0)