Skip to content

docs: copy-edit guides and fix demo-notebook narrative/code inconsistencies#321

Merged
thodson-usgs merged 4 commits into
DOI-USGS:mainfrom
thodson-usgs:docs/notebook-and-doc-copyedit
Jun 10, 2026
Merged

docs: copy-edit guides and fix demo-notebook narrative/code inconsistencies#321
thodson-usgs merged 4 commits into
DOI-USGS:mainfrom
thodson-usgs:docs/notebook-and-doc-copyedit

Conversation

@thodson-usgs

Copy link
Copy Markdown
Collaborator

What

A documentation copy-edit pass plus fixes for narrative/code inconsistencies in the demo notebooks. All 20 docs notebooks were executed end-to-end against the live USGS Water Data API; every fix below was verified against live output before editing, and the edited notebooks were re-run to confirm they pass.

Notebook fixes (demos/)

One notebook failed to execute before this PR, the rest had narrative that contradicted their own output:

  • DiscreteSamples (was crashing)get_codes returns a (df, md) tuple, but five cells indexed the tuple with a column list (TypeError), and the prose claimed it "returns a plain DataFrame". Unpack the tuple; correct the claim.
  • SiteInfostate_code="UT" returned an empty frame under an "all locations in a state" heading. state_code is a two-digit ANSI code → use "49" (Utah).
  • UnitValues — two notes said returned timestamps are "in local time"; they are tz-aware UTC. Removed a dead duplicate of Example 5.
  • Samples — "181 fields" → the default profile returns 187 columns; replaced six references to non-existent *_lookup() helpers with the real get_codes(code_service=...).
  • GroundwaterLevels — stale comment claimed partial dates "show up as NaT" in the index; the index is a plain RangeIndex and dates live in a normalized UTC time column. Relabeled a y-axis that mixed depth-below-surface with NGVD29/NAVD88 elevations.
  • Introductionget_combined_metadata joins monitoring-location + time-series metadata, not "field-measurement metadata".
  • R vignettenwis.get_water_use() is defunct (raises NameError); noted instead of shown as runnable.
  • SiteInventory — Example 3 duplicated Example 2 verbatim → repurposed as a skip_geometry=True demo.
  • peak_streamflow_trends — the live-data migration (ad4e980f) removed the CSV-load cell but left narrative cells describing it; rewrote them to match the live final_df, and corrected a chunker comment (RI's 350 gages return in a single request — verified).
  • Disambiguated two identical get_field_measurements() notebook titles (Surface-Water vs Groundwater-Level).

Prose docs

  • installing.rst: fixed the conda command (conda install -c conda-forge dataretrieval).
  • contributing.rst: version is derived from Git tags via setuptools_scm, so the "edit setup.py / conf.py" steps are obsolete → tag-based release note; Python "3.6, 3.7, 3.8" → "3.9 and later".
  • CONTRIBUTING.md: stale USGS-python/dataretrieval URLs → DOI-USGS/dataretrieval-python; blob/masterblob/main; Python versions; "interace" typo.

Cleanup

  • Removed docs/source/examples/datasets/peak_discharge_trends.csv (535 KB) — leftover from the ad4e980f live-data migration, referenced nowhere in the repo.

Verification

  • 20/20 docs notebooks execute against the live API (1 previously failed → now passes); all edited notebooks re-executed clean.
  • Embedded .rst doctests (readme_examples, siteinfo_examples, timeconventions) verified exactly against live output.
  • pre-commit (ruff, nbstripout, eof) passes on all changed files.

🤖 Generated with Claude Code

thodson-usgs and others added 4 commits June 8, 2026 19:01
- installing.rst: fix the conda command (`conda install -c conda-forge
  dataretrieval`, not `conda -c conda-forge install ...`).
- contributing.rst: the package version is derived from Git tags by
  setuptools_scm, so the "edit the version in setup.py / conf.py" steps
  are obsolete — replace them with the tag-based release note. Bump the
  supported Python from "3.6, 3.7, 3.8" to "3.9 and later" (matches
  pyproject `requires-python` and the CI matrix: 3.9/3.13/3.14).
- CONTRIBUTING.md: update stale `USGS-python/dataretrieval` issue URLs to
  `DOI-USGS/dataretrieval-python`, `blob/master` -> `blob/main`, the
  Python version list to "3.9 and later", and fix an "interace" typo.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Every docs notebook now executes end-to-end against the live USGS Water
Data API; each fix below was verified against live output before editing.

- DiscreteSamples: `get_codes` returns a `(df, md)` tuple, but five cells
  indexed the tuple with a column list (raising TypeError) and the prose
  claimed it "returns a plain DataFrame". Unpack the tuple and correct the
  claim. (This notebook previously failed to execute.)
- SiteInfo: `state_code="UT"` returned an empty frame under an "all
  locations in a state" heading; `state_code` is a two-digit ANSI code, so
  use "49" (Utah).
- UnitValues: two notes claimed returned timestamps are "in local time" --
  they are tz-aware UTC. Removed a dead duplicate of Example 5.
- Samples: "181 fields" -> the default profile returns 187 columns;
  replaced six references to nonexistent `*_lookup()` helpers with the
  real `get_codes(code_service=...)`.
- GroundwaterLevels: stale comment said partial dates "show up as NaT" in
  the index -- the index is a plain RangeIndex and dates live in a
  normalized UTC `time` column; print that instead. Relabel a y-axis that
  mixed depth-below-surface with NGVD29/NAVD88 elevations.
- Introduction: `get_combined_metadata` joins monitoring-location and
  time-series metadata, not "field-measurement metadata".
- R vignette: `nwis.get_water_use()` is defunct (raises NameError); note
  that instead of presenting it as runnable.
- SiteInventory: Example 3 duplicated Example 2 verbatim -- repurpose it as
  a `skip_geometry=True` demonstration.
- peak_streamflow_trends: the live-data migration (ad4e980) removed the
  CSV-load cell but left narrative cells describing it; rewrite them to
  describe the live `final_df`, and correct the chunker comment (Rhode
  Island's 350 gages return in a single request).
- Disambiguate the two identical `get_field_measurements()` notebook
  titles (Surface-Water vs Groundwater-Level).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Commit ad4e980 switched the peak-streamflow-trends demo to pull data
live and deleted the `demos/datasets/` copy of this cached CSV, but a
second copy under docs/ was missed. It is referenced nowhere in the repo,
so remove it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-up to the narrative fix: instead of leaving a commented basemap
cell that referenced legacy NWIS coordinate columns (`dec_lat_va` /
`dec_long_va`) from the retired CSV, keep the monitoring-location
`geometry` (drop `skip_geometry=True`) so `final_df` is a GeoDataFrame,
and map the Rhode Island results directly with geopandas + matplotlib
(increasing trends red, decreasing blue). No basemap dependency, and the
map renders live in the doc build. Verified end-to-end against the API.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@thodson-usgs thodson-usgs marked this pull request as ready for review June 10, 2026 14:06
@thodson-usgs thodson-usgs merged commit 4b7464f into DOI-USGS:main Jun 10, 2026
9 checks passed
@thodson-usgs thodson-usgs deleted the docs/notebook-and-doc-copyedit branch June 10, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant