Skip to content

Commit 4b7464f

Browse files
thodson-usgsclaude
andauthored
docs: copy-edit guides and fix demo-notebook narrative/code inconsistencies (#321)
- installing.rst: fix the conda command (`conda install -c conda-forge dataretrieval`, not `conda -c conda-forge install ...`). - contributing.rst: the package version is derived from Git tags by setuptools_scm, so the "edit the version in setup.py / conf.py" steps are obsolete — replace them with the tag-based release note. Bump the supported Python from "3.6, 3.7, 3.8" to "3.9 and later" (matches pyproject `requires-python` and the CI matrix: 3.9/3.13/3.14). - CONTRIBUTING.md: update stale `USGS-python/dataretrieval` issue URLs to `DOI-USGS/dataretrieval-python`, `blob/master` -> `blob/main`, the Python version list to "3.9 and later", and fix an "interace" typo. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(demos): fix notebook narrative and code inconsistencies Every docs notebook now executes end-to-end against the live USGS Water Data API; each fix below was verified against live output before editing. - DiscreteSamples: `get_codes` returns a `(df, md)` tuple, but five cells indexed the tuple with a column list (raising TypeError) and the prose claimed it "returns a plain DataFrame". Unpack the tuple and correct the claim. (This notebook previously failed to execute.) - SiteInfo: `state_code="UT"` returned an empty frame under an "all locations in a state" heading; `state_code` is a two-digit ANSI code, so use "49" (Utah). - UnitValues: two notes claimed returned timestamps are "in local time" -- they are tz-aware UTC. Removed a dead duplicate of Example 5. - Samples: "181 fields" -> the default profile returns 187 columns; replaced six references to nonexistent `*_lookup()` helpers with the real `get_codes(code_service=...)`. - GroundwaterLevels: stale comment said partial dates "show up as NaT" in the index -- the index is a plain RangeIndex and dates live in a normalized UTC `time` column; print that instead. Relabel a y-axis that mixed depth-below-surface with NGVD29/NAVD88 elevations. - Introduction: `get_combined_metadata` joins monitoring-location and time-series metadata, not "field-measurement metadata". - R vignette: `nwis.get_water_use()` is defunct (raises NameError); note that instead of presenting it as runnable. - SiteInventory: Example 3 duplicated Example 2 verbatim -- repurpose it as a `skip_geometry=True` demonstration. - peak_streamflow_trends: the live-data migration (ad4e980) removed the CSV-load cell but left narrative cells describing it; rewrite them to describe the live `final_df`, and correct the chunker comment (Rhode Island's 350 gages return in a single request). - Disambiguate the two identical `get_field_measurements()` notebook titles (Surface-Water vs Groundwater-Level). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent be0f072 commit 4b7464f

14 files changed

Lines changed: 104 additions & 2989 deletions

CONTRIBUTING.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ however writing code is not the only way to contribute.
3636

3737
### Reporting Bugs
3838

39-
Report bugs at https://github.com/USGS-python/dataretrieval/issues.
39+
Report bugs at https://github.com/DOI-USGS/dataretrieval-python/issues.
4040

4141
When reporting a bug, please include:
4242

@@ -49,7 +49,7 @@ When reporting a bug, please include:
4949

5050
### Fixing Bugs
5151

52-
Look through the GitHub [issues](https://github.com/USGS-python/dataretrieval/issues)
52+
Look through the GitHub [issues](https://github.com/DOI-USGS/dataretrieval-python/issues)
5353
for known and unresolved bugs. Any issues labeled "bug" that are unassigned,
5454
are open for resolution. You are welcome to comment in the relevant issue to
5555
state your intention to resolve the bug, which will help ensure there is no
@@ -68,7 +68,7 @@ your fork, to the original upstream repository.
6868

6969
### Implementing Features
7070

71-
Look through the GitHub [issues](https://github.com/USGS-python/dataretrieval/issues)
71+
Look through the GitHub [issues](https://github.com/DOI-USGS/dataretrieval-python/issues)
7272
for outstanding feature requests. Anything tagged with "enhancement"
7373
and "please-help" is open to whomever wants to implement it.
7474

@@ -83,8 +83,8 @@ Before you submit a pull request, check that it meets these guidelines:
8383
2. If the pull request adds or modifies functionality, the documentation should
8484
be updated. To do so, either add or modify a functions docstring which will
8585
automatically become part of the API documentation
86-
3. The pull request should work for Python 3.7, 3.8, 3.9, 3.10 - refer to the
87-
[python-package.yml file](https://github.com/USGS-python/dataretrieval/blob/master/.github/workflows/python-package.yml)
86+
3. The pull request should work for Python 3.9 and later - refer to the
87+
[python-package.yml file](https://github.com/DOI-USGS/dataretrieval-python/blob/main/.github/workflows/python-package.yml)
8888
for the latest versions of Python being tested by the continuous integration
8989
pipelines. This will be checked automatically by the CI pipelines once the
9090
pull request is opened.
@@ -98,7 +98,7 @@ via any automated processes or pipelines.
9898
#### Style
9999

100100
* Attempt to write code following the [PEP8 style guidelines](https://peps.python.org/pep-0008/) as much as possible
101-
* The public interace should emphasize functions over classes; however, classes
101+
* The public interface should emphasize functions over classes; however, classes
102102
can and should be used internally and in tests
103103
* Functions for downloading data from a specific web portal must be grouped
104104
within their own submodule
@@ -202,7 +202,7 @@ will need to do the following (in a separate branch of the repository):
202202
### Submitting Feedback
203203

204204
The best way to send feedback is to open an issue at
205-
https://github.com/USGS-python/dataretrieval/issues.
205+
https://github.com/DOI-USGS/dataretrieval-python/issues.
206206

207207
Please be as clear as possible in your feedback, if you are reporting a bug
208208
refer to [Reporting Bugs](#reporting-bugs).
@@ -211,7 +211,7 @@ refer to [Reporting Bugs](#reporting-bugs).
211211
### Feature Requests
212212

213213
To request or propose a new feature, open an issue at
214-
https://github.com/USGS-python/dataretrieval/issues.
214+
https://github.com/DOI-USGS/dataretrieval-python/issues.
215215

216216
Please be sure to:
217217
* Explain in detail how it would work, possibly with pseudo-code or an example

demos/R Python Vignette equivalents.ipynb

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -307,17 +307,8 @@
307307
" header # the response headers\n",
308308
"```\n",
309309
"\n",
310-
"Note: USGS *water use* data has no Water Data API equivalent yet, so it remains available only through the deprecated `nwis` module:\n",
311-
"\n",
312-
"```\n",
313-
"national, md = nwis.get_water_use()\n",
314-
"```"
310+
"Note: USGS *water use* data has no Water Data API equivalent yet. The legacy `nwis.get_water_use()` service has been decommissioned and now raises a \"defunct\" error, so there is currently no runnable way to retrieve water-use data through `dataretrieval`."
315311
]
316-
},
317-
{
318-
"cell_type": "markdown",
319-
"metadata": {},
320-
"source": []
321312
}
322313
],
323314
"metadata": {

demos/USGS_WaterData_DiscreteSamples_Examples.ipynb

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -331,8 +331,9 @@
331331
"## Additional query parameters\n",
332332
"\n",
333333
"Several parameters narrow the results further. The allowable values for the\n",
334-
"categorical ones come from `get_codes`. Note that `get_codes` returns a plain\n",
335-
"`DataFrame` (no metadata tuple).\n",
334+
"categorical ones come from `get_codes`, which — like the other `waterdata`\n",
335+
"functions — returns a `(DataFrame, metadata)` tuple; we unpack it and keep the\n",
336+
"DataFrame.\n",
336337
"\n",
337338
"### `siteTypeCode` / `siteTypeName`"
338339
]
@@ -344,7 +345,7 @@
344345
"metadata": {},
345346
"outputs": [],
346347
"source": [
347-
"site_type_info = waterdata.get_codes(code_service=\"sitetype\")\n",
348+
"site_type_info, _ = waterdata.get_codes(code_service=\"sitetype\")\n",
348349
"site_type_info[[\"typeCode\", \"typeLongName\"]].head(10)"
349350
]
350351
},
@@ -365,7 +366,8 @@
365366
"metadata": {},
366367
"outputs": [],
367368
"source": [
368-
"waterdata.get_codes(code_service=\"samplemedia\")[\"activityMedia\"].tolist()"
369+
"media, _ = waterdata.get_codes(code_service=\"samplemedia\")\n",
370+
"media[\"activityMedia\"].tolist()"
369371
]
370372
},
371373
{
@@ -386,7 +388,8 @@
386388
"metadata": {},
387389
"outputs": [],
388390
"source": [
389-
"waterdata.get_codes(code_service=\"characteristicgroup\")[\"characteristicGroup\"].tolist()"
391+
"char_groups, _ = waterdata.get_codes(code_service=\"characteristicgroup\")\n",
392+
"char_groups[\"characteristicGroup\"].tolist()"
390393
]
391394
},
392395
{
@@ -407,7 +410,7 @@
407410
"metadata": {},
408411
"outputs": [],
409412
"source": [
410-
"characteristic_info = waterdata.get_codes(code_service=\"characteristics\")\n",
413+
"characteristic_info, _ = waterdata.get_codes(code_service=\"characteristics\")\n",
411414
"print(\"unique characteristic names:\")\n",
412415
"print(characteristic_info[\"characteristicName\"].drop_duplicates().head().tolist())\n",
413416
"print(\"\\nexample USGS parameter codes:\")\n",
@@ -432,7 +435,8 @@
432435
"metadata": {},
433436
"outputs": [],
434437
"source": [
435-
"waterdata.get_codes(code_service=\"observedproperty\")[\"observedProperty\"].head().tolist()"
438+
"observed, _ = waterdata.get_codes(code_service=\"observedproperty\")\n",
439+
"observed[\"observedProperty\"].head().tolist()"
436440
]
437441
},
438442
{

demos/USGS_WaterData_Introduction_Examples.ipynb

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -237,9 +237,11 @@
237237
"source": [
238238
"### Time series & combined metadata\n",
239239
"\n",
240-
"`get_combined_metadata` merges time-series metadata\n",
241-
"(`get_time_series_metadata`) and field-measurement metadata by site, telling you\n",
242-
"which time series a site offers and the span of each."
240+
"`get_combined_metadata` joins the monitoring-location catalog\n",
241+
"(`get_monitoring_locations`) with the time-series metadata\n",
242+
"(`get_time_series_metadata`), returning one row per available time series with\n",
243+
"both the site attributes and the series' period of record — a convenient \"what\n",
244+
"data is available\" view."
243245
]
244246
},
245247
{

demos/hydroshare/USGS_WaterData_GroundwaterLevels_Examples.ipynb

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# USGS dataretrieval Python Package `get_field_measurements()` Examples\n",
7+
"# USGS dataretrieval Python Package Groundwater-Level `get_field_measurements()` Examples\n",
88
"\n",
99
"This notebook provides examples of using the Python dataretrieval package to retrieve groundwater level field measurements for a United States Geological Survey (USGS) monitoring location. The dataretrieval package provides a collection of functions to get data from the USGS Water Data API and other online sources of hydrology and water quality data."
1010
]
@@ -152,9 +152,11 @@
152152
"metadata": {},
153153
"outputs": [],
154154
"source": [
155+
"# This site reports several quantities (depth below land surface as well as\n",
156+
"# water-surface elevations above NGVD29/NAVD88), so use a datum-neutral label.\n",
155157
"ax = data[0][[\"time\", \"value\"]].plot(x=\"time\", y=\"value\", style=\".\")\n",
156158
"ax.set_xlabel(\"Date\")\n",
157-
"ax.set_ylabel(\"Water Level (feet below land surface)\")"
159+
"ax.set_ylabel(\"Water level (ft)\")"
158160
]
159161
},
160162
{
@@ -233,9 +235,11 @@
233235
"data3 = waterdata.get_field_measurements(monitoring_location_id=\"USGS-425957088141001\")\n",
234236
"print(\"Retrieved \" + str(len(data3[0])) + \" data values.\")\n",
235237
"\n",
236-
"# Print the date/time index values, which show up as NaT because\n",
237-
"# the dates can't be converted to a date/time data type\n",
238-
"print(data3[0].index)"
238+
"# Observation dates live in the 'time' column (the data frame uses a plain\n",
239+
"# integer index). Where the original record gave only a year or a year and\n",
240+
"# month, the Water Data API normalizes the value to a UTC timestamp with the\n",
241+
"# missing day/time defaulted, so these appear as ordinary timestamps.\n",
242+
"print(data3[0][\"time\"].head(10))"
239243
]
240244
},
241245
{

demos/hydroshare/USGS_WaterData_Measurements_Examples.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# USGS dataretrieval Python Package `get_field_measurements()` Examples\n",
7+
"# USGS dataretrieval Python Package Surface-Water `get_field_measurements()` Examples\n",
88
"\n",
99
"This notebook provides examples of using the Python dataretrieval package to retrieve surface water field measurement data for a United States Geological Survey (USGS) monitoring location. The dataretrieval package provides a collection of functions to get data from the USGS Water Data API and other online sources of hydrology and water quality data."
1010
]

demos/hydroshare/USGS_WaterData_Samples_Examples.ipynb

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@
5151
"source": [
5252
"### Basic Usage\n",
5353
"\n",
54-
"The dataretrieval package has several functions that allow you to retrieve data from different web services. This example uses the `get_samples()` function to retrieve water quality sample data for USGS monitoring locations from Samples. The following arguments are supported:\n",
54+
"The dataretrieval package has several functions that allow you to retrieve data from different web services. This example uses the `get_samples()` function to retrieve water quality sample data for USGS monitoring locations from Samples. The allowable values for the categorical arguments below come from `waterdata.get_codes()` (see the *Discrete water-quality samples* notebook). The following arguments are supported:\n",
5555
"\n",
5656
"* **ssl_check** : boolean, optional\n",
5757
" Check the SSL certificate.\n",
@@ -72,8 +72,7 @@
7272
" organizations - \"organization\", \"count\"\n",
7373
"* **activityMediaName** : string or list of strings, optional\n",
7474
" Name or code indicating environmental medium in which sample was taken.\n",
75-
" Check the `activityMediaName_lookup()` function in this module for all\n",
76-
" possible inputs.\n",
75+
" Use `get_codes(code_service=\"samplemedia\")` for all possible inputs.\n",
7776
" Example: \"Water\".\n",
7877
"* **activityStartDateLower** : string, optional\n",
7978
" The start date if using a date range. Takes the format YYYY-MM-DD.\n",
@@ -90,16 +89,16 @@
9089
" Example: \"Sample-Routine, regular\".\n",
9190
"* **characteristicGroup** : string or list of strings, optional\n",
9291
" Characteristic group is a broad category of characteristics\n",
93-
" describing one or more results. Check the `characteristicGroup_lookup()`\n",
94-
" function in this module for all possible inputs.\n",
92+
" describing one or more results. Use\n",
93+
" `get_codes(code_service=\"characteristicgroup\")` for all possible inputs.\n",
9594
" Example: \"Organics, PFAS\"\n",
9695
"* **characteristic** : string or list of strings, optional\n",
9796
" Characteristic is a specific category describing one or more results.\n",
98-
" Check the `characteristic_lookup()` function in this module for all\n",
99-
" possible inputs.\n",
97+
" Use `get_codes(code_service=\"characteristics\")` for all possible inputs.\n",
10098
" Example: \"Suspended Sediment Discharge\"\n",
10199
"* **characteristicUserSupplied** : string or list of strings, optional\n",
102100
" A user supplied characteristic name describing one or more results.\n",
101+
" Use `get_codes(code_service=\"observedproperty\")` for all possible inputs.\n",
103102
"* **boundingBox**: list of four floats, optional\n",
104103
" Filters on the the associated monitoring location's point location\n",
105104
" by checking if it is located within the specified geographic area. \n",
@@ -116,27 +115,22 @@
116115
"* **countryFips** : string or list of strings, optional\n",
117116
" Example: \"US\" (United States)\n",
118117
"* **stateFips** : string or list of strings, optional\n",
119-
" Check the `stateFips_lookup()` function in this module for all\n",
120-
" possible inputs.\n",
121118
" Example: \"US:15\" (United States: Hawaii)\n",
122119
"* **countyFips** : string or list of strings, optional\n",
123-
" Check the `countyFips_lookup()` function in this module for all\n",
124-
" possible inputs.\n",
125120
" Example: \"US:15:001\" (United States: Hawaii, Hawaii County)\n",
126121
"* **siteTypeCode** : string or list of strings, optional\n",
127-
" An abbreviation for a certain site type. Check the `siteType_lookup()`\n",
128-
" function in this module for all possible inputs.\n",
122+
" An abbreviation for a certain site type. Use\n",
123+
" `get_codes(code_service=\"sitetype\")` for all possible inputs.\n",
129124
" Example: \"GW\" (Groundwater site)\n",
130125
"* **siteTypeName** : string or list of strings, optional\n",
131-
" A full name for a certain site type. Check the `siteType_lookup()`\n",
132-
" function in this module for all possible inputs.\n",
126+
" A full name for a certain site type. Use\n",
127+
" `get_codes(code_service=\"sitetype\")` for all possible inputs.\n",
133128
" Example: \"Well\"\n",
134129
"* **usgsPCode** : string or list of strings, optional\n",
135130
" 5-digit number used in the US Geological Survey computerized\n",
136131
" data system, National Water Information System (NWIS), to\n",
137-
" uniquely identify a specific constituent. Check the \n",
138-
" `characteristic_lookup()` function in this module for all possible\n",
139-
" inputs.\n",
132+
" uniquely identify a specific constituent. Use\n",
133+
" `get_codes(code_service=\"characteristics\")` for all possible inputs.\n",
140134
" Example: \"00060\" (Discharge, cubic feet per second)\n",
141135
"* **hydrologicUnit** : string or list of strings, optional\n",
142136
" Max 12-digit number used to describe a hydrologic unit.\n",
@@ -300,7 +294,7 @@
300294
"source": [
301295
"#### Example 4: Retrieve water quality sample data for one site and convert to a wide format\n",
302296
"\n",
303-
"Note that the USGS Samples database returns multiple parameters in a \"long\" format: each row in the resulting table represents a single observation of a single parameter. Furthermore, every observation has 181 fields of metadata. However, if you wanted to place your water quality data into a \"wide\" format, where each column represents a water quality parameter code, the code below details one solution."
297+
"Note that the USGS Samples database returns multiple parameters in a \"long\" format: each row in the resulting table represents a single observation of a single parameter. Furthermore, every observation comes with more than 180 fields of metadata (the default `fullphyschem` profile returns 187 columns). However, if you wanted to place your water quality data into a \"wide\" format, where each column represents a water quality parameter code, the code below details one solution."
304298
]
305299
},
306300
{

demos/hydroshare/USGS_WaterData_SiteInfo_Examples.ipynb

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -172,8 +172,9 @@
172172
"metadata": {},
173173
"outputs": [],
174174
"source": [
175-
"# Get the site information for a state\n",
176-
"siteINFO_state = waterdata.get_monitoring_locations(state_code=\"UT\")\n",
175+
"# Get the site information for a state. state_code is a two-digit ANSI code;\n",
176+
"# 49 is Utah. (The postal abbreviation \"UT\" returns no results.)\n",
177+
"siteINFO_state = waterdata.get_monitoring_locations(state_code=\"49\")\n",
177178
"display(siteINFO_state[0])"
178179
]
179180
},

demos/hydroshare/USGS_WaterData_SiteInventory_Examples.ipynb

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,9 @@
148148
"cell_type": "markdown",
149149
"metadata": {},
150150
"source": [
151-
"#### Example 3: Retrieve information for a single monitoring location"
151+
"#### Example 3: Retrieve a single monitoring location without geometry\n",
152+
"\n",
153+
"Pass `skip_geometry=True` to get a plain `pandas.DataFrame` (no `geometry` column) instead of a `geopandas.GeoDataFrame`."
152154
]
153155
},
154156
{
@@ -157,8 +159,11 @@
157159
"metadata": {},
158160
"outputs": [],
159161
"source": [
160-
"oneSite = waterdata.get_monitoring_locations(monitoring_location_id=\"USGS-05114000\")\n",
161-
"display(oneSite[0])"
162+
"oneSite_nogeom = waterdata.get_monitoring_locations(\n",
163+
" monitoring_location_id=\"USGS-05114000\", skip_geometry=True\n",
164+
")\n",
165+
"print(\"geometry column present:\", \"geometry\" in oneSite_nogeom[0].columns)\n",
166+
"display(oneSite_nogeom[0])"
162167
]
163168
},
164169
{

demos/hydroshare/USGS_WaterData_UnitValues_Examples.ipynb

Lines changed: 2 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -181,7 +181,7 @@
181181
"\n",
182182
"#### Example 2: Get unit values for an individual monitoring location and parameter between a start and end date.\n",
183183
"\n",
184-
"NOTE: By default, start and end date are evaluated as local time, and the result is returned with the timestamps in the local time of the monitoring location."
184+
"NOTE: By default, the start and end dates are interpreted in the monitoring location's local time. Regardless of the input time zone, the returned `time` column is tz-aware UTC (see Example 4 for supplying UTC input explicitly)."
185185
]
186186
},
187187
{
@@ -228,7 +228,7 @@
228228
"source": [
229229
"#### Example 4: Retrieve data using UTC times\n",
230230
"\n",
231-
"NOTE: Adding 'Z' to the input time parameters indicates that they are in UTC rather than local time. The time stamps associated with the data returned are still in the local time of the USGS monitoring location."
231+
"NOTE: Adding 'Z' to the input time parameters indicates that they are in UTC rather than local time. The returned timestamps are tz-aware UTC in either case."
232232
]
233233
},
234234
{
@@ -267,29 +267,6 @@
267267
"print(\"Retrieved \" + str(len(discharge_multisite[0])) + \" data values.\")\n",
268268
"display(discharge_multisite[0])"
269269
]
270-
},
271-
{
272-
"cell_type": "markdown",
273-
"metadata": {},
274-
"source": [
275-
"The following example requests the same two-location data as the previous example."
276-
]
277-
},
278-
{
279-
"cell_type": "code",
280-
"execution_count": null,
281-
"metadata": {},
282-
"outputs": [],
283-
"source": [
284-
"discharge_multisite = waterdata.get_continuous(\n",
285-
" monitoring_location_id=[\"USGS-04024430\", \"USGS-04024000\"],\n",
286-
" parameter_code=parameterCode,\n",
287-
" time=\"2013-10-01/2013-10-01\",\n",
288-
" \n",
289-
")\n",
290-
"print(\"Retrieved \" + str(len(discharge_multisite[0])) + \" data values.\")\n",
291-
"display(discharge_multisite[0])"
292-
]
293270
}
294271
],
295272
"metadata": {

0 commit comments

Comments
 (0)