Skip to content

Commit 61d51f2

Browse files
thodson-usgsclaude
andcommitted
docs: migrate documentation and demo notebooks to the Water Data API
Move the user guide, example pages, and demo notebooks off the deprecated `nwis` module onto the `waterdata` module (USGS Water Data API): - userguide/timeconventions: rewritten for the Water Data API datetime model (`time` is a column; tz-aware UTC for continuous data, tz-naive for daily), using the `.dt.tz_convert` idiom. - examples/readme_examples + siteinfo_examples: ported to get_continuous, get_monitoring_locations, and get_time_series_metadata. - demo notebooks: migrated to the waterdata API (USGS-prefixed monitoring location ids; get_daily, get_continuous, get_field_measurements, get_peaks, get_ratings, get_stats_*, get_samples), with titles, narratives, and argument lists rewritten to match the code. WaterUse stays on the legacy nwis.get_water_use with a note (no Water Data API equivalent yet). - renamed the demo notebooks to lead with their module (USGS_WaterData_*, USGS_NLDI, USGS_NWIS_WaterUse) and renamed the peak-flow trends demo to peak_streamflow_trends. - nldi: mark the nationwide get_features_by_data_source example `# doctest: +SKIP` (it streams every nwissite feature and hangs the build). - fixed pre-existing Sphinx build warnings: get_channel bullet-list RST, the duplicated get_monitoring_locations link, the NWIS_Metadata duplicate object description, the deprecated display_version theme option, the missing _static directory, and a WaterData_demo heading level. Notebooks are kept output-free (nbsphinx executes them at build time). `sphinx-build -b doctest` passes and every notebook executes against the live API with no cell errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c450c40 commit 61d51f2

58 files changed

Lines changed: 2091 additions & 2456 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

dataretrieval/nldi.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -296,8 +296,9 @@ def get_features_by_data_source(data_source: str) -> gpd.GeoDataFrame:
296296
--------
297297
.. doctest::
298298
299-
>>> # Get features for a feature wqp and feature_id USGS-01031500
300-
>>> gdf = dataretrieval.nldi.get_features_by_data_source(
299+
>>> # "nwissite" returns every NWIS site nationwide, so this example is
300+
>>> # skipped in the doctest build to avoid the (very large) download.
301+
>>> gdf = dataretrieval.nldi.get_features_by_data_source( # doctest: +SKIP
301302
... data_source="nwissite"
302303
... )
303304
"""

dataretrieval/nwis.py

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1125,12 +1125,11 @@ class NWIS_Metadata(BaseMetadata):
11251125
Response headers
11261126
comments: str | None
11271127
Metadata comments, if any
1128-
site_info: tuple[pd.DataFrame, NWIS_Metadata] | None
1129-
Site information if the query included `site_no`, `sites`, `stateCd`,
1130-
`huc`, `countyCd` or `bBox`. `site_no` is preferred over `sites` if
1131-
both are present.
1132-
variable_info: None
1133-
Deprecated. Accessing variable_info via NWIS_Metadata is deprecated.
1128+
1129+
Notes
1130+
-----
1131+
``site_info`` and ``variable_info`` are exposed as properties (documented
1132+
below) rather than plain attributes.
11341133
11351134
"""
11361135

@@ -1164,7 +1163,12 @@ def __init__(self, response, **parameters) -> None:
11641163

11651164
@property
11661165
def site_info(self) -> tuple[pd.DataFrame, BaseMetadata] | None:
1167-
"""
1166+
"""Site information for the query.
1167+
1168+
Populated when the query included ``site_no``, ``sites``, ``stateCd``,
1169+
``huc``, ``countyCd`` or ``bBox`` (``site_no`` is preferred over
1170+
``sites`` if both are present); ``None`` otherwise.
1171+
11681172
Return
11691173
------
11701174
df: ``pandas.DataFrame``

dataretrieval/waterdata/api.py

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -554,11 +554,11 @@ def get_monitoring_locations(
554554
county_code : string or iterable of strings, optional
555555
The code for the county or county equivalent (parish, borough, etc.) in which
556556
the monitoring location is located. A `list of codes
557-
<https://help.waterdata.usgs.gov/code/county_query?fmt=html>`_ is available.
557+
<https://help.waterdata.usgs.gov/code/county_query?fmt=html>`__ is available.
558558
county_name : string or iterable of strings, optional
559559
The name of the county or county equivalent (parish, borough, etc.) in which
560560
the monitoring location is located. A `list of codes
561-
<https://help.waterdata.usgs.gov/code/county_query?fmt=html>`_ is available.
561+
<https://help.waterdata.usgs.gov/code/county_query?fmt=html>`__ is available.
562562
minor_civil_division_code : string or iterable of strings, optional
563563
Codes for primary governmental or administrative divisions of the county or
564564
county equivalent in which the monitoring location is located.
@@ -2751,9 +2751,10 @@ def get_channel(
27512751
* A date-time: "2018-02-12T23:20:50Z"
27522752
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
27532753
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or
2754-
"../2018-03-18T12:31:12Z"
2755-
* Duration objects: "P1M" for data from the past month or "PT36H" for
2756-
the last 36 hours
2754+
"../2018-03-18T12:31:12Z"
2755+
* Duration objects: "P1M" for data from the past month or "PT36H"
2756+
for the last 36 hours
2757+
27572758
channel_name : string or iterable of strings, optional
27582759
The channel name.
27592760
channel_flow : string or iterable of strings, optional
@@ -2799,9 +2800,9 @@ def get_channel(
27992800
* A date-time: "2018-02-12T23:20:50Z"
28002801
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
28012802
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or
2802-
"../2018-03-18T12:31:12Z"
2803+
"../2018-03-18T12:31:12Z"
28032804
* Duration objects: "P1M" for data from the past month or "PT36H" for the
2804-
last 36 hours
2805+
last 36 hours
28052806
28062807
Only features that have a last_modified that intersects the value of
28072808
datetime are selected.

demos/R Python Vignette equivalents.ipynb

Lines changed: 69 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,10 @@
3636
"siteNumbers <- c(\"01491000\",\"01645000\")\n",
3737
"siteINFO <- readNWISsite(siteNumbers)\n",
3838
"\"\"\"\n",
39-
"siteNumbers = [\"01491000\", \"01645000\"]\n",
40-
"siteINFO, md = nwis.get_iv(sites=siteNumbers)"
39+
"siteNumbers = [\"USGS-01491000\", \"USGS-01645000\"]\n",
40+
"siteINFO, md = waterdata.get_monitoring_locations(\n",
41+
" monitoring_location_id=siteNumbers, skip_geometry=True\n",
42+
")"
4143
]
4244
},
4345
{
@@ -52,8 +54,9 @@
5254
"dailyDataAvailable <- whatNWISdata(siteNumbers,\n",
5355
" service=\"dv\", statCd=\"00003\")\n",
5456
"\"\"\"\n",
55-
"\n",
56-
"dailyDataAvailable, md = nwis.get_dv(sites=siteNumbers, statCd=\"00003\")"
57+
"dailyDataAvailable, md = waterdata.get_time_series_metadata(\n",
58+
" monitoring_location_id=siteNumbers, statistic_id=\"00003\", skip_geometry=True\n",
59+
")"
5760
]
5861
},
5962
{
@@ -66,20 +69,19 @@
6669
"# Choptank River near Greensboro, MD:\n",
6770
"siteNumber <- \"01491000\"\n",
6871
"parameterCd <- \"00060\" # Discharge\n",
69-
"startDate <- \"2009-10-01\" \n",
70-
"endDate <- \"2012-09-30\" \n",
72+
"startDate <- \"2009-10-01\"\n",
73+
"endDate <- \"2012-09-30\"\n",
7174
"\n",
72-
"discharge <- readNWISdv(siteNumber, \n",
73-
" parameterCd, startDate, endDate)\n",
75+
"discharge <- readNWISdv(siteNumber, parameterCd, startDate, endDate)\n",
7476
"\"\"\"\n",
7577
"# Choptank River near Greensboro, MD:\n",
76-
"siteNumber = \"01491000\"\n",
78+
"siteNumber = \"USGS-01491000\"\n",
7779
"parameterCd = \"00060\" # Discharge\n",
78-
"startDate = \"2009-10-01\"\n",
79-
"endDate = \"2012-09-30\"\n",
8080
"\n",
81-
"discharge, md = nwis.get_dv(\n",
82-
" sites=siteNumber, parameterCd=parameterCd, start=startDate, end=endDate\n",
81+
"discharge, md = waterdata.get_daily(\n",
82+
" monitoring_location_id=siteNumber,\n",
83+
" parameter_code=parameterCd,\n",
84+
" time=\"2009-10-01/2012-09-30\",\n",
8385
")"
8486
]
8587
},
@@ -92,25 +94,21 @@
9294
"\"\"\"\n",
9395
"siteNumber <- \"01491000\"\n",
9496
"parameterCd <- c(\"00010\",\"00060\") # Temperature and discharge\n",
95-
"statCd <- c(\"00001\",\"00003\") # Mean and maximum\n",
97+
"statCd <- c(\"00001\",\"00003\") # Maximum and mean\n",
9698
"startDate <- \"2012-01-01\"\n",
9799
"endDate <- \"2012-05-01\"\n",
98100
"\n",
99-
"temperatureAndFlow <- readNWISdv(siteNumber, parameterCd, \n",
100-
" startDate, endDate, statCd=statCd)\n",
101+
"temperatureAndFlow <- readNWISdv(siteNumber, parameterCd, startDate, endDate, statCd=statCd)\n",
101102
"\"\"\"\n",
102-
"siteNumber = \"01491000\"\n",
103+
"siteNumber = \"USGS-01491000\"\n",
103104
"parameterCd = [\"00010\", \"00060\"] # Temperature and discharge\n",
104-
"statCd = [\"00001\", \"00003\"] # Mean and maximum\n",
105-
"startDate = \"2012-01-01\"\n",
106-
"endDate = \"2012-05-01\"\n",
105+
"statisticId = [\"00001\", \"00003\"] # Maximum and mean\n",
107106
"\n",
108-
"temperatureAndFlow, md = nwis.get_dv(\n",
109-
" sites=siteNumber,\n",
110-
" parameterCd=parameterCd,\n",
111-
" start=startDate,\n",
112-
" end=endDate,\n",
113-
" statCd=statCd,\n",
107+
"temperatureAndFlow, md = waterdata.get_daily(\n",
108+
" monitoring_location_id=siteNumber,\n",
109+
" parameter_code=parameterCd,\n",
110+
" statistic_id=statisticId,\n",
111+
" time=\"2012-01-01/2012-05-01\",\n",
114112
")"
115113
]
116114
},
@@ -122,17 +120,17 @@
122120
"source": [
123121
"\"\"\"\n",
124122
"parameterCd <- \"00060\" # Discharge\n",
125-
"startDate <- \"2012-05-12\" \n",
126-
"endDate <- \"2012-05-13\" \n",
127-
"dischargeUnit <- readNWISuv(siteNumber, parameterCd, \n",
128-
" startDate, endDate)\n",
123+
"startDate <- \"2012-05-12\"\n",
124+
"endDate <- \"2012-05-13\"\n",
125+
"dischargeUnit <- readNWISuv(siteNumber, parameterCd, startDate, endDate)\n",
129126
"\"\"\"\n",
130-
"siteNumber = \"01491000\"\n",
127+
"siteNumber = \"USGS-01491000\"\n",
131128
"parameterCd = \"00060\" # Discharge\n",
132-
"startDate = \"2012-05-12\"\n",
133-
"endDate = \"2012-05-13\"\n",
134-
"dischargeUnit, md = nwis.get_iv(\n",
135-
" sites=siteNumber, parameterCd=parameterCd, start=startDate, end=endDate\n",
129+
"\n",
130+
"dischargeUnit, md = waterdata.get_continuous(\n",
131+
" monitoring_location_id=siteNumber,\n",
132+
" parameter_code=parameterCd,\n",
133+
" time=\"2012-05-12/2012-05-13\",\n",
136134
")"
137135
]
138136
},
@@ -147,18 +145,17 @@
147145
"parameterCd <- c(\"00618\",\"71851\")\n",
148146
"startDate <- \"1985-10-01\"\n",
149147
"endDate <- \"2012-09-30\"\n",
150-
"dfLong <- read_USGS_samples(monitoringLocationIdentifier=sprintf(\"USGS-%s\", siteNumber), usgsPCode=parameterCd, \n",
151-
" activityStartDateLower=startDate, activityStartDateUpper=endDate)\n",
148+
"dfLong <- read_USGS_samples(monitoringLocationIdentifier=sprintf(\"USGS-%s\", siteNumber),\n",
149+
" usgsPCode=parameterCd, activityStartDateLower=startDate, activityStartDateUpper=endDate)\n",
152150
"\"\"\"\n",
153-
"siteNumber = \"01491000\"\n",
151+
"siteNumber = \"USGS-01491000\"\n",
154152
"parameterCd = [\"00618\", \"71851\"]\n",
155-
"startDate = \"1985-10-01\"\n",
156-
"endDate = \"2012-09-30\"\n",
153+
"\n",
157154
"dfLong, md = waterdata.get_samples(\n",
158-
" monitoringLocationIdentifier=f\"USGS-{siteNumber}\",\n",
155+
" monitoringLocationIdentifier=siteNumber,\n",
159156
" usgsPCode=parameterCd,\n",
160-
" activityStartDateLower=startDate,\n",
161-
" activityStartDateUpper=endDate,\n",
157+
" activityStartDateLower=\"1985-10-01\",\n",
158+
" activityStartDateUpper=\"2012-09-30\",\n",
162159
")"
163160
]
164161
},
@@ -172,8 +169,9 @@
172169
"siteNumber <- '01594440'\n",
173170
"peakData <- readNWISpeak(siteNumber)\n",
174171
"\"\"\"\n",
175-
"siteNumber = \"01594440\"\n",
176-
"peakData, md = nwis.get_discharge_peaks(sites=siteNumber)"
172+
"peakData, md = waterdata.get_peaks(\n",
173+
" monitoring_location_id=\"USGS-01594440\", parameter_code=\"00060\"\n",
174+
")"
177175
]
178176
},
179177
{
@@ -186,7 +184,11 @@
186184
"ratingData <- readNWISrating(siteNumber, \"base\")\n",
187185
"attr(ratingData, \"RATING\")\n",
188186
"\"\"\"\n",
189-
"ratings_data, md = nwis.get_ratings(site=\"01594440\", file_type=\"base\")"
187+
"# get_ratings returns a dict keyed by \"<id>.<file_type>.rdb\"\n",
188+
"ratings_data = waterdata.get_ratings(\n",
189+
" monitoring_location_id=\"USGS-01594440\", file_type=\"base\"\n",
190+
")\n",
191+
"list(ratings_data.keys())"
190192
]
191193
},
192194
{
@@ -198,10 +200,12 @@
198200
"\"\"\"\n",
199201
"discharge_stats <- readNWISstat(siteNumbers=c(\"02319394\"),\n",
200202
" parameterCd=c(\"00060\"),\n",
201-
" statReportType=\"annual\") \n",
203+
" statReportType=\"annual\")\n",
202204
"\"\"\"\n",
203-
"discharge_stats, md = nwis.get_stats(\n",
204-
" sites=\"02319394\", parameterCd=\"00060\", statReportType=\"annual\", statTypeCd=\"all\"\n",
205+
"discharge_stats, md = waterdata.get_stats_date_range(\n",
206+
" monitoring_location_id=\"USGS-02319394\",\n",
207+
" parameter_code=\"00060\",\n",
208+
" computation_type=\"arithmetic_mean\",\n",
205209
")"
206210
]
207211
},
@@ -211,14 +215,14 @@
211215
"metadata": {},
212216
"outputs": [],
213217
"source": [
214-
"# '''\n",
215-
"# dischargeWI <- readNWISdata(service=\"dv\",\n",
216-
"# stateCd=\"WI\",\n",
217-
"# parameterCd=\"00060\",\n",
218-
"# drainAreaMin=\"50\",\n",
219-
"# statCd=\"00003\")\n",
220-
"# '''\n",
221-
"# dischargeWI, md = nwis.get_dv(stateCd=\"WI\", parameterCd=\"00060\", drainAreaMin=\"50\", statCd=\"00003\")"
218+
"# R: readNWISdata(service=\"dv\", stateCd=\"WI\", parameterCd=\"00060\",\n",
219+
"# drainAreaMin=\"50\", statCd=\"00003\")\n",
220+
"#\n",
221+
"# The Water Data API serves daily values per monitoring location. To assemble a\n",
222+
"# state-wide set, first find the locations (optionally filtering by drainage\n",
223+
"# area) with waterdata.get_monitoring_locations(state_name=\"Wisconsin\", ...),\n",
224+
"# then pass their ids to waterdata.get_daily(parameter_code=\"00060\",\n",
225+
"# statistic_id=\"00003\")."
222226
]
223227
},
224228
{
@@ -292,21 +296,21 @@
292296
"source": [
293297
"# Embedded Metadata\n",
294298
"\n",
295-
"All service methods return the DataFrame containing requested data and Metadata as a tuple. Note, a call using get_record will only return the DataFrame to remain compatible with previous usage.\n",
299+
"Most `waterdata` and `wqp` service methods return a tuple of the requested data (a pandas DataFrame) and a metadata object.\n",
296300
"\n",
301+
"`md` is an object with the following attributes:\n",
297302
"\n",
298303
"```\n",
299-
"national, md = nwis.get_water_use()\n",
304+
"Metadata\n",
305+
" url # the URL used to query the service\n",
306+
" query_time # how long the query took\n",
307+
" header # the response headers\n",
300308
"```\n",
301309
"\n",
302-
"md is an object with the following attributes\n",
310+
"Note: USGS *water use* data has no Water Data API equivalent yet, so it remains available only through the deprecated `nwis` module:\n",
303311
"\n",
304312
"```\n",
305-
"Metadata\n",
306-
" url # the resulting url to query usgs\n",
307-
" query_time # the time it took to query usgs\n",
308-
" site_info # a method to call site_info with the site parameters supplied\n",
309-
" header # any headers attached to the response object\n",
313+
"national, md = nwis.get_water_use()\n",
310314
"```"
311315
]
312316
},

demos/WaterData_demo.ipynb

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -123,9 +123,7 @@
123123
"cell_type": "markdown",
124124
"id": "406762ab",
125125
"metadata": {},
126-
"source": [
127-
"#### Reference tables"
128-
]
126+
"source": "### Reference tables"
129127
},
130128
{
131129
"cell_type": "code",
File renamed without changes.

demos/hydroshare/USGS_dataretrieval_WaterUse_Examples.ipynb renamed to demos/hydroshare/USGS_NWIS_WaterUse_Examples.ipynb

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# USGS dataretrieval Python Package `get_water_use()` Examples\n",
7+
"# USGS dataretrieval Python Package Water Use Examples\n",
88
"\n",
9-
"This notebook provides examples of using the Python dataretrieval package to retrieve water use data. The dataretrieval package provides a collection of functions to get data from the USGS National Water Information System (NWIS) and other online sources of hydrology and water quality data, including the United States Environmental Protection Agency (USEPA)."
9+
"> **Note:** USGS water-use data has **no USGS Water Data API equivalent**. The legacy `nwis.get_water_use()` service has been decommissioned and now raises a \"defunct\" error, so the examples below are retained for historical reference only — they document the former interface and are not runnable. There is currently no `waterdata` replacement for water-use data.\n",
10+
"\n",
11+
"This notebook formerly retrieved water use data through the USGS National Water Information System (NWIS)."
1012
]
1113
},
1214
{
@@ -42,9 +44,7 @@
4244
"source": [
4345
"from IPython.display import display\n",
4446
"\n",
45-
"from dataretrieval import nwis\n",
46-
"from dataretrieval import waterdata\n",
47-
"import dataretrieval.waterdata as waterdata\n"
47+
"from dataretrieval import nwis\n"
4848
]
4949
},
5050
{

0 commit comments

Comments
 (0)