|
4 | 4 | "cell_type": "markdown", |
5 | 5 | "metadata": {}, |
6 | 6 | "source": [ |
7 | | - "# USGS dataretrieval Python Package `get_dv()` Examples\n", |
| 7 | + "# USGS dataretrieval Python Package `get_daily()` Examples\n", |
8 | 8 | "\n", |
9 | | - "This notebook provides examples of using the Python dataretrieval package to retrieve daily streamflow data for a United States Geological Survey (USGS) monitoring site. The dataretrieval package provides a collection of functions to get data from the USGS National Water Information System (NWIS) and other online sources of hydrology and water quality data, including the United States Environmental Protection Agency (USEPA)." |
| 9 | + "This notebook provides examples of using the Python dataretrieval package to retrieve daily streamflow data for a United States Geological Survey (USGS) monitoring location. The dataretrieval package provides a collection of functions to get data from the USGS Water Data API and other online sources of hydrology and water quality data." |
10 | 10 | ] |
11 | 11 | }, |
12 | 12 | { |
|
39 | 39 | "execution_count": null, |
40 | 40 | "metadata": {}, |
41 | 41 | "outputs": [], |
42 | | - "source": [ |
43 | | - "from IPython.display import display\n", |
44 | | - "\n", |
45 | | - "from dataretrieval import nwis\n", |
46 | | - "import dataretrieval.waterdata as waterdata" |
47 | | - ] |
| 42 | + "source": "from IPython.display import display\n\nimport dataretrieval.waterdata as waterdata" |
48 | 43 | }, |
49 | 44 | { |
50 | 45 | "cell_type": "markdown", |
51 | 46 | "metadata": {}, |
52 | 47 | "source": [ |
53 | 48 | "### Basic Usage\n", |
54 | 49 | "\n", |
55 | | - "The dataretrieval package has several functions that allow you to retrieve data from different web services. This examples uses the `get_dv()` function to retrieve daily streamflow data for a USGS monitoring site from NWIS. The following arguments are supported:\n", |
| 50 | + "The dataretrieval package has several functions that allow you to retrieve data from different web services. This example uses the `get_daily()` function to retrieve daily streamflow data for a USGS monitoring location from the USGS Water Data API. The following arguments are supported:\n", |
56 | 51 | "\n", |
57 | 52 | "Arguments (Additional arguments, if supplied, will be used as query parameters)\n", |
58 | 53 | "\n", |
59 | | - "* **sites** (string or list of strings): A list of USGS site identifiers for which to retrieve data.\n", |
60 | | - "* **parameterCd** (list of strings): A list of USGS parameter codes for which to retrieve data.\n", |
61 | | - "* **statCd** (list of strings): A list of USGS statistic codes for which to retrieve data.\n", |
62 | | - "* **start** (string): The beginning date for a period for which to retrieve data. If the waterdata parameter startDT is supplied, it will overwrite the start parameter.\n", |
63 | | - "* **end** (string): The ending date for a period for which to retrieve data. If the waterdata parameter endDT is supplied, it will overwrite the end parameter." |
| 54 | + "* **monitoring_location_id** (string or iterable of strings): A unique identifier representing a single monitoring location, formed by combining the agency code with the site number (e.g. `USGS-10109000`). Accepts a single ID or a list of IDs.\n", |
| 55 | + "* **parameter_code** (string or iterable of strings): One or more 5-digit USGS parameter codes identifying the constituent measured and its units of measure (e.g. `00060` for discharge).\n", |
| 56 | + "* **statistic_id** (string or iterable of strings): One or more codes corresponding to the statistic an observation represents (e.g. `00001` for maximum, `00003` for mean).\n", |
| 57 | + "* **time** (string): The date or interval an observation represents, following RFC 3339. May be a single date, a bounded or half-bounded interval (e.g. `2020-10-01/2021-09-30`), or an ISO 8601 duration (e.g. `P7D` for the past seven days).\n", |
| 58 | + "* **skip_geometry** (boolean): If `True`, response geometries are omitted and the returned data frame contains no spatial information." |
64 | 59 | ] |
65 | 60 | }, |
66 | 61 | { |
67 | 62 | "cell_type": "markdown", |
68 | 63 | "metadata": {}, |
69 | 64 | "source": [ |
70 | | - "Example 1: Get daily value data for a specific parameter at a single USGS NWIS monitoring site between a begin and end date." |
| 65 | + "Example 1: Get daily value data for a specific parameter at a single USGS monitoring location between a begin and end date." |
71 | 66 | ] |
72 | 67 | }, |
73 | 68 | { |
|
83 | 78 | "source": [ |
84 | 79 | "### Interpreting the Result\n", |
85 | 80 | "\n", |
86 | | - "The result of calling the `get_dv()` function is an object that contains a Pandas data frame object and an associated metadata object. The Pandas data frame contains the daily values for the observed variable and time period requested. The data frame is indexed by the dates associated with the data values.\n", |
| 81 | + "The `get_daily()` function returns a tuple containing a pandas data frame and an associated metadata object. The data frame contains the daily values for the observed variable and time period requested. It is a flat table with a default integer index; the dates associated with each observation are held in a `time` column rather than in the index.\n", |
87 | 82 | "\n", |
88 | 83 | "Once you've got the data frame, there's several useful things you can do to explore the data." |
89 | 84 | ] |
|
148 | 143 | "cell_type": "markdown", |
149 | 144 | "metadata": {}, |
150 | 145 | "source": [ |
151 | | - "The other part of the result returned from the `get_dv()` function is a metadata object that contains information about the query that was executed to return the data. For example, you can access the URL that was assembled to retrieve the requested data from the USGS web service. The USGS web service responses contain a descriptive header that defines and can be helpful in interpreting the contents of the response." |
| 146 | + "The other part of the result returned from the `get_daily()` function is a metadata object that contains information about the query that was executed to return the data. For example, you can access the URL that was assembled to retrieve the requested data from the USGS Water Data API." |
152 | 147 | ] |
153 | 148 | }, |
154 | 149 | { |
155 | 150 | "cell_type": "code", |
156 | 151 | "execution_count": null, |
157 | 152 | "metadata": {}, |
158 | 153 | "outputs": [], |
159 | | - "source": [ |
160 | | - "print(\n", |
161 | | - " \"The query URL used to retrieve the data from NWIS was: \" + dailyStreamflow[1].url\n", |
162 | | - ")" |
163 | | - ] |
| 154 | + "source": "print(\n \"The query URL used to retrieve the data from the Water Data API was: \" + dailyStreamflow[1].url\n)" |
164 | 155 | }, |
165 | 156 | { |
166 | 157 | "cell_type": "markdown", |
167 | 158 | "metadata": {}, |
168 | 159 | "source": [ |
169 | 160 | "### Additional Examples\n", |
170 | 161 | "\n", |
171 | | - "Example 2: Get daily mean and max discharge and temperature values for a site between a begin and end date.\n", |
| 162 | + "Example 2: Get daily mean and max discharge and temperature values for a location between a begin and end date.\n", |
172 | 163 | "\n", |
173 | 164 | "Parameter Code: 00010 = temperature, 00060 = discharge\n", |
174 | 165 | "See https://help.waterdata.usgs.gov/codes-and-parameters/parameters\n", |
175 | 166 | "\n", |
176 | 167 | "Statistic Code: 00001 = Maximum, 00003 = Mean\n", |
177 | 168 | "See https://help.waterdata.usgs.gov/stat_code\n", |
178 | 169 | "\n", |
179 | | - "NOTE: There's not full overlap in the availability of data for temperature and discharge for both statistics at this site. When data for one statistic is not available, a \"NaN\" value is returned in the data frame." |
| 170 | + "NOTE: There's not full overlap in the availability of data for temperature and discharge for both statistics at this location." |
180 | 171 | ] |
181 | 172 | }, |
182 | 173 | { |
|
190 | 181 | "cell_type": "markdown", |
191 | 182 | "metadata": {}, |
192 | 183 | "source": [ |
193 | | - "Example 3: Get daily mean and max discharge and temperature values for multiple sites between a begin and end date" |
| 184 | + "Example 3: Get daily mean and max discharge and temperature values for multiple monitoring locations between a begin and end date." |
194 | 185 | ] |
195 | 186 | }, |
196 | 187 | { |
|
204 | 195 | "cell_type": "markdown", |
205 | 196 | "metadata": {}, |
206 | 197 | "source": [ |
207 | | - "The following example is the same as the previous example but with multi index turned off (multi_index=False)" |
| 198 | + "Like all of the waterdata getters, `get_daily()` returns a flat data frame with a default integer index regardless of how many monitoring locations are requested. Each row carries its own `monitoring_location_id`, `parameter_code`, `statistic_id`, and `time`, so multi-location results can be filtered or pivoted as needed." |
208 | 199 | ] |
209 | 200 | }, |
210 | 201 | { |
|
218 | 209 | "cell_type": "markdown", |
219 | 210 | "metadata": {}, |
220 | 211 | "source": [ |
221 | | - "Example 4: Test for a site that is not active - returns an empty DataFrame." |
| 212 | + "Example 4: Query a location that has no matching data for the requested period - returns an empty DataFrame." |
222 | 213 | ] |
223 | 214 | }, |
224 | 215 | { |
|
0 commit comments