You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor(errors)!: a lean, idiomatic DataRetrievalError taxonomy
Every request failure raises a subclass of DataRetrievalError, so a caller can
handle any of them with a single `except dataretrieval.DataRetrievalError`. The
taxonomy stays small -- it adds only what the underlying httpx exceptions can't
express:
DataRetrievalError(Exception)
|- HTTPError # .status_code -- the server returned an error status
| '- TransientError # .retry_after -- retryable (429 / 5xx)
| |- RateLimited # 429
| '- ServiceUnavailable # 5xx
|- RequestTooLarge # the request can't fit
| |- URLTooLong # 414 / client-side over-long URL
| '- Unchunkable # the Water Data chunker can't split the call
'- NoDataError # a 200 response with no data
One factory -- error_for_status(status, message, *, retry_after) -- maps a
status to its type, and every request path routes through it (the legacy
`query` path, the Water Data chunker, nldi, nadp, streamstats), so a given
status surfaces as the same type everywhere. A fatal 4xx is a generic HTTPError
carrying .status_code (inspect the code rather than a class per code). The
chunker keys retry/resume on TransientError; connection-level failures
(timeouts, DNS) surface as httpx exceptions on the single-shot paths. The typed
errors are picklable, so they survive a pickle / deepcopy back from a
multiprocessing / lithops worker (a chunk-interruption error sheds its live
resume handle to make the trip).
A too-long-URL status (413 / 414) on the legacy `query` path keeps the
actionable "split your query" remediation message (the same one the client-side
over-long-URL case raises), rather than degrading to a bare HTTP-status line.
BREAKING CHANGES
- Request failures raise typed DataRetrievalError subclasses instead of bare
ValueError / RuntimeError / httpx.HTTPStatusError. The exceptions root only at
DataRetrievalError(Exception) and no longer also inherit ValueError /
RuntimeError -- catch DataRetrievalError (or a subclass), not the builtins.
- A fatal 4xx raises HTTPError (read .status_code); there are no per-code types.
- The empty-result error is renamed NoSitesError -> NoDataError (it is raised
from the shared query path for any module, not just NWIS "sites"). NoSitesError
stays as a deprecated alias and will be removed in a future release.
Also adds a dataretrieval.exceptions API docs page and a NEWS.md changelog entry.
mypy --strict clean; ruff clean; full suite green (487 passed, 2 skipped); the
Water Data chunker's resume tests pass unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: NEWS.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,5 @@
1
+
**06/03/2026:** The request-error hierarchy is now unified. Every module (`nwis`, `wqp`, `nldi`, `waterdata`, `nadp`, `streamstats`) raises a subclass of `dataretrieval.DataRetrievalError` on a failed request, so a single `except dataretrieval.DataRetrievalError` spans them all. An HTTP error status surfaces as an `HTTPError` carrying `.status_code` (inspect it to branch on a specific code); the retryable 429/5xx subset is `TransientError` (`RateLimited` / `ServiceUnavailable`, carrying `.retry_after`); and a request too large to satisfy is a `RequestTooLarge` (`URLTooLong` for an over-long single request, `Unchunkable` when the Water Data chunker cannot split a call small enough). Connection-level failures (timeouts, DNS) still surface as `httpx` exceptions on the single-shot paths. **Breaking change:** these exceptions no longer multiply-inherit a built-in — code that caught request failures with `except ValueError` or `except RuntimeError` should switch to `except dataretrieval.DataRetrievalError` (or a specific subclass). The error raised on a 200-but-empty result, formerly `NoSitesError`, is renamed `NoDataError` (the old name leaked NWIS-era "sites" terminology and the condition is general); `NoSitesError` remains as a deprecated alias and will be removed in a future release.
2
+
1
3
**05/17/2026:** The OGC `waterdata` getters (`get_daily`, `get_continuous`, `get_field_measurements`, and the rest of the multi-value-capable functions) now transparently chunk requests whose URLs would otherwise exceed the server's ~8 KB byte limit.
2
4
3
5
**05/16/2026:** Fixed silent truncation in the paginated `waterdata` request loops (`_walk_pages` and `get_stats_data`). Mid-pagination failures (HTTP 429, 5xx, network error) were previously swallowed — pagination would quietly stop and the function would return whatever rows it had collected, leaving callers with truncated DataFrames they had no way to detect. The loops now status-check every page like the initial request and raise `RuntimeError` on any failure, with the upstream exception chained as `__cause__` and a short menu of recovery actions (wait and retry, reduce the request, or obtain an API token) in the message. **Behavior change**: callers that previously consumed partial DataFrames on transient upstream blips will now see an exception; retry the call (possibly with a smaller `limit` or narrower query).
0 commit comments