Skip to content

Commit 01c734b

Browse files
thodson-usgsclaude
andcommitted
docs(waterdata): correct interruption claims to include transport errors
Module docstring, ChunkedCall.resume() Raises, and ChunkedCall._run all listed only 429/5xx as the failures that raise ChunkInterrupted, but _classify_chunk_error also wraps bare httpx.HTTPError (ConnectError, TimeoutException, RemoteProtocolError, ...) and httpx.InvalidURL as ServiceInterrupted (chunking.py:1098). So callers who only caught the 429/5xx case per the docs could miss the transport-error path. Fix: list transport errors alongside 429/5xx in all three docstrings, and name QuotaExhausted vs ServiceInterrupted by which case maps where. Surfaced by a docs-vs-code audit; no functional change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent ead1c09 commit 01c734b

1 file changed

Lines changed: 12 additions & 10 deletions

File tree

dataretrieval/waterdata/chunking.py

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,10 @@
3030
the resumable interruption below so a multi-minute quota-window
3131
reset doesn't block the call.
3232
33-
Interruption: any mid-stream transient failure (429, 5xx) surfaces
34-
as a ``ChunkInterrupted`` subclass — ``QuotaExhausted`` for 429,
35-
``ServiceInterrupted`` for 5xx. The exception carries ``.call``, a
33+
Interruption: any mid-stream transient failure — 429, 5xx, or a bare
34+
transport error (connect/read timeout, oversize follow-up URL) — surfaces
35+
as a ``ChunkInterrupted`` subclass: ``QuotaExhausted`` for 429,
36+
``ServiceInterrupted`` for the rest. The exception carries ``.call``, a
3637
``ChunkedCall`` handle that owns the already-completed sub-request
3738
state (sparse-indexed, since gathered sub-requests complete out of
3839
order). Call ``.call.resume()`` once the underlying condition clears;
@@ -1596,11 +1597,11 @@ def resume(self) -> tuple[pd.DataFrame, Any]:
15961597
Raises
15971598
------
15981599
ChunkInterrupted
1599-
On a mid-stream transient failure
1600-
(:class:`QuotaExhausted` for 429,
1601-
:class:`ServiceInterrupted` for 5xx). The resumable handle
1602-
is on ``exc.call`` — wait for the underlying condition to
1603-
clear and call ``exc.call.resume()`` again.
1600+
On a mid-stream transient failure — 429, 5xx, or a bare
1601+
transport error: :class:`QuotaExhausted` for 429,
1602+
:class:`ServiceInterrupted` for the rest. The resumable
1603+
handle is on ``exc.call`` — wait for the underlying
1604+
condition to clear and call ``exc.call.resume()`` again.
16041605
"""
16051606
concurrency = _read_concurrency_env()
16061607
with start_blocking_portal() as portal:
@@ -1613,8 +1614,9 @@ async def _run(self, max_concurrent: int | None) -> tuple[pd.DataFrame, Any]:
16131614
16141615
Pending sub-requests (:meth:`_pending`) fan out under
16151616
``asyncio.gather`` with ``return_exceptions=True`` so completed
1616-
sub-requests survive a sibling's transient failure. On a recognized
1617-
transient (:class:`RateLimited`, :class:`ServiceUnavailable`) a
1617+
sub-requests survive a sibling's transient failure. On a
1618+
recognized transient (:class:`RateLimited`, :class:`ServiceUnavailable`,
1619+
or a bare ``httpx.HTTPError`` / ``httpx.InvalidURL``) a
16181620
:class:`ChunkInterrupted` subclass is raised carrying ``self`` on
16191621
``.call``; ``exc.call.resume()`` then re-issues only the unfinished
16201622
indices through this same runner.

0 commit comments

Comments
 (0)