You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Migrate to httpx and add async parallel chunker
Replace ``requests`` with ``httpx`` package-wide and add an opt-in
async parallel fan-out for the multi-value chunker, gated on the
``API_USGS_CONCURRENT`` env var.
* ``httpx`` ships sync and async clients on a unified API, so the
same request shape powers both the synchronous getters callers
use today and the new ``_fan_out_async`` parallel path; the
unmaintained ``requests`` had no async story.
* ``API_USGS_CONCURRENT=1`` (default in tests) keeps the serial
``ChunkedCall.resume()`` path over one shared ``httpx.Client``.
``API_USGS_CONCURRENT=N`` (N > 1; default 16 in production) or
``unbounded`` fans the plan out through ``_fan_out_async`` over
one shared ``httpx.AsyncClient``, bounded by
``asyncio.Semaphore(N)``.
* Both paths publish their client on a ``ContextVar``
(``_chunked_session`` / ``_chunked_async_session``) so paginated
helpers downstream reuse the connection pool across every
sub-request of a chunked call.
* The parallel path preserves the same safety contracts as the
serial path: it probes the first sub-request alone to read
``x-ratelimit-remaining`` before fanning out the rest
(``RequestExceedsQuota``), and uses ``asyncio.gather(
return_exceptions=True)`` so a transient failure surfaces as a
``ChunkInterrupted`` whose ``.call`` is a ``ChunkedCall`` holding
the sparse-indexed completed sub-requests; ``exc.call.resume()``
re-issues only the unfinished ones via the sync path.
* The wrapper falls back to the serial path (with a
``UserWarning``) when ``asyncio.get_running_loop()`` returns —
so Jupyter / IPython kernels and async apps don't see a
confusing ``RuntimeError`` — and when the decorator was set up
without a ``fetch_async=`` sibling.
* Three defensive helpers smooth over httpx behaviours that
``requests`` didn't have: ``_safe_request_bytes`` swallows
``httpx.InvalidURL`` so the planner's halving loop keeps
shrinking past httpx's 64 KB URL cap; ``_safe_elapsed`` falls
back to ``timedelta(0)`` when ``.elapsed`` is missing (mock
transports); ``_set_response_url`` rewrites the URL via the
bound request, since httpx makes ``Response.url`` read-only.
Tests: ``pyproject.toml`` switches ``requests``/``requests-mock``
to ``httpx``/``pytest-httpx``; ``tests/conftest.py`` adds a
``requests_mock``-shaped shim over ``httpx_mock`` and an autouse
fixture pinning ``API_USGS_CONCURRENT=1`` so historical tests
stay on the deterministic serial path. New async-mode tests cover
the parallel fan-out, the probe-first quota check, the resumable
``ChunkInterrupted.call`` after a mid-fan-out failure, the
running-event-loop fallback, and the missing-``fetch_async``
warning.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0 commit comments