Skip to content

Commit c107fc9

Browse files
thodson-usgsclaude
andauthored
fix(waterdata): materialize numpy/Series numeric params instead of str()-ing them (#308)
A numeric (_NO_NORMALIZE_PARAMS) param — water_year, year, month, day, thresholds, … — passed as a numpy array or pandas Series fell into the `args[k] = v` passthrough in _get_args without being materialized to a list. Downstream, the GET comma-join and the chunker both test `list`/`tuple`, so an ndarray/Series was neither comma-joined nor chunked: e.g. get_peaks(water_year=np.array([2020, 2021])) produced `water_year=%5B2020+2021%5D` (the array's repr) instead of `water_year=2020,2021`, which the API rejects with HTTP 400. Plain lists already worked. Split the branch so _NO_NORMALIZE_PARAMS values keep their element types (no string-normalization) but a non-string iterable is still materialized to a list of native Python scalars — `.tolist()` for numpy/pandas, `list()` for generators and other iterables — so the values comma-join in the URL, chunk, and stay JSON-serializable (no numpy reprs in args). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 316af70 commit c107fc9

2 files changed

Lines changed: 33 additions & 2 deletions

File tree

dataretrieval/waterdata/utils.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2024,9 +2024,18 @@ def _get_args(
20242024
args[k] = _as_str_list(v, k)
20252025
elif (
20262026
k in _NO_NORMALIZE_PARAMS
2027-
or isinstance(v, str)
2028-
or not isinstance(v, Iterable)
2027+
and isinstance(v, Iterable)
2028+
and not isinstance(v, str)
20292029
):
2030+
# Numeric params (water_year, bbox, thresholds, …) keep their
2031+
# element types — no string-normalization — but a non-string
2032+
# iterable (numpy array, pandas Series, generator) is materialized
2033+
# to a list so the GET comma-join and the chunker, which test
2034+
# ``list``/``tuple``, handle it instead of str()-ing the whole
2035+
# array. ``.tolist()`` yields native int/float; ``list()`` covers
2036+
# generators and other iterables. Scalars/strings fall through.
2037+
args[k] = v.tolist() if hasattr(v, "tolist") else list(v)
2038+
elif isinstance(v, str) or not isinstance(v, Iterable):
20302039
args[k] = v
20312040
else:
20322041
args[k] = _normalize_str_iterable(v, k)

tests/waterdata_test.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
_check_profiles,
3636
_construct_api_requests,
3737
_construct_cql_request,
38+
_get_args,
3839
_normalize_str_iterable,
3940
)
4041

@@ -294,6 +295,27 @@ def test_construct_api_requests_numeric_list_joins_with_str():
294295
assert "water_year=2020%2C2021" in str(req.url)
295296

296297

298+
def test_get_args_materializes_numpy_and_series_numeric_params():
299+
"""Regression: numeric (_NO_NORMALIZE_PARAMS) params given as a numpy array
300+
or pandas Series must be materialized to a list of native Python scalars so
301+
they comma-join in the URL (and stay JSON-serializable) — previously the
302+
array/Series repr leaked into the query string."""
303+
for value in (np.array([2020, 2021]), pd.Series([2020, 2021])):
304+
args = _get_args({"water_year": value})
305+
assert args["water_year"] == [2020, 2021]
306+
# native Python ints, not numpy scalars (JSON-serializable, no np reprs)
307+
assert [type(x) for x in args["water_year"]] == [int, int]
308+
req = _construct_api_requests("peaks", **args)
309+
assert "water_year=2020%2C2021" in str(req.url)
310+
311+
# float coordinate arrays (e.g. bbox) likewise materialize to native floats
312+
args = _get_args({"bbox": np.array([-92.8, 44.2, -88.9, 46.0])})
313+
assert args["bbox"] == [-92.8, 44.2, -88.9, 46.0]
314+
assert all(type(x) is float for x in args["bbox"])
315+
req = _construct_api_requests("daily", **args)
316+
assert "bbox=-92.8%2C44.2%2C-88.9%2C46.0" in str(req.url)
317+
318+
297319
def test_construct_api_requests_two_element_date_list_becomes_interval():
298320
"""A two-element date list is interpreted as start/end of an OGC datetime
299321
interval (joined with '/'), NOT as two discrete dates. The OGC `datetime`

0 commit comments

Comments
 (0)