Skip to content

Commit c05d082

Browse files
thodson-usgsclaude
andcommitted
Polish from /simplify review
Three small follow-ups to c206c0c addressing review findings: 1. Restore the AGENCY-ID hint in _check_monitoring_location_id's TypeError. The refactor through _normalize_str_iterable accidentally dropped the trailing "Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'." that the original wrapper carried. Catch+re-raise so monitoring_location_id keeps its helpful error while the generic helper stays generic. Pinned with a new assertion in test_integer_raises_type_error. 2. Document the date-range exclusion in _normalize_str_iterable's docstring. `time`, `last_modified`, `begin`, `end`, `datetime` bypass this helper because _format_api_dates handles their single-string-or-range semantics inside _construct_api_requests; the exclusion lived only in the prior commit message. 3. Single-pass validation. The helper previously did `values = list(value)` followed by a separate `for v in values` isinstance loop. Folded into one loop that builds the list while validating per-element. Plus tests: hoist `from unittest import mock` to module level (matches the rest of the repo's test files) instead of importing it inside the test body. No behavior change for valid inputs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c206c0c commit c05d082

2 files changed

Lines changed: 29 additions & 13 deletions

File tree

dataretrieval/waterdata/utils.py

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1165,12 +1165,17 @@ def _normalize_str_iterable(
11651165
Used by every public waterdata getter for multi-value string parameters
11661166
(``parameter_code``, ``statistic_id``, ``state_name``, ...) so any
11671167
sequence-like input — ``list``, ``tuple``, ``pandas.Series``,
1168-
``pandas.Index``, ``numpy.ndarray``, generators, sets — works at the
1169-
public boundary. The downstream ``_construct_api_requests`` branches
1170-
on ``isinstance(v, (list, tuple))``, so iterables are materialized to
1171-
a ``list`` here. ``Mapping`` types are rejected because iterating a
1168+
``pandas.Index``, ``numpy.ndarray``, generators — works at the public
1169+
boundary. The downstream ``_construct_api_requests`` branches on
1170+
``isinstance(v, (list, tuple))``, so iterables are materialized to a
1171+
``list`` here. ``Mapping`` types are rejected because iterating a
11721172
mapping yields keys, which would be a footgun.
11731173
1174+
Date-range params (``time``, ``last_modified``, ``begin``, ``end``,
1175+
``datetime``) deliberately bypass this helper; their single-string-or-
1176+
two-element-range semantics are handled by ``_format_api_dates`` inside
1177+
``_construct_api_requests``.
1178+
11741179
Parameters
11751180
----------
11761181
value : None, str, or iterable of str
@@ -1199,13 +1204,14 @@ def _normalize_str_iterable(
11991204
f"{param_name} must be a string or iterable of strings, "
12001205
f"not {type(value).__name__} (got {value!r})."
12011206
)
1202-
values = list(value)
1203-
for v in values:
1207+
values: list[str] = []
1208+
for v in value:
12041209
if not isinstance(v, str):
12051210
raise TypeError(
12061211
f"{param_name} elements must be strings, "
12071212
f"not {type(v).__name__} (got {v!r})."
12081213
)
1214+
values.append(v)
12091215
return values
12101216

12111217

@@ -1237,7 +1243,15 @@ def _check_monitoring_location_id(
12371243
If any identifier doesn't contain a hyphen separator
12381244
(per the OGC API spec: AGENCY-ID format, e.g. ``USGS-01646500``).
12391245
"""
1240-
value = _normalize_str_iterable(monitoring_location_id, "monitoring_location_id")
1246+
try:
1247+
value = _normalize_str_iterable(
1248+
monitoring_location_id, "monitoring_location_id"
1249+
)
1250+
except TypeError as exc:
1251+
# Re-raise with the AGENCY-ID hint the generic helper doesn't carry.
1252+
raise TypeError(
1253+
f"{exc} Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'."
1254+
) from None
12411255
if value is None:
12421256
return None
12431257
if isinstance(value, str):

tests/waterdata_test.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import datetime
22
import sys
3+
from unittest import mock
34

45
import pandas as pd
56
import pytest
@@ -407,9 +408,12 @@ def test_none_passes(self):
407408
assert _check_monitoring_location_id(None) is None
408409

409410
def test_integer_raises_type_error(self):
410-
"""An integer ID raises TypeError with a helpful message."""
411-
with pytest.raises(TypeError, match="not int"):
411+
"""An integer ID raises TypeError with a helpful AGENCY-ID hint."""
412+
with pytest.raises(TypeError, match="not int") as exc_info:
412413
_check_monitoring_location_id(5129115)
414+
# The wrapper appends the AGENCY-ID format hint that the generic
415+
# helper alone doesn't carry.
416+
assert "USGS-01646500" in str(exc_info.value)
413417

414418
def test_integer_in_list_raises_type_error(self):
415419
"""An integer inside a list raises TypeError."""
@@ -541,10 +545,8 @@ def test_get_daily_parameter_code_as_series(self):
541545
URL (or POST body). Post-fix, ``_normalize_str_iterable`` materializes
542546
it to ``list`` at the function boundary.
543547
"""
544-
from unittest import mock as _mock
545-
546-
with _mock.patch("dataretrieval.waterdata.api.get_ogc_data") as fake:
547-
fake.return_value = (pd.DataFrame(), _mock.MagicMock(spec=[]))
548+
with mock.patch("dataretrieval.waterdata.api.get_ogc_data") as fake:
549+
fake.return_value = (pd.DataFrame(), mock.MagicMock(spec=[]))
548550
get_daily(
549551
monitoring_location_id="USGS-05427718",
550552
parameter_code=pd.Series(["00060", "00010"]),

0 commit comments

Comments
 (0)