Skip to content

Commit e8c8e88

Browse files
thodson-usgsclaude
andcommitted
Polish from /simplify review
Three small follow-ups to c206c0c addressing review findings: 1. Restore the AGENCY-ID hint in _check_monitoring_location_id's TypeError. The refactor through _normalize_str_iterable accidentally dropped the trailing "Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'." that the original wrapper carried. Catch+re-raise so monitoring_location_id keeps its helpful error while the generic helper stays generic. Pinned with a new assertion in test_integer_raises_type_error. 2. Document the date-range exclusion in _normalize_str_iterable's docstring. `time`, `last_modified`, `begin`, `end`, `datetime` bypass this helper because _format_api_dates handles their single-string-or-range semantics inside _construct_api_requests; the exclusion lived only in the prior commit message. 3. Single-pass validation. The helper previously did `values = list(value)` followed by a separate `for v in values` isinstance loop. Folded into one loop that builds the list while validating per-element. Plus tests: hoist `from unittest import mock` to module level (matches the rest of the repo's test files) instead of importing it inside the test body. No behavior change for valid inputs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent df3c108 commit e8c8e88

2 files changed

Lines changed: 29 additions & 13 deletions

File tree

dataretrieval/waterdata/utils.py

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1181,12 +1181,17 @@ def _normalize_str_iterable(
11811181
Used by every public waterdata getter for multi-value string parameters
11821182
(``parameter_code``, ``statistic_id``, ``state_name``, ...) so any
11831183
sequence-like input — ``list``, ``tuple``, ``pandas.Series``,
1184-
``pandas.Index``, ``numpy.ndarray``, generators, sets — works at the
1185-
public boundary. The downstream ``_construct_api_requests`` branches
1186-
on ``isinstance(v, (list, tuple))``, so iterables are materialized to
1187-
a ``list`` here. ``Mapping`` types are rejected because iterating a
1184+
``pandas.Index``, ``numpy.ndarray``, generators — works at the public
1185+
boundary. The downstream ``_construct_api_requests`` branches on
1186+
``isinstance(v, (list, tuple))``, so iterables are materialized to a
1187+
``list`` here. ``Mapping`` types are rejected because iterating a
11881188
mapping yields keys, which would be a footgun.
11891189
1190+
Date-range params (``time``, ``last_modified``, ``begin``, ``end``,
1191+
``datetime``) deliberately bypass this helper; their single-string-or-
1192+
two-element-range semantics are handled by ``_format_api_dates`` inside
1193+
``_construct_api_requests``.
1194+
11901195
Parameters
11911196
----------
11921197
value : None, str, or iterable of str
@@ -1215,13 +1220,14 @@ def _normalize_str_iterable(
12151220
f"{param_name} must be a string or iterable of strings, "
12161221
f"not {type(value).__name__} (got {value!r})."
12171222
)
1218-
values = list(value)
1219-
for v in values:
1223+
values: list[str] = []
1224+
for v in value:
12201225
if not isinstance(v, str):
12211226
raise TypeError(
12221227
f"{param_name} elements must be strings, "
12231228
f"not {type(v).__name__} (got {v!r})."
12241229
)
1230+
values.append(v)
12251231
return values
12261232

12271233

@@ -1253,7 +1259,15 @@ def _check_monitoring_location_id(
12531259
If any identifier doesn't contain a hyphen separator
12541260
(per the OGC API spec: AGENCY-ID format, e.g. ``USGS-01646500``).
12551261
"""
1256-
value = _normalize_str_iterable(monitoring_location_id, "monitoring_location_id")
1262+
try:
1263+
value = _normalize_str_iterable(
1264+
monitoring_location_id, "monitoring_location_id"
1265+
)
1266+
except TypeError as exc:
1267+
# Re-raise with the AGENCY-ID hint the generic helper doesn't carry.
1268+
raise TypeError(
1269+
f"{exc} Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'."
1270+
) from None
12571271
if value is None:
12581272
return None
12591273
if isinstance(value, str):

tests/waterdata_test.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import datetime
22
import sys
3+
from unittest import mock
34

45
import pandas as pd
56
import pytest
@@ -531,9 +532,12 @@ def test_none_passes(self):
531532
assert _check_monitoring_location_id(None) is None
532533

533534
def test_integer_raises_type_error(self):
534-
"""An integer ID raises TypeError with a helpful message."""
535-
with pytest.raises(TypeError, match="not int"):
535+
"""An integer ID raises TypeError with a helpful AGENCY-ID hint."""
536+
with pytest.raises(TypeError, match="not int") as exc_info:
536537
_check_monitoring_location_id(5129115)
538+
# The wrapper appends the AGENCY-ID format hint that the generic
539+
# helper alone doesn't carry.
540+
assert "USGS-01646500" in str(exc_info.value)
537541

538542
def test_integer_in_list_raises_type_error(self):
539543
"""An integer inside a list raises TypeError."""
@@ -665,10 +669,8 @@ def test_get_daily_parameter_code_as_series(self):
665669
URL (or POST body). Post-fix, ``_normalize_str_iterable`` materializes
666670
it to ``list`` at the function boundary.
667671
"""
668-
from unittest import mock as _mock
669-
670-
with _mock.patch("dataretrieval.waterdata.api.get_ogc_data") as fake:
671-
fake.return_value = (pd.DataFrame(), _mock.MagicMock(spec=[]))
672+
with mock.patch("dataretrieval.waterdata.api.get_ogc_data") as fake:
673+
fake.return_value = (pd.DataFrame(), mock.MagicMock(spec=[]))
672674
get_daily(
673675
monitoring_location_id="USGS-05427718",
674676
parameter_code=pd.Series(["00060", "00010"]),

0 commit comments

Comments
 (0)