Skip to content

Commit 247fa30

Browse files
committed
fix(models): make deprecation warnings visible under default filters
andreatgretel (PR #594): the YAML-default warning in `get_default_provider_name` and the registry-default warning emitted from inside DataDesigner helpers were attributing to data_designer library frames, not user code. Python's default filter chain includes `ignore::DeprecationWarning`, so library-attributed entries are silenced — meaning a normal `DataDesigner()` call with a YAML `default:` set showed nothing, and `resolve_model_provider_registry` warnings were similarly invisible. Two related changes: 1. `warn_at_caller`: extend the default skip-list from `("pydantic",)` to `("pydantic", "pydantic_core", "data_designer")` so the walk escapes both pydantic's validator-dispatch frames and data_designer helper frames before attributing. Also tighten the prefix predicate to exact-or-dotted-prefix matching (`name == p or name.startswith(p + ".")`) so e.g. `pydantic_helpers` is not falsely matched as part of `pydantic` (johnnygreco nit). Allow callers to pass a custom `skip_prefixes` for flexibility. Drop the "skip frame 0+1 unconditionally" guard now that prefix matching covers it. 2. `get_default_provider_name`: switch from `warnings.warn(stacklevel=2)` to `warn_at_caller`. The previous stacklevel pointed into `default_model_settings.py`, which is a library file → silenced under default filters. Verified the fix empirically with `python -W default`: warning is now attributed to the user's call site and rendered. johnnygreco (PR #594): add the missing `test_explicit_default_none_does_not_emit_deprecation_warning` regression for the `self.default is not None` predicate landed in the prior round. Tests: - New `test_warning_helpers.py` pins prefix-matching precision (rejects `pydantic_helpers` / `data_designer_other`), default skip-list contents, attribution past skip-prefix frames, and per-call-site dedup behavior. - `test_get_default_provider_name_warning_attributes_to_user_frame` pins andreatgretel's repro for the YAML-default site. - `test_explicit_default_warning_attributes_to_user_frame` pins the multi-frame case: construction goes through `resolve_model_provider_registry`, so the walk has to escape both pydantic and data_designer before landing on the test file. - `test_explicit_default_none_does_not_emit_deprecation_warning` pins johnnygreco's predicate-tightening regression. 3,124 tests pass (540 config + 1,923 engine + 653 interface; +10 net from this round). Refs #589 Made-with: Cursor
1 parent 481643f commit 247fa30

5 files changed

Lines changed: 229 additions & 27 deletions

File tree

packages/data-designer-config/src/data_designer/config/default_model_settings.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@
55

66
import logging
77
import os
8-
import warnings
98
from functools import lru_cache
109
from pathlib import Path
1110
from typing import Any, Literal
@@ -25,6 +24,7 @@
2524
PREDEFINED_PROVIDERS_MODEL_MAP,
2625
)
2726
from data_designer.config.utils.io_helpers import load_config_file, save_config_file
27+
from data_designer.config.utils.warning_helpers import warn_at_caller
2828

2929
logger = logging.getLogger(__name__)
3030

@@ -104,12 +104,18 @@ def get_default_provider_name() -> str | None:
104104
"""
105105
default = _get_default_providers_file_content(MODEL_PROVIDERS_FILE_PATH).get("default")
106106
if default is not None:
107-
warnings.warn(
107+
# ``warn_at_caller`` (rather than ``warnings.warn(stacklevel=2)``) so the
108+
# warning attributes to the user's call site rather than this library
109+
# module. The only real call path is ``DataDesigner.__init__``, which
110+
# is itself a ``data_designer`` frame; under default Python filters,
111+
# library-attributed ``DeprecationWarning`` entries are silenced
112+
# (``ignore::DeprecationWarning``), so library attribution = invisible
113+
# warning. See PR #594 review.
114+
warn_at_caller(
108115
f"The 'default:' key in {MODEL_PROVIDERS_FILE_PATH} is deprecated and will "
109116
"be removed in a future release. Remove it and specify provider= explicitly "
110117
"on each ModelConfig instead. See issue #589.",
111118
DeprecationWarning,
112-
stacklevel=2,
113119
)
114120
return default
115121

packages/data-designer-config/src/data_designer/config/utils/warning_helpers.py

Lines changed: 62 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -3,42 +3,81 @@
33

44
"""Helpers for emitting warnings that attribute correctly to user code.
55
6-
Pydantic v2 dispatches ``@model_validator`` / ``@field_validator`` callbacks
7-
through several internal frames. ``warnings.warn(stacklevel=N)`` from inside a
8-
validator therefore tends to land inside pydantic's machinery rather than at
9-
the user's ``ModelConfig(...)`` / ``ModelProviderRegistry(...)`` call site.
6+
Library-internal warnings (typically emitted from a pydantic ``@model_validator``
7+
or from a helper function) need to be attributed to the *user's* call site, not
8+
to the library frame that happened to fire them. Two reasons:
109
11-
That breaks two things:
10+
1. Attribution — a source location pointing at library internals isn't
11+
actionable.
12+
2. Visibility under default filters — Python's default ``DeprecationWarning``
13+
filter is::
1214
13-
1. Attribution — the displayed source location is unhelpful.
14-
2. Deduplication — Python's default once-per-location dedup key is
15+
default::DeprecationWarning:__main__
16+
ignore::DeprecationWarning
17+
18+
Library-attributed ``DeprecationWarning`` entries fall under the second
19+
filter and are silenced. Attributing to user code is what gets the warning
20+
actually shown.
21+
22+
3. Deduplication — Python's once-per-location dedup key is
1523
``(category, module, lineno)``. When every call resolves to the same
16-
pydantic-internal line, every warning after the first is silently
17-
suppressed regardless of which user file triggered it.
24+
library-internal line, every warning after the first is silently suppressed
25+
regardless of which user file triggered it.
1826
19-
``warn_at_caller`` walks the stack to the first frame outside pydantic (and
20-
outside this helper / the calling validator) and uses
21-
``warnings.warn_explicit`` to attribute the warning there.
27+
``warn_at_caller`` walks the stack past frames whose module belongs to a known
28+
internal package (pydantic, data_designer) and uses ``warnings.warn_explicit``
29+
to attribute the warning at the first user frame.
2230
"""
2331

2432
from __future__ import annotations
2533

2634
import sys
2735
import warnings
2836

37+
DEFAULT_INTERNAL_PREFIXES: tuple[str, ...] = ("pydantic", "pydantic_core", "data_designer")
38+
"""Modules whose frames are skipped when walking up to the user's call site.
39+
40+
Matching is exact-or-dotted-prefix (see ``_module_in_prefixes``), so
41+
``pydantic_helpers`` is *not* treated as part of ``pydantic``."""
42+
43+
44+
def _module_in_prefixes(module_name: str, prefixes: tuple[str, ...]) -> bool:
45+
"""Return True if ``module_name`` belongs to one of the prefix-rooted packages.
46+
47+
Uses exact-equality plus dotted-prefix matching so that, e.g.,
48+
``pydantic_helpers`` is NOT treated as part of the ``pydantic`` package
49+
while ``pydantic.fields`` is. Same for ``data_designer`` vs. a hypothetical
50+
``data_designer_other``.
51+
"""
52+
return any(module_name == prefix or module_name.startswith(prefix + ".") for prefix in prefixes)
53+
54+
55+
def warn_at_caller(
56+
message: str,
57+
category: type[Warning],
58+
*,
59+
skip_prefixes: tuple[str, ...] = DEFAULT_INTERNAL_PREFIXES,
60+
) -> None:
61+
"""Emit ``message`` attributed to the first frame outside ``skip_prefixes``.
62+
63+
Intended for warnings whose root cause is the user's call site but whose
64+
emission point is library code (a pydantic validator, an internal helper,
65+
etc.). The walk starts above this helper's own frame and skips every frame
66+
whose module belongs to a package in ``skip_prefixes`` until it reaches a
67+
user frame.
2968
30-
def warn_at_caller(message: str, category: type[Warning]) -> None:
31-
"""Emit ``message`` attributed to the first non-pydantic frame above the caller.
69+
The default skip set covers:
3270
33-
Intended to be invoked from inside a pydantic validator. The walk skips this
34-
helper's own frame and the calling validator's frame, then walks past any
35-
pydantic-internal frames until it finds the user's call site.
71+
* ``pydantic`` / ``pydantic_core`` — so warnings emitted from
72+
``@model_validator`` callbacks escape pydantic's dispatch frames.
73+
* ``data_designer`` — so warnings emitted from a registry / model-config
74+
built deep inside a DataDesigner helper still attribute to the outermost
75+
user call. Without this, attribution lands on a library file and Python's
76+
default ``DeprecationWarning`` filter silences the warning entirely.
3677
3778
The user frame's ``__warningregistry__`` is passed to
3879
``warnings.warn_explicit`` so Python's built-in once-per-location dedup keys
39-
on the *user's* (filename, lineno) rather than an internal pydantic frame.
40-
That matches how ``warnings.warn`` would dedup if ``stacklevel`` could
41-
correctly point at the user.
80+
on the *user's* (filename, lineno) rather than an internal frame.
4281
4382
We deliberately do *not* pass ``module_globals`` — it's only used for
4483
``linecache`` source-line display, and for scripts run with ``python -c``
@@ -47,11 +86,10 @@ def warn_at_caller(message: str, category: type[Warning]) -> None:
4786
module")``. Skipping ``module_globals`` keeps the warning path robust at
4887
the cost of an empty source line in the formatted output.
4988
"""
50-
# Skip frame 0 (warn_at_caller) and frame 1 (the validator that called us).
51-
frame = sys._getframe(2) if hasattr(sys, "_getframe") else None
89+
frame = sys._getframe(1) if hasattr(sys, "_getframe") else None
5290
while frame is not None:
5391
module_name = frame.f_globals.get("__name__", "")
54-
if not module_name.startswith("pydantic"):
92+
if not _module_in_prefixes(module_name, skip_prefixes):
5593
warnings.warn_explicit(
5694
message,
5795
category,
@@ -63,5 +101,5 @@ def warn_at_caller(message: str, category: type[Warning]) -> None:
63101
return
64102
frame = frame.f_back
65103

66-
# Fallback: never escaped pydantic frames (or no frame access). Use stacklevel.
104+
# Fallback: never escaped library frames (or no frame access). Use stacklevel.
67105
warnings.warn(message, category, stacklevel=3)

packages/data-designer-config/tests/config/test_default_model_settings.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,32 @@ def test_get_default_provider_name_without_default_key(tmp_path: Path):
167167
assert get_default_provider_name() is None
168168

169169

170+
def test_get_default_provider_name_warning_attributes_to_user_frame(tmp_path: Path):
171+
"""Regression for PR #594 review (andreatgretel): the YAML-default warning
172+
must attribute to the user's call site, not to ``default_model_settings.py``.
173+
Python's default filter ignores library-attributed ``DeprecationWarning``
174+
entries, so the previous ``stacklevel=2`` attribution rendered the warning
175+
invisible under default filters on the only real call path
176+
(``DataDesigner.__init__``). See issue #589.
177+
"""
178+
providers_file_path = tmp_path / "providers.yaml"
179+
providers_file_path.write_text(
180+
json.dumps(dict(providers=[p.model_dump() for p in get_builtin_model_providers()], default="nvidia"))
181+
)
182+
with patch("data_designer.config.default_model_settings.MODEL_PROVIDERS_FILE_PATH", new=providers_file_path):
183+
with warnings.catch_warnings(record=True) as caught:
184+
warnings.simplefilter("always", DeprecationWarning)
185+
assert get_default_provider_name() == "nvidia"
186+
187+
matches = [w for w in caught if "'default:' key" in str(w.message)]
188+
assert len(matches) == 1, [str(w.message) for w in caught]
189+
assert matches[0].filename == __file__, (
190+
f"Warning attributed to {matches[0].filename!r} (line {matches[0].lineno}) "
191+
f"instead of the test file. Library-attributed DeprecationWarnings are "
192+
f"silenced under default filters."
193+
)
194+
195+
170196
def test_get_default_provider_name_path_does_not_exist():
171197
with patch("data_designer.config.default_model_settings.MODEL_PROVIDERS_FILE_PATH", new=Path("non_existent_path")):
172198
with pytest.raises(FileNotFoundError, match=r"Default model providers file not found at 'non_existent_path'"):
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
from __future__ import annotations
5+
6+
import warnings
7+
8+
from data_designer.config.utils.warning_helpers import _module_in_prefixes, warn_at_caller
9+
10+
11+
def test_module_in_prefixes_exact_match():
12+
assert _module_in_prefixes("pydantic", ("pydantic",)) is True
13+
14+
15+
def test_module_in_prefixes_dotted_submodule():
16+
assert _module_in_prefixes("pydantic.fields", ("pydantic",)) is True
17+
assert _module_in_prefixes("data_designer.config.models", ("data_designer",)) is True
18+
19+
20+
def test_module_in_prefixes_rejects_prefix_collision():
21+
"""Regression for PR #594 review (johnnygreco): ``startswith`` matching
22+
naively on the prefix would silently treat ``pydantic_helpers`` as part of
23+
the ``pydantic`` package. Anchor on exact-or-dotted-prefix instead.
24+
"""
25+
assert _module_in_prefixes("pydantic_helpers", ("pydantic",)) is False
26+
assert _module_in_prefixes("pydanticfoo", ("pydantic",)) is False
27+
assert _module_in_prefixes("data_designer_other", ("data_designer",)) is False
28+
29+
30+
def test_warn_at_caller_attributes_to_direct_caller():
31+
"""When called from a non-skipped module, the warning attributes to the
32+
caller's frame.
33+
"""
34+
with warnings.catch_warnings(record=True) as caught:
35+
warnings.simplefilter("always")
36+
warn_at_caller("hello", DeprecationWarning) # line anchored below
37+
38+
assert len(caught) == 1
39+
assert caught[0].filename == __file__
40+
assert "hello" in str(caught[0].message)
41+
42+
43+
def test_warn_at_caller_skips_skip_prefix_frames():
44+
"""The walk should escape any frame whose module is listed in
45+
``skip_prefixes`` and attribute to the first frame outside them. We
46+
simulate a library frame by ``exec``-ing a helper with a fake module name
47+
in its globals; calling that helper produces a frame whose
48+
``f_globals["__name__"]`` is the fake name, mirroring how a real library
49+
frame would appear during the walk.
50+
"""
51+
library_globals: dict[str, object] = {
52+
"__name__": "fake_library.dispatch",
53+
"warn_at_caller": warn_at_caller,
54+
"DeprecationWarning": DeprecationWarning,
55+
}
56+
exec(
57+
"def emit():\n warn_at_caller('from-library', DeprecationWarning, skip_prefixes=('fake_library',))\n",
58+
library_globals,
59+
)
60+
emit = library_globals["emit"]
61+
62+
with warnings.catch_warnings(record=True) as caught:
63+
warnings.simplefilter("always")
64+
emit()
65+
66+
assert len(caught) == 1
67+
assert caught[0].filename == __file__, f"Expected attribution at {__file__!r}, got {caught[0].filename!r}"
68+
69+
70+
def test_warn_at_caller_default_skips_pydantic_and_data_designer():
71+
"""Default ``skip_prefixes`` should cover both pydantic and data_designer
72+
so warnings emitted from validators inside DataDesigner internals attribute
73+
to the user, not to either library.
74+
"""
75+
from data_designer.config.utils.warning_helpers import DEFAULT_INTERNAL_PREFIXES
76+
77+
assert "pydantic" in DEFAULT_INTERNAL_PREFIXES
78+
assert "data_designer" in DEFAULT_INTERNAL_PREFIXES
79+
80+
81+
def test_warn_at_caller_dedup_keys_on_user_call_site():
82+
"""Python's once-per-location dedup keys on (text, category, lineno) inside
83+
the *attributing* frame's ``__warningregistry__``. With proper user
84+
attribution, two distinct call sites in the user's file each emit a
85+
warning under ``default`` filtering, while a repeated call at the same
86+
site emits only the first.
87+
"""
88+
with warnings.catch_warnings(record=True) as caught:
89+
warnings.simplefilter("default", DeprecationWarning)
90+
warn_at_caller("dedup-test", DeprecationWarning) # site A
91+
warn_at_caller("dedup-test", DeprecationWarning) # site B
92+
93+
linenos = {w.lineno for w in caught}
94+
assert len(caught) == 2, [str(w.message) for w in caught]
95+
assert len(linenos) == 2, "Each call site should produce a distinct dedup key"

packages/data-designer-engine/tests/engine/test_model_provider.py

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,43 @@ def test_no_default_does_not_emit_deprecation_warning(stub_foo_provider: ModelPr
124124
ModelProviderRegistry(providers=[stub_foo_provider])
125125

126126

127+
def test_explicit_default_none_does_not_emit_deprecation_warning(stub_foo_provider: ModelProvider) -> None:
128+
"""Pin the predicate tightening from PR #594 review: passing ``default=None``
129+
explicitly is semantically equivalent to omitting it (caller is opting *out*
130+
of a registry-level default), so the deprecation must NOT fire.
131+
"""
132+
with warnings.catch_warnings():
133+
warnings.simplefilter("error", DeprecationWarning)
134+
ModelProviderRegistry(providers=[stub_foo_provider], default=None)
135+
136+
137+
def test_explicit_default_warning_attributes_to_user_frame(
138+
stub_foo_provider: ModelProvider, stub_bar_provider: ModelProvider
139+
) -> None:
140+
"""Regression for PR #594 review (andreatgretel): the ``default=`` deprecation
141+
warning must attribute to the *user's* call site, not the pydantic-internal
142+
or ``data_designer`` library frame that emits it. Library-attributed
143+
``DeprecationWarning`` entries are silenced under Python's default
144+
``ignore::DeprecationWarning`` filter, so attribution determines whether
145+
the warning is actually visible.
146+
147+
Construction goes through ``resolve_model_provider_registry`` so the walk
148+
has to escape both pydantic (validator dispatch) and ``data_designer``
149+
(the helper that builds the registry) before landing on the test frame.
150+
"""
151+
with warnings.catch_warnings(record=True) as caught:
152+
warnings.simplefilter("always", DeprecationWarning)
153+
resolve_model_provider_registry([stub_foo_provider, stub_bar_provider], default_provider_name="bar")
154+
155+
matches = [w for w in caught if "ModelProviderRegistry.default is deprecated" in str(w.message)]
156+
assert len(matches) == 1, [str(w.message) for w in caught]
157+
assert matches[0].filename == __file__, (
158+
f"Warning attributed to {matches[0].filename!r} (line {matches[0].lineno}) "
159+
f"instead of the test file. Library-attributed DeprecationWarnings are "
160+
f"silenced under default filters."
161+
)
162+
163+
127164
def test_resolve_single_provider_quiet_under_deprecation(stub_foo_provider: ModelProvider) -> None:
128165
"""Pin the q3 tweak: ``resolve_model_provider_registry`` skips ``default=``
129166
in the single-provider case so common construction paths stay quiet under

0 commit comments

Comments
 (0)