Skip to content

Commit 5e88645

Browse files
committed
Merge remote-tracking branch 'origin/main' into andreatgretel/chore/deprecate-mkdocs
Signed-off-by: Andre Manoel <amanoel@nvidia.com> # Conflicts: # docs/concepts/models/configure-model-settings-with-the-cli.md # docs/concepts/models/custom-model-settings.md # docs/concepts/models/default-model-settings.md # docs/concepts/models/model-configs.md # docs/concepts/models/model-providers.md
2 parents 1f0f9be + 241b0f8 commit 5e88645

53 files changed

Lines changed: 1677 additions & 1099 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

fern/versions/latest/pages/concepts/models/configure-model-settings-with-the-cli.mdx

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -74,12 +74,6 @@ data-designer config providers
7474

7575
**Delete all providers**: Remove all providers and their associated models.
7676

77-
**Change default provider**: Set which provider is used by default. This option is only available when multiple providers are configured.
78-
79-
<Warning title="Deprecated: 'Change default provider' workflow">
80-
The "Change default provider" workflow is **deprecated** and will be removed in a future release alongside the registry-level default. Specify `provider=` explicitly on each `ModelConfig` instead — the workflow now emits a `DeprecationWarning` when entered. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).
81-
</Warning>
82-
8377
## Managing Model Configurations
8478

8579
Run the interactive model configuration command:
@@ -128,7 +122,6 @@ data-designer config list
128122
This command displays:
129123

130124
- **Model Providers**: All configured providers with their endpoints (API keys are masked)
131-
- **Default Provider**: The currently selected default provider _(deprecated; see [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589))_
132125
- **Model Configurations**: All configured models with their settings
133126

134127
## Resetting Configurations

fern/versions/latest/pages/concepts/models/custom-model-settings.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -94,9 +94,9 @@ preview_result.display_sample_record()
9494
When you only specify `model_configs`, the default model providers (NVIDIA, OpenAI, and OpenRouter) are still available. You only need to create custom providers if you want to connect to different endpoints or modify provider settings.
9595
</Note>
9696

97-
<Warning title="Always specify `provider=` on `ModelConfig`">
98-
Leaving `provider` unset (or passing `provider=None`) on `ModelConfig` is **deprecated**. The legacy "implicit default provider" routing — used when `provider` is omitted — emits a `DeprecationWarning` and will be removed in a future release. Always reference the intended provider by name, as the examples below do. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).
99-
</Warning>
97+
<Note title="Provider is required">
98+
Every custom `ModelConfig` must reference the intended provider by name. The examples below use the built-in `nvidia` provider.
99+
</Note>
100100

101101
<Tip title="Mixing Custom and Default Models">
102102
When you provide custom `model_configs` to `DataDesignerConfigBuilder`, they **replace** the defaults entirely. To use custom model configs in addition to the default configs, use the add_model_config method:

fern/versions/latest/pages/concepts/models/default-model-settings.mdx

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -120,10 +120,6 @@ Both methods operate on the same files, ensuring consistency across your entire
120120
The default model providers call hosted endpoints operated by NVIDIA, OpenAI, OpenRouter, or their upstream providers. Provider terms and privacy practices apply independently of Data Designer, and free or trial endpoints may log request data for security, operations, or product improvement. Do not submit confidential information or personal data, including faces, voices, screenshots, regulated data, or other sensitive content, unless the selected provider and endpoint are approved for your use case.
121121
</Warning>
122122

123-
<Warning title="Deprecated: implicit default provider routing">
124-
The `default:` key in `~/.data-designer/model_providers.yaml` and the registry-level "default provider" concept are **deprecated** and will be removed in a future release. Specify `provider=` explicitly on every `ModelConfig` instead — the built-in defaults above already do this, and a `DeprecationWarning` is now emitted whenever the legacy routing is exercised. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).
125-
</Warning>
126-
127123
<Tip title="Environment Variables">
128124
Store your API keys in environment variables rather than hardcoding them in your scripts:
129125

fern/versions/latest/pages/concepts/models/model-configs.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,13 @@ The `ModelConfig` class has the following fields:
2020
| `alias` | `str` | Yes | Unique identifier for this model configuration (e.g., `"my-text-model"`, `"reasoning-model"`) |
2121
| `model` | `str` | Yes | Model identifier as recognized by the provider (e.g., `"nvidia/nemotron-3-nano-30b-a3b"`, `"gpt-4"`) |
2222
| `inference_parameters` | `InferenceParamsT` | No | Controls model behavior during generation. Use `ChatCompletionInferenceParams` for text/code/structured generation or `EmbeddingInferenceParams` for embeddings. Defaults to `ChatCompletionInferenceParams()` if not provided. The generation type is automatically determined by the inference parameters type. See [Inference Parameters](/concepts/models/inference-parameters) for details. |
23-
| `provider` | `str` | No | Reference to the name of the Provider to use (e.g., `"nvidia"`, `"openai"`, `"openrouter"`). If not specified, one set as the default provider, which may resolve to the first provider if there are more than one |
23+
| `provider` | `str` | Yes | Reference to the name of the Provider to use (e.g., `"nvidia"`, `"openai"`, `"openrouter"`). |
2424
| `skip_health_check` | `bool` | No | Whether to skip the health check for this model. Defaults to `False`. Set to `True` to skip health checks when you know the model is accessible or want to defer validation. |
2525

26+
<Warning title="Upgrade note">
27+
Every `ModelConfig` must now specify `provider`. Existing `model_configs.yaml` entries from older releases that omit `provider` or set it to `null` must be updated with an explicit provider name before loading. Agent tooling that parses `data-designer agent context` should read each model alias item's `provider` field; the top-level `default_provider` and per-item `configured_provider` / `effective_provider` fields are no longer emitted.
28+
</Warning>
29+
2630

2731
## Examples
2832

fern/versions/latest/pages/concepts/models/model-providers.mdx

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,6 @@ Model providers are external services that host and serve models. Data Designer
99

1010
A `ModelProvider` defines how Data Designer connects to a provider's API endpoint. When you create a `ModelConfig`, you reference a provider by name, and Data Designer uses that provider's settings to make API calls to the appropriate endpoint.
1111

12-
<Warning title="Deprecated: implicit default provider routing">
13-
Earlier versions of Data Designer let you omit `provider=` on `ModelConfig` and fall back to a registry-level default — including the `default:` key in `~/.data-designer/model_providers.yaml`. That implicit routing is **deprecated** and will be removed in a future release. Always reference a provider by name on every `ModelConfig`. A `DeprecationWarning` is now emitted when the legacy path is exercised. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).
14-
</Warning>
15-
1612
## ModelProvider Configuration
1713

1814
The `ModelProvider` class has the following fields:

packages/data-designer-config/src/data_designer/config/default_model_settings.py

Lines changed: 0 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@
2424
PREDEFINED_PROVIDERS_MODEL_MAP,
2525
)
2626
from data_designer.config.utils.io_helpers import load_config_file, save_config_file
27-
from data_designer.config.utils.warning_helpers import warn_at_caller
2827

2928
logger = logging.getLogger(__name__)
3029

@@ -95,31 +94,6 @@ def get_default_providers() -> list[ModelProvider]:
9594
return []
9695

9796

98-
def get_default_provider_name() -> str | None:
99-
"""Return the YAML's ``default:`` provider name, if set.
100-
101-
Deprecated: this function and the underlying YAML key are deprecated and
102-
will be removed in a future release. Specify ``provider=`` explicitly on
103-
each ``ModelConfig`` instead. See issue #589.
104-
"""
105-
default = _get_default_providers_file_content(MODEL_PROVIDERS_FILE_PATH).get("default")
106-
if default is not None:
107-
# ``warn_at_caller`` (rather than ``warnings.warn(stacklevel=2)``) so the
108-
# warning attributes to the user's call site rather than this library
109-
# module. The only real call path is ``DataDesigner.__init__``, which
110-
# is itself a ``data_designer`` frame; under default Python filters,
111-
# library-attributed ``DeprecationWarning`` entries are silenced
112-
# (``ignore::DeprecationWarning``), so library attribution = invisible
113-
# warning. See PR #594 review.
114-
warn_at_caller(
115-
f"The 'default:' key in {MODEL_PROVIDERS_FILE_PATH} is deprecated and will "
116-
"be removed in a future release. Remove it and specify provider= explicitly "
117-
"on each ModelConfig instead. See issue #589.",
118-
DeprecationWarning,
119-
)
120-
return default
121-
122-
12397
def resolve_seed_default_model_settings() -> None:
12498
if not MODEL_CONFIGS_FILE_PATH.exists():
12599
logger.debug(

packages/data-designer-config/src/data_designer/config/models.py

Lines changed: 3 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,6 @@
4444
video_format_from_mime_type,
4545
video_mime_type,
4646
)
47-
from data_designer.config.utils.warning_helpers import warn_at_caller
4847

4948
logger = logging.getLogger(__name__)
5049

@@ -642,17 +641,15 @@ class ModelConfig(ConfigBase):
642641
model: Model identifier (e.g., from build.nvidia.com or other providers).
643642
inference_parameters: Inference parameters for the model (temperature, top_p, max_tokens, etc.).
644643
The generation_type is determined by the type of inference_parameters.
645-
provider: Name of the model provider. Required in a future release. Leaving
646-
``provider`` unset (or ``None``) currently routes through the registry's
647-
implicit default and is **deprecated**; specify ``provider=`` explicitly.
648-
See issue #589.
644+
provider: Name of the model provider. Must match the ``name`` field of a
645+
``ModelProvider`` registered with the surrounding ``DataDesigner`` instance.
649646
skip_health_check: Whether to skip the health check for this model. Defaults to False.
650647
"""
651648

652649
alias: str
653650
model: str
654651
inference_parameters: InferenceParamsT = Field(default_factory=ChatCompletionInferenceParams)
655-
provider: str | None = None
652+
provider: str
656653
skip_health_check: bool = False
657654

658655
@property
@@ -677,22 +674,6 @@ def _convert_inference_parameters(cls, value: Any) -> Any:
677674
return ChatCompletionInferenceParams(**value)
678675
return value
679676

680-
@model_validator(mode="after")
681-
def _warn_on_implicit_provider(self) -> Self:
682-
if self.provider is None:
683-
# Use ``warn_at_caller`` so the warning is attributed to the user's
684-
# ``ModelConfig(...)`` / ``model_validate(...)`` call rather than a
685-
# pydantic-internal frame. Without this, every call dedupes to the
686-
# same pydantic line and only the first emission is shown. See
687-
# PR #594 review.
688-
warn_at_caller(
689-
f"ModelConfig.provider=None is deprecated and will be required in a future release. "
690-
f"Specify provider= explicitly on ModelConfig(alias={self.alias!r}, ...). "
691-
"See issue #589.",
692-
DeprecationWarning,
693-
)
694-
return self
695-
696677

697678
class ModelProvider(ConfigBase):
698679
"""Configuration for a custom model provider.

packages/data-designer-config/src/data_designer/config/testing/fixtures.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ def stub_data_designer_config_str() -> str:
3232
model_configs:
3333
- alias: my_own_code_model
3434
model: openai/meta/llama-3.3-70b-instruct
35+
provider: openai
3536
inference_parameters:
3637
temperature:
3738
distribution_type: uniform

packages/data-designer-config/tests/config/test_config_builder.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,7 @@ def test_from_config_auto_wraps_bare_dict() -> None:
196196
{
197197
"alias": "test-model",
198198
"model": "openai/meta/llama-3.3-70b-instruct",
199+
"provider": "openai",
199200
}
200201
],
201202
"columns": [
@@ -219,6 +220,7 @@ def test_from_config_passthrough_when_already_wrapped() -> None:
219220
{
220221
"alias": "test-model",
221222
"model": "openai/meta/llama-3.3-70b-instruct",
223+
"provider": "openai",
222224
}
223225
],
224226
"columns": [
@@ -253,6 +255,7 @@ def test_from_config_auto_wraps_bare_json_file() -> None:
253255
{
254256
"alias": "test-model",
255257
"model": "openai/meta/llama-3.3-70b-instruct",
258+
"provider": "openai",
256259
}
257260
],
258261
"columns": [

packages/data-designer-config/tests/config/test_default_model_settings.py

Lines changed: 0 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
# SPDX-License-Identifier: Apache-2.0
33

44
import json
5-
import warnings
65
from pathlib import Path
76
from unittest.mock import patch
87

@@ -14,7 +13,6 @@
1413
get_builtin_model_providers,
1514
get_default_inference_parameters,
1615
get_default_model_configs,
17-
get_default_provider_name,
1816
get_default_providers,
1917
get_providers_with_missing_api_keys,
2018
resolve_seed_default_model_settings,
@@ -146,63 +144,6 @@ def test_get_default_providers_path_does_not_exist():
146144
get_default_providers()
147145

148146

149-
def test_get_default_provider_name_with_default_key(tmp_path: Path):
150-
"""When the YAML carries a non-None ``default:``, the function must
151-
return that value AND emit a ``DeprecationWarning`` (regression for #589).
152-
"""
153-
providers_file_path = tmp_path / "providers.yaml"
154-
providers_file_path.write_text(
155-
json.dumps(dict(providers=[p.model_dump() for p in get_builtin_model_providers()], default="nvidia"))
156-
)
157-
with patch("data_designer.config.default_model_settings.MODEL_PROVIDERS_FILE_PATH", new=providers_file_path):
158-
with pytest.warns(DeprecationWarning, match="'default:' key.*is deprecated"):
159-
assert get_default_provider_name() == "nvidia"
160-
161-
162-
def test_get_default_provider_name_without_default_key(tmp_path: Path):
163-
"""Pin the post-deprecation happy path: a YAML without ``default:`` must
164-
return ``None`` and NOT emit a ``DeprecationWarning``.
165-
"""
166-
providers_file_path = tmp_path / "providers.yaml"
167-
providers_file_path.write_text(json.dumps({"providers": [p.model_dump() for p in get_builtin_model_providers()]}))
168-
with patch("data_designer.config.default_model_settings.MODEL_PROVIDERS_FILE_PATH", new=providers_file_path):
169-
with warnings.catch_warnings():
170-
warnings.simplefilter("error", DeprecationWarning)
171-
assert get_default_provider_name() is None
172-
173-
174-
def test_get_default_provider_name_warning_attributes_to_user_frame(tmp_path: Path):
175-
"""Regression for PR #594 review (andreatgretel): the YAML-default warning
176-
must attribute to the user's call site, not to ``default_model_settings.py``.
177-
Python's default filter ignores library-attributed ``DeprecationWarning``
178-
entries, so the previous ``stacklevel=2`` attribution rendered the warning
179-
invisible under default filters on the only real call path
180-
(``DataDesigner.__init__``). See issue #589.
181-
"""
182-
providers_file_path = tmp_path / "providers.yaml"
183-
providers_file_path.write_text(
184-
json.dumps(dict(providers=[p.model_dump() for p in get_builtin_model_providers()], default="nvidia"))
185-
)
186-
with patch("data_designer.config.default_model_settings.MODEL_PROVIDERS_FILE_PATH", new=providers_file_path):
187-
with warnings.catch_warnings(record=True) as caught:
188-
warnings.simplefilter("always", DeprecationWarning)
189-
assert get_default_provider_name() == "nvidia"
190-
191-
matches = [w for w in caught if "'default:' key" in str(w.message)]
192-
assert len(matches) == 1, [str(w.message) for w in caught]
193-
assert matches[0].filename == __file__, (
194-
f"Warning attributed to {matches[0].filename!r} (line {matches[0].lineno}) "
195-
f"instead of the test file. Library-attributed DeprecationWarnings are "
196-
f"silenced under default filters."
197-
)
198-
199-
200-
def test_get_default_provider_name_path_does_not_exist():
201-
with patch("data_designer.config.default_model_settings.MODEL_PROVIDERS_FILE_PATH", new=Path("non_existent_path")):
202-
with pytest.raises(FileNotFoundError, match=r"Default model providers file not found at 'non_existent_path'"):
203-
get_default_provider_name()
204-
205-
206147
def test_get_nvidia_api_key():
207148
with patch("data_designer.config.utils.visualization.os.getenv", return_value="nvidia_api_key"):
208149
assert get_nvidia_api_key() == "nvidia_api_key"

0 commit comments

Comments
 (0)