Skip to content

Commit 4c6823c

Browse files
fix: Updated default reasoning model for nvidia (#568)
* Updated default reasoning model for nvidia * Updated inference params for super * Add reasoning_effort to Nemotron Super params, update stale docs - Add extra_body.reasoning_effort=medium to NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS (mirrors GPT-5 config) - Update README telemetry example and model-configs.md to use nvidia/nemotron-3-super-120b-a12b instead of openai/gpt-oss-20b - Broaden inference-parameters.md reasoning effort tip to cover Nemotron Super * Remove build-time README accidentally tracked --------- Co-authored-by: Andre Manoel <amanoel@nvidia.com> Co-authored-by: Andre Manoel <165937436+andreatgretel@users.noreply.github.com>
1 parent f612822 commit 4c6823c

6 files changed

Lines changed: 23 additions & 14 deletions

File tree

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -156,17 +156,17 @@ Specifically, a model name that is defined a `ModelConfig` object, is what will
156156
```python
157157
ModelConfig(
158158
alias="nv-reasoning",
159-
model="openai/gpt-oss-20b",
159+
model="nvidia/nemotron-3-super-120b-a12b",
160160
provider="nvidia",
161161
inference_parameters=ChatCompletionInferenceParams(
162-
temperature=0.3,
163-
top_p=0.9,
162+
temperature=1.0,
163+
top_p=0.95,
164164
max_tokens=4096,
165165
),
166166
)
167167
```
168168

169-
The value `openai/gpt-oss-20b` would be collected.
169+
The value `nvidia/nemotron-3-super-120b-a12b` would be collected.
170170

171171
To disable telemetry capture, set `NEMO_TELEMETRY_ENABLED=false`.
172172

docs/concepts/models/default-model-settings.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ The following model configurations are automatically available when `NVIDIA_API_
4444
| Alias | Model | Use Case | Inference Parameters |
4545
|-------|-------|----------|---------------------|
4646
| `nvidia-text` | `nvidia/nemotron-3-nano-30b-a3b` | General text generation | `temperature=1.0, top_p=1.0` |
47-
| `nvidia-reasoning` | `openai/gpt-oss-20b` | Reasoning and analysis tasks | `temperature=0.35, top_p=0.95` |
47+
| `nvidia-reasoning` | `nvidia/nemotron-3-super-120b-a12b` | Reasoning and analysis tasks | `temperature=1.0, top_p=0.95, extra_body={"reasoning_effort": "medium"}` |
4848
| `nvidia-vision` | `nvidia/nemotron-nano-12b-v2-vl` | Vision and image understanding | `temperature=0.85, top_p=0.95` |
4949
| `nvidia-embedding` | `nvidia/llama-3.2-nv-embedqa-1b-v2` | Text embeddings | `encoding_format="float", extra_body={"input_type": "query"}` |
5050

docs/concepts/models/inference-parameters.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,8 @@ The `ChatCompletionInferenceParams` class controls how models generate text comp
2424
!!! note "Default Values"
2525
If `temperature`, `top_p`, or `max_tokens` are not provided, the model provider's default values will be used. Different providers and models may have different defaults.
2626

27-
!!! tip "Controlling Reasoning Effort for GPT-OSS Models"
28-
For gpt-oss models like `gpt-oss-20b` and `gpt-oss-120b`, you can control the reasoning effort using the `extra_body` parameter:
27+
!!! tip "Controlling Reasoning Effort for Reasoning Models"
28+
For reasoning models like Nemotron 3 Super (`nvidia/nemotron-3-super-120b-a12b`) and GPT-OSS (`gpt-oss-20b`, `gpt-oss-120b`), you can control the reasoning effort using the `extra_body` parameter:
2929

3030
```python
3131
import data_designer.config as dd

docs/concepts/models/model-configs.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -70,11 +70,11 @@ model_configs = [
7070
# Reasoning and structured tasks
7171
dd.ModelConfig(
7272
alias="reasoning-model",
73-
model="openai/gpt-oss-20b",
73+
model="nvidia/nemotron-3-super-120b-a12b",
7474
provider="nvidia",
7575
inference_parameters=dd.ChatCompletionInferenceParams(
76-
temperature=0.3,
77-
top_p=0.9,
76+
temperature=1.0,
77+
top_p=0.95,
7878
max_tokens=4096,
7979
),
8080
),

packages/data-designer-config/src/data_designer/config/utils/constants.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -336,6 +336,11 @@ class NordColor(Enum):
336336
DEFAULT_VISION_INFERENCE_PARAMS = {"temperature": 0.85, "top_p": 0.95}
337337
DEFAULT_EMBEDDING_INFERENCE_PARAMS = {"encoding_format": "float"}
338338
NEMOTRON_3_NANO_30B_A3B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 1.0}
339+
NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS = {
340+
"temperature": 1.0,
341+
"top_p": 0.95,
342+
"extra_body": {"reasoning_effort": "medium"},
343+
}
339344
GPT5_INFERENCE_PARAMS = {"extra_body": {"reasoning_effort": "medium"}}
340345

341346
PREDEFINED_PROVIDERS_MODEL_MAP = {
@@ -344,7 +349,10 @@ class NordColor(Enum):
344349
"model": "nvidia/nemotron-3-nano-30b-a3b",
345350
"inference_parameters": NEMOTRON_3_NANO_30B_A3B_INFERENCE_PARAMS,
346351
},
347-
"reasoning": {"model": "openai/gpt-oss-20b", "inference_parameters": DEFAULT_REASONING_INFERENCE_PARAMS},
352+
"reasoning": {
353+
"model": "nvidia/nemotron-3-super-120b-a12b",
354+
"inference_parameters": NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS,
355+
},
348356
"vision": {"model": "nvidia/nemotron-nano-12b-v2-vl", "inference_parameters": DEFAULT_VISION_INFERENCE_PARAMS},
349357
"embedding": {
350358
"model": "nvidia/llama-3.2-nv-embedqa-1b-v2",

packages/data-designer-config/tests/config/test_default_model_settings.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,11 @@ def test_get_default_inference_parameters():
3030
top_p=0.95,
3131
)
3232
assert get_default_inference_parameters(
33-
"reasoning", {"temperature": 0.35, "top_p": 0.95}
33+
"reasoning", {"temperature": 1.0, "top_p": 0.95, "extra_body": {"reasoning_effort": "medium"}}
3434
) == ChatCompletionInferenceParams(
35-
temperature=0.35,
35+
temperature=1.0,
3636
top_p=0.95,
37+
extra_body={"reasoning_effort": "medium"},
3738
)
3839
assert get_default_inference_parameters(
3940
"vision", {"temperature": 0.85, "top_p": 0.95}
@@ -59,7 +60,7 @@ def test_get_builtin_model_configs():
5960
assert builtin_model_configs[0].model == "nvidia/nemotron-3-nano-30b-a3b"
6061
assert builtin_model_configs[0].provider == "nvidia"
6162
assert builtin_model_configs[1].alias == "nvidia-reasoning"
62-
assert builtin_model_configs[1].model == "openai/gpt-oss-20b"
63+
assert builtin_model_configs[1].model == "nvidia/nemotron-3-super-120b-a12b"
6364
assert builtin_model_configs[1].provider == "nvidia"
6465
assert builtin_model_configs[2].alias == "nvidia-vision"
6566
assert builtin_model_configs[2].model == "nvidia/nemotron-nano-12b-v2-vl"

0 commit comments

Comments
 (0)