fix: Updated default reasoning model for nvidia#568
Conversation
Greptile SummaryThis PR replaces the default NVIDIA reasoning model from
|
| Filename | Overview |
|---|---|
| packages/data-designer-config/src/data_designer/config/utils/constants.py | Adds NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS constant and updates the nvidia reasoning entry in PREDEFINED_PROVIDERS_MODEL_MAP to use the new model and params; DEFAULT_REASONING_INFERENCE_PARAMS is retained and still used by the openrouter provider. |
| packages/data-designer-config/tests/config/test_default_model_settings.py | Test expectations updated to match the new model identifier and inference parameters (temperature=1.0, extra_body reasoning_effort) for the nvidia reasoning slot. |
| docs/concepts/models/default-model-settings.md | Documentation table updated to reflect the new nvidia-reasoning model and its inference parameters. |
| docs/concepts/models/inference-parameters.md | Tip section updated to include Nemotron 3 Super alongside GPT-OSS models when describing reasoning effort control via extra_body. |
| README.md | Example ModelConfig in the README updated to show the new model name and adjusted temperature/top_p values. |
| docs/concepts/models/model-configs.md | Example ModelConfig updated to use the new nvidia reasoning model with revised inference parameters. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Provider: nvidia] --> B{Model Role}
B --> C[text
nvidia/nemotron-3-nano-30b-a3b
temp=1.0, top_p=1.0]
B --> D[reasoning
nvidia/nemotron-3-super-120b-a12b
temp=1.0, top_p=0.95
reasoning_effort=medium]
B --> E[vision
nvidia/nemotron-nano-12b-v2-vl
temp=0.85, top_p=0.95]
B --> F[embedding
nvidia/llama-3.2-nv-embedqa-1b-v2
encoding_format=float]
style D fill:#d4f1d4,stroke:#2d8a2d
Reviews (3): Last reviewed commit: "Remove build-time README accidentally tr..." | Re-trigger Greptile
|
Docs preview: https://8c9078e8.dd-docs-preview.pages.dev
|
Code Review: PR #568 —
|
nabinchha
left a comment
There was a problem hiding this comment.
Thanks for swapping in the new model, @kirit93!
Summary
Replaces the default NVIDIA reasoning model from openai/gpt-oss-20b with nvidia/nemotron-3-super-120b-a12b in the NVIDIA provider map, updates the documentation table to match, and updates the corresponding test assertion. The model name change is correct and the changeset is clean.
Findings
Critical — Let's fix these before merge
packages/data-designer-config/src/data_designer/config/utils/constants.py:349 — Inference parameters don't match model card recommendations
- What: The new model (
nvidia/nemotron-3-super-120b-a12b) reusesDEFAULT_REASONING_INFERENCE_PARAMSwhich istemperature=0.35, top_p=0.95. The HuggingFace model card explicitly states: "Usetemperature=1.0andtop_p=0.95across all tasks and serving backends — reasoning, tool calling, and general chat alike." Every code example in the Quick Start Guide usestemperature=1.0, top_p=0.95. - Why: Running Nemotron 3 Super at
temperature=0.35is well outside the vendor-recommended operating range. The model was likely tuned and evaluated withtemperature=1.0, so a much lower temperature could degrade output quality (especially for reasoning traces where diversity in the thinking process matters). This would silently affect every user relying on thenvidia-reasoningdefault. - Suggestion: Create a dedicated inference params constant for Nemotron 3 Super and use it in the provider map:
Then reference it:
NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 0.95}
This follows the same pattern used for"reasoning": { "model": "nvidia/nemotron-3-super-120b-a12b", "inference_parameters": NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS, },
NEMOTRON_3_NANO_30B_A3B_INFERENCE_PARAMSin the text slot. It also keepsDEFAULT_REASONING_INFERENCE_PARAMSunchanged for other providers (OpenRouter still usesgpt-oss-20b).
docs/concepts/models/default-model-settings.md:47 — Documentation table shows wrong inference parameters
- What: The docs table for
nvidia-reasoningshowstemperature=0.35, top_p=0.95— matching the code, but mismatched with the model card. - Why: Once the code is updated per the above, the docs need to reflect the actual parameters (
temperature=1.0, top_p=0.95). - Suggestion: Update the table row:
| `nvidia-reasoning` | `nvidia/nemotron-3-super-120b-a12b` | Reasoning and analysis tasks | `temperature=1.0, top_p=0.95` |
Suggestions — Take it or leave it
packages/data-designer-config/tests/config/test_default_model_settings.py:33 — Test hardcodes old inference params for reasoning
- What:
test_get_default_inference_parameterspassestemperature=0.35for the"reasoning"role. This test verifies the parsing function (not the provider map), so it won't break — but once you introduce a Nemotron-specific constant, you might want an additional test or parametrize case that exercisestemperature=1.0, top_p=0.95through the same path, so it's validated end-to-end. - Why: Extra confidence that the new params round-trip correctly through
ChatCompletionInferenceParams. - Suggestion: Add a parametrize case or a standalone assertion for the Nemotron 3 Super params.
What Looks Good
- Clean, minimal changeset — only three files touched, all directly related. No stray changes.
- Docs updated alongside code — the default model settings page was updated in the same PR, which keeps them in sync.
- Test assertion updated — the model name assertion in
test_get_builtin_model_configswas correctly updated to match the new model.
Verdict
Needs changes — The inference parameters are the main issue. The model card is very explicit about temperature=1.0 being required across all tasks. This should be straightforward to fix by adding a model-specific constant (following the existing NEMOTRON_3_NANO_30B_A3B_INFERENCE_PARAMS pattern) and updating the docs table.
This review was generated by an AI assistant.
| DEFAULT_VISION_INFERENCE_PARAMS = {"temperature": 0.85, "top_p": 0.95} | ||
| DEFAULT_EMBEDDING_INFERENCE_PARAMS = {"encoding_format": "float"} | ||
| NEMOTRON_3_NANO_30B_A3B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 1.0} | ||
| NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 0.95} |
There was a problem hiding this comment.
Nemotron Super supports reasoning_effort as a request param (same as gpt-oss did). Might be worth setting a default here - without it the model runs full chain-of-thought on every call, which can add up for bulk generation. We don't set it for gpt-oss either so not blocking, but since we're touching this anyway:
NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 0.95, "extra_body": {"reasoning_effort": "medium"}}Mirrors what we do for GPT-5 on line 340.
|
A few docs still show
(Openrouter refs in |
|
Not a blocker for this PR, but worth flagging: Easiest fix is probably removing the |
- Add extra_body.reasoning_effort=medium to NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS (mirrors GPT-5 config) - Update README telemetry example and model-configs.md to use nvidia/nemotron-3-super-120b-a12b instead of openai/gpt-oss-20b - Broaden inference-parameters.md reasoning effort tip to cover Nemotron Super
Updated default model for NVIDIA model provider. Swapped GPT-OSS 120B with Nemotron 3 Super.