Skip to content

fix: Updated default reasoning model for nvidia#568

Merged
andreatgretel merged 5 commits intomainfrom
kirit93/update-default-models
Apr 22, 2026
Merged

fix: Updated default reasoning model for nvidia#568
andreatgretel merged 5 commits intomainfrom
kirit93/update-default-models

Conversation

@kirit93
Copy link
Copy Markdown
Contributor

@kirit93 kirit93 commented Apr 22, 2026

Updated default model for NVIDIA model provider. Swapped GPT-OSS 120B with Nemotron 3 Super.

@kirit93 kirit93 requested a review from a team as a code owner April 22, 2026 15:56
@kirit93 kirit93 deployed to agentic-ci April 22, 2026 15:57 — with GitHub Actions Active
@kirit93 kirit93 self-assigned this Apr 22, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 22, 2026

Greptile Summary

This PR replaces the default NVIDIA reasoning model from openai/gpt-oss-20b to nvidia/nemotron-3-super-120b-a12b and updates its inference parameters (temperature → 1.0, adds extra_body={"reasoning_effort": "medium"}). All changes are consistent across constants, tests, and documentation; the previously used DEFAULT_REASONING_INFERENCE_PARAMS constant is correctly retained for the openrouter provider.

Confidence Score: 5/5

This PR is safe to merge — it is a well-scoped constant swap with consistent updates across code, tests, and docs.

All changed files are internally consistent: the new model identifier and inference parameters are correctly reflected in constants.py, the test assertions, and all documentation pages. No logic errors or missing updates were found.

No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer-config/src/data_designer/config/utils/constants.py Adds NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS constant and updates the nvidia reasoning entry in PREDEFINED_PROVIDERS_MODEL_MAP to use the new model and params; DEFAULT_REASONING_INFERENCE_PARAMS is retained and still used by the openrouter provider.
packages/data-designer-config/tests/config/test_default_model_settings.py Test expectations updated to match the new model identifier and inference parameters (temperature=1.0, extra_body reasoning_effort) for the nvidia reasoning slot.
docs/concepts/models/default-model-settings.md Documentation table updated to reflect the new nvidia-reasoning model and its inference parameters.
docs/concepts/models/inference-parameters.md Tip section updated to include Nemotron 3 Super alongside GPT-OSS models when describing reasoning effort control via extra_body.
README.md Example ModelConfig in the README updated to show the new model name and adjusted temperature/top_p values.
docs/concepts/models/model-configs.md Example ModelConfig updated to use the new nvidia reasoning model with revised inference parameters.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Provider: nvidia] --> B{Model Role}
    B --> C[text
nvidia/nemotron-3-nano-30b-a3b
temp=1.0, top_p=1.0]
    B --> D[reasoning
nvidia/nemotron-3-super-120b-a12b
temp=1.0, top_p=0.95
reasoning_effort=medium]
    B --> E[vision
nvidia/nemotron-nano-12b-v2-vl
temp=0.85, top_p=0.95]
    B --> F[embedding
nvidia/llama-3.2-nv-embedqa-1b-v2
encoding_format=float]

    style D fill:#d4f1d4,stroke:#2d8a2d
Loading

Reviews (3): Last reviewed commit: "Remove build-time README accidentally tr..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

Docs preview: https://8c9078e8.dd-docs-preview.pages.dev

Notebook tutorials are placeholder-only in previews.

@github-actions
Copy link
Copy Markdown
Contributor

Code Review: PR #568fix: Updated default reasoning model for nvidia

Summary

Swaps the default nvidia-reasoning built-in model from openai/gpt-oss-20b to nvidia/nemotron-3-super-120b-a12b in three synchronized places:

  • packages/data-designer-config/src/data_designer/config/utils/constants.py (source of truth)
  • packages/data-designer-config/tests/config/test_default_model_settings.py (assertion updated in lockstep)
  • docs/concepts/models/default-model-settings.md (user-facing table)

Inference parameters (temperature=0.35, top_p=0.95) and the nvidia-reasoning alias are unchanged. Only the underlying model ID changes.

Net diff: +6 / −3.

Findings

Correctness

  • The swap is made consistently in the config, the corresponding test assertion, and the documented table — all three stay in sync.
  • The openrouter provider's reasoning model at constants.py:368 remains openai/gpt-oss-20b, which is intentional and correct (the scope of this PR is the NVIDIA provider only). Test assertion at test_default_model_settings.py:86 for the openrouter entry still matches.
  • The new model name nvidia/nemotron-3-super-120b-a12b matches the naming convention used elsewhere in the repo (e.g., docs/assets/recipes/trace_ingestion/agent_rollout_distillation.py:259, docs/devnotes/posts/owning-the-model-stack.md).

Style / Conventions

  • The single-line dict in the original was reformatted to a multi-line dict when the model string grew longer — matches the formatting style already used for neighboring entries (text, embedding) in the same map. Consistent with project conventions.
  • No import or typing concerns; data-only change.

Documentation

  • docs/concepts/models/default-model-settings.md is updated to reflect the new model. Good.
  • The PR description says "Swapped GPT-OSS 120B with Nemotron 3 Super," but the previous model was openai/gpt-oss-20b (20B, not 120B). Minor — the PR description is wrong but the diff is right. Worth correcting in the PR body for historical clarity, but not blocking.
  • docs/concepts/models/inference-parameters.md:28 still references gpt-oss models generically as examples of reasoning-effort tuning. That guidance is still accurate for the openrouter provider's default, so no change needed.

Test Coverage

  • The positional-index assertion (builtin_model_configs[1].model == ...) is updated. Note that this test is brittle to ordering changes in PREDEFINED_PROVIDERS_MODEL_MAP, but that is a pre-existing property of the test, not introduced here.
  • No new behavior is introduced, so no new tests are required.

Risk / Compatibility

  • User-visible behavior change: anyone relying on nvidia-reasoning implicitly will now get a different (and much larger: 120B vs 20B) model on the next run. Cost, latency, and rate-limit characteristics will differ. This is the intent of the PR, but downstream users should be made aware — worth a line in release notes / changelog if the project maintains one.
  • The chosen inference parameters (temperature=0.35, top_p=0.95) were tuned against GPT-OSS-20B. They may not be optimal for Nemotron-3-Super-120B, but keeping them constant is a reasonable default — no signal in the diff that re-tuning was considered.
  • No migration or deprecation path is provided for users who wanted the old default; they can still configure the model explicitly, so this is acceptable.

Security

  • No security implications. Config-only change with no new code paths, no secrets, no network behavior change.

Verdict

Looks good. Minimal, focused, well-scoped config update with matching test and docs changes. The only nit is the PR description's "120B" → should read "20B" to accurately describe what was swapped out. Safe to merge.

Copy link
Copy Markdown
Contributor

@nabinchha nabinchha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for swapping in the new model, @kirit93!

Summary

Replaces the default NVIDIA reasoning model from openai/gpt-oss-20b with nvidia/nemotron-3-super-120b-a12b in the NVIDIA provider map, updates the documentation table to match, and updates the corresponding test assertion. The model name change is correct and the changeset is clean.

Findings

Critical — Let's fix these before merge

packages/data-designer-config/src/data_designer/config/utils/constants.py:349 — Inference parameters don't match model card recommendations

  • What: The new model (nvidia/nemotron-3-super-120b-a12b) reuses DEFAULT_REASONING_INFERENCE_PARAMS which is temperature=0.35, top_p=0.95. The HuggingFace model card explicitly states: "Use temperature=1.0 and top_p=0.95 across all tasks and serving backends — reasoning, tool calling, and general chat alike." Every code example in the Quick Start Guide uses temperature=1.0, top_p=0.95.
  • Why: Running Nemotron 3 Super at temperature=0.35 is well outside the vendor-recommended operating range. The model was likely tuned and evaluated with temperature=1.0, so a much lower temperature could degrade output quality (especially for reasoning traces where diversity in the thinking process matters). This would silently affect every user relying on the nvidia-reasoning default.
  • Suggestion: Create a dedicated inference params constant for Nemotron 3 Super and use it in the provider map:
    NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 0.95}
    Then reference it:
    "reasoning": {
        "model": "nvidia/nemotron-3-super-120b-a12b",
        "inference_parameters": NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS,
    },
    This follows the same pattern used for NEMOTRON_3_NANO_30B_A3B_INFERENCE_PARAMS in the text slot. It also keeps DEFAULT_REASONING_INFERENCE_PARAMS unchanged for other providers (OpenRouter still uses gpt-oss-20b).

docs/concepts/models/default-model-settings.md:47 — Documentation table shows wrong inference parameters

  • What: The docs table for nvidia-reasoning shows temperature=0.35, top_p=0.95 — matching the code, but mismatched with the model card.
  • Why: Once the code is updated per the above, the docs need to reflect the actual parameters (temperature=1.0, top_p=0.95).
  • Suggestion: Update the table row:
    | `nvidia-reasoning` | `nvidia/nemotron-3-super-120b-a12b` | Reasoning and analysis tasks | `temperature=1.0, top_p=0.95` |
    

Suggestions — Take it or leave it

packages/data-designer-config/tests/config/test_default_model_settings.py:33 — Test hardcodes old inference params for reasoning

  • What: test_get_default_inference_parameters passes temperature=0.35 for the "reasoning" role. This test verifies the parsing function (not the provider map), so it won't break — but once you introduce a Nemotron-specific constant, you might want an additional test or parametrize case that exercises temperature=1.0, top_p=0.95 through the same path, so it's validated end-to-end.
  • Why: Extra confidence that the new params round-trip correctly through ChatCompletionInferenceParams.
  • Suggestion: Add a parametrize case or a standalone assertion for the Nemotron 3 Super params.

What Looks Good

  • Clean, minimal changeset — only three files touched, all directly related. No stray changes.
  • Docs updated alongside code — the default model settings page was updated in the same PR, which keeps them in sync.
  • Test assertion updated — the model name assertion in test_get_builtin_model_configs was correctly updated to match the new model.

Verdict

Needs changes — The inference parameters are the main issue. The model card is very explicit about temperature=1.0 being required across all tasks. This should be straightforward to fix by adding a model-specific constant (following the existing NEMOTRON_3_NANO_30B_A3B_INFERENCE_PARAMS pattern) and updating the docs table.


This review was generated by an AI assistant.

nabinchha
nabinchha previously approved these changes Apr 22, 2026
DEFAULT_VISION_INFERENCE_PARAMS = {"temperature": 0.85, "top_p": 0.95}
DEFAULT_EMBEDDING_INFERENCE_PARAMS = {"encoding_format": "float"}
NEMOTRON_3_NANO_30B_A3B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 1.0}
NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 0.95}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nemotron Super supports reasoning_effort as a request param (same as gpt-oss did). Might be worth setting a default here - without it the model runs full chain-of-thought on every call, which can add up for bulk generation. We don't set it for gpt-oss either so not blocking, but since we're touching this anyway:

NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS = {"temperature": 1.0, "top_p": 0.95, "extra_body": {"reasoning_effort": "medium"}}

Mirrors what we do for GPT-5 on line 340.

@andreatgretel
Copy link
Copy Markdown
Contributor

A few docs still show gpt-oss-20b as the nvidia reasoning model:

  • README.md:159,169 and packages/data-designer/README.md:159,169 - telemetry example
  • docs/concepts/models/model-configs.md:73 - example code snippet
  • docs/concepts/models/inference-parameters.md:27-28 - "Controlling Reasoning Effort for GPT-OSS Models" tip could mention Nemotron Super now that it's the nvidia default

(Openrouter refs in constants.py:369, default-model-settings.md:70, and test:86 are correct - different provider.)

@andreatgretel
Copy link
Copy Markdown
Contributor

Not a blocker for this PR, but worth flagging: ~/.data-designer/model_configs.yaml caches the builtin defaults on first run and never refreshes them (resolve_seed_default_model_settings only writes if the file doesn't exist). Existing installs will silently keep using gpt-oss-20b even after upgrading to a version with this change.

Easiest fix is probably removing the if not exists guard so it always writes current builtins on startup. Could be a follow-up PR since it affects all defaults, not just this model swap.

- Add extra_body.reasoning_effort=medium to
  NEMOTRON_3_SUPER_120B_A12B_INFERENCE_PARAMS (mirrors GPT-5 config)
- Update README telemetry example and model-configs.md to use
  nvidia/nemotron-3-super-120b-a12b instead of openai/gpt-oss-20b
- Broaden inference-parameters.md reasoning effort tip to cover
  Nemotron Super
@andreatgretel andreatgretel merged commit 4c6823c into main Apr 22, 2026
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants