Commit 998d535
authored
fix: guard nim_langchain against ChatNVIDIA model lookup crash (#1843)
Guard nim_langchain against ChatNVIDIA model lookup crash
Background: https://forums.developer.nvidia.com/t/nim-api-duplicate-model-issue/365698
Remove after nemo-agent-toolkit uses version with upstreamed fix: langchain-ai/langchain-nvidia#282
## Problem
ChatNVIDIA can crash with an AssertionError when a model is not present in the static MODEL_TABLE and appears more than once in the NIM \(/v1/models\) API response. In _NVIDIAClient.__init__(), the code asserts that there is at most one matching candidate, but duplicate API entries can produce two:
```
candidates = [model for model in self.available_models if model.id == self.mdl_name]
assert len(candidates) <= 1
```
This currently affects nvidia/nemotron-3-super-120b-a12b and could affect any future model that meets both conditions. The error propagates unhandled through nim_langchain(), surfacing as an opaque crash.
## Root cause
The NIM API at integrate.api.nvidia.com/v1/models can return duplicate entries for some models. langchain-nvidia-ai-endpoints does not deduplicate this response and instead asserts uniqueness.
An upstream fix has already been prepared in langchain-ai/langchain-nvidia on branch bb/fix-chat-model-dedup, but NAT should not depend on an unreleased upstream change to avoid crashing.
## Fix
Pre-register unknown models in ChatNVIDIA’s static MODEL_TABLE before constructing the client. This allows the static determine_model() lookup to succeed and avoids the \(/v1/models\) API call entirely, bypassing the problematic assertion.
```
if llm_config.model_name not in MODEL_TABLE:
MODEL_TABLE[llm_config.model_name] = Model(
id=llm_config.model_name,
model_type="chat",
client="ChatNVIDIA",
)
```
Models already present in MODEL_TABLE are unaffected because the guard skips them.
## Trade-offs
Pre-registered entries default to supports_tools=False, supports_structured_output=False, and supports_thinking=False.
As a result, bind_tools() on an unknown model may emit a warning such as “not known to support tools,” even though tool calling still works.
This is still strictly better than the current behavior, which is a hard crash.
The fix accesses MODEL_TABLE directly, which is not a public API of langchain-nvidia-ai-endpoints.
The public register_model() helper requires an endpoint parameter that is not available for standard hosted models.
Given that this is a temporary guard until the upstream deduplication fix is released, this trade-off is acceptable.
## Testing
Verified that ChatNVIDIA(model="nvidia/nemotron-3-super-120b-a12b") no longer crashes.
Verified that models already present in MODEL_TABLE, such as meta/llama-3.3-70b-instruct, are unaffected by the guard.
## Summary by CodeRabbit
* **Bug Fixes**
* Improved LLM startup so configured models are more reliably recognized during initialization, reducing cases where a configured model wasn't available.
* Keeps existing fallback behavior intact to avoid disruptions if advanced model registration isn't available.
Authors:
- Bryan Bednarski (https://github.com/bbednarski9)
Approvers:
- Will Killian (https://github.com/willkill07)
URL: #18431 parent 24f948d commit 998d535
File tree
1 file changed
+22
-0
lines changed- packages/nvidia_nat_langchain/src/nat/plugins/langchain
1 file changed
+22
-0
lines changedLines changed: 22 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
169 | 169 | | |
170 | 170 | | |
171 | 171 | | |
| 172 | + | |
172 | 173 | | |
173 | 174 | | |
174 | 175 | | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
175 | 197 | | |
176 | 198 | | |
177 | 199 | | |
| |||
0 commit comments