Skip to content

Commit 998d535

Browse files
authored
fix: guard nim_langchain against ChatNVIDIA model lookup crash (#1843)
Guard nim_langchain against ChatNVIDIA model lookup crash Background: https://forums.developer.nvidia.com/t/nim-api-duplicate-model-issue/365698 Remove after nemo-agent-toolkit uses version with upstreamed fix: langchain-ai/langchain-nvidia#282 ## Problem ChatNVIDIA can crash with an AssertionError when a model is not present in the static MODEL_TABLE and appears more than once in the NIM \(/v1/models\) API response. In _NVIDIAClient.__init__(), the code asserts that there is at most one matching candidate, but duplicate API entries can produce two: ``` candidates = [model for model in self.available_models if model.id == self.mdl_name] assert len(candidates) <= 1 ``` This currently affects nvidia/nemotron-3-super-120b-a12b and could affect any future model that meets both conditions. The error propagates unhandled through nim_langchain(), surfacing as an opaque crash. ## Root cause The NIM API at integrate.api.nvidia.com/v1/models can return duplicate entries for some models. langchain-nvidia-ai-endpoints does not deduplicate this response and instead asserts uniqueness. An upstream fix has already been prepared in langchain-ai/langchain-nvidia on branch bb/fix-chat-model-dedup, but NAT should not depend on an unreleased upstream change to avoid crashing. ## Fix Pre-register unknown models in ChatNVIDIA’s static MODEL_TABLE before constructing the client. This allows the static determine_model() lookup to succeed and avoids the \(/v1/models\) API call entirely, bypassing the problematic assertion. ``` if llm_config.model_name not in MODEL_TABLE: MODEL_TABLE[llm_config.model_name] = Model( id=llm_config.model_name, model_type="chat", client="ChatNVIDIA", ) ``` Models already present in MODEL_TABLE are unaffected because the guard skips them. ## Trade-offs Pre-registered entries default to supports_tools=False, supports_structured_output=False, and supports_thinking=False. As a result, bind_tools() on an unknown model may emit a warning such as “not known to support tools,” even though tool calling still works. This is still strictly better than the current behavior, which is a hard crash. The fix accesses MODEL_TABLE directly, which is not a public API of langchain-nvidia-ai-endpoints. The public register_model() helper requires an endpoint parameter that is not available for standard hosted models. Given that this is a temporary guard until the upstream deduplication fix is released, this trade-off is acceptable. ## Testing Verified that ChatNVIDIA(model="nvidia/nemotron-3-super-120b-a12b") no longer crashes. Verified that models already present in MODEL_TABLE, such as meta/llama-3.3-70b-instruct, are unaffected by the guard. ## Summary by CodeRabbit * **Bug Fixes** * Improved LLM startup so configured models are more reliably recognized during initialization, reducing cases where a configured model wasn't available. * Keeps existing fallback behavior intact to avoid disruptions if advanced model registration isn't available. Authors: - Bryan Bednarski (https://github.com/bbednarski9) Approvers: - Will Killian (https://github.com/willkill07) URL: #1843
1 parent 24f948d commit 998d535

File tree

1 file changed

+22
-0
lines changed
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain

1 file changed

+22
-0
lines changed

packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -169,9 +169,31 @@ async def azure_openai_langchain(llm_config: AzureOpenAIModelConfig, _builder: B
169169
async def nim_langchain(llm_config: NIMModelConfig, _builder: Builder):
170170

171171
from langchain_nvidia_ai_endpoints import ChatNVIDIA
172+
from langchain_nvidia_ai_endpoints import Model
172173

173174
validate_no_responses_api(llm_config, LLMFrameworkEnum.LANGCHAIN)
174175

176+
# TODO: Remove after upgrading to a langchain-nvidia-ai-endpoints release
177+
# that includes https://github.com/langchain-ai/langchain-nvidia/pull/282.
178+
#
179+
# Pre-register unknown models so ChatNVIDIA skips the /v1/models API
180+
# call. This guards against upstream issues such as duplicate entries
181+
# in the API response that cause ChatNVIDIA to crash with AssertionError.
182+
# Uses internal MODEL_TABLE with fallback — if the private module
183+
# changes between langchain-nvidia-ai-endpoints versions, we skip
184+
# pre-registration and let ChatNVIDIA discover the model via /v1/models.
185+
try:
186+
from langchain_nvidia_ai_endpoints._statics import MODEL_TABLE
187+
188+
if llm_config.model_name not in MODEL_TABLE:
189+
MODEL_TABLE[llm_config.model_name] = Model(
190+
id=llm_config.model_name,
191+
model_type="chat",
192+
client="ChatNVIDIA",
193+
)
194+
except (ImportError, AttributeError):
195+
pass
196+
175197
# prefer max_completion_tokens over max_tokens
176198
# verify_ssl is a supported keyword parameter for the ChatNVIDIA client
177199
client = ChatNVIDIA(

0 commit comments

Comments
 (0)