fix: guard nim_langchain against ChatNVIDIA model lookup crash (#1843)

bbednarski9 · web-flow · commit 998d535ff2d3 · 2026-04-07T22:29:55.000Z
Guard nim_langchain against ChatNVIDIA model lookup crash Background: https://forums.developer.nvidia.com/t/nim-api-duplicate-model-issue/365698 Remove after nemo-agent-toolkit uses version with upstreamed fix: langchain-ai/langchain-nvidia#282 ## Problem ChatNVIDIA can crash with an AssertionError when a model is not present in the static MODEL_TABLE and appears more than once in the NIM \(/v1/models\) API response. In _NVIDIAClient.__init__(), the code asserts that there is at most one matching candidate, but duplicate API entries can produce two: ``` candidates = [model for model in self.available_models if model.id == self.mdl_name] assert len(candidates) <= 1 ``` This currently affects nvidia/nemotron-3-super-120b-a12b and could affect any future model that meets both conditions. The error propagates unhandled through nim_langchain(), surfacing as an opaque crash. ## Root cause The NIM API at integrate.api.nvidia.com/v1/models can return duplicate entries for some models. langchain-nvidia-ai-endpoints does not deduplicate this response and instead asserts uniqueness. An upstream fix has already been prepared in langchain-ai/langchain-nvidia on branch bb/fix-chat-model-dedup, but NAT should not depend on an unreleased upstream change to avoid crashing. ## Fix Pre-register unknown models in ChatNVIDIA’s static MODEL_TABLE before constructing the client. This allows the static determine_model() lookup to succeed and avoids the \(/v1/models\) API call entirely, bypassing the problematic assertion. ``` if llm_config.model_name not in MODEL_TABLE: MODEL_TABLE[llm_config.model_name] = Model( id=llm_config.model_name, model_type="chat", client="ChatNVIDIA", ) ``` Models already present in MODEL_TABLE are unaffected because the guard skips them. ## Trade-offs Pre-registered entries default to supports_tools=False, supports_structured_output=False, and supports_thinking=False. As a result, bind_tools() on an unknown model may emit a warning such as “not known to support tools,” even though tool calling still works. This is still strictly better than the current behavior, which is a hard crash. The fix accesses MODEL_TABLE directly, which is not a public API of langchain-nvidia-ai-endpoints. The public register_model() helper requires an endpoint parameter that is not available for standard hosted models. Given that this is a temporary guard until the upstream deduplication fix is released, this trade-off is acceptable. ## Testing Verified that ChatNVIDIA(model="nvidia/nemotron-3-super-120b-a12b") no longer crashes. Verified that models already present in MODEL_TABLE, such as meta/llama-3.3-70b-instruct, are unaffected by the guard. ## Summary by CodeRabbit * **Bug Fixes** * Improved LLM startup so configured models are more reliably recognized during initialization, reducing cases where a configured model wasn't available. * Keeps existing fallback behavior intact to avoid disruptions if advanced model registration isn't available. Authors: - Bryan Bednarski (https://github.com/bbednarski9) Approvers: - Will Killian (https://github.com/willkill07) URL: #1843
diff --git a/packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py b/packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
@@ -169,9 +169,31 @@ async def azure_openai_langchain(llm_config: AzureOpenAIModelConfig, _builder: B
 async def nim_langchain(llm_config: NIMModelConfig, _builder: Builder):
 
     from langchain_nvidia_ai_endpoints import ChatNVIDIA
+    from langchain_nvidia_ai_endpoints import Model
 
     validate_no_responses_api(llm_config, LLMFrameworkEnum.LANGCHAIN)
 
+    # TODO: Remove after upgrading to a langchain-nvidia-ai-endpoints release
+    # that includes https://github.com/langchain-ai/langchain-nvidia/pull/282.
+    #
+    # Pre-register unknown models so ChatNVIDIA skips the /v1/models API
+    # call. This guards against upstream issues such as duplicate entries
+    # in the API response that cause ChatNVIDIA to crash with AssertionError.
+    # Uses internal MODEL_TABLE with fallback — if the private module
+    # changes between langchain-nvidia-ai-endpoints versions, we skip
+    # pre-registration and let ChatNVIDIA discover the model via /v1/models.
+    try:
+        from langchain_nvidia_ai_endpoints._statics import MODEL_TABLE
+
+        if llm_config.model_name not in MODEL_TABLE:
+            MODEL_TABLE[llm_config.model_name] = Model(
+                id=llm_config.model_name,
+                model_type="chat",
+                client="ChatNVIDIA",
+            )
+    except (ImportError, AttributeError):
+        pass
+
     # prefer max_completion_tokens over max_tokens
     # verify_ssl is a supported keyword parameter for the ChatNVIDIA client
     client = ChatNVIDIA(