Skip to content

Commit 3daa5eb

Browse files
committed
Add OCI LangChain support for hosted Nemotron workflows
Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
1 parent f51c41c commit 3daa5eb

File tree

19 files changed

+1521
-818
lines changed

19 files changed

+1521
-818
lines changed

docs/source/build-workflows/llms/index.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ NVIDIA NeMo Agent Toolkit supports the following LLM providers:
2828
| [OpenAI](https://openai.com) | `openai` | OpenAI API |
2929
| [AWS Bedrock](https://aws.amazon.com/bedrock/) | `aws_bedrock` | AWS Bedrock API |
3030
| [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/quickstart) | `azure_openai` | Azure OpenAI API |
31+
| [OCI Hosted OpenAI-Compatible](https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm) | `oci` | OCI-hosted OpenAI-compatible API, including OCI Generative AI or OKE-hosted gateways |
3132
| [LiteLLM](https://github.com/BerriAI/litellm) | `litellm` | LiteLLM API |
3233
| [Hugging Face](https://huggingface.co) | `huggingface` | Hugging Face API |
3334
| [Hugging Face Inference](https://huggingface.co/docs/api-inference) | `huggingface_inference` | Hugging Face Inference API, Endpoints, and TGI |
@@ -52,6 +53,10 @@ llms:
5253
azure_openai_llm:
5354
_type: azure_openai
5455
azure_deployment: gpt-4o-mini
56+
oci_llm:
57+
_type: oci
58+
model_name: nvidia/Llama-3.1-Nemotron-Nano-8B-v1
59+
endpoint: https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/chat/completions/openai/v1
5560
litellm_llm:
5661
_type: litellm
5762
model_name: gpt-4o
@@ -118,6 +123,30 @@ The AWS Bedrock LLM provider is defined by the {py:class}`~nat.llm.aws_bedrock_l
118123
* `credentials_profile_name` - The credentials profile name to use for the model
119124
* `max_retries` - The maximum number of retries for the request
120125

126+
### OCI Hosted OpenAI-Compatible
127+
128+
You can use the following environment variables to configure the OCI Generative AI LLM provider:
129+
130+
* `OCI_GENAI_API_KEY` - The API key or bearer token to access the OCI-hosted endpoint
131+
* `OCI_GENAI_BASE_URL` - The OCI OpenAI-compatible endpoint base URL
132+
* `OCI_GENAI_ENDPOINT` - Alternate OCI Generative AI endpoint variable
133+
134+
The OCI OpenAI-compatible LLM provider is defined by the {py:class}`~nat.llm.oci_llm.OCIModelConfig` class.
135+
136+
* `model_name` - The name of the model to use
137+
* `endpoint` - The OCI OpenAI-compatible endpoint base URL
138+
* `temperature` - The temperature to use for the model
139+
* `top_p` - The top-p value to use for the model
140+
* `max_tokens` - The maximum number of tokens to generate
141+
* `seed` - The seed to use for the model
142+
* `api_key` - The API key to use for the model
143+
* `max_retries` - The maximum number of retries for the request
144+
* `request_timeout` - HTTP request timeout in seconds
145+
146+
:::{note}
147+
This provider targets OCI-hosted OpenAI-compatible chat-completions endpoints and does not enable the Responses API.
148+
:::
149+
121150
### Azure OpenAI
122151

123152
You can use the following environment variables to configure the Azure OpenAI LLM provider:

docs/source/components/integrations/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,5 @@ limitations under the License.
2323
./frameworks.md
2424
./a2a.md
2525
AWS Bedrock <./integrating-aws-bedrock-models.md>
26-
```
26+
OCI Generative AI <./integrating-oci-generative-ai-models.md>
27+
```
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
<!--
2+
SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
SPDX-License-Identifier: Apache-2.0
4+
5+
Licensed under the Apache License, Version 2.0 (the "License");
6+
you may not use this file except in compliance with the License.
7+
You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing, software
12+
distributed under the License is distributed on an "AS IS" BASIS,
13+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
See the License for the specific language governing permissions and
15+
limitations under the License.
16+
-->
17+
18+
# NVIDIA NeMo Agent Toolkit OCI Integration
19+
20+
The NeMo Agent Toolkit supports integration with multiple [LLM](../../build-workflows/llms/index.md) providers, including OCI Generative AI. The `oci` provider uses OCI SDK authentication and is designed for OCI Generative AI model and endpoint access. For workflow parity with the AWS Bedrock path, the toolkit also includes a LangChain wrapper built on `langchain-oci`.
21+
22+
To view the full list of supported LLM providers, run `nat info components -t llm_provider`.
23+
24+
## Configuration
25+
26+
### Prerequisites
27+
Before integrating OCI, ensure you have:
28+
29+
- access to OCI Generative AI in the target region
30+
- a valid OCI auth method such as `API_KEY`, `SECURITY_TOKEN`, `INSTANCE_PRINCIPAL`, or `RESOURCE_PRINCIPAL`
31+
- the target compartment OCID
32+
- the Generative AI service endpoint for the region or a custom endpoint URL
33+
34+
Common deployment patterns include:
35+
36+
- OCI Generative AI regional endpoints
37+
- custom OCI Generative AI endpoints
38+
- OCI-hosted inference for NVIDIA Nemotron used as a live integration target
39+
40+
### Example Configuration
41+
Add the OCI LLM configuration to your workflow config file:
42+
43+
```yaml
44+
llms:
45+
oci_llm:
46+
_type: oci
47+
model_name: nvidia/Llama-3.1-Nemotron-Nano-8B-v1
48+
endpoint: https://inference.generativeai.us-chicago-1.oci.oraclecloud.com
49+
compartment_id: ocid1.compartment.oc1..example
50+
auth_type: API_KEY
51+
auth_profile: API_KEY_AUTH
52+
temperature: 0.0
53+
max_tokens: 1024
54+
top_p: 1.0
55+
request_timeout: 60
56+
```
57+
58+
### Configurable Options
59+
* `model_name`: The name of the OCI-hosted model to use (required)
60+
* `endpoint`: The OCI Generative AI service endpoint or custom endpoint URL
61+
* `compartment_id`: OCI compartment OCID
62+
* `auth_type`: OCI SDK auth type
63+
* `auth_profile`: OCI profile name for file-backed auth
64+
* `auth_file_location`: Path to the OCI config file
65+
* `provider`: Optional OCI provider override such as `meta`, `google`, `cohere`, or `openai`
66+
* `temperature`: Controls randomness in the output (0.0 to 1.0)
67+
* `max_tokens`: Maximum number of tokens to generate
68+
* `top_p`: Top-p sampling parameter (0.0 to 1.0)
69+
* `seed`: Optional random seed
70+
* `max_retries`: Maximum number of retries for the request
71+
* `request_timeout`: HTTP request timeout in seconds
72+
73+
### Limitations
74+
* This provider targets OCI Generative AI through the OCI SDK-backed `langchain-oci` path.
75+
* The Responses API is not enabled for this provider in the current release.
76+
77+
## Nemotron On OCI
78+
79+
One strong OCI deployment pattern is NVIDIA Nemotron hosted on OCI and exposed through an OpenAI-compatible route. In that setup, the toolkit can validate live integration behavior against the OCI-hosted Nemotron endpoint while the official provider and LangChain wrapper cover the OCI Generative AI path.
80+
81+
## Usage
82+
Reference the OCI LLM in your configuration:
83+
84+
```yaml
85+
llms:
86+
oci_llm:
87+
_type: oci
88+
model_name: nvidia/Llama-3.1-Nemotron-Nano-8B-v1
89+
endpoint: https://inference.generativeai.us-chicago-1.oci.oraclecloud.com
90+
compartment_id: ocid1.compartment.oc1..example
91+
auth_profile: API_KEY_AUTH
92+
```
93+
94+
## Troubleshooting
95+
* `401 Unauthorized`: verify the OCI profile, signer, and IAM permissions for Generative AI.
96+
* `404 Not Found`: confirm the regional endpoint or custom endpoint URL is correct.
97+
* `Connection errors`: verify OCI networking and regional endpoint reachability.
98+
* `Tool calling issues`: verify the served model supports tool calling and that the serving stack is configured for it.

docs/source/conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -379,6 +379,8 @@ def _build_api_tree() -> Path:
379379
'/extend/custom-components/gated-fields.html',
380380
'extend/integrating-aws-bedrock-models':
381381
'/components/integrations/integrating-aws-bedrock-models.html',
382+
'extend/integrating-oci-generative-ai-models':
383+
'/components/integrations/integrating-oci-generative-ai-models.html',
382384
'extend/memory':
383385
'/extend/custom-components/memory.html',
384386
'extend/object-store':

docs/source/get-started/installation.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ The following [LLM](../build-workflows/llms/index.md) API providers are supporte
2727
- OpenAI
2828
- AWS Bedrock
2929
- Azure OpenAI
30+
- OCI Generative AI
3031

3132
## Packages
3233

examples/frameworks/agno_personal_finance/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ classifiers = ["Programming Language :: Python"]
3434
[tool.setuptools_dynamic_dependencies]
3535
dependencies = [
3636
"nvidia-nat[agno,test] == {version}",
37-
"openai~=1.106",
37+
"openai>=1.106,<3.0.0",
3838
]
3939

4040
[tool.uv.sources]

examples/frameworks/multi_frameworks/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ dependencies = [
3838
"beautifulsoup4~=4.13",
3939
"markdown-it-py~=3.0",
4040
"nvidia-haystack~=0.3.0",
41-
"openai~=1.106",
41+
"openai>=1.106,<3.0.0",
4242
]
4343

4444
[tool.uv.sources]

packages/nvidia_nat_agno/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ dependencies = [
5757
"agno>=1.2.3,<2.0.0",
5858
"google-search-results>=2.4.2,<3.0.0",
5959
"litellm~=1.74",
60-
"openai~=1.106",
60+
"openai>=1.106,<3.0.0",
6161
]
6262

6363
[tool.setuptools_dynamic_dependencies.optional-dependencies]
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
from pydantic import AliasChoices
17+
from pydantic import ConfigDict
18+
from pydantic import Field
19+
20+
from nat.builder.builder import Builder
21+
from nat.builder.llm import LLMProviderInfo
22+
from nat.cli.register_workflow import register_llm_provider
23+
from nat.data_models.llm import LLMBaseConfig
24+
from nat.data_models.optimizable import OptimizableField
25+
from nat.data_models.optimizable import OptimizableMixin
26+
from nat.data_models.optimizable import SearchSpace
27+
from nat.data_models.retry_mixin import RetryMixin
28+
from nat.data_models.ssl_verification_mixin import SSLVerificationMixin
29+
from nat.data_models.thinking_mixin import ThinkingMixin
30+
31+
class OCIModelConfig(LLMBaseConfig, RetryMixin, OptimizableMixin, ThinkingMixin, SSLVerificationMixin, name="oci"):
32+
"""OCI Generative AI LLM provider."""
33+
34+
model_config = ConfigDict(protected_namespaces=(), extra="allow")
35+
36+
endpoint: str | None = Field(
37+
default=None,
38+
validation_alias=AliasChoices("endpoint", "service_endpoint", "base_url"),
39+
description="OCI Generative AI service endpoint URL.",
40+
)
41+
compartment_id: str | None = Field(default=None, description="OCI compartment OCID for Generative AI requests.")
42+
auth_type: str = Field(default="API_KEY",
43+
description="OCI SDK authentication type: API_KEY, SECURITY_TOKEN, INSTANCE_PRINCIPAL, "
44+
"or RESOURCE_PRINCIPAL.")
45+
auth_profile: str = Field(default="DEFAULT",
46+
description="OCI config profile to use for API_KEY or SECURITY_TOKEN auth.")
47+
auth_file_location: str = Field(default="~/.oci/config",
48+
description="Path to the OCI config file used for SDK authentication.")
49+
model_name: str = OptimizableField(validation_alias=AliasChoices("model_name", "model"),
50+
serialization_alias="model",
51+
description="The OCI Generative AI model ID.")
52+
provider: str | None = Field(default=None,
53+
description="Optional OCI provider override such as cohere, google, meta, or openai.")
54+
context_size: int | None = Field(
55+
default=1024,
56+
gt=0,
57+
description="The maximum number of tokens available for input.",
58+
)
59+
seed: int | None = Field(default=None, description="Random seed to set for generation.")
60+
max_retries: int = Field(default=10, description="The max number of retries for the request.")
61+
max_tokens: int | None = Field(default=None, gt=0, description="Maximum number of output tokens.")
62+
temperature: float | None = OptimizableField(
63+
default=None,
64+
ge=0.0,
65+
description="Sampling temperature to control randomness in the output.",
66+
space=SearchSpace(high=0.9, low=0.1, step=0.2))
67+
top_p: float | None = OptimizableField(default=None,
68+
ge=0.0,
69+
le=1.0,
70+
description="Top-p for distribution sampling.",
71+
space=SearchSpace(high=1.0, low=0.5, step=0.1))
72+
request_timeout: float | None = Field(default=None, gt=0.0, description="HTTP request timeout in seconds.")
73+
74+
75+
@register_llm_provider(config_type=OCIModelConfig)
76+
async def oci_llm(config: OCIModelConfig, _builder: Builder):
77+
78+
yield LLMProviderInfo(config=config, description="An OCI Generative AI model for use with an LLM client.")

packages/nvidia_nat_core/src/nat/llm/register.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,5 @@
2727
from . import huggingface_llm
2828
from . import litellm_llm
2929
from . import nim_llm
30+
from . import oci_llm
3031
from . import openai_llm

0 commit comments

Comments
 (0)