Skip to content

Commit 343161c

Browse files
authored
Merge pull request #2893 from oracle-devrel/tutorial/cline-oci-generative-ai-openai-api-update
tutorial updated with DAC use for coding-assistant
2 parents 8be3ff4 + a2f466a commit 343161c

1 file changed

Lines changed: 102 additions & 24 deletions

File tree

  • ai/generative-ai-service/coding-assistant

ai/generative-ai-service/coding-assistant/README.md

Lines changed: 102 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,22 @@
22

33
A practical guide for configuring Cline to call OCI Generative AI models through OCI's OpenAI-compatible API.
44

5+
Cline is an AI coding assistant that works inside your IDE and can help with day-to-day development tasks such as explaining unfamiliar code, generating new files, refactoring existing logic, writing tests, debugging errors, and summarizing repository structure. By connecting Cline to OCI Generative AI, developers can use OCI-hosted models directly from their coding environment while keeping model access, deployment choices, and enterprise controls within Oracle Cloud Infrastructure.
6+
7+
This setup is useful when teams want AI-assisted development workflows that can use either on-demand OCI Generative AI models for quick setup or Dedicated AI Cluster (DAC)-hosted models for production-grade isolation, performance, and customization.
8+
59
## Overview
610

711
This tutorial walks through configuring Cline to use OCI Generative AI models through the OCI OpenAI-compatible API.
812

913
The process involves:
1014

1115
1. Selecting an OCI Generative AI model
12-
2. Understanding the OCI OpenAI-compatible API endpoint URL
13-
3. Creating an OCI Generative AI API key
14-
4. Configuring Cline with the OCI OpenAI-compatible base URL, API key, and model ID
15-
5. Testing the setup with an example prompt
16+
2. Choosing whether to use the model on demand or from a Dedicated AI Cluster (DAC)
17+
3. Understanding the OCI OpenAI-compatible API endpoint URL
18+
4. Creating an OCI Generative AI API key
19+
5. Configuring Cline with the OCI OpenAI-compatible base URL, API key, and model ID
20+
6. Testing the setup with an example prompt
1621

1722
OCI provides OpenAI-compatible APIs for model inference, including Chat Completions and Responses. This tutorial uses the OCI Generative AI OpenAI-compatible API documented here:
1823

@@ -42,21 +47,30 @@ Example regions include:
4247
- `uk-london-1`
4348
- `eu-frankfurt-1`
4449

45-
The region is used in the OpenAI-compatible base URL.
50+
The region is used in the OpenAI-compatible URL.
4651

47-
Example:
52+
On-demand example:
4853

4954
```text
5055
https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/openai/v1
5156
```
5257

53-
**Important:** Create the OCI Generative AI API key in the same region where you plan to use the model.
58+
DAC Chat Completions example:
59+
60+
```text
61+
https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/openai/v1/chat/completions
62+
```
63+
5464

5565
### 2. Choose a Model
5666

57-
Select the OCI Generative AI model that you want Cline to use.
67+
Select the OCI Generative AI model that you want Cline to use. OCI Generative AI models can be used either on demand or from a Dedicated AI Cluster (DAC). The Cline setup is different for each option, so choose the deployment path first.
68+
69+
#### On-Demand Models
70+
71+
On-demand models are shared, OCI-hosted models that are ready to call directly through the OpenAI-compatible API. This is the simplest setup for testing, prototyping, and lighter usage patterns.
5872

59-
Example model IDs:
73+
Example OCI Model Names:
6074

6175
```text
6276
xai.grok-code-fast-1
@@ -66,20 +80,54 @@ openai.gpt-oss-120b
6680
google.gemini-2.5-flash
6781
```
6882

83+
For on-demand models, the Cline **Model ID** is the OCI Model Name.
84+
85+
#### Dedicated AI Cluster (DAC)-Hosted Models
86+
87+
DAC-hosted models run on dedicated infrastructure in your tenancy. Use a DAC-hosted model when you need production-grade control over model hosting and inference. DACs provide several advantages:
88+
89+
- **Flexibility:** Import supported Hugging Face-format models from Hugging Face or Object Storage, test imported models with shorter commitments, choose fine-tuned or quantized versions, and right-size based on visible hardware specifications.
90+
- **Isolation:** Run workloads on dedicated GPU resources inside your tenancy, which helps protect sensitive data, avoids shared-resource contention, and supports regulated workloads.
91+
- **Predictable latency:** Dedicated infrastructure can provide more stable time-to-first-token and inference response times than shared model endpoints, especially for scaling production applications.
92+
- **Fine-tuning support:** Host fine-tuned models alongside base models, run multiple fine-tuned models on a single cluster, and control model lifecycle and upgrade cadence.
93+
- **Cost efficiency at scale:** For inference-heavy workloads, DACs can reduce effective price per token by keeping dedicated resources highly utilized and hosting multiple models on one cluster.
94+
- **Deployment near data:** Deploy in supported OCI regions, including regulated regions where available, to support data residency, lower latency, and simpler security reviews.
95+
- **Simplified management:** OCI manages the infrastructure while you manage model deployment, scaling, fine-tuning, and application integration.
96+
97+
Before configuring Cline for a DAC-hosted model, make sure the model endpoint is already created and active.
98+
99+
1. Open the OCI Console
100+
2. Navigate to **Analytics & AI -> Generative AI**
101+
3. Go to **Endpoints**
102+
4. Confirm that the endpoint for your DAC-hosted model is **Active**
103+
5. Keep note of the endpoint region and DAC endpoint OCID
104+
105+
For DAC-hosted models, the Cline **Model ID** is the DAC endpoint OCID.
106+
107+
```text
108+
ocid1.generativeaiendpoint.<region>..<unique_id>
109+
```
110+
69111
### 3. Understand the OCI OpenAI-Compatible API Endpoint URL
70112

71-
All calls from Cline go through the OCI OpenAI-compatible API endpoint URL:
113+
Cline uses a different OCI OpenAI-compatible endpoint format depending on whether the model is on demand or DAC-hosted.
114+
115+
For on-demand models, configure the base URL **without** `/chat/completions`:
72116

73117
```text
74118
https://inference.generativeai.<region>.oci.oraclecloud.com/openai/v1
75119
```
76120

77-
This base URL is the endpoint you configure in Cline.
121+
For DAC-hosted models, configure the full Chat Completions URL **with** `/chat/completions`:
122+
123+
```text
124+
https://inference.generativeai.<region>.oci.oraclecloud.com/openai/v1/chat/completions
125+
```
78126

79127
Keep note of the following values:
80128

81129
- Region
82-
- Model ID
130+
- OCI Model Name or DAC endpoint OCID
83131
- Compartment
84132

85133
### 4. Create an OCI Generative AI API Key
@@ -94,18 +142,18 @@ Keep note of the following values:
94142
cline-genai-key
95143
```
96144

97-
6. Optionally add a description
145+
6. Add a description if needed
98146
7. Configure key names and expiration dates
99147
8. Click **Create**
100-
9. Copy one of the generated key values immediately
148+
9. Copy one of the generated key values
101149

102150
OCI Generative AI API keys are service-specific credentials and are different from OCI IAM API keys.
103151

104-
**Important:** Store the key securely. Do not commit it to GitHub or place it in source code.
152+
⚠️ Store the key securely. Do not commit it to GitHub or place it in source code.
105153

106-
### 5. Build the OCI OpenAI-Compatible Base URL
154+
### 5. Build the OCI OpenAI-Compatible URL
107155

108-
Use the following base URL format for Cline:
156+
For on-demand models, use the following base URL format for Cline:
109157

110158
```text
111159
https://inference.generativeai.<region>.oci.oraclecloud.com/openai/v1
@@ -117,6 +165,14 @@ Example for US Midwest Chicago:
117165
https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/openai/v1
118166
```
119167

168+
For DAC-hosted models, use the full Chat Completions URL shown in the DAC section.
169+
170+
Example for UK South London:
171+
172+
```text
173+
https://inference.generativeai.uk-london-1.oci.oraclecloud.com/openai/v1/chat/completions
174+
```
175+
120176

121177
### 6. Configure Cline
122178

@@ -130,29 +186,43 @@ Open VS Code or PyCharm and configure Cline:
130186
OpenAI Compatible
131187
```
132188

133-
4. Set **Base URL** to your OCI base URL:
189+
4. Set **Base URL** to your OCI URL.
190+
191+
For on-demand models:
134192

135193
```text
136194
https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/openai/v1
137195
```
138196

197+
For DAC-hosted models:
198+
199+
```text
200+
https://inference.generativeai.uk-london-1.oci.oraclecloud.com/openai/v1/chat/completions
201+
```
202+
139203
5. Paste your OCI Generative AI API key into the **API Key** field
140-
6. Set **Model ID** to your OCI model ID
204+
6. Set **Model ID** to your OCI Model Name or DAC endpoint OCID
141205

142206
Example:
143207

144208
```text
145209
xai.grok-code-fast-1
146210
```
147211

212+
For a DAC-hosted model, use the DAC endpoint OCID instead:
213+
214+
```text
215+
ocid1.generativeaiendpoint.<region>..<unique_id>
216+
```
217+
148218
7. Save the configuration
149219

150220
### 7. Test the Connection in Cline
151221

152222
Use a simple prompt first:
153223

154224
```text
155-
Write a one-sentence commit message for a change that adds OCI Generative AI support to a Cline tutorial.
225+
Hello. Reply with one sentence confirming that the connection works.
156226
```
157227

158228
If the setup is correct, Cline should return a normal response from the OCI-hosted model.
@@ -186,9 +256,9 @@ Check that:
186256

187257
Check that:
188258

189-
- The model ID is correct
190259
- The model is available in the selected region
191260
- The model supports OpenAI-compatible chat completion requests
261+
- If using a DAC-hosted model, the OCI Generative AI endpoint is active and the Model ID field uses the DAC endpoint OCID
192262

193263
### Authorization Error
194264

@@ -204,19 +274,25 @@ Check that:
204274
Check that:
205275

206276
- The base URL uses the correct region
207-
- The URL ends with:
277+
- For on-demand models, the URL ends with:
208278

209279
```text
210280
/openai/v1
211281
```
212282

283+
- For DAC-hosted models, the URL ends with:
284+
285+
```text
286+
/openai/v1/chat/completions
287+
```
288+
213289
- Your network can reach OCI public endpoints
290+
- Private endpoints are reachable from your machine if using private networking
214291

215292
## Recommended Tests
216293

217294
Run the following tests in Cline:
218295

219-
- Simple math prompt
220296
- Short coding task
221297
- Code explanation prompt
222298
- Refactoring prompt
@@ -229,10 +305,12 @@ If this was only a test setup:
229305

230306
1. Remove the API key from Cline
231307
2. Revoke or deactivate the OCI Generative AI API key
308+
3. If using a DAC-hosted model only for testing, delete the endpoint before deleting the Dedicated AI Cluster
232309

233310
## OCI Services Used
234311

235-
- **OCI Generative AI** - Model inference
312+
- **OCI Generative AI** - Model inference and endpoints
313+
- **OCI Generative AI Dedicated AI Clusters** - Dedicated hosting for deployed models
236314
- **OCI IAM** - Policies and authorization
237315
- **OCI Generative AI API Keys** - API key authentication
238316
- **Cline** - AI coding assistant in VS Code or PyCharm

0 commit comments

Comments
 (0)