Skip to content

Commit c1db1ae

Browse files
Merge branch 'main' into NicolaCimitile-patch-3
2 parents 4c55e0d + 09eee4c commit c1db1ae

70 files changed

Lines changed: 3516 additions & 36 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

ai/generative-ai-service/coding-assistant/README.md

Lines changed: 102 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,22 @@
22

33
A practical guide for configuring Cline to call OCI Generative AI models through OCI's OpenAI-compatible API.
44

5+
Cline is an AI coding assistant that works inside your IDE and can help with day-to-day development tasks such as explaining unfamiliar code, generating new files, refactoring existing logic, writing tests, debugging errors, and summarizing repository structure. By connecting Cline to OCI Generative AI, developers can use OCI-hosted models directly from their coding environment while keeping model access, deployment choices, and enterprise controls within Oracle Cloud Infrastructure.
6+
7+
This setup is useful when teams want AI-assisted development workflows that can use either on-demand OCI Generative AI models for quick setup or Dedicated AI Cluster (DAC)-hosted models for production-grade isolation, performance, and customization.
8+
59
## Overview
610

711
This tutorial walks through configuring Cline to use OCI Generative AI models through the OCI OpenAI-compatible API.
812

913
The process involves:
1014

1115
1. Selecting an OCI Generative AI model
12-
2. Understanding the OCI OpenAI-compatible API endpoint URL
13-
3. Creating an OCI Generative AI API key
14-
4. Configuring Cline with the OCI OpenAI-compatible base URL, API key, and model ID
15-
5. Testing the setup with an example prompt
16+
2. Choosing whether to use the model on demand or from a Dedicated AI Cluster (DAC)
17+
3. Understanding the OCI OpenAI-compatible API endpoint URL
18+
4. Creating an OCI Generative AI API key
19+
5. Configuring Cline with the OCI OpenAI-compatible base URL, API key, and model ID
20+
6. Testing the setup with an example prompt
1621

1722
OCI provides OpenAI-compatible APIs for model inference, including Chat Completions and Responses. This tutorial uses the OCI Generative AI OpenAI-compatible API documented here:
1823

@@ -42,21 +47,30 @@ Example regions include:
4247
- `uk-london-1`
4348
- `eu-frankfurt-1`
4449

45-
The region is used in the OpenAI-compatible base URL.
50+
The region is used in the OpenAI-compatible URL.
4651

47-
Example:
52+
On-demand example:
4853

4954
```text
5055
https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/openai/v1
5156
```
5257

53-
**Important:** Create the OCI Generative AI API key in the same region where you plan to use the model.
58+
DAC Chat Completions example:
59+
60+
```text
61+
https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/openai/v1/chat/completions
62+
```
63+
5464

5565
### 2. Choose a Model
5666

57-
Select the OCI Generative AI model that you want Cline to use.
67+
Select the OCI Generative AI model that you want Cline to use. OCI Generative AI models can be used either on demand or from a Dedicated AI Cluster (DAC). The Cline setup is different for each option, so choose the deployment path first.
68+
69+
#### On-Demand Models
70+
71+
On-demand models are shared, OCI-hosted models that are ready to call directly through the OpenAI-compatible API. This is the simplest setup for testing, prototyping, and lighter usage patterns.
5872

59-
Example model IDs:
73+
Example OCI Model Names:
6074

6175
```text
6276
xai.grok-code-fast-1
@@ -66,20 +80,54 @@ openai.gpt-oss-120b
6680
google.gemini-2.5-flash
6781
```
6882

83+
For on-demand models, the Cline **Model ID** is the OCI Model Name.
84+
85+
#### Dedicated AI Cluster (DAC)-Hosted Models
86+
87+
DAC-hosted models run on dedicated infrastructure in your tenancy. Use a DAC-hosted model when you need production-grade control over model hosting and inference. DACs provide several advantages:
88+
89+
- **Flexibility:** Import supported Hugging Face-format models from Hugging Face or Object Storage, test imported models with shorter commitments, choose fine-tuned or quantized versions, and right-size based on visible hardware specifications.
90+
- **Isolation:** Run workloads on dedicated GPU resources inside your tenancy, which helps protect sensitive data, avoids shared-resource contention, and supports regulated workloads.
91+
- **Predictable latency:** Dedicated infrastructure can provide more stable time-to-first-token and inference response times than shared model endpoints, especially for scaling production applications.
92+
- **Fine-tuning support:** Host fine-tuned models alongside base models, run multiple fine-tuned models on a single cluster, and control model lifecycle and upgrade cadence.
93+
- **Cost efficiency at scale:** For inference-heavy workloads, DACs can reduce effective price per token by keeping dedicated resources highly utilized and hosting multiple models on one cluster.
94+
- **Deployment near data:** Deploy in supported OCI regions, including regulated regions where available, to support data residency, lower latency, and simpler security reviews.
95+
- **Simplified management:** OCI manages the infrastructure while you manage model deployment, scaling, fine-tuning, and application integration.
96+
97+
Before configuring Cline for a DAC-hosted model, make sure the model endpoint is already created and active.
98+
99+
1. Open the OCI Console
100+
2. Navigate to **Analytics & AI -> Generative AI**
101+
3. Go to **Endpoints**
102+
4. Confirm that the endpoint for your DAC-hosted model is **Active**
103+
5. Keep note of the endpoint region and DAC endpoint OCID
104+
105+
For DAC-hosted models, the Cline **Model ID** is the DAC endpoint OCID.
106+
107+
```text
108+
ocid1.generativeaiendpoint.<region>..<unique_id>
109+
```
110+
69111
### 3. Understand the OCI OpenAI-Compatible API Endpoint URL
70112

71-
All calls from Cline go through the OCI OpenAI-compatible API endpoint URL:
113+
Cline uses a different OCI OpenAI-compatible endpoint format depending on whether the model is on demand or DAC-hosted.
114+
115+
For on-demand models, configure the base URL **without** `/chat/completions`:
72116

73117
```text
74118
https://inference.generativeai.<region>.oci.oraclecloud.com/openai/v1
75119
```
76120

77-
This base URL is the endpoint you configure in Cline.
121+
For DAC-hosted models, configure the full Chat Completions URL **with** `/chat/completions`:
122+
123+
```text
124+
https://inference.generativeai.<region>.oci.oraclecloud.com/openai/v1/chat/completions
125+
```
78126

79127
Keep note of the following values:
80128

81129
- Region
82-
- Model ID
130+
- OCI Model Name or DAC endpoint OCID
83131
- Compartment
84132

85133
### 4. Create an OCI Generative AI API Key
@@ -94,18 +142,18 @@ Keep note of the following values:
94142
cline-genai-key
95143
```
96144

97-
6. Optionally add a description
145+
6. Add a description if needed
98146
7. Configure key names and expiration dates
99147
8. Click **Create**
100-
9. Copy one of the generated key values immediately
148+
9. Copy one of the generated key values
101149

102150
OCI Generative AI API keys are service-specific credentials and are different from OCI IAM API keys.
103151

104-
**Important:** Store the key securely. Do not commit it to GitHub or place it in source code.
152+
⚠️ Store the key securely. Do not commit it to GitHub or place it in source code.
105153

106-
### 5. Build the OCI OpenAI-Compatible Base URL
154+
### 5. Build the OCI OpenAI-Compatible URL
107155

108-
Use the following base URL format for Cline:
156+
For on-demand models, use the following base URL format for Cline:
109157

110158
```text
111159
https://inference.generativeai.<region>.oci.oraclecloud.com/openai/v1
@@ -117,6 +165,14 @@ Example for US Midwest Chicago:
117165
https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/openai/v1
118166
```
119167

168+
For DAC-hosted models, use the full Chat Completions URL shown in the DAC section.
169+
170+
Example for UK South London:
171+
172+
```text
173+
https://inference.generativeai.uk-london-1.oci.oraclecloud.com/openai/v1/chat/completions
174+
```
175+
120176

121177
### 6. Configure Cline
122178

@@ -130,29 +186,43 @@ Open VS Code or PyCharm and configure Cline:
130186
OpenAI Compatible
131187
```
132188

133-
4. Set **Base URL** to your OCI base URL:
189+
4. Set **Base URL** to your OCI URL.
190+
191+
For on-demand models:
134192

135193
```text
136194
https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/openai/v1
137195
```
138196

197+
For DAC-hosted models:
198+
199+
```text
200+
https://inference.generativeai.uk-london-1.oci.oraclecloud.com/openai/v1/chat/completions
201+
```
202+
139203
5. Paste your OCI Generative AI API key into the **API Key** field
140-
6. Set **Model ID** to your OCI model ID
204+
6. Set **Model ID** to your OCI Model Name or DAC endpoint OCID
141205

142206
Example:
143207

144208
```text
145209
xai.grok-code-fast-1
146210
```
147211

212+
For a DAC-hosted model, use the DAC endpoint OCID instead:
213+
214+
```text
215+
ocid1.generativeaiendpoint.<region>..<unique_id>
216+
```
217+
148218
7. Save the configuration
149219

150220
### 7. Test the Connection in Cline
151221

152222
Use a simple prompt first:
153223

154224
```text
155-
Write a one-sentence commit message for a change that adds OCI Generative AI support to a Cline tutorial.
225+
Hello. Reply with one sentence confirming that the connection works.
156226
```
157227

158228
If the setup is correct, Cline should return a normal response from the OCI-hosted model.
@@ -186,9 +256,9 @@ Check that:
186256

187257
Check that:
188258

189-
- The model ID is correct
190259
- The model is available in the selected region
191260
- The model supports OpenAI-compatible chat completion requests
261+
- If using a DAC-hosted model, the OCI Generative AI endpoint is active and the Model ID field uses the DAC endpoint OCID
192262

193263
### Authorization Error
194264

@@ -204,19 +274,25 @@ Check that:
204274
Check that:
205275

206276
- The base URL uses the correct region
207-
- The URL ends with:
277+
- For on-demand models, the URL ends with:
208278

209279
```text
210280
/openai/v1
211281
```
212282

283+
- For DAC-hosted models, the URL ends with:
284+
285+
```text
286+
/openai/v1/chat/completions
287+
```
288+
213289
- Your network can reach OCI public endpoints
290+
- Private endpoints are reachable from your machine if using private networking
214291

215292
## Recommended Tests
216293

217294
Run the following tests in Cline:
218295

219-
- Simple math prompt
220296
- Short coding task
221297
- Code explanation prompt
222298
- Refactoring prompt
@@ -229,10 +305,12 @@ If this was only a test setup:
229305

230306
1. Remove the API key from Cline
231307
2. Revoke or deactivate the OCI Generative AI API key
308+
3. If using a DAC-hosted model only for testing, delete the endpoint before deleting the Dedicated AI Cluster
232309

233310
## OCI Services Used
234311

235-
- **OCI Generative AI** - Model inference
312+
- **OCI Generative AI** - Model inference and endpoints
313+
- **OCI Generative AI Dedicated AI Clusters** - Dedicated hosting for deployed models
236314
- **OCI IAM** - Policies and authorization
237315
- **OCI Generative AI API Keys** - API key authentication
238316
- **Cline** - AI coding assistant in VS Code or PyCharm
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
2+
FROM ubuntu:22.04
3+
4+
RUN apt-get update && \
5+
apt-get install -y ffmpeg curl python3 python3-pip unzip && \
6+
pip3 install oci-cli && \
7+
rm -rf /var/lib/apt/lists/*
8+
9+
RUN mkdir -p /video && chmod 777 /video
10+
11+
WORKDIR /video
12+
13+
COPY entrypoint.sh /entrypoint.sh
14+
RUN chmod +x /entrypoint.sh
15+
16+
ENTRYPOINT ["/entrypoint.sh"]
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Copyright (c) 2026 Oracle and/or its affiliates.
2+
3+
The Universal Permissive License (UPL), Version 1.0
4+
5+
Subject to the condition set forth below, permission is hereby granted to any
6+
person obtaining a copy of this software, associated documentation and/or data
7+
(collectively the "Software"), free of charge and under any and all copyright
8+
rights in the Software, and any and all patent rights owned or freely
9+
licensable by each licensor hereunder covering either (i) the unmodified
10+
Software as contributed to or provided by such licensor, or (ii) the Larger
11+
Works (as defined below), to deal in both
12+
13+
(a) the Software, and
14+
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
15+
one is included with the Software (each a "Larger Work" to which the Software
16+
is contributed by such licensors),
17+
18+
without restriction, including without limitation the rights to copy, create
19+
derivative works of, display, perform, and distribute the Software and make,
20+
use, sell, offer for sale, import, export, have made, and have sold the
21+
Software and the Larger Work(s), and to sublicense the foregoing rights on
22+
either these or other terms.
23+
24+
This license is subject to the following condition:
25+
The above copyright notice and either this complete permission notice or at
26+
a minimum a reference to the UPL must be included in all copies or
27+
substantial portions of the Software.
28+
29+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
SOFTWARE.

0 commit comments

Comments
 (0)