Skip to content

Commit df27754

Browse files
alexsin368pre-commit-ci[bot]ZePan110
authored andcommitted
AgentQnA - add support for remote server (opea-project#1900)
Signed-off-by: alexsin368 <alex.sin@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: ZePan110 <ze.pan@intel.com> Signed-off-by: cogniware-devops <ambarish.desai@cogniware.ai>
1 parent 0f4db06 commit df27754

2 files changed

Lines changed: 71 additions & 12 deletions

File tree

AgentQnA/README.md

Lines changed: 53 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ flowchart LR
9999

100100
#### First, clone the `GenAIExamples` repo.
101101

102-
```
102+
```bash
103103
export WORKDIR=<your-work-directory>
104104
cd $WORKDIR
105105
git clone https://github.com/opea-project/GenAIExamples.git
@@ -109,7 +109,7 @@ git clone https://github.com/opea-project/GenAIExamples.git
109109

110110
##### For proxy environments only
111111

112-
```
112+
```bash
113113
export http_proxy="Your_HTTP_Proxy"
114114
export https_proxy="Your_HTTPs_Proxy"
115115
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
@@ -118,31 +118,43 @@ export no_proxy="Your_No_Proxy"
118118

119119
##### For using open-source llms
120120

121-
```
121+
Set up a [HuggingFace](https://huggingface.co/) account and generate a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
122+
123+
Then set an environment variable with the token and another for a directory to download the models:
124+
125+
```bash
122126
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
123-
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time
127+
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> # to avoid redownloading models
124128
```
125129

126-
##### [Optional] OPANAI_API_KEY to use OpenAI models
130+
##### [Optional] OPENAI_API_KEY to use OpenAI models or Intel® AI for Enterprise Inference
127131

128-
```
132+
To use OpenAI models, generate a key following these [instructions](https://platform.openai.com/api-keys).
133+
134+
To use a remote server running Intel® AI for Enterprise Inference, contact the cloud service provider or owner of the on-prem machine for a key to access the desired model on the server.
135+
136+
Then set the environment variable `OPENAI_API_KEY` with the key contents:
137+
138+
```bash
129139
export OPENAI_API_KEY=<your-openai-key>
130140
```
131141

132142
#### Third, set up environment variables for the selected hardware using the corresponding `set_env.sh`
133143

134144
##### Gaudi
135145

136-
```
146+
```bash
137147
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/set_env.sh
138148
```
139149

140150
##### Xeon
141151

142-
```
152+
```bash
143153
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/set_env.sh
144154
```
145155

156+
For running
157+
146158
### 2. Launch the multi-agent system. </br>
147159

148160
We make it convenient to launch the whole system with docker compose, which includes microservices for LLM, agents, UI, retrieval tool, vector database, dataprep, and telemetry. There are 3 docker compose files, which make it easy for users to pick and choose. Users can choose a different retrieval tool other than the `DocIndexRetriever` example provided in our GenAIExamples repo. Users can choose not to launch the telemetry containers.
@@ -184,14 +196,37 @@ docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/
184196

185197
#### Launch on Xeon
186198

187-
On Xeon, only OpenAI models are supported. The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent.
199+
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key.
188200

189201
```bash
190202
export OPENAI_API_KEY=<your-openai-key>
191203
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
204+
```
205+
206+
##### OpenAI Models
207+
208+
The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent.
209+
210+
```bash
192211
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d
193212
```
194213

214+
##### Models on Remote Server
215+
216+
When models are deployed on a remote server with Intel® AI for Enterprise Inference, a base URL and an API key are required to access them. To run the Agent microservice on Xeon while using models deployed on a remote server, add `compose_remote.yaml` to the `docker compose` command and set additional environment variables.
217+
218+
###### Notes
219+
220+
- `OPENAI_API_KEY` is already set in a previous step.
221+
- `model` is used to overwrite the value set for this environment variable in `set_env.sh`.
222+
- `LLM_ENDPOINT_URL` is the base URL given from the owner of the on-prem machine or cloud service provider. It will follow this format: "https://<DNS>". Here is an example: "https://api.inference.example.com".
223+
224+
```bash
225+
export model=<name-of-model-card>
226+
export LLM_ENDPOINT_URL=<http-endpoint-of-remote-server>
227+
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml -f compose_remote.yaml up -d
228+
```
229+
195230
### 3. Ingest Data into the vector database
196231

197232
The `run_ingest_data.sh` script will use an example jsonl file to ingest example documents into a vector database. Other ways to ingest data and other types of documents supported can be found in the OPEA dataprep microservice located in the opea-project/GenAIComps repo.
@@ -208,12 +243,18 @@ bash run_ingest_data.sh
208243
The UI microservice is launched in the previous step with the other microservices.
209244
To see the UI, open a web browser to `http://${ip_address}:5173` to access the UI. Note the `ip_address` here is the host IP of the UI microservice.
210245

211-
1. `create Admin Account` with a random value
212-
2. add opea agent endpoint `http://$ip_address:9090/v1` which is a openai compatible api
246+
1. Click on the arrow above `Get started`. Create an admin account with a name, email, and password.
247+
2. Add an OpenAI-compatible API endpoint. In the upper right, click on the circle button with the user's initial, go to `Admin Settings`->`Connections`. Under `Manage OpenAI API Connections`, click on the `+` to add a connection. Fill in these fields:
248+
249+
- **URL**: `http://${ip_address}:9090/v1`, do not forget the `v1`
250+
- **Key**: any value
251+
- **Model IDs**: any name i.e. `opea-agent`, then press `+` to add it
252+
253+
Click "Save".
213254

214255
![opea-agent-setting](assets/img/opea-agent-setting.png)
215256

216-
3. test opea agent with ui
257+
3. Test OPEA agent with UI. Return to `New Chat` and ensure the model (i.e. `opea-agent`) is selected near the upper left. Enter in any prompt to interact with the agent.
217258

218259
![opea-agent-test](assets/img/opea-agent-test.png)
219260

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
services:
5+
worker-rag-agent:
6+
environment:
7+
llm_endpoint_url: ${LLM_ENDPOINT_URL}
8+
api_key: ${OPENAI_API_KEY}
9+
10+
worker-sql-agent:
11+
environment:
12+
llm_endpoint_url: ${LLM_ENDPOINT_URL}
13+
api_key: ${OPENAI_API_KEY}
14+
15+
supervisor-react-agent:
16+
environment:
17+
llm_endpoint_url: ${LLM_ENDPOINT_URL}
18+
api_key: ${OPENAI_API_KEY}

0 commit comments

Comments
 (0)