You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Set up a [HuggingFace](https://huggingface.co/) account and generate a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
122
+
123
+
Then set an environment variable with the token and another for a directory to download the models:
124
+
125
+
```bash
122
126
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
123
-
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time
##### [Optional]OPANAI_API_KEY to use OpenAI models
130
+
##### [Optional]OPENAI_API_KEY to use OpenAI models or Intel® AI for Enterprise Inference
127
131
128
-
```
132
+
To use OpenAI models, generate a key following these [instructions](https://platform.openai.com/api-keys).
133
+
134
+
To use a remote server running Intel® AI for Enterprise Inference, contact the cloud service provider or owner of the on-prem machine for a key to access the desired model on the server.
135
+
136
+
Then set the environment variable `OPENAI_API_KEY` with the key contents:
137
+
138
+
```bash
129
139
export OPENAI_API_KEY=<your-openai-key>
130
140
```
131
141
132
142
#### Third, set up environment variables for the selected hardware using the corresponding `set_env.sh`
We make it convenient to launch the whole system with docker compose, which includes microservices for LLM, agents, UI, retrieval tool, vector database, dataprep, and telemetry. There are 3 docker compose files, which make it easy for users to pick and choose. Users can choose a different retrieval tool other than the `DocIndexRetriever` example provided in our GenAIExamples repo. Users can choose not to launch the telemetry containers.
On Xeon, only OpenAI models are supported. The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent.
199
+
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key.
The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent.
209
+
210
+
```bash
192
211
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d
193
212
```
194
213
214
+
##### Models on Remote Server
215
+
216
+
When models are deployed on a remote server with Intel® AI for Enterprise Inference, a base URL and an API key are required to access them. To run the Agent microservice on Xeon while using models deployed on a remote server, add `compose_remote.yaml` to the `docker compose` command and set additional environment variables.
217
+
218
+
###### Notes
219
+
220
+
-`OPENAI_API_KEY` is already set in a previous step.
221
+
-`model` is used to overwrite the value set for this environment variable in `set_env.sh`.
222
+
-`LLM_ENDPOINT_URL` is the base URL given from the owner of the on-prem machine or cloud service provider. It will follow this format: "https://<DNS>". Here is an example: "https://api.inference.example.com".
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml -f compose_remote.yaml up -d
228
+
```
229
+
195
230
### 3. Ingest Data into the vector database
196
231
197
232
The `run_ingest_data.sh` script will use an example jsonl file to ingest example documents into a vector database. Other ways to ingest data and other types of documents supported can be found in the OPEA dataprep microservice located in the opea-project/GenAIComps repo.
@@ -208,12 +243,18 @@ bash run_ingest_data.sh
208
243
The UI microservice is launched in the previous step with the other microservices.
209
244
To see the UI, open a web browser to `http://${ip_address}:5173` to access the UI. Note the `ip_address` here is the host IP of the UI microservice.
210
245
211
-
1.`create Admin Account` with a random value
212
-
2. add opea agent endpoint `http://$ip_address:9090/v1` which is a openai compatible api
246
+
1. Click on the arrow above `Get started`. Create an admin account with a name, email, and password.
247
+
2. Add an OpenAI-compatible API endpoint. In the upper right, click on the circle button with the user's initial, go to `Admin Settings`->`Connections`. Under `Manage OpenAI API Connections`, click on the `+` to add a connection. Fill in these fields:
248
+
249
+
-**URL**: `http://${ip_address}:9090/v1`, do not forget the `v1`
250
+
-**Key**: any value
251
+
-**Model IDs**: any name i.e. `opea-agent`, then press `+` to add it
3.Test OPEA agent with UI. Return to `New Chat` and ensure the model (i.e. `opea-agent`) is selected near the upper left. Enter in any prompt to interact with the agent.
0 commit comments