Skip to content

Commit 7edd44e

Browse files
authored
Merge branch 'main' into ze-ci/size
2 parents aaa6e78 + 7eb64d3 commit 7edd44e

23 files changed

Lines changed: 1272 additions & 636 deletions

File tree

.github/workflows/pr-link-path-scan.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ jobs:
5050
if [ "$response_retry" -eq 200 ]; then
5151
echo "*****Retry successfully*****"
5252
else
53-
echo "Invalid link ($response_retry) from ${{github.workspace}}/$(echo "$url_line"|cut -d':' -f1): $url"
53+
echo -e "::error:: Invalid link ($response_retry) from ${{github.workspace}}/$(echo "$url_line"|cut -d':' -f1): $url"
5454
fail="TRUE"
5555
fi
5656
fi
@@ -123,7 +123,7 @@ jobs:
123123
if [ "$response_retry" -eq 200 ]; then
124124
echo "*****Retry successfully*****"
125125
else
126-
echo "Invalid path ($response_retry) from ${{github.workspace}}/$refer_path: $png_path"
126+
echo -e "::error:: Invalid path ($response_retry) from ${{github.workspace}}/$refer_path: $png_path"
127127
fail="TRUE"
128128
fi
129129
else
@@ -132,7 +132,7 @@ jobs:
132132
fi
133133
fi
134134
else
135-
echo "${{github.workspace}}/$refer_path:$png_path does not exist"
135+
echo -e "::error:: ${{github.workspace}}/$refer_path:$png_path does not exist"
136136
fail="TRUE"
137137
fi
138138
done

AgentQnA/README.md

Lines changed: 22 additions & 284 deletions
Original file line numberDiff line numberDiff line change
@@ -1,90 +1,6 @@
1-
# Agents for Question Answering
2-
3-
## Table of contents
4-
5-
1. [Overview](#overview)
6-
2. [Deploy with Docker](#deploy-with-docker)
7-
3. [How to interact with the agent system with UI](#how-to-interact-with-the-agent-system-with-ui)
8-
4. [Validate Services](#validate-services)
9-
5. [Register Tools](#how-to-register-other-tools-with-the-ai-agent)
10-
6. [Monitoring and Tracing](#monitor-and-tracing)
11-
12-
## Overview
13-
14-
This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram below shows a supervisor agent that interfaces with the user and dispatches tasks to two worker agents to gather information and come up with answers. The worker RAG agent uses the retrieval tool to retrieve relevant documents from a knowledge base - a vector database. The worker SQL agent retrieves relevant data from a SQL database. Although not included in this example by default, other tools such as a web search tool or a knowledge graph query tool can be used by the supervisor agent to gather information from additional sources.
15-
![Architecture Overview](assets/img/agent_qna_arch.png)
16-
17-
The AgentQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.
18-
19-
```mermaid
20-
---
21-
config:
22-
flowchart:
23-
nodeSpacing: 400
24-
rankSpacing: 100
25-
curve: linear
26-
themeVariables:
27-
fontSize: 50px
28-
---
29-
flowchart LR
30-
%% Colors %%
31-
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
32-
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
33-
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
34-
classDef invisible fill:transparent,stroke:transparent;
35-
36-
%% Subgraphs %%
37-
subgraph DocIndexRetriever-MegaService["DocIndexRetriever MegaService "]
38-
direction LR
39-
EM([Embedding MicroService]):::blue
40-
RET([Retrieval MicroService]):::blue
41-
RER([Rerank MicroService]):::blue
42-
end
43-
subgraph UserInput[" User Input "]
44-
direction LR
45-
a([User Input Query]):::orchid
46-
Ingest([Ingest data]):::orchid
47-
end
48-
AG_REACT([Agent MicroService - react]):::blue
49-
AG_RAG([Agent MicroService - rag]):::blue
50-
AG_SQL([Agent MicroService - sql]):::blue
51-
LLM_gen{{LLM Service <br>}}
52-
DP([Data Preparation MicroService]):::blue
53-
TEI_RER{{Reranking service<br>}}
54-
TEI_EM{{Embedding service <br>}}
55-
VDB{{Vector DB<br><br>}}
56-
R_RET{{Retriever service <br>}}
57-
1+
# Agents for Question and Answering Application
582

59-
60-
%% Questions interaction
61-
direction LR
62-
a[User Input Query] --> AG_REACT
63-
AG_REACT --> AG_RAG
64-
AG_REACT --> AG_SQL
65-
AG_RAG --> DocIndexRetriever-MegaService
66-
EM ==> RET
67-
RET ==> RER
68-
Ingest[Ingest data] --> DP
69-
70-
%% Embedding service flow
71-
direction LR
72-
AG_RAG <-.-> LLM_gen
73-
AG_SQL <-.-> LLM_gen
74-
AG_REACT <-.-> LLM_gen
75-
EM <-.-> TEI_EM
76-
RET <-.-> R_RET
77-
RER <-.-> TEI_RER
78-
79-
direction TB
80-
%% Vector DB interaction
81-
R_RET <-.-> VDB
82-
DP <-.-> VDB
83-
84-
85-
```
86-
87-
### Why should AI Agents be used for question-answering?
3+
AI agents bring unique advantages to question-answering tasks, as demonstrated in the AgentQnA application:
884

895
1. **Improve relevancy of retrieved context.**
906
RAG agents can rephrase user queries, decompose user queries, and iterate to get the most relevant context for answering a user's question. Compared to conventional RAG, RAG agents significantly improve the correctness and relevancy of the answer because of the iterations it goes through.
@@ -93,213 +9,35 @@ flowchart LR
939
3. **Hierarchical multi-agents improve performance.**
9410
Expert worker agents, such as RAG agents and SQL agents, can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information to provide a comprehensive answer. If only one agent is used and all tools are provided to this single agent, it can lead to large overhead or not use the best tool to provide accurate answers.
9511

96-
## Deploy with docker
97-
98-
### 1. Set up environment </br>
99-
100-
#### First, clone the `GenAIExamples` repo.
101-
102-
```bash
103-
export WORKDIR=<your-work-directory>
104-
cd $WORKDIR
105-
git clone https://github.com/opea-project/GenAIExamples.git
106-
```
107-
108-
#### Second, set up environment variables.
109-
110-
##### For proxy environments only
111-
112-
```bash
113-
export http_proxy="Your_HTTP_Proxy"
114-
export https_proxy="Your_HTTPs_Proxy"
115-
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
116-
export no_proxy="Your_No_Proxy"
117-
```
118-
119-
##### For using open-source llms
120-
121-
Set up a [HuggingFace](https://huggingface.co/) account and generate a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
122-
123-
Then set an environment variable with the token and another for a directory to download the models:
124-
125-
```bash
126-
export HF_TOKEN=<your-HF-token>
127-
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> # to avoid redownloading models
128-
```
129-
130-
##### [Optional] OPENAI_API_KEY to use OpenAI models or Intel® AI for Enterprise Inference
131-
132-
To use OpenAI models, generate a key following these [instructions](https://platform.openai.com/api-keys).
133-
134-
To use a remote server running Intel® AI for Enterprise Inference, contact the cloud service provider or owner of the on-prem machine for a key to access the desired model on the server.
135-
136-
Then set the environment variable `OPENAI_API_KEY` with the key contents:
137-
138-
```bash
139-
export OPENAI_API_KEY=<your-openai-key>
140-
```
141-
142-
#### Third, set up environment variables for the selected hardware using the corresponding `set_env.sh`
143-
144-
##### Gaudi
145-
146-
```bash
147-
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/set_env.sh
148-
```
149-
150-
##### Xeon
151-
152-
```bash
153-
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/set_env.sh
154-
```
155-
156-
For running
157-
158-
### 2. Launch the multi-agent system. </br>
159-
160-
We make it convenient to launch the whole system with docker compose, which includes microservices for LLM, agents, UI, retrieval tool, vector database, dataprep, and telemetry. There are 3 docker compose files, which make it easy for users to pick and choose. Users can choose a different retrieval tool other than the `DocIndexRetriever` example provided in our GenAIExamples repo. Users can choose not to launch the telemetry containers.
161-
162-
#### Launch on Gaudi
163-
164-
On Gaudi, `meta-llama/Meta-Llama-3.3-70B-Instruct` will be served using vllm. The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent.
165-
166-
```bash
167-
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
168-
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose.yaml up -d
169-
```
170-
171-
> **Note**: To enable the web search tool, skip this step and proceed to the "[Optional] Web Search Tool Support" section.
172-
173-
To enable Open Telemetry Tracing, compose.telemetry.yaml file need to be merged along with default compose.yaml file.
174-
Gaudi example with Open Telemetry feature:
175-
176-
```bash
177-
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
178-
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose.yaml -f compose.telemetry.yaml up -d
179-
```
180-
181-
##### [Optional] Web Search Tool Support
182-
183-
<details>
184-
<summary> Instructions </summary>
185-
A web search tool is supported in this example and can be enabled by running docker compose with the `compose.webtool.yaml` file.
186-
The Google Search API is used. Follow the [instructions](https://python.langchain.com/docs/integrations/tools/google_search) to create an API key and enable the Custom Search API on a Google account. The environment variables `GOOGLE_CSE_ID` and `GOOGLE_API_KEY` need to be set.
187-
188-
```bash
189-
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
190-
export GOOGLE_CSE_ID="YOUR_ID"
191-
export GOOGLE_API_KEY="YOUR_API_KEY"
192-
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose.yaml -f compose.webtool.yaml up -d
193-
```
194-
195-
</details>
196-
197-
#### Launch on Xeon
198-
199-
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key.
200-
201-
```bash
202-
export OPENAI_API_KEY=<your-openai-key>
203-
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
204-
```
205-
206-
##### OpenAI Models
207-
208-
The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent.
209-
210-
```bash
211-
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d
212-
```
213-
214-
##### Models on Remote Server
215-
216-
When models are deployed on a remote server with Intel® AI for Enterprise Inference, a base URL and an API key are required to access them. To run the Agent microservice on Xeon while using models deployed on a remote server, add `compose_remote.yaml` to the `docker compose` command and set additional environment variables.
217-
218-
###### Notes
219-
220-
- `OPENAI_API_KEY` is already set in a previous step.
221-
- `model` is used to overwrite the value set for this environment variable in `set_env.sh`.
222-
- `LLM_ENDPOINT_URL` is the base URL given from the owner of the on-prem machine or cloud service provider. It will follow this format: "https://<DNS>". Here is an example: "https://api.inference.example.com".
223-
224-
```bash
225-
export model=<name-of-model-card>
226-
export LLM_ENDPOINT_URL=<http-endpoint-of-remote-server>
227-
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml -f compose_remote.yaml up -d
228-
```
229-
230-
### 3. Ingest Data into the vector database
231-
232-
The `run_ingest_data.sh` script will use an example jsonl file to ingest example documents into a vector database. Other ways to ingest data and other types of documents supported can be found in the OPEA dataprep microservice located in the opea-project/GenAIComps repo.
233-
234-
```bash
235-
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool/
236-
bash run_ingest_data.sh
237-
```
238-
239-
> **Note**: This is a one-time operation.
240-
241-
## How to interact with the agent system with UI
242-
243-
The UI microservice is launched in the previous step with the other microservices.
244-
To see the UI, open a web browser to `http://${ip_address}:5173` to access the UI. Note the `ip_address` here is the host IP of the UI microservice.
245-
246-
1. Click on the arrow above `Get started`. Create an admin account with a name, email, and password.
247-
2. Add an OpenAI-compatible API endpoint. In the upper right, click on the circle button with the user's initial, go to `Admin Settings`->`Connections`. Under `Manage OpenAI API Connections`, click on the `+` to add a connection. Fill in these fields:
248-
249-
- **URL**: `http://${ip_address}:9090/v1`, do not forget the `v1`
250-
- **Key**: any value
251-
- **Model IDs**: any name i.e. `opea-agent`, then press `+` to add it
252-
253-
Click "Save".
254-
255-
![opea-agent-setting](assets/img/opea-agent-setting.png)
256-
257-
3. Test OPEA agent with UI. Return to `New Chat` and ensure the model (i.e. `opea-agent`) is selected near the upper left. Enter in any prompt to interact with the agent.
258-
259-
![opea-agent-test](assets/img/opea-agent-test.png)
260-
261-
## [Optional] Deploy using Helm Charts
262-
263-
Refer to the [AgentQnA helm chart](./kubernetes/helm/README.md) for instructions on deploying AgentQnA on Kubernetes.
264-
265-
## Validate Services
266-
267-
1. First look at logs for each of the agent docker containers:
268-
269-
```bash
270-
# worker RAG agent
271-
docker logs rag-agent-endpoint
272-
273-
# worker SQL agent
274-
docker logs sql-agent-endpoint
275-
276-
# supervisor agent
277-
docker logs react-agent-endpoint
278-
```
12+
## Table of contents
27913

280-
Look for the message "HTTP server setup successful" to confirm the agent docker container has started successfully.</p>
14+
1. [Architecture](#architecture)
15+
2. [deployment Options](#deployment-options)
16+
3. [Validated Configurations](#validated-configurations)
17+
4. [Monitoring and Tracing](./README_miscellaneous.md)
28118

282-
2. Use python to validate each agent is working properly:
19+
## Architecture
28320

284-
```bash
285-
# RAG worker agent
286-
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "Tell me about Michael Jackson song Thriller" --agent_role "worker" --ext_port 9095
21+
This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram below shows a supervisor agent that interfaces with the user and dispatches tasks to two worker agents to gather information and come up with answers. The worker RAG agent uses the retrieval tool to retrieve relevant documents from a knowledge base - a vector database. The worker SQL agent retrieves relevant data from a SQL database. Although not included in this example by default, other tools such as a web search tool or a knowledge graph query tool can be used by the supervisor agent to gather information from additional sources.
28722

288-
# SQL agent
289-
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "How many employees in company" --agent_role "worker" --ext_port 9096
23+
The AgentQnA application is an end-to-end workflow that leverages the capability of agents and tools. The workflow falls into the following architecture:
29024

291-
# supervisor agent: this will test a two-turn conversation
292-
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --agent_role "supervisor" --ext_port 9090
293-
```
25+
![Architecture Overview](assets/img/agent_qna_arch.png)
29426

295-
## How to register other tools with the AI agent
27+
The AgentQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps).
29628

297-
The [tools](./tools) folder contains YAML and Python files for additional tools for the supervisor and worker agents. Refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/src/README.md) to add tools and customize the AI agents.
29+
## Deployment Options
29830

299-
## Monitor and Tracing
31+
The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.
30032

301-
Follow [OpenTelemetry OPEA Guide](https://opea-project.github.io/latest/tutorial/OpenTelemetry/OpenTelemetry_OPEA_Guide.html) to understand how to use OpenTelemetry tracing and metrics in OPEA.
302-
For AgentQnA specific tracing and metrics monitoring, follow [OpenTelemetry on AgentQnA](https://opea-project.github.io/latest/tutorial/OpenTelemetry/deploy/AgentQnA.html) section.
33+
| Category | Deployment Option | Description |
34+
| ---------------------- | -------------------- | ---------------------------------------------------------------- |
35+
| On-premise Deployments | Docker compose | [AgentQnA deployment on Xeon](./docker_compose/intel/cpu/xeon) |
36+
| | | [AgentQnA deployment on Gaudi](./docker_compose/intel/hpu/gaudi) |
37+
| | | [AgentQnA deployment on AMD ROCm](./docker_compose/amd/gpu/rocm) |
38+
| | Kubernetes | [Helm Charts](./kubernetes/helm) |
39+
| | Azure | Work-in-progress |
40+
| | Intel Tiber AI Cloud | Work-in-progress |
30341

30442
## Validated Configurations
30543

0 commit comments

Comments
 (0)