Skip to content

Commit d75b260

Browse files
committed
Consolidate all the changes
Signed-off-by: sunshuang1866 <sunshuang1866@outlook.com>
1 parent a02931e commit d75b260

4 files changed

Lines changed: 722 additions & 0 deletions

File tree

Lines changed: 244 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,244 @@
1+
# Deploying CodeGen with openGauss on Intel Xeon Processors
2+
3+
This document outlines the deployment process for a CodeGen application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon servers. The pipeline integrates **openGauss** as the vector database (VectorDB) and includes microservices such as `embedding`, `retriever`, and `llm`.
4+
5+
---
6+
7+
## Table of Contents
8+
9+
1. [Quick Start](#quick-start)
10+
2. [Build Docker Images](#build-docker-images)
11+
3. [Validate Microservices](#validate-microservices)
12+
4. [Launch the UI](#launch-the-ui)
13+
14+
---
15+
16+
## Quick Start
17+
18+
### 1. Set up Environment Variables
19+
20+
To set up environment variables for deploying CodeGen services, follow these steps:
21+
22+
1. Set the required environment variables:
23+
24+
```bash
25+
# Example: host_ip="192.168.1.1"
26+
export host_ip="External_Public_IP"
27+
export HOST_IP=$host_ip
28+
export HF_TOKEN="Your_Huggingface_API_Token"
29+
export GS_USER="gaussdb"
30+
export GS_PASSWORD="openGauss@123"
31+
export GS_DB="postgres"
32+
export GS_CONNECTION_STRING="opengauss+psycopg2://${GS_USER}:${GS_PASSWORD}@${host_ip}:5432/${GS_DB}"
33+
```
34+
35+
2. If you are in a proxy environment, also set the proxy-related environment variables:
36+
37+
```bash
38+
export http_proxy="Your_HTTP_Proxy"
39+
export https_proxy="Your_HTTPS_Proxy"
40+
# Example: no_proxy="localhost,127.0.0.1,192.168.1.1"
41+
export no_proxy="Your_No_Proxy",codegen-xeon-ui-server,codegen-xeon-backend-server,dataprep-opengauss-server,tei-embedding-serving,retriever-opengauss-server,vllm-server
42+
```
43+
44+
3. Set up other environment variables:
45+
46+
```bash
47+
source ../set_env_opengauss.sh
48+
```
49+
50+
### 2. Run Docker Compose
51+
52+
```bash
53+
docker compose -f compose_opengauss.yaml up -d
54+
```
55+
56+
It will automatically download the Docker images from Docker Hub:
57+
58+
```bash
59+
docker pull opea/codegen:latest
60+
docker pull opea/codegen-ui:latest
61+
```
62+
63+
Note: You should build docker images from source yourself if:
64+
65+
- You are developing off the git main branch (as the container's ports in the repo may be different from the published docker image).
66+
- You can't download the docker image.
67+
- You want to use a specific version of Docker image.
68+
69+
Please refer to [Build Docker Images](#build-docker-images) below.
70+
71+
### 3. Consume the CodeGen Service
72+
73+
```bash
74+
curl http://${host_ip}:7778/v1/codegen \
75+
-H "Content-Type: application/json" \
76+
-d '{
77+
"messages": "Write a Python function to calculate fibonacci numbers"
78+
}'
79+
```
80+
81+
---
82+
83+
## Build Docker Images
84+
85+
First of all, you need to build Docker Images locally.
86+
87+
```bash
88+
git clone https://github.com/opea-project/GenAIComps.git
89+
cd GenAIComps
90+
```
91+
92+
### 1. Build Retriever Image
93+
94+
```bash
95+
docker build --no-cache -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile .
96+
```
97+
98+
### 2. Build Dataprep Image
99+
100+
```bash
101+
docker build --no-cache -t opea/dataprep:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/src/Dockerfile .
102+
cd ..
103+
```
104+
105+
### 3. Build MegaService Docker Image
106+
107+
```bash
108+
git clone https://github.com/opea-project/GenAIExamples.git
109+
cd GenAIExamples/CodeGen
110+
docker build --no-cache -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
111+
```
112+
113+
### 4. Build UI Docker Image
114+
115+
```bash
116+
cd GenAIExamples/CodeGen/ui
117+
docker build --no-cache -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile .
118+
```
119+
120+
Then run the command `docker images`, you will have the following Docker Images:
121+
122+
1. `opea/dataprep:latest`
123+
2. `opea/retriever:latest`
124+
3. `opea/codegen:latest`
125+
4. `opea/codegen-ui:latest`
126+
127+
---
128+
129+
## Required Models
130+
131+
By default, the embedding and LLM models are set to default values as listed below:
132+
133+
| Service | Model |
134+
| --------- | ------------------------------ |
135+
| Embedding | BAAI/bge-base-en-v1.5 |
136+
| LLM | Qwen/Qwen2.5-Coder-7B-Instruct |
137+
138+
Change the `xxx_MODEL_ID` in the environment for your needs.
139+
140+
---
141+
142+
## Validate Microservices
143+
144+
Note: When verifying the microservices by curl or API from a remote client, please make sure the **ports** of the microservices are opened in the firewall of the cloud node.
145+
146+
### 1. TEI Embedding Service
147+
148+
```bash
149+
curl ${host_ip}:8090/embed \
150+
-X POST \
151+
-d '{"inputs":"What is Deep Learning?"}' \
152+
-H 'Content-Type: application/json'
153+
```
154+
155+
### 2. Retriever Microservice
156+
157+
To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of the embedding vector is determined by the embedding model. Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768.
158+
159+
```bash
160+
export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
161+
curl http://${host_ip}:7000/v1/retrieval \
162+
-X POST \
163+
-d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \
164+
-H 'Content-Type: application/json'
165+
```
166+
167+
### 3. LLM Backend Service
168+
169+
In the first startup, this service will take more time to download, load, and warm up the model. After it's finished, the service will be ready.
170+
171+
Try the command below to check whether the LLM serving is ready:
172+
173+
```bash
174+
docker logs vllm-server 2>&1 | grep complete
175+
```
176+
177+
If the service is ready, you will get the response like below:
178+
179+
```text
180+
INFO: Application startup complete.
181+
```
182+
183+
Then try the `cURL` command below to validate services:
184+
185+
```bash
186+
curl http://${host_ip}:8028/v1/chat/completions \
187+
-X POST \
188+
-d '{"model": "Qwen/Qwen2.5-Coder-7B-Instruct", "messages": [{"role": "user", "content": "Write a hello world in Python"}], "max_tokens":50}' \
189+
-H 'Content-Type: application/json'
190+
```
191+
192+
### 4. MegaService
193+
194+
```bash
195+
curl http://${host_ip}:7778/v1/codegen \
196+
-H "Content-Type: application/json" \
197+
-d '{
198+
"messages": "Write a Python function to sort a list"
199+
}'
200+
```
201+
202+
### 5. Dataprep Microservice (Optional)
203+
204+
If you want to update the default knowledge base, you can use the following commands:
205+
206+
**Upload a file:**
207+
208+
```bash
209+
curl -X POST "http://${host_ip}:6007/v1/dataprep/ingest" \
210+
-H "Content-Type: multipart/form-data" \
211+
-F "files=@./your_code_file.py"
212+
```
213+
214+
**Add Knowledge Base via HTTP Links:**
215+
216+
```bash
217+
curl -X POST "http://${host_ip}:6007/v1/dataprep/ingest" \
218+
-H "Content-Type: multipart/form-data" \
219+
-F 'link_list=["https://example.com/code"]'
220+
```
221+
222+
**Delete uploaded files:**
223+
224+
```bash
225+
curl -X POST "http://${host_ip}:6007/v1/dataprep/delete" \
226+
-d '{"file_path": "all"}' \
227+
-H "Content-Type: application/json"
228+
```
229+
230+
---
231+
232+
## Launch the UI
233+
234+
To access the frontend, open the following URL in your browser: `http://{host_ip}:5173`
235+
236+
By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose_opengauss.yaml` file as shown below:
237+
238+
```yaml
239+
codegen-xeon-ui-server:
240+
image: opea/codegen-ui:latest
241+
...
242+
ports:
243+
- "80:5173"
244+
```

0 commit comments

Comments
 (0)