You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RUN curl -sL 'https://keyserver.ubuntu.com/pks/lookup?fingerprint=on&op=get&search=0x0C0E6AF955CE463C03FC51574D098D70AFBE5E1F' | tee /etc/apt/trusted.gpg.d/driver.asc
Copy file name to clipboardExpand all lines: EdgeCraftRAG/docker_compose/intel/gpu/arc/README.md
+57-30Lines changed: 57 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,25 +10,27 @@ This document outlines the deployment process for Edge Craft Retrieval-Augmented
10
10
11
11
This section describes how to quickly deploy and test the EdgeCraftRAG service manually on Intel® Arc® platform. The basic steps are:
12
12
13
-
1.[Prerequisites](#prerequisites)
14
-
2.[Access the Code](#access-the-code)
15
-
3.[Prepare models](#prepare-models)
16
-
4.[Prepare env variables and configurations](#prepare-env-variables-and-configurations)
17
-
5.[Configure the Deployment Environment](#configure-the-deployment-environment)
18
-
6.[Deploy the Service Using Docker Compose](#deploy-the-service-using-docker-compose)
19
-
7.[Access UI](#access-ui)
20
-
8.[Cleanup the Deployment](#cleanup-the-deployment)
13
+
1.[Prerequisites](#1-prerequisites)
14
+
2.[Access the Code](#2-access-the-code)
15
+
3.[Prepare models](#3-prepare-models)
16
+
4.[Prepare env variables and configurations](#4-prepare-env-variables-and-configurations)
17
+
5.[Deploy the Service on Arc A770 Using Docker Compose](#5-deploy-the-service-on-intel-gpu-using-docker-compose)
18
+
6.[Access UI](#6-access-ui)
19
+
7.[Cleanup the Deployment](#7-cleanup-the-deployment)
21
20
22
-
### Prerequisites
21
+
### 1. Prerequisites
23
22
24
23
EC-RAG supports vLLM deployment(default method) and local OpenVINO deployment for Intel Arc GPU. Prerequisites are shown as below:
25
24
Hardware: Intel Arc A770
26
25
OS: Ubuntu Server 22.04.1 or newer (at least 6.2 LTS kernel)
27
26
Driver & libraries: please to [Installing GPUs Drivers](https://dgpu-docs.intel.com/driver/installation-rolling.html#installing-gpu-drivers) for detailed driver & libraries setup
28
27
28
+
Hardware: Intel Arc B60
29
+
please to [Install Native Environment](https://github.com/intel/llm-scaler/tree/main/vllm#11-install-native-environment) for detailed setup
30
+
29
31
Below steps are based on **vLLM** as inference engine, if you want to choose **OpenVINO**, please refer to [OpenVINO Local Inference](../../../../docs/Advanced_Setup.md#openvino-local-inference)
30
32
31
-
### Access the Code
33
+
### 2. Access the Code
32
34
33
35
Clone the GenAIExample repository and access the EdgeCraftRAG Intel® Arc® platform Docker Compose files and supporting scripts:
34
36
@@ -43,7 +45,7 @@ Checkout a released version, such as v1.3:
Below steps are for single Intel Arc GPU inference, if you want to setup multi Intel Arc GPUs inference, please refer to [Multi-ARC Setup](../../../../docs/Advanced_Setup.md#multi-arc-setup)
68
70
@@ -77,32 +79,23 @@ export HOST_IP=$ip_address # Your host ip
77
79
export VIDEOGROUPID=$(getent group video | cut -d: -f3)
78
80
export RENDERGROUPID=$(getent group render | cut -d: -f3)
79
81
80
-
# If you have a proxy configured, uncomment below line
# If you have a HF mirror configured, it will be imported to the container
84
86
# export HF_ENDPOINT=https://hf-mirror.com # your HF mirror endpoint"
85
87
86
88
# Make sure all 3 folders have 1000:1000 permission, otherwise
87
-
#chown 1000:1000 ${MODEL_PATH} ${PWD} # the default value of DOC_PATH and TMPFILE_PATH is PWD ,so here we give permission to ${PWD}
89
+
chown 1000:1000 ${MODEL_PATH}${PWD}# the default value of DOC_PATH and TMPFILE_PATH is PWD ,so here we give permission to ${PWD}
88
90
# In addition, also make sure the .cache folder has 1000:1000 permission, otherwise
89
-
#chown 1000:1000 -R $HOME/.cache
91
+
chown 1000:1000 -R $HOME/.cache
90
92
```
91
93
92
94
For more advanced env variables and configurations, please refer to [Prepare env variables for vLLM deployment](../../../../docs/Advanced_Setup.md#prepare-env-variables-for-vllm-deployment)
93
95
94
-
#### Generate nginx config file
95
-
96
-
```bash
97
-
export VLLM_SERVICE_PORT_0=8100 # You can set your own port for vllm service
98
-
# Generate your nginx config file
99
-
# nginx-conf-generator.sh requires 2 parameters: DP_NUM and output filepath
docker compose -f docker_compose/intel/gpu/arc/compose_vllm.yaml up -d
118
122
```
119
123
120
-
### Access UI
124
+
#### option b. Deploy the Service on Arc B60 Using Docker Compose
125
+
126
+
```bash
127
+
# Besides MILVUS_ENABLED and CHAT_HISTORY_ROUND, below environments are exposed for vLLM config, you can change them to your preference:
128
+
# export VLLM_SERVICE_PORT_B60=8086
129
+
# export DTYPE=float16
130
+
# export TP=1 # for multi GPU, you can change TP value
131
+
# export DP=1
132
+
# export ZE_AFFINITY_MASK=0 # for multi GPU, you can export ZE_AFFINITY_MASK=0,1,2...
133
+
# export ENFORCE_EAGER=1
134
+
# export TRUST_REMOTE_CODE=1
135
+
# export DISABLE_SLIDING_WINDOW=1
136
+
# export GPU_MEMORY_UTIL=0.8
137
+
# export NO_ENABLE_PREFIX_CACHING=1
138
+
# export MAX_NUM_BATCHED_TOKENS=8192
139
+
# export DISABLE_LOG_REQUESTS=1
140
+
# export MAX_MODEL_LEN=49152
141
+
# export BLOCK_SIZE=64
142
+
# export QUANTIZATION=fp8
143
+
docker compose -f docker_compose/intel/gpu/arc/compose_vllm_b60.yaml up -d
144
+
```
145
+
146
+
### 6. Access UI
121
147
122
148
Open your browser, access http://${HOST_IP}:8082
123
149
@@ -126,12 +152,13 @@ Open your browser, access http://${HOST_IP}:8082
126
152
Below is the UI front page, for detailed operations on UI and EC-RAG settings, please refer to [Explore_Edge_Craft_RAG](../../../../docs/Explore_Edge_Craft_RAG.md)
0 commit comments