Skip to content

Commit 82dbc3e

Browse files
committed
add helm deploy for EC-RAG
Signed-off-by: Yongbozzz <yongbo.zhu@intel.com>
1 parent a02931e commit 82dbc3e

13 files changed

Lines changed: 539 additions & 0 deletions
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
apiVersion: v2
2+
name: edgecraftrag
3+
description: Helm chart for EdgeCraftRAG stack
4+
type: application
5+
version: 0.1.0
6+
appVersion: "25.11"
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# EdgeCraft RAG Helm Chart
2+
3+
This doc intrudoces the Helm chart for deploying EdgeCraft RAG (ecrag) on a Kubernetes cluster.
4+
5+
## Prerequisites
6+
7+
- A running Kubernetes cluster.
8+
- Helm installed.
9+
- Required Docker images available in your registry or locally.
10+
11+
## Configuration
12+
13+
Before installing, you should configure the `edgecraftrag/values.yaml` file according to your environment.
14+
15+
### Key Configurations
16+
17+
1. **Images**: Set the registry and tag for `ecrag` and `vllm`.
18+
```yaml
19+
image:
20+
ecrag:
21+
registry: <your-registry>
22+
tag: <your-tag>
23+
vllm:
24+
registry: <your-registry>
25+
tag: <your-tag>
26+
```
27+
28+
2. **Environment Variables**: Configure proxies and host IP.
29+
```yaml
30+
env:
31+
http_proxy: "http://proxy:port"
32+
https_proxy: "http://proxy:port"
33+
HOST_IP: "<node-ip>"
34+
```
35+
36+
3. **LLM Settings**: Adjust LLM model paths and parameters.
37+
```yaml
38+
llm:
39+
LLM_MODEL: "/path/to/model/inside/container" # Ensure this maps to paths.model
40+
```
41+
42+
4. **Persistant Paths**: Ensure the host paths exist for mounting.
43+
```yaml
44+
paths:
45+
model: /home/user/models
46+
docs: /home/user/docs
47+
```
48+
49+
## Installation
50+
51+
To install the chart, please use below command (`edgecraftrag` as an example)
52+
53+
```bash
54+
cd kubernetes/helm
55+
helm install edgecraftrag ./edgecraftrag
56+
```
57+
58+
If there're different clusters avaliable, please install the chart with specific kube config, e.g. :
59+
60+
```bash
61+
helm install edgecraftrag ./edgecraftrag --kubeconfig /home/user/.kube/nas.yaml
62+
```
63+
64+
## Verification
65+
66+
### Accessing the Web UI
67+
68+
Once the service is running, you can access the UI via your browser.
69+
70+
1. **Identify the Port**:
71+
Check the `nodePort` configured in the `edgecraftrag/values.yaml` file. This is the external access port.
72+
73+
2. **Identify the IP**:
74+
Use the IP address of the Kubernetes node where the deployment is running.
75+
* If running on your local machine (e.g., MicroK8s), use `localhost` or your machine's LAN IP.
76+
* If running on a remote cluster, use that node's IP.
77+
78+
3. **Open in Browser**:
79+
Navigate to `http://<NodeIP>:<NodePort>`
80+
> Example: `http://192.168.1.5:31234`
81+
82+
## Uninstallation
83+
84+
To uninstall/delete the `edgecraftrag` deployment:
85+
86+
```bash
87+
helm uninstall edgecraftrag
88+
```
89+
90+
If there're different clusters avaliable, please uninstall the chart with specific kube config, e.g. :
91+
92+
```bash
93+
helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml
94+
```
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# EdgeCraft RAG Helm Chart
2+
3+
此文档将为您介绍如何使用Helm chart在Kubernetes集群上部署EdgeCraft RAG (ecrag)。
4+
5+
## 前置条件
6+
7+
- 您需要一个运行中的Kubernetes集群。
8+
- 您需要已经安装Helm。
9+
- 所需的Docker镜像已在您的镜像仓库或本地可用。
10+
11+
## 配置
12+
13+
安装前,请根据您的环境配置 `edgecraftrag/values.yaml` 文件。
14+
15+
### 关键配置
16+
17+
1. **镜像**:设置 `ecrag``vllm` 的镜像仓库和标签。
18+
```yaml
19+
image:
20+
ecrag:
21+
registry: <your-registry>
22+
tag: <your-tag>
23+
vllm:
24+
registry: <your-registry>
25+
tag: <your-tag>
26+
```
27+
28+
2. **环境变量**:配置代理和主机IP。
29+
```yaml
30+
env:
31+
http_proxy: "http://proxy:port"
32+
https_proxy: "http://proxy:port"
33+
HOST_IP: "<node-ip>"
34+
```
35+
36+
3. **LLM设置**:调整LLM模型路径和参数。
37+
```yaml
38+
llm:
39+
LLM_MODEL: "/path/to/model/inside/container" # 确保此路径映射到 paths.model
40+
```
41+
42+
4. **持久化路径**:确保主机挂载路径存在。
43+
```yaml
44+
paths:
45+
model: /home/user/models
46+
docs: /home/user/docs
47+
```
48+
49+
## 安装
50+
51+
请使用如下命令安装helm(以`edgecraftrag`作为发布名为例):
52+
53+
```bash
54+
cd kubernetes/helm
55+
helm install edgecraftrag ./edgecraftrag
56+
```
57+
58+
如果有不同的集群可用,请使用指定的kube config安装chart,例如:
59+
60+
```bash
61+
helm install edgecraftrag ./edgecraftrag --kubeconfig /home/user/.kube/nas.yaml
62+
```
63+
64+
## 验证
65+
66+
### 访问Web界面
67+
68+
服务运行后,您可以通过浏览器访问UI。
69+
70+
1. **确认端口**:
71+
查看 `edgecraftrag/values.yaml` 文件中配置的 `nodePort`。这是外部访问端口。
72+
73+
2. **确认IP**:
74+
使用部署所运行的Kubernetes节点的IP地址。
75+
* 如果在本地机器运行(如MicroK8s),使用 `localhost` 或您机器的局域网IP。
76+
* 如果在远程集群运行,使用该节点的IP。
77+
78+
3. **在浏览器中打开**:
79+
访问 `http://<NodeIP>:<NodePort>`
80+
> 示例:`http://192.168.1.5:31234`
81+
82+
## 卸载
83+
84+
卸载/删除部署的`edgecraftrag`:
85+
86+
```bash
87+
helm uninstall edgecraftrag
88+
```
89+
90+
如果有不同的集群可用,请使用指定的kube config卸载chart,例如:
91+
92+
```bash
93+
helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml
94+
```
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
apiVersion: v1
2+
kind: ConfigMap
3+
metadata:
4+
name: edgecraftrag-env
5+
data:
6+
# Common environment variables
7+
no_proxy: "{{ .Values.env.no_proxy }}"
8+
http_proxy: "{{ .Values.env.http_proxy }}"
9+
https_proxy: "{{ .Values.env.https_proxy }}"
10+
HOST_IP: "{{ .Values.env.HOST_IP }}"
11+
ENABLE_BENCHMARK: "{{ .Values.env.ENABLE_BENCHMARK }}"
12+
CHAT_HISTORY_ROUND: "{{ .Values.env.CHAT_HISTORY_ROUND }}"
13+
METADATA_DATABASE_URL: "{{ .Values.env.METADATA_DATABASE_URL }}"
14+
MEGA_SERVICE_PORT: "{{ .Values.ports.mega }}"
15+
PIPELINE_SERVICE_HOST_IP: edgecraftrag-server
16+
PIPELINE_SERVICE_PORT: "{{ .Values.ports.pipeline }}"
17+
UI_SERVICE_PORT: "{{ .Values.ports.ui.port }}"
18+
VLLM_SERVICE_PORT_B60: "{{ .Values.ports.vllm }}"
19+
20+
# llm-serving-xpu specific environment variables
21+
LLM_MODEL: "{{ .Values.llm.LLM_MODEL }}"
22+
DTYPE: "{{ .Values.llm.DTYPE }}"
23+
ZE_AFFINITY_MASK: "{{ .Values.llm.ZE_AFFINITY_MASK }}"
24+
ENFORCE_EAGER: "{{ .Values.llm.ENFORCE_EAGER }}"
25+
TRUST_REMOTE_CODE: "{{ .Values.llm.TRUST_REMOTE_CODE }}"
26+
DISABLE_SLIDING_WINDOW: "{{ .Values.llm.DISABLE_SLIDING_WINDOW }}"
27+
GPU_MEMORY_UTIL: "{{ .Values.llm.GPU_MEMORY_UTIL }}"
28+
NO_ENABLE_PREFIX_CACHING: "{{ .Values.llm.NO_ENABLE_PREFIX_CACHING }}"
29+
MAX_NUM_BATCHED_TOKENS: "{{ .Values.llm.MAX_NUM_BATCHED_TOKENS }}"
30+
MAX_MODEL_LEN: "{{ .Values.llm.MAX_MODEL_LEN }}"
31+
DISABLE_LOG_REQUESTS: "{{ .Values.llm.DISABLE_LOG_REQUESTS }}"
32+
BLOCK_SIZE: "{{ .Values.llm.BLOCK_SIZE }}"
33+
QUANTIZATION: "{{ .Values.llm.QUANTIZATION }}"
34+
TP: "{{ .Values.llm.TP }}"
35+
DP: "{{ .Values.llm.DP }}"
36+
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
apiVersion: apps/v1
2+
kind: DaemonSet
3+
metadata:
4+
name: edgecraftrag-server
5+
spec:
6+
selector:
7+
matchLabels:
8+
app: edgecraftrag-server
9+
template:
10+
metadata:
11+
labels:
12+
app: edgecraftrag-server
13+
spec:
14+
securityContext:
15+
runAsUser: 1000
16+
runAsGroup: 1000
17+
supplementalGroups:
18+
- {{ .Values.gpu.groups.video }}
19+
- {{ .Values.gpu.groups.render }}
20+
containers:
21+
- name: edgecraftrag-server
22+
image: "{{ .Values.image.ecrag.registry }}/edgecraftrag-server:{{ .Values.image.ecrag.tag }}"
23+
imagePullPolicy: IfNotPresent
24+
envFrom:
25+
- configMapRef:
26+
name: edgecraftrag-env
27+
env:
28+
- name: PIPELINE_SERVICE_HOST_IP
29+
value: "0.0.0.0"
30+
ports:
31+
- containerPort: {{ .Values.ports.pipeline }}
32+
volumeMounts:
33+
- name: model-path
34+
mountPath: /home/user/models
35+
- name: docs-path
36+
mountPath: /home/user/docs
37+
- name: tmpfile-path
38+
mountPath: /home/user/ui_cache
39+
- name: prompt-path
40+
mountPath: /templates/custom
41+
- name: dri-device
42+
mountPath: /dev/dri
43+
volumes:
44+
- name: model-path
45+
hostPath:
46+
path: "{{ .Values.paths.model }}"
47+
- name: docs-path
48+
hostPath:
49+
path: "{{ .Values.paths.docs }}"
50+
- name: tmpfile-path
51+
hostPath:
52+
path: "{{ .Values.paths.tmpfile }}"
53+
- name: prompt-path
54+
hostPath:
55+
path: "{{ .Values.paths.prompt }}"
56+
- name: dri-device
57+
hostPath:
58+
path: /dev/dri
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
apiVersion: apps/v1
2+
kind: DaemonSet
3+
metadata:
4+
name: llm-serving-xpu
5+
spec:
6+
selector:
7+
matchLabels:
8+
app: llm-serving-xpu
9+
template:
10+
metadata:
11+
labels:
12+
app: llm-serving-xpu
13+
spec:
14+
securityContext:
15+
runAsUser: 1000
16+
runAsGroup: 1000
17+
supplementalGroups:
18+
- {{ .Values.gpu.groups.video }}
19+
- {{ .Values.gpu.groups.render }}
20+
containers:
21+
- name: llm-serving-xpu
22+
image: "{{ .Values.image.vllm.registry }}/llm-scaler-vllm:{{ .Values.image.vllm.tag }}"
23+
imagePullPolicy: IfNotPresent
24+
command:
25+
- "/bin/bash"
26+
- "-c"
27+
- "cd /workspace/vllm/models && source /opt/intel/oneapi/setvars.sh --force && \
28+
VLLM_OFFLOAD_WEIGHTS_BEFORE_QUANT=1 TORCH_LLM_ALLREDUCE=1 VLLM_USE_V1=1 \
29+
CCL_ZE_IPC_EXCHANGE=pidfd VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_WORKER_MULTIPROC_METHOD=spawn \
30+
python3 -m vllm.entrypoints.openai.api_server \
31+
--model $LLM_MODEL --dtype $DTYPE --enforce-eager --port $VLLM_SERVICE_PORT_B60 \
32+
--trust-remote-code --disable-sliding-window --gpu-memory-util $GPU_MEMORY_UTIL \
33+
--no-enable-prefix-caching --max-num-batched-tokens $MAX_NUM_BATCHED_TOKENS \
34+
--disable-log-requests --max-model-len $MAX_MODEL_LEN --block-size $BLOCK_SIZE \
35+
--quantization $QUANTIZATION -tp=$TP -dp=$DP"
36+
envFrom:
37+
- configMapRef:
38+
name: edgecraftrag-env
39+
ports:
40+
- containerPort: {{ .Values.ports.vllm }}
41+
securityContext:
42+
privileged: true
43+
volumeMounts:
44+
- name: model-path
45+
mountPath: /workspace/vllm/models
46+
- name: dri-device
47+
mountPath: /dev/dri
48+
volumes:
49+
- name: model-path
50+
hostPath:
51+
path: "{{ .Values.paths.model }}"
52+
- name: dri-device
53+
hostPath:
54+
path: /dev/dri
55+
tolerations:
56+
- key: "gpu"
57+
operator: "Exists"
58+
effect: "NoSchedule"
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: ecrag
5+
spec:
6+
replicas: {{ .Values.replica.ecrag }}
7+
selector:
8+
matchLabels:
9+
app: ecrag
10+
template:
11+
metadata:
12+
labels:
13+
app: ecrag
14+
spec:
15+
containers:
16+
- name: ecrag
17+
image: "{{ .Values.image.ecrag.registry }}/edgecraftrag:{{ .Values.image.ecrag.tag }}"
18+
imagePullPolicy: IfNotPresent
19+
envFrom:
20+
- configMapRef:
21+
name: edgecraftrag-env
22+
ports:
23+
- containerPort: {{ .Values.ports.mega }}
24+
volumeMounts:
25+
- name: model-path
26+
mountPath: /home/user/models
27+
- name: docs-path
28+
mountPath: /home/user/docs
29+
- name: tmpfile-path
30+
mountPath: /home/user/ui_cache
31+
- name: prompt-path
32+
mountPath: /templates/custom
33+
volumes:
34+
- name: model-path
35+
hostPath:
36+
path: "{{ .Values.paths.model }}"
37+
- name: docs-path
38+
hostPath:
39+
path: "{{ .Values.paths.docs }}"
40+
- name: tmpfile-path
41+
hostPath:
42+
path: "{{ .Values.paths.tmpfile }}"
43+
- name: prompt-path
44+
hostPath:
45+
path: "{{ .Values.paths.prompt }}"

0 commit comments

Comments
 (0)