opea-project · Yongbozzz · Mar 12, 2026 · Mar 11, 2026 · Mar 12, 2026 · Mar 12, 2026
@@ -6,7 +6,7 @@ RUN apt-get remove -y libze-intel-gpu1 libigc1 libigdfcl1 libze-dev || true; \
     apt-get update; \
     apt-get install -y curl
 RUN curl -sL 'https://keyserver.ubuntu.com/pks/lookup?fingerprint=on&op=get&search=0x0C0E6AF955CE463C03FC51574D098D70AFBE5E1F' | tee /etc/apt/trusted.gpg.d/driver.asc
-RUN echo -e "Types: deb\nURIs: https://ppa.launchpadcontent.net/kobuk-team/intel-graphics/ubuntu/\nSuites: plucky\nComponents: main\nSigned-By: /etc/apt/trusted.gpg.d/driver.asc" > /etc/apt/sources.list.d/driver.sources
+RUN echo -e "Types: deb\nURIs: https://ppa.launchpadcontent.net/kobuk-team/intel-graphics/ubuntu/\nSuites: questing\nComponents: main\nSigned-By: /etc/apt/trusted.gpg.d/driver.asc" > /etc/apt/sources.list.d/driver.sources
 RUN apt-get update && apt-get install -y libze-intel-gpu1 libze1 intel-metrics-discovery intel-opencl-icd clinfo intel-gsc && apt-get install -y libze-intel-gpu1 libze1 intel-metrics-discovery intel-opencl-icd clinfo intel-gsc && apt-get install -y libze-dev intel-ocloc libze-intel-gpu-raytracing
 
 RUN useradd -m -s /bin/bash user && \
@@ -18,11 +18,13 @@ RUN mkdir /templates && \
 COPY ./edgecraftrag/prompt_template/default_prompt.txt /templates/
 RUN chown -R user /templates/default_prompt.txt
 
-COPY ./edgecraftrag /home/user/edgecraftrag
-
-RUN mkdir -p /home/user/ui_cache
+RUN mkdir -p /home/user/ui_cache /home/user/edgecraftrag
 ENV UI_UPLOAD_PATH=/home/user/ui_cache
 
+# Copy requirements first so pip install is cached independently from source changes
+COPY ./edgecraftrag/requirements.txt /home/user/edgecraftrag/requirements.txt
+RUN chown -R user /home/user/edgecraftrag
+
 USER user
 
 WORKDIR /home/user/edgecraftrag
@@ -37,4 +39,7 @@ ENV PYTHONPATH="$PYTHONPATH:/home/user/genai/tools/llm_bench"
 
 RUN python3 -m nltk.downloader -d /home/user/nltk_data punkt_tab averaged_perceptron_tagger_eng
 
+# Copy the full source last — changes here no longer bust the pip cache layers above
+COPY ./edgecraftrag /home/user/edgecraftrag
+
 ENTRYPOINT ["python3", "-m", "edgecraftrag.server"]
@@ -7,10 +7,9 @@ quality and performance.
 
 ## What's New
 
-1. Support Agent component and enable deep_search agent
-2. Optimize pipeline execution performance with asynchronous api
-3. Support session list display in UI
-4. Support vllm-based embedding service
+1. Support decouple operation for pipeline and knowledge base
+2. Optimize Agentic workflow user experience
+3. User Guide enhancement
 
 ## Table of contents
 

@@ -1,8 +1,10 @@
 # Example Edge Craft Retrieval-Augmented Generation Deployment on Intel® Arc® Platform
 
-This document outlines the deployment process for Edge Craft Retrieval-Augmented Generation service on Intel Arc server. This example includes the following sections:
+[中文版](README_zh.md)
 
-- [EdgeCraftRAG Quick Start Deployment](#edgecraftrag-quick-start-deployment): Demonstrates how to quickly deploy a Edge Craft Retrieval-Augmented Generation service/pipeline on Intel® Arc® platform.
+This document outlines the deployment process for Edge Craft Retrieval-Augmented Generation service on Intel® Arc® Platform. This example includes the following sections:
+
+- [EdgeCraftRAG Quick Start Deployment](#edgecraftrag-quick-start-deployment): Demonstrates how to quickly deploy a Edge Craft Retrieval-Augmented Generation service/pipeline on Intel® Arc® Platform.
 - [EdgeCraftRAG Docker Compose Files](#edgecraftrag-docker-compose-files): Describes some example deployments and their docker compose files.
 - [EdgeCraftRAG Service Configuration](#edgecraftrag-service-configuration): Describes the service and possible configuration changes.
 
@@ -12,23 +14,31 @@ This section describes how to quickly deploy and test the EdgeCraftRAG service m
 
 1. [Prerequisites](#1-prerequisites)
 2. [Access the Code](#2-access-the-code)
-3. [Prepare models](#3-prepare-models)
-4. [Prepare env variables and configurations](#4-prepare-env-variables-and-configurations)
-5. [Deploy the Service on Arc GPU Using Docker Compose](#5-deploy-the-service-on-intel-gpu-using-docker-compose)
-6. [Access UI](#6-access-ui)
-7. [Cleanup the Deployment](#7-cleanup-the-deployment)
+3. [Run quick_start.sh](#3-run-quick_startsh)
+4. [Access UI](#4-access-ui)
+5. [Cleanup the Deployment](#5-cleanup-the-deployment)
 
 ### 1. Prerequisites
 
-EC-RAG supports vLLM deployment(default method) and local OpenVINO deployment for Intel Arc GPU. Prerequisites are shown as below:  
-Hardware: Intel Arc A770  
-OS: Ubuntu Server 22.04.1 or newer (at least 6.2 LTS kernel)  
-Driver & libraries: please to [Installing GPUs Drivers](https://dgpu-docs.intel.com/driver/installation-rolling.html#installing-gpu-drivers) for detailed driver & libraries setup
+EC-RAG supports vLLM deployment(default method) and local OpenVINO deployment for Intel Arc GPU and Core Ultra Platform. Prerequisites are shown as below:
+
+#### Core Ultra
+
+**OS**: Ubuntu 24.04 or newer  
+**Driver & libraries**: Please refer to [Installing Client GPUs on Ubuntu Desktop](https://dgpu-docs.intel.com/driver/client/overview.html#installing-client-gpus-on-ubuntu-desktop)  
+**Available Inferencing Framework**: openVINO
+
+#### Intel Arc B60
 
-Hardware: Intel Arc B60  
-please to [Install Native Environment](https://github.com/intel/llm-scaler/tree/main/vllm#11-install-native-environment) for detailed setup
+**OS**: Ubuntu 25.04 Desktop (for Core Ultra and Xeon-W), Ubuntu 25.04 Server (for Xeon-SP).  
+**Driver & libraries**: Please refer to [Install Bare Metal Environment](https://github.com/intel/llm-scaler/tree/main/vllm#11-install-bare-metal-environment) for detailed setup  
+**Available Inferencing Framework**: openVINO, vLLM
 
-Below steps are based on **vLLM** as inference engine, if you want to choose **OpenVINO**, please refer to [OpenVINO Local Inference](../../../../docs/Advanced_Setup.md#openvino-local-inference)
+#### Intel Arc A770
+
+**OS**: Ubuntu Server 22.04.1 or newer (at least 6.2 LTS kernel)  
+**Driver & libraries**: Please refer to [Installing GPUs Drivers](https://dgpu-docs.intel.com/driver/installation-rolling.html#installing-gpu-drivers) for detailed driver & libraries setup  
+**Available Inferencing Framework**: openVINO, vLLM
 
 ### 2. Access the Code
 
@@ -39,123 +49,54 @@ git clone https://github.com/opea-project/GenAIExamples.git
 cd GenAIExamples/EdgeCraftRAG
 ```
 
-Checkout a released version, such as v1.5:
-
-```
-git checkout v1.5
-```
-
-### 3. Prepare models
+> **NOTE**: If you want to checkout a released version, such as v1.5:
+>
+> ```
+> git checkout v1.5
+> ```
 
-```bash
-# Prepare models for embedding, reranking:
-export MODEL_PATH="${PWD}/models" # Your model path for embedding, reranking and LLM models
-mkdir -p $MODEL_PATH
-pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
-optimum-cli export openvino -m BAAI/bge-small-en-v1.5 ${MODEL_PATH}/BAAI/bge-small-en-v1.5 --task sentence-similarity
-optimum-cli export openvino -m BAAI/bge-reranker-large ${MODEL_PATH}/BAAI/bge-reranker-large --task text-classification
-
-# Prepare LLM model
-export LLM_MODEL="Qwen/Qwen3-8B" # Your model id
-pip install modelscope
-modelscope download --model $LLM_MODEL --local_dir "${MODEL_PATH}/${LLM_MODEL}"
-# Optionally, you can also download models with huggingface:
-# pip install -U huggingface_hub
-# huggingface-cli download $LLM_MODEL --local-dir "${MODEL_PATH}/${LLM_MODEL}"
-```
+### 3. Run quick_start.sh
 
-### 4. Prepare env variables and configurations
-
-#### Prepare env variables for vLLM deployment
+Run quick start script from the `EdgeCraftRAG` root directory:
 
 ```bash
-ip_address=$(hostname -I | awk '{print $1}')
-# Use `ip a` to check your active ip
-export HOST_IP=$ip_address # Your host ip
-
-# Check group id of video and render
-export VIDEOGROUPID=$(getent group video | cut -d: -f3)
-export RENDERGROUPID=$(getent group render | cut -d: -f3)
-
-# If you have a proxy configured, execute below line
-export no_proxy=${no_proxy},${HOST_IP},edgecraftrag,edgecraftrag-server
-export NO_PROXY=${NO_PROXY},${HOST_IP},edgecraftrag,edgecraftrag-server
-# If you have a HF mirror configured, it will be imported to the container
-# export HF_ENDPOINT=https://hf-mirror.com # your HF mirror endpoint"
-
-# Make sure all 3 folders have 1000:1000 permission, otherwise
-export DOC_PATH=${PWD}/tests
-export TMPFILE_PATH=${PWD}/tests
-chown 1000:1000 ${MODEL_PATH} ${DOC_PATH} ${TMPFILE_PATH}
-# In addition, also make sure the .cache folder has 1000:1000 permission, otherwise
-chown 1000:1000 -R $HOME/.cache
+./tools/quick_start.sh
 ```
 
-For more advanced env variables and configurations, please refer to [Prepare env variables for vLLM deployment](../../../../docs/Advanced_Setup.md#prepare-env-variables-for-vllm-deployment)
+The script is located in the `tools` directory. For detailed usage of `quick_start.sh` and `build_images.sh`, please refer to [tools/README.md](../../../../tools/README.md).
 
-### 5. Deploy the Service on Intel GPU Using Docker Compose
+By default, this script starts local OpenVINO deployment when no environment variables are configured.
 
-set Milvus DB and chat history round for inference:
+If you prefer manual model preparation, env setup, and docker compose options, please refer to [Manual deployment details in Advanced Setup](../../../../docs/Advanced_Setup.md#manual-deployment-details-for-arc-platform).
 
-```bash
-# EC-RAG support Milvus as persistent database, by default milvus is disabled, you can choose to set MILVUS_ENABLED=1 to enable it
-export MILVUS_ENABLED=0
-# If you enable Milvus, the default storage path is PWD, uncomment if you want to change:
-# export DOCKER_VOLUME_DIRECTORY= # change to your preference
+### 4. Access UI
 
-# EC-RAG support chat history round setting, by default chat history is disabled, you can set CHAT_HISTORY_ROUND to control it
-# export CHAT_HISTORY_ROUND= # change to your preference
-```
-
-#### option a. Deploy the Service on Arc A770 Using Docker Compose
-
-```bash
-export VLLM_SERVICE_PORT_A770=8086 # You can set your own port for vllm service
-
-# Launch EC-RAG service with compose
-docker compose --profile a770 -f docker_compose/intel/gpu/arc/compose.yaml up -d
-```
+Open your browser, access http://${HOST_IP}:8082
 
-#### option b. Deploy the Service on Arc B60 Using Docker Compose
+After startup completes, `quick_start.sh` will print:
 
-```bash
-# Besides MILVUS_ENABLED and CHAT_HISTORY_ROUND, below environments are exposed for vLLM config, you can change them to your preference:
-# export VLLM_SERVICE_PORT_B60=8086
-# export DTYPE=float16
-# export TP=1 # for multi GPU, you can change TP value
-# export DP=1
-# export ZE_AFFINITY_MASK=0 # for multi GPU, you can export ZE_AFFINITY_MASK=0,1,2...
-# export ENFORCE_EAGER=1
-# export TRUST_REMOTE_CODE=1
-# export DISABLE_SLIDING_WINDOW=1
-# export GPU_MEMORY_UTIL=0.8
-# export NO_ENABLE_PREFIX_CACHING=1
-# export MAX_NUM_BATCHED_TOKENS=8192
-# export DISABLE_LOG_REQUESTS=1
-# export MAX_MODEL_LEN=49152
-# export BLOCK_SIZE=64
-# export QUANTIZATION=fp8
-docker compose --profile b60 -f docker_compose/intel/gpu/arc/compose.yaml up -d
+```text
+Service launched successfully.
+UI access URL: http://${HOST_IP}:8082
+If you are accessing from another machine, replace ${HOST_IP} with your server's reachable IP or hostname.
 ```
 
-### 6. Access UI
-
-Open your browser, access http://${HOST_IP}:8082
-
 > Your browser should be running on the same host of your console, otherwise you will need to access UI with your host domain name instead of ${HOST_IP}.
 
 Below is the UI front page, for detailed operations on UI and EC-RAG settings, please refer to [Explore_Edge_Craft_RAG](../../../../docs/Explore_Edge_Craft_RAG.md)
 ![front_page](../../../../assets/img/front_page.png)
 
-### 7. Cleanup the Deployment
+### 5. Cleanup the Deployment
 
-To stop the containers associated with the deployment, execute the following command:
+To stop the containers associated with the deployment, execute the helper script command:
 
+```bash
+./tools/quick_start.sh cleanup
 ```
-docker compose -f docker_compose/intel/gpu/arc/compose.yaml down
-```
 
-All the EdgeCraftRAG containers will be stopped and then removed on completion of the "down" command.
+All the EdgeCraftRAG containers will be stopped and then removed on completion.
+
+If you prefer the manual docker compose cleanup command, please refer to [Manual cleanup details in Advanced Setup](../../../../docs/Advanced_Setup.md#6-cleanup-the-deployment-manual).
 
 ## EdgeCraftRAG Docker Compose Files
 

@@ -0,0 +1,125 @@
+# 在 Intel® Arc® 平台上部署 Edge Craft 检索增强生成（EC-RAG）示例
+
+[English](README.md)
+
+本文档介绍了在 Intel® Arc® 平台上部署 Edge Craft 检索增强生成服务的流程。该示例包含以下部分：
+
+- [EdgeCraftRAG 快速开始部署](#edgecraftrag-快速开始部署)：演示如何在 Intel® Arc® 平台上快速部署 Edge Craft 检索增强生成服务/流水线。
+- [EdgeCraftRAG Docker Compose 文件](#edgecraftrag-docker-compose-文件)：说明一些示例部署及其 docker compose 文件。
+- [EdgeCraftRAG 服务配置](#edgecraftrag-服务配置)：说明服务以及可进行的配置变更。
+
+## EdgeCraftRAG 快速开始部署
+
+本节介绍如何在 Intel® Arc® 平台上手动快速部署并测试 EdgeCraftRAG 服务。基本步骤如下：
+
+1. [前置条件](#1-前置条件)
+2. [获取代码](#2-获取代码)
+3. [运行 quick_start.sh](#3-运行-quick_startsh)
+4. [访问 UI](#4-访问-ui)
+5. [清理部署](#5-清理部署)
+
+### 1. 前置条件
+
+EC-RAG 支持 vLLM 部署（默认方式）以及面向 Intel Arc GPU 和 Core Ultra 平台的本地 OpenVINO 部署。前置条件如下：
+
+#### Core Ultra
+
+**操作系统**：Ubuntu 24.04 或更高版本  
+**驱动与库**：请参考 [Installing Client GPUs on Ubuntu Desktop](https://dgpu-docs.intel.com/driver/client/overview.html#installing-client-gpus-on-ubuntu-desktop)  
+**可用推理框架**：openVINO
+
+#### Intel Arc B60
+
+**操作系统**：Ubuntu 25.04 Desktop（适用于 Core Ultra 和 Xeon-W），Ubuntu 25.04 Server（适用于 Xeon-SP）。  
+**驱动与库**：详细安装请参考 [Install Bare Metal Environment](https://github.com/intel/llm-scaler/tree/main/vllm#11-install-bare-metal-environment)  
+**可用推理框架**：openVINO、vLLM
+
+#### Intel Arc A770
+
+**操作系统**：Ubuntu Server 22.04.1 或更高版本（至少 6.2 LTS 内核）  
+**驱动与库**：详细驱动与库安装请参考 [Installing GPUs Drivers](https://dgpu-docs.intel.com/driver/installation-rolling.html#installing-gpu-drivers)  
+**可用推理框架**：openVINO、vLLM
+
+### 2. 获取代码
+
+克隆 GenAIExample 仓库，并进入 EdgeCraftRAG 在 Intel® Arc® 平台上的 Docker Compose 文件与配套脚本目录：
+
+```
+git clone https://github.com/opea-project/GenAIExamples.git
+cd GenAIExamples/EdgeCraftRAG
+```
+
+> **注意**：如果你想切换到某个发布版本，例如 v1.5：
+>
+> ```
+> git checkout v1.5
+> ```
+
+### 3. 运行 quick_start.sh
+
+在 `EdgeCraftRAG` 根目录下运行快速启动脚本：
+
+```bash
+./tools/quick_start.sh
+```
+
+该脚本位于 `tools` 目录。有关 `quick_start.sh` 和 `build_images.sh` 的详细用法，请参考 [tools/README_zh.md](../../../../tools/README_zh.md)。
+
+在不配置任何环境变量时，脚本默认启动本地 OpenVINO 部署。
+
+如果你希望使用手动方式（模型准备、环境变量配置、Docker Compose 启动），请参考 [Advanced Setup 中的手动部署说明](../../../../docs/Advanced_Setup_zh.md#arc-平台手动部署详细说明)。
+
+### 4. 访问 UI
+
+打开浏览器访问 http://${HOST_IP}:8082
+
+启动完成后，`quick_start.sh` 会输出：
+
+```text
+Service launched successfully.
+UI access URL: http://${HOST_IP}:8082
+If you are accessing from another machine, replace ${HOST_IP} with your server's reachable IP or hostname.
+```
+
+> 浏览器应运行在与控制台相同的主机上；否则你需要使用主机域名而不是 ${HOST_IP} 来访问 UI。
+
+下图为 UI 首页。有关 UI 操作和 EC-RAG 设置的详细说明，请参考 [Explore_Edge_Craft_RAG](../../../../docs/Explore_Edge_Craft_RAG_zh.md)
+![front_page](../../../../assets/img/front_page.png)
+
+### 5. 清理部署
+
+若要停止与本次部署关联的容器，请执行脚本命令：
+
+```bash
+./tools/quick_start.sh cleanup
+```
+
+执行完成后，所有 EdgeCraftRAG 容器都会停止并被移除。
+
+如果你希望使用手动 docker compose 清理命令，请参考 [Advanced Setup 中的手动清理说明](../../../../docs/Advanced_Setup_zh.md#6-清理部署手动)。
+
+## EdgeCraftRAG Docker Compose 文件
+
+`compose.yaml` 是默认的 compose 文件，使用 tgi 作为服务框架。
+
+| 服务名称            | 镜像名称                                 |
+| ------------------- | ---------------------------------------- |
+| etcd                | quay.io/coreos/etcd:v3.5.5               |
+| minio               | minio/minio:RELEASE.2023-03-20T20-16-18Z |
+| milvus-standalone   | milvusdb/milvus:v2.4.6                   |
+| edgecraftrag-server | opea/edgecraftrag-server:latest          |
+| edgecraftrag-ui     | opea/edgecraftrag-ui:latest              |
+| ecrag               | opea/edgecraftrag:latest                 |
+
+## EdgeCraftRAG 服务配置
+
+下表全面概述了示例 Docker Compose 文件中各类部署所使用的 EdgeCraftRAG 服务。表中每一行代表一个独立服务，详细说明了可用镜像及其在部署架构中的功能描述。
+
+| 服务名称            | 可选镜像名称                             | 可选 | 描述                                                       |
+| ------------------- | ---------------------------------------- | ---- | ---------------------------------------------------------- |
+| etcd                | quay.io/coreos/etcd:v3.5.5               | 否   | 提供分布式键值存储，用于服务发现和配置管理。               |
+| minio               | minio/minio:RELEASE.2023-03-20T20-16-18Z | 否   | 提供对象存储服务，用于存储文档和模型文件。                 |
+| milvus-standalone   | milvusdb/milvus:v2.4.6                   | 否   | 提供向量数据库能力，用于管理 embedding 和相似度检索。      |
+| edgecraftrag-server | opea/edgecraftrag-server:latest          | 否   | 作为 EdgeCraftRAG 服务后端，具体形态随部署方式不同而变化。 |
+| edgecraftrag-ui     | opea/edgecraftrag-ui:latest              | 否   | 提供 EdgeCraftRAG 服务的用户界面。                         |
+| ecrag               | opea/edgecraftrag:latest                 | 否   | 作为反向代理，管理 UI 与后端服务之间的流量。               |
@@ -207,14 +207,14 @@ services:
       https_proxy: ${https_proxy}
       MODEL_PATH: "/llm/models"
       SERVED_MODEL_NAME: ${LLM_MODEL}
-      TENSOR_PARALLEL_SIZE: ${TENSOR_PARALLEL_SIZE:-1}
+      TENSOR_PARALLEL_SIZE: ${TP:-1}
       MAX_NUM_SEQS: ${MAX_NUM_SEQS:-64}
       MAX_NUM_BATCHED_TOKENS: ${MAX_NUM_BATCHED_TOKENS:-10240}
       MAX_MODEL_LEN: ${MAX_MODEL_LEN:-10240}
-      LOAD_IN_LOW_BIT: ${LOAD_IN_LOW_BIT:-fp8}
+      LOAD_IN_LOW_BIT: ${QUANTIZATION:-fp8}
       CCL_DG2_USM: ${CCL_DG2_USM:-""}
       PORT: ${VLLM_SERVICE_PORT_A770:-8086}
-      ZE_AFFINITY_MASK: ${SELECTED_XPU_0:-0}
+      ZE_AFFINITY_MASK: ${ZE_AFFINITY_MASK:-0}
     shm_size: '32g'
     entrypoint: /bin/bash -c "\
       cd /llm && \