Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
4490094
update readme and configs
daniel-de-leon-user293 May 7, 2025
95505cf
Merge branch 'main' into daniel/docsum-dr
daniel-de-leon-user293 May 7, 2025
ab6482e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 7, 2025
a644b90
Merge branch 'main' into daniel/docsum-dr
daniel-de-leon-user293 May 8, 2025
e1e0e46
add updates to cpu and fix typo
daniel-de-leon-user293 May 8, 2025
c49cf1c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 8, 2025
2b220f0
Merge branch 'main' into daniel/docsum-dr
ashahba May 9, 2025
49df965
update OPEA version
daniel-de-leon-user293 May 9, 2025
53bf578
Merge branch 'main' into daniel/docsum-dr
ashahba May 11, 2025
ba1fbc4
Merge branch 'main' into daniel/docsum-dr
daniel-de-leon-user293 May 12, 2025
01a2cc2
default to run warmup
daniel-de-leon-user293 May 12, 2025
8ea4366
gaudi env vars
daniel-de-leon-user293 May 12, 2025
4434c89
Merge branch 'main' into daniel/docsum-dr
chensuyue May 13, 2025
1b14116
bring back gaudi env vars since it was the test issue
daniel-de-leon-user293 May 13, 2025
fce44ac
add more detail to vLLM warmup description
daniel-de-leon-user293 May 13, 2025
be1ca64
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2025
258ef72
add vLLM detail to gaudi README as well
daniel-de-leon-user293 May 13, 2025
b450c1e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2025
92f7a48
get host_ip before set no_proxy
Zhenzhong1 May 14, 2025
b2e0225
Merge branch 'main' into daniel/docsum-dr
ashahba May 14, 2025
884f9e2
Merge branch 'main' into daniel/docsum-dr
ashahba May 14, 2025
3b8a536
Merge branch 'main' into daniel/docsum-dr
ashahba May 14, 2025
dcbf8fe
Merge branch 'main' into daniel/docsum-dr
ashahba May 15, 2025
4e0c2c8
remove xeon skip warmup
daniel-de-leon-user293 May 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 13 additions & 19 deletions DocSum/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,35 +21,29 @@ This section describes how to quickly deploy and test the DocSum service manuall
6. [Test the Pipeline](#test-the-pipeline)
7. [Cleanup the Deployment](#cleanup-the-deployment)

### Access the Code
### Access the Code and Set Up Environment

Clone the GenAIExample repository and access the ChatQnA Intel Xeon platform Docker Compose files and supporting scripts:

```
```bash
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/DocSum/docker_compose/intel/cpu/xeon/
cd GenAIExamples/DocSum/docker_compose
source set_env.sh
cd intel/cpu/xeon/
```

Checkout a released version, such as v1.2:
NOTE: by default vLLM does "warmup" at start, to optimize its performance for the specified model and the underlying platform, which can take long time. For development (and e.g. autoscaling) it can be skipped with `export VLLM_SKIP_WARMUP=true`.

```
git checkout v1.2
Checkout a released version, such as v1.3:

```bash
git checkout v1.3
```

### Generate a HuggingFace Access Token

Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).

### Configure the Deployment Environment

To set up environment variables for deploying DocSum services, source the _set_env.sh_ script in this directory:

```
source ./set_env.sh
```

The _set_env.sh_ script will prompt for required and optional environment variables used to configure the DocSum services. If a value is not entered, the script will use a default value for the same. It will also generate a _.env_ file defining the desired configuration. Consult the section on [DocSum Service configuration](#docsum-service-configuration) for information on how service specific configuration parameters affect deployments.

### Deploy the Services Using Docker Compose

To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:
Expand Down Expand Up @@ -78,13 +72,13 @@ Please refer to the table below to build different microservices from source:

After running docker compose, check if all the containers launched via docker compose have started:

```
```bash
docker ps -a
```

For the default deployment, the following 5 containers should have started:

```
```bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
748f577b3c78 opea/whisper:latest "python whisper_s…" 5 minutes ago Up About a minute 0.0.0.0:7066->7066/tcp, :::7066->7066/tcp docsum-xeon-whisper-server
4eq8b7034fd9 opea/docsum-gradio-ui:latest "docker-entrypoint.s…" 5 minutes ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp docsum-xeon-ui-server
Expand All @@ -109,7 +103,7 @@ curl -X POST http://${host_ip}:8888/v1/docsum \

To stop the containers associated with the deployment, execute the following command:

```
```bash
docker compose -f compose.yaml down
```

Expand Down
34 changes: 14 additions & 20 deletions DocSum/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,35 +23,29 @@ This section describes how to quickly deploy and test the DocSum service manuall
6. [Test the Pipeline](#test-the-pipeline)
7. [Cleanup the Deployment](#cleanup-the-deployment)

### Access the Code
### Access the Code and Set Up Environment

Clone the GenAIExample repository and access the ChatQnA Intel® Gaudi® platform Docker Compose files and supporting scripts:
Clone the GenAIExample repository and access the DocSum Intel® Gaudi® platform Docker Compose files and supporting scripts:

```
```bash
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/DocSum/docker_compose/intel/hpu/gaudi/
cd GenAIExamples/DocSum/docker_compose
source set_env.sh
cd intel/hpu/gaudi/
```

Checkout a released version, such as v1.2:
NOTE: by default vLLM does "warmup" at start, to optimize its performance for the specified model and the underlying platform, which can take long time. For development (and e.g. autoscaling) it can be skipped with `export VLLM_SKIP_WARMUP=true`.

```
git checkout v1.2
Checkout a released version, such as v1.3:

```bash
git checkout v1.3
```

### Generate a HuggingFace Access Token

Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).

### Configure the Deployment Environment

To set up environment variables for deploying DocSum services, source the _set_env.sh_ script in this directory:

```
source ./set_env.sh
```

The _set_env.sh_ script will prompt for required and optional environment variables used to configure the DocSum services. If a value is not entered, the script will use a default value for the same. It will also generate a _.env_ file defining the desired configuration. Consult the section on [DocSum Service configuration](#docsum-service-configuration) for information on how service specific configuration parameters affect deployments.

### Deploy the Services Using Docker Compose

To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:
Expand Down Expand Up @@ -80,13 +74,13 @@ Please refer to the table below to build different microservices from source:

After running docker compose, check if all the containers launched via docker compose have started:

```
```bash
docker ps -a
```

For the default deployment, the following 5 containers should have started:

```
```bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
748f577b3c78 opea/whisper:latest "python whisper_s…" 5 minutes ago Up About a minute 0.0.0.0:7066->7066/tcp, :::7066->7066/tcp docsum-gaudi-whisper-server
4eq8b7034fd9 opea/docsum-gradio-ui:latest "docker-entrypoint.s…" 5 minutes ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp docsum-gaudi-ui-server
Expand All @@ -111,7 +105,7 @@ curl -X POST http://${host_ip}:8888/v1/docsum \

To stop the containers associated with the deployment, execute the following command:

```
```bash
docker compose -f compose.yaml down
```

Expand Down
1 change: 1 addition & 0 deletions DocSum/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ services:
OMPI_MCA_btl_vader_single_copy_mechanism: none
LLM_MODEL_ID: ${LLM_MODEL_ID}
NUM_CARDS: ${NUM_CARDS}
VLLM_SKIP_WARMUP: ${VLLM_SKIP_WARMUP:-false}
VLLM_TORCH_PROFILER_DIR: "/mnt"
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:80/health || exit 1"]
Expand Down
7 changes: 6 additions & 1 deletion DocSum/docker_compose/set_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ pushd "../../" > /dev/null
source .set_env.sh
popd > /dev/null

export host_ip=$(hostname -I | awk '{print $1}') # Example: host_ip="192.168.1.1"
export no_proxy="${no_proxy},${host_ip}" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export http_proxy=$http_proxy
export https_proxy=$https_proxy
export host_ip=$(hostname -I | awk '{print $1}') # Example: host_ip="192.168.1.1"
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}

export LLM_ENDPOINT_PORT=8008
Expand All @@ -29,3 +29,8 @@ export BACKEND_SERVICE_PORT=8888
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:${BACKEND_SERVICE_PORT}/v1/docsum"

export LOGFLAG=True

export NUM_CARDS=1
export BLOCK_SIZE=128
export MAX_NUM_SEQS=256
export MAX_SEQ_LEN_TO_CAPTURE=2048