Skip to content

Commit ae30e1e

Browse files
committed
Add monitoring for AudioQnA deployed by Docker compose
Signed-off-by: Yi Yao <yi.a.yao@intel.com>
1 parent 4f620f0 commit ae30e1e

19 files changed

Lines changed: 414 additions & 24 deletions

File tree

AudioQnA/docker_compose/intel/cpu/xeon/README.md

Lines changed: 29 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,19 @@ Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying
1515

1616
This section describes how to quickly deploy and test the AudioQnA service manually on an Intel® Xeon® processor. The basic steps are:
1717

18-
1. [Access the Code](#access-the-code)
19-
2. [Configure the Deployment Environment](#configure-the-deployment-environment)
20-
3. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
21-
4. [Check the Deployment Status](#check-the-deployment-status)
22-
5. [Validate the Pipeline](#validate-the-pipeline)
23-
6. [Cleanup the Deployment](#cleanup-the-deployment)
18+
- [Deploying AudioQnA on Intel® Xeon® Processors](#deploying-audioqna-on-intel-xeon-processors)
19+
- [Table of Contents](#table-of-contents)
20+
- [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment)
21+
- [Access the Code](#access-the-code)
22+
- [Configure the Deployment Environment](#configure-the-deployment-environment)
23+
- [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
24+
- [Check the Deployment Status](#check-the-deployment-status)
25+
- [Validate the Pipeline](#validate-the-pipeline)
26+
- [Cleanup the Deployment](#cleanup-the-deployment)
27+
- [AudioQnA Docker Compose Files](#audioqna-docker-compose-files)
28+
- [Running LLM models with remote endpoints](#running-llm-models-with-remote-endpoints)
29+
- [Validate MicroServices](#validate-microservices)
30+
- [Conclusion](#conclusion)
2431

2532
### Access the Code
2633

@@ -59,7 +66,7 @@ To deploy the AudioQnA services, execute the `docker compose up` command with th
5966

6067
```bash
6168
cd docker_compose/intel/cpu/xeon
62-
docker compose -f compose.yaml up -d
69+
docker compose -f compose_tgi.yaml up -d
6370
```
6471

6572
> **Note**: developers should build docker image from source when:
@@ -80,6 +87,13 @@ Please refer to the table below to build different microservices from source:
8087
| MegaService | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image) |
8188
| UI | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image) |
8289

90+
(Optional) Enabling monitoring using the command:
91+
92+
```bash
93+
cd docker_compose/intel/cpu/xeon
94+
docker compose -f compose_tgi.yaml -f compose.monitoring.yaml up -d
95+
```
96+
8397
### Check the Deployment Status
8498

8599
After running docker compose, check if all the containers launched via docker compose have started:
@@ -127,7 +141,13 @@ curl http://${host_ip}:3008/v1/audioqna \
127141
To stop the containers associated with the deployment, execute the following command:
128142

129143
```bash
130-
docker compose -f compose.yaml down
144+
docker compose -f compose_tgi.yaml down
145+
```
146+
147+
If monitoring is enabled, stop the containers using the following command:
148+
149+
```bash
150+
docker compose -f compose_tgi.yaml -f compose.monitoring.yaml down
131151
```
132152

133153
## AudioQnA Docker Compose Files
@@ -140,6 +160,7 @@ In the context of deploying an AudioQnA pipeline on an Intel® Xeon® platform,
140160
| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default |
141161
| [compose_multilang.yaml](./compose_multilang.yaml) | The TTS component is GPT-SoVITS. All other configurations remain the same as the default |
142162
| [compose_remote.yaml](./compose_remote.yaml) | The LLM used is hosted on a remote server and an endpoint is used to access this model. Additional environment variables need to be set before running. See [instructions](#running-llm-models-with-remote-endpoints) below. |
163+
| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files |
143164

144165
### Running LLM models with remote endpoints
145166

AudioQnA/docker_compose/intel/cpu/xeon/README_vllm.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ export HF_TOKEN='your_huggingfacehub_token'
7474
### Setting variables in the file set_env_vllm.sh
7575

7676
```bash
77-
cd cd cd ~/searchqna-test/GenAIExamples/SearchQnA/docker_compose/amd/gpu/rocm
77+
cd ~/searchqna-test/GenAIExamples/SearchQnA/docker_compose/amd/gpu/rocm
7878
### The example uses the Nano text editor. You can use any convenient text editor
7979
nano set_env_vllm.sh
8080
```
@@ -107,7 +107,7 @@ export https_proxy="Your_HTTPs_Proxy"
107107

108108
```bash
109109
cd cd ~/audioqna-test/GenAIExamples/AudioQnA/docker_compose/amd/gpu/rocm/
110-
docker compose -f compose_vllm up -d
110+
docker compose up -d
111111
```
112112

113113
After starting the containers, you need to view their status with the command:
@@ -126,6 +126,12 @@ The following containers should be running:
126126

127127
Containers should not restart.
128128

129+
(Optional) Enabling monitoring using the commmand:
130+
```bash
131+
docker compose -f compose.yaml -f compose.monitoring.yaml up -d
132+
```
133+
134+
129135
#### 3.1.1. Configuring GPU forwarding
130136

131137
By default, in the Docker Compose file, compose_vllm.yaml is configured to forward all GPUs to the audioqna-vllm-service container.
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
services:
5+
prometheus:
6+
image: prom/prometheus:v2.52.0
7+
container_name: opea_prometheus
8+
user: root
9+
volumes:
10+
- ./prometheus.yaml:/etc/prometheus/prometheus.yaml
11+
- ./prometheus_data:/prometheus
12+
command:
13+
- '--config.file=/etc/prometheus/prometheus.yaml'
14+
ports:
15+
- '9090:9090'
16+
ipc: host
17+
restart: unless-stopped
18+
19+
grafana:
20+
image: grafana/grafana:11.0.0
21+
container_name: grafana
22+
volumes:
23+
- ./grafana_data:/var/lib/grafana
24+
- ./grafana/dashboards:/var/lib/grafana/dashboards
25+
- ./grafana/provisioning:/etc/grafana/provisioning
26+
user: root
27+
environment:
28+
GF_SECURITY_ADMIN_PASSWORD: admin
29+
GF_RENDERING_CALLBACK_URL: http://grafana:3000/
30+
GF_LOG_FILTERS: rendering:debug
31+
no_proxy: ${no_proxy}
32+
host_ip: ${host_ip}
33+
depends_on:
34+
- prometheus
35+
ports:
36+
- '3000:3000'
37+
ipc: host
38+
restart: unless-stopped
39+
40+
node-exporter:
41+
image: prom/node-exporter
42+
container_name: node-exporter
43+
volumes:
44+
- /proc:/host/proc:ro
45+
- /sys:/host/sys:ro
46+
- /:/rootfs:ro
47+
command:
48+
- '--path.procfs=/host/proc'
49+
- '--path.sysfs=/host/sys'
50+
- --collector.filesystem.ignored-mount-points
51+
- "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)"
52+
environment:
53+
no_proxy: ${no_proxy}
54+
ports:
55+
- 9100:9100
56+
ipc: host
57+
restart: always
58+
deploy:
59+
mode: global
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
#!/bin/bash
2+
# Copyright (C) 2025 Intel Corporation
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
if ls *.json 1> /dev/null 2>&1; then
6+
rm *.json
7+
fi
8+
9+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/vllm_grafana.json
10+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json
11+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/audioqna_megaservice_grafana.json
12+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
apiVersion: 1
5+
6+
providers:
7+
- name: 'default'
8+
orgId: 1
9+
folder: ''
10+
type: file
11+
disableDeletion: false
12+
updateIntervalSeconds: 10 #how often Grafana will scan for changed dashboards
13+
options:
14+
path: /var/lib/grafana/dashboards
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# config file version
5+
apiVersion: 1
6+
7+
# list of datasources that should be deleted from the database
8+
deleteDatasources:
9+
- name: Prometheus
10+
orgId: 1
11+
12+
# list of datasources to insert/update depending
13+
# what's available in the database
14+
datasources:
15+
# <string, required> name of the datasource. Required
16+
- name: Prometheus
17+
# <string, required> datasource type. Required
18+
type: prometheus
19+
# <string, required> access mode. direct or proxy. Required
20+
access: proxy
21+
# <int> org id. will default to orgId 1 if not specified
22+
orgId: 1
23+
# <string> url
24+
url: http://$host_ip:9090
25+
# <string> database password, if used
26+
password:
27+
# <string> database user, if used
28+
user:
29+
# <string> database name, if used
30+
database:
31+
# <bool> enable/disable basic auth
32+
basicAuth: false
33+
# <string> basic auth username, if used
34+
basicAuthUser:
35+
# <string> basic auth password, if used
36+
basicAuthPassword:
37+
# <bool> enable/disable with credentials headers
38+
withCredentials:
39+
# <bool> mark as default datasource. Max one per org
40+
isDefault: true
41+
# <map> fields that will be converted to json and stored in json_data
42+
jsonData:
43+
httpMethod: GET
44+
graphiteVersion: "1.1"
45+
tlsAuth: false
46+
tlsAuthWithCACert: false
47+
# <string> json object of data that will be encrypted.
48+
secureJsonData:
49+
tlsCACert: "..."
50+
tlsClientCert: "..."
51+
tlsClientKey: "..."
52+
version: 1
53+
# <bool> allow users to edit datasources from the UI.
54+
editable: true
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
# [IP_ADDR]:{PORT_OUTSIDE_CONTAINER} -> {PORT_INSIDE_CONTAINER} / {PROTOCOL}
4+
global:
5+
scrape_interval: 5s
6+
external_labels:
7+
monitor: "my-monitor"
8+
scrape_configs:
9+
- job_name: "prometheus"
10+
static_configs:
11+
- targets: ["opea_prometheus:9090"]
12+
- job_name: "vllm"
13+
metrics_path: /metrics
14+
static_configs:
15+
- targets: ["vllm-service:80"]
16+
- job_name: "tgi"
17+
metrics_path: /metrics
18+
static_configs:
19+
- targets: ["tgi-service:80"]
20+
- job_name: "docsum-backend-server"
21+
metrics_path: /metrics
22+
static_configs:
23+
- targets: ["audioqna-xeon-backend-server:8888"]
24+
- job_name: "prometheus-node-exporter"
25+
metrics_path: /metrics
26+
static_configs:
27+
- targets: ["node-exporter:9100"]

AudioQnA/docker_compose/intel/cpu/xeon/set_env.sh

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
# Copyright (C) 2024 Intel Corporation
44
# SPDX-License-Identifier: Apache-2.0
55

6+
SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
7+
68
# export host_ip=<your External Public IP>
79
export host_ip=$(hostname -I | awk '{print $1}')
810
export HF_TOKEN=${HF_TOKEN}
@@ -21,3 +23,7 @@ export SPEECHT5_SERVER_PORT=7055
2123
export LLM_SERVER_PORT=3006
2224

2325
export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna
26+
27+
pushd "${SCRIPT_DIR}/grafana/dashboards" > /dev/null
28+
source download_opea_dashboard.sh
29+
popd > /dev/null

AudioQnA/docker_compose/intel/hpu/gaudi/README.md

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,18 @@ Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying
1515

1616
This section describes how to quickly deploy and test the AudioQnA service manually on an Intel® Gaudi® processor. The basic steps are:
1717

18-
1. [Access the Code](#access-the-code)
19-
2. [Configure the Deployment Environment](#configure-the-deployment-environment)
20-
3. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
21-
4. [Check the Deployment Status](#check-the-deployment-status)
22-
5. [Validate the Pipeline](#validate-the-pipeline)
23-
6. [Cleanup the Deployment](#cleanup-the-deployment)
18+
- [Deploying AudioQnA on Intel® Gaudi® Processors](#deploying-audioqna-on-intel-gaudi-processors)
19+
- [Table of Contents](#table-of-contents)
20+
- [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment)
21+
- [Access the Code](#access-the-code)
22+
- [Configure the Deployment Environment](#configure-the-deployment-environment)
23+
- [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
24+
- [Check the Deployment Status](#check-the-deployment-status)
25+
- [Validate the Pipeline](#validate-the-pipeline)
26+
- [Cleanup the Deployment](#cleanup-the-deployment)
27+
- [AudioQnA Docker Compose Files](#audioqna-docker-compose-files)
28+
- [Validate MicroServices](#validate-microservices)
29+
- [Conclusion](#conclusion)
2430

2531
### Access the Code
2632

@@ -79,6 +85,13 @@ Please refer to the table below to build different microservices from source:
7985
| MegaService | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image) |
8086
| UI | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image) |
8187

88+
(Optional) Enabling monitoring using the command:
89+
90+
```bash
91+
cd docker_compose/intel/cpu/xeon
92+
docker compose -f compose.yaml -f compose.monitoring.yaml up -d
93+
```
94+
8295
### Check the Deployment Status
8396

8497
After running docker compose, check if all the containers launched via docker compose have started:
@@ -128,6 +141,12 @@ To stop the containers associated with the deployment, execute the following com
128141
docker compose -f compose.yaml down
129142
```
130143

144+
If monitoring is enabled, stop the containers using the following command:
145+
146+
```bash
147+
docker compose -f compose.yaml -f compose.monitoring.yaml down
148+
```
149+
131150
## AudioQnA Docker Compose Files
132151

133152
In the context of deploying an AudioQnA pipeline on an Intel® Gaudi® platform, we can pick and choose different large language model serving frameworks. The table below outlines the various configurations that are available as part of the application. These configurations can be used as templates and can be extended to different components available in [GenAIComps](https://github.com/opea-project/GenAIComps.git).

0 commit comments

Comments
 (0)