edge-developer-kit-reference-scripts/usecases/ai/video_summarization/README.md at main · open-edge-platform/edge-developer-kit-reference-scripts

Introduction

This is a demo application that uses power of AI (Large Video Language model) for live video stream captioning for retail loss prevention scenario and Visual RAG (text-to-image retrieval) using Intel VDMS vector database.

Pre-requisite:

Ubuntu 24.04 LTS
Docker (https://docs.docker.com/engine/install/ubuntu/)
Miniforge Conda (https://conda-forge.org/download/)
Python 3.10+
tmux
Target Hardware:
- Intel® Core™ Processors Platform (Alderlake, Raptor Lake, Arrow Lake, etc)
- Intel® Arc™ A-series or Intel® Arc™ B-series graphics card (A770/B580)
- Intel® Core™ Ultra Series Processor Platform (Lunar Lake, Arrow Lake-H)
Additional installation guide, see link.

Environment Setup

Refer to the pre-requisite section, and follow the instructions from the website to install docker, Miniforge Conda.
Install other ubuntu dependencies

sudo apt update
sudo apt install tmux ffmpeg

Pull the code

mkdir -p $HOME/work
cd $HOME/work
git clone https://github.com/intel/edge-developer-kit-reference-scripts edge-ai-devkit
cp -rf edge-ai-devkit/usecases/ai/video_summarization ./video_summarization
cd $HOME/work/video_summarization

Create conda environment and install conda packages.

conda create -n openvino-env python=3.11
conda activate openvino-env
conda update --all
conda install -c conda-forge openvino=2026.0.0
pip install -r requirements.txt

Model Preparation

Convert and Quantized MiniCPM-v-2_6 to OpenVINO IR format (INT8)

mkdir -p $HOME/work
cd $HOME/work
optimum-cli export openvino -m openbmb/MiniCPM-V-2_6 --trust-remote-code --weight-format int8 MiniCPM_INT8

NOTE: If it prompt for access issue to the model, follow the link in the terminal and request access. Then run this to set your huggingface token export HF_TOKEN=<token>

Starting Demo App

This demo app is comprising of the following modules/components:

VLM API service (port 8000)
Retriver API service (port 8001)
Live Summarizer UI (port 8888)
Video RAG UI (port 9999)
Simple-RTSP-server (port 8554)
VDMS Vector DB (port 55555)

Start using docker

Name of the docker containers and its associated ports. Please make sure the ports are not used by any other locally hosted services.

Container Name Exposed Ports

vlm_api_service 8000

rag_api_service 8001

vector_store 55555

rtmp_server 8554

summarizer_ui 8888

retriever_ui 9999
create chunks folder (folder to hold the video chunks file), and change permission of the chunks folder.
```
mkdir -p ../chunks
chmod 777 ../chunks
```
Launch linux terminal from desktop and cd to project directory:
```
cd $HOME/work/video_summarization
```

Container Name	Exposed Ports
vlm_api_service	8000
rag_api_service	8001
vector_store	55555
rtmp_server	8554
summarizer_ui	8888
retriever_ui	9999

Run the command below:

docker compose build
docker compose up -d

Start virtual camera stream. You may use the utility script below to do that.
```
./start_virtual_rtsp_cam0.sh /path/to/video.mp4   #replace /path/to/video.mp4 with the absolute path to your own video file
```
Note: The utility script only create one video stream, please copy and edit the script so more stream can be created. Make sure the video is posted to the URL as shown in the table below:

Camera Name URL

CAM0 rtsp://localhost:8554/live

CAM1 rtsp://localhost:8554/live1

CAM2 rtsp://localhost:8554/live2
- clear_database.sh - use this script to clear the VDMS vector store
- start_virtual_rtsp_cam0.sh - use this script to create a virtual camera stream. Expected input video format. Resolution: 1920x1080, Framerate: 15fps.
Access to web browser http://127.0.0.1:8888 to view summarize UI
Access to web brower http://127.0.0.1:9999 to view retriever UI
To stop the demo:
```
docker compose down
```

Camera Name	URL
CAM0	rtsp://localhost:8554/live
CAM1	rtsp://localhost:8554/live1
CAM2	rtsp://localhost:8554/live2

Other useful command

tmux ls   # use this command to check if virtual camera stream is running
docker compose ls   # use this command to check if all services of the demo is running
docker compose top # use this command to check the container name
docker compose logs [container_name]  # use this command to retrieve the runtime logs of a specific container

The docker-compose.yml uses environment variables to pass additional configuration parameters to the containers. Change this from

Environment Variables	Containers	Default Value	Sample Value
Proxy settings: HTTP_PROXY, HTTPS_PROXY, NO_PROXY	vlm_api_service, rag_api_service, summarizer_ui, retriever_ui	None	http://proxy.domain.com:8080
* AI backend: DEVICE	vlm_api_service	GPU	GPU.1, GPU.0, GPU, CPU, NPU

*Note:

DEVICE=NPU is not supported yet

You may also use this command to identify all AI accelerator supported in your HW. python -c "import openvino as ov; print(ov.Core().available_devices)"

Performance

Pre-requisites

sudo apt install cargo pkg-config libudev-dev

Install qmassa. Follow instruction in https://github.com/ulissesf/qmassa.
Run qmassa in command line to view the GPU utilization.
```
sudo $HOME/.cargo/bin/qmassa
```
Note: If it doesn't correctly show the gpu utilization for your intel GPU device, pass the parameter "-d bus:device:func" to qmassa. You may look up for the BDF (bus:device:func) of your Intel GPU card using the command 'lspci'.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduction

Pre-requisite:

Environment Setup

Model Preparation

Starting Demo App

Start using docker

Performance

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Introduction

Pre-requisite:

Environment Setup

Model Preparation

Starting Demo App

Start using docker

Performance