Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 37 additions & 1 deletion .github/workflows/pre_commit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -98,20 +98,56 @@ jobs:
- name: Run python unit tests
run: uv run pytest tests/unit --cov

- name: Prepare test data
- &prepare-test-data
name: Prepare test data
run: |
uv run python tests/accuracy/download_models.py -d data -j tests/precommit/public_scope.json -l

- name: Run test
run: |
uv run pytest --data=./data tests/functional

serving_api-tests:
strategy:
fail-fast: false
matrix:
os:
- "ubuntu-latest"
python-version:
- "3.11"
- "3.12"
- "3.13"
- "3.14"
runs-on: ${{ matrix.os }}
steps:
- name: Set up docker for macOS
if: startsWith(matrix.os, 'macos-1')
run: |
brew install colima docker
colima start

- *checkout

- *matrix-setup-uv

- name: Install dependencies
run: uv sync --locked --extra tests --extra ovms --extra-index-url https://download.pytorch.org/whl/cpu

- *prepare-test-data

- name: serving_api
run: |
uv run python -c "from model_api.models import DetectionModel; DetectionModel.create_model('./data/otx_models/detection_model_with_xai_head.xml').save('ovms_models/ssd_mobilenet_v1_fpn_coco/1/ssd_mobilenet_v1_fpn_coco.xml')"
docker run -d --rm -v $GITHUB_WORKSPACE/ovms_models/:/models -p 8000:8000 openvino/model_server:latest --model_path /models/ssd_mobilenet_v1_fpn_coco/ --model_name ssd_mobilenet_v1_fpn_coco --rest_port 8000 --log_level DEBUG --target_device CPU
Comment thread
tybulewicz marked this conversation as resolved.
Comment thread
tybulewicz marked this conversation as resolved.
uv run python examples/serving_api/run.py data/coco128/images/train2017/000000000009.jpg # detects 4 objects

pre-commit-result:
runs-on: ubuntu-latest
needs:
- accuracy-tests
- code_quality_checks
- unit-functional-tests
- serving_api-tests
if: always()
steps:
- name: All tests ok
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -145,3 +145,4 @@ docs/source/_build/
.vscode/

data/
ovms_models/
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

## Introduction

Model API is a set of wrapper classes for particular tasks and model architectures, simplifying data preprocess and postprocess as well as routine procedures (model loading, asynchronous execution, etc.). It is aimed at simplifying end-to-end model inference. The Model API is based on the OpenVINO inference API.
Model API is a set of wrapper classes for particular tasks and model architectures, simplifying data preprocess and postprocess as well as routine procedures (model loading, asynchronous execution, etc.). It is aimed at simplifying end-to-end model inference for different deployment scenarios, including local execution and serving. The Model API is based on the OpenVINO inference API.

## How it works

Expand All @@ -29,6 +29,7 @@ Training Extensions embed all the metadata required for inference into model fil

- Python API
- Synchronous and asynchronous inference
- Local inference and serving through the REST API
- Model preprocessing embedding for faster inference

## Installation
Expand All @@ -41,6 +42,7 @@ Training Extensions embed all the metadata required for inference into model fil
from model_api.models import Model

# Create a model wrapper from a compatible model generated by OpenVINO Training Extensions
# To work with an OVMS-served model, pass its endpoint instead of a file path, e.g. "localhost:8000/v2/models/ssdlite_mobilenet_v2"
model = Model.create_model("model.xml")

# Run synchronous inference locally
Expand All @@ -52,7 +54,7 @@ print(f"Inference result: {result}")

## Prepare a model for `InferenceAdapter`

There are usecases when it is not possible to modify an internal `ov::Model` and it is hidden behind `InferenceAdapter`. `create_model()` can construct a model from a given `InferenceAdapter`. That approach assumes that the model in `InferenceAdapter` was already configured by `create_model()` called with a string (a path or a model name). It is possible to prepare such model:
There are usecases when it is not possible to modify an internal `ov::Model` and it is hidden behind `InferenceAdapter`. For example the model can be served using [OVMS](https://github.com/openvinotoolkit/model_server). `create_model()` can construct a model from a given `InferenceAdapter`. That approach assumes that the model in `InferenceAdapter` was already configured by `create_model()` called with a string (a path or a model name). It is possible to prepare such model:

```python
model = DetectionModel.create_model("~/.cache/omz/public/ssdlite_mobilenet_v2/FP16/ssdlite_mobilenet_v2.xml")
Expand Down
5 changes: 5 additions & 0 deletions docs/source/adapters/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@
[todo]
:::

:::{grid-item-card} Ovms Adapter
:link: ./ovms_adapter
:link-type: doc

[todo]
:::
:::{grid-item-card} Onnx Adapter
Expand Down Expand Up @@ -41,5 +45,6 @@
./inference_adapter
./onnx_adapter
./openvino_adapter
./ovms_adapter
./utils
```
8 changes: 8 additions & 0 deletions docs/source/adapters/ovms_adapter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# OVMS Adapter

```{eval-rst}
.. automodule:: model_api.adapters.ovms_adapter
:members:
:undoc-members:
:show-inheritance:
```
40 changes: 40 additions & 0 deletions examples/serving_api/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Serving API example

This example demonstrates how to use a Python API of OpenVINO Model API for a remote inference of models hosted with [OpenVINO Model Server](https://docs.openvino.ai/latest/ovms_what_is_openvino_model_server.html). This tutorial assumes that you are familiar with Docker subsystem and includes the following steps:

- Run Docker image with
- Instantiate a model
- Run inference
- Process results

## Prerequisites

- Install Model API from source. Please refer to the main [README](../../../README.md) for details.
- Install Docker. Please refer to the [official documentation](https://docs.docker.com/get-docker/) for details.
- Install Triton HTTP client (used by the OVMS adapter) into the Python environment:

```bash
pip install 'tritonclient[http]'
```

- Download a model by running a Python code with Model API, see Python [example](../../synchronous_api/README.md) and resave a configured model at OVMS friendly folder layout:

```python
from model_api.models import DetectionModel

DetectionModel.create_model("ssd_mobilenet_v1_fpn_coco").save("/home/user/models/ssd_mobilenet_v1_fpn_coco/1/ssd_mobilenet_v1_fpn_coco.xml")
```

- Run docker with OVMS server:

```bash
docker run -d -v /home/user/models:/models -p 8000:8000 openvino/model_server:latest --model_path /models/ssd_mobilenet_v1_fpn_coco --model_name ssd_mobilenet_v1_fpn_coco --rest_port 8000 --nireq 4 --target_device CPU
```

## Run example

To run the example, please execute the following command:

```bash
python run.py <path_to_image>
```
34 changes: 34 additions & 0 deletions examples/serving_api/run.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/usr/bin/env python3
#
# Copyright (C) 2020-2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
#

import sys

import cv2

from model_api.models import DetectionModel


def main():
if len(sys.argv) != 2:
usage_message = f"Usage: {sys.argv[0]} <path_to_image>"
raise RuntimeError(usage_message)

image = cv2.imread(sys.argv[1])
if image is None:
error_message = f"Failed to read the image: {sys.argv[1]}"
raise RuntimeError(error_message)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Create Object Detection model specifying the OVMS server URL
model = DetectionModel.create_model(
"localhost:8000/v2/models/ssd_mobilenet_v1_fpn_coco",
model_type="ssd",
)
detections = model(image)
print(f"Detection results: {detections}")


if __name__ == "__main__":
main()
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ dependencies = [
]

[project.optional-dependencies]
ovms = [
"tritonclient[http]<2.59",
]
tests = [
"httpx",
"pytest",
Expand Down
14 changes: 13 additions & 1 deletion src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,13 +79,25 @@ The following tasks can be solved with wrappers usage:

Model API wrappers are executor-agnostic, meaning it does not implement the specific model inference or model loading, instead it can be used with different executors having the implementation of common interface methods in adapter class respectively.

Currently, `OpenvinoAdapter` and `ONNXRuntimeAdapter` are supported.
Currently, `OpenvinoAdapter`, `OVMSAdapter`, and `ONNXRuntimeAdapter` are supported.

### OpenVINO Adapter

`OpenvinoAdapter` hides the OpenVINO™ toolkit API, which allows Model API wrappers launching with models represented in Intermediate Representation (IR) format.
It accepts a path to either `xml` model file or `onnx` model file.

### OpenVINO Model Server Adapter

`OVMSAdapter` hides the OpenVINO Model Server python client API, which allows Model API wrappers launching with models served by OVMS.

Refer to **[`OVMSAdapter`](adapters/ovms_adapter.md)** to learn about running demos with OVMS.

For using OpenVINO Model Server Adapter you need to install the package with extra module:

```sh
pip install <omz_dir>/demos/common/python[ovms]
```

### ONNXRuntime Adapter

`ONNXRuntimeAdapter` hides the ONNXRuntime, which Model API wrappers launching with models represented in ONNX format.
Expand Down
2 changes: 2 additions & 0 deletions src/model_api/adapters/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,15 @@

from .onnx_adapter import ONNXRuntimeAdapter
from .openvino_adapter import OpenvinoAdapter, create_core, get_user_config
from .ovms_adapter import OVMSAdapter
from .utils import INTERPOLATION_TYPES, RESIZE_TYPES, InputTransform, Layout

__all__ = [
"create_core",
"get_user_config",
"Layout",
"OpenvinoAdapter",
"OVMSAdapter",
"ONNXRuntimeAdapter",
"RESIZE_TYPES",
"InputTransform",
Expand Down
59 changes: 59 additions & 0 deletions src/model_api/adapters/ovms_adapter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# OpenVINO Model Server Adapter

The `OVMSAdapter` implements `InferenceAdapter` interface. The `OVMSAdapter` makes it possible to use Model API with models hosted in OpenVINO Model Server.

## Prerequisites

`OVMSAdapter` enables inference via calls to OpenVINO Model Server, so in order to use it you need two things:

- OpenVINO Model Server that serves your model
- [`tritonclient[http]`](https://pypi.org/project/tritonclient/) package installed to enable communication with the model server: `python3 -m pip install tritonclient[http]`

### Deploy OpenVINO Model Server

Model Server is distributed as a docker image and it's available in DockerHub, so you can use it with `docker run` command. See [model server documentation](https://github.com/openvinotoolkit/model_server/blob/main/docs/starting_server.md) to learn how to deploy OpenVINO optimized models with OpenVINO Model Server.

## Model configuration

When using OpenVINO Model Server model cannot be directly accessed from the client application. Therefore any configuration must be done on model server side or before starting the server: see [Prepare a model for `InferenceAdapter`](../../../../../README.md#prepare-a-model-for-inferenceadapter).

### Input reshaping

For some use cases you may want your model to reshape to match input of certain size. In that case, you should provide `--shape auto` parameter to model server startup command. With that option, model server will reshape model input on demand to match the input data.

### Inference options

It's possible to configure inference related options for the model in OpenVINO Model Server with options:

- `--target_device` - name of the device to load the model to
- `--nireq` - number of InferRequests
- `--plugin_config` - configuration of the device plugin

See [model server configuration parameters](https://github.com/openvinotoolkit/model_server/blob/main/docs/starting_server.md#serving-a-single-model) for more details.

### Example OVMS startup command

```bash
docker run -d --rm -v /home/user/models:/models -p 8000:8000 openvino/model_server:latest --model_path /models/model1 --model_name model1 --port 8000 --shape auto --nireq 32 --target_device CPU --plugin_config "{\"CPU_THROUGHPUT_STREAMS\": \"CPU_THROUGHPUT_AUTO\"}"
```

> **Note**: In demos, while using `--adapter ovms`, inference options like: `-nireq`, `-nstreams` `-nthreads` as well as device specification with `-d` will be ignored.

## Running demos with OVMSAdapter

To run the demo with model served in OpenVINO Model Server, you would have to provide `--adapter ovms` option and modify `-m` parameter to indicate model inference service instead of the model files. Model parameter for `OVMSAdapter` follows this schema:

`<service_address>/v2/models/<model_name>[/versions/<model_version>[/]]`

- `<service_address>` - OVMS service address in form `<address>:<port>`
- `<model_name>` - name of the target model (the one specified by `model_name` parameter in the model server startup command)
- `<model_version>` _(optional)_ - version of the target model specified in the `/versions/<model_version>` path segment (default: latest)

Assuming that model server runs on the same machine as the demo, exposes service on port 8000 and serves model called `model1`, the value of `-m` parameter would be:

- `localhost:8000/v2/models/model1` - requesting latest model version
- `localhost:8000/v2/models/model1/versions/2` - requesting model version number 2 (an optional trailing slash, e.g. `/versions/2/`, is also accepted)

Comment thread
tybulewicz marked this conversation as resolved.
## See Also

- [OpenVINO Model Server](https://github.com/openvinotoolkit/model_server)
Loading