Skip to content

Commit ce59d9a

Browse files
authored
feat: Add PP-OCRv5 ML backend for text detection and recognition (#859)
1 parent ddf5cd1 commit ce59d9a

14 files changed

Lines changed: 1290 additions & 1 deletion

.github/docker-build-config.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@
2323
- backend_dir_name: mmdetection-3
2424
backend_tag_prefix: mmdetection3-
2525
runs_on: ubuntu-latest-4c-16gb
26+
- backend_dir_name: ppocr
27+
backend_tag_prefix: ppocr-
2628
- backend_dir_name: nemo_asr
2729
backend_tag_prefix: nemoasr-
2830
runs_on: ubuntu-latest-4c-16gb

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,8 @@ Check the **Required parameters** column to see if you need to set any additiona
5656
| [langchain_search_agent](/label_studio_ml/examples/langchain_search_agent) | RAG pipeline with Google Search and [Langchain](https://langchain.com/) |||| OPENAI_API_KEY, GOOGLE_CSE_ID, GOOGLE_API_KEY | Arbitrary |
5757
| [llm_interactive](/label_studio_ml/examples/llm_interactive) | Prompt engineering with [OpenAI](https://platform.openai.com/), Azure LLMs. |||| OPENAI_API_KEY | Arbitrary |
5858
| [mmdetection](/label_studio_ml/examples/mmdetection-3) | Object Detection with [OpenMMLab](https://github.com/open-mmlab/mmdetection) |||| None | Arbitrary |
59-
| [nemo_asr](/label_studio_ml/examples/nemo_asr) | Speech ASR by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo) |||| None | Set (vocabulary and characters) |
59+
| [nemo_asr](/label_studio_ml/examples/nemo_asr) | Speech ASR by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo) |||| None | Set (vocabulary and characters) |
60+
| [paddleocr](/label_studio_ml/examples/paddleocr) | OCR with [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) (PP-OCRv5) |||| None | Set (characters) |
6061
| [segment_anything_2_image](/label_studio_ml/examples/segment_anything_2_image) | Image segmentation with [SAM 2](https://github.com/facebookresearch/segment-anything-2) |||| None| Arbitrary|
6162
| [segment_anything_model](/label_studio_ml/examples/segment_anything_model) | Image segmentation by [Meta](https://segment-anything.com/) |||| None | Arbitrary |
6263
| [sklearn_text_classifier](/label_studio_ml/examples/sklearn_text_classifier) | Text classification with [scikit-learn](https://scikit-learn.org/stable/) |||| None | Arbitrary |
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Git
2+
.git
3+
.gitignore
4+
5+
# Python
6+
__pycache__
7+
*.py[cod]
8+
*$py.class
9+
*.so
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
*.egg-info/
24+
.installed.cfg
25+
*.egg
26+
27+
# Virtual environments
28+
venv/
29+
ENV/
30+
env/
31+
.venv/
32+
33+
# IDE
34+
.idea/
35+
.vscode/
36+
*.swp
37+
*.swo
38+
39+
# Test files
40+
.pytest_cache/
41+
.coverage
42+
htmlcov/
43+
44+
# Data directories (should be mounted as volumes)
45+
data/
46+
47+
# Logs
48+
*.log
49+
50+
# Local config
51+
config.json
52+
.env
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# syntax=docker/dockerfile:1
2+
# PP-OCR ML Backend for Label Studio (CPU version)
3+
#
4+
# Uses the official PaddleX Docker image which includes
5+
# PaddlePaddle and PaddleX pre-installed.
6+
#
7+
# Build arguments:
8+
# TEST_ENV: Set to "true" to install test dependencies
9+
10+
ARG TEST_ENV
11+
12+
FROM ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/paddlex:paddlex3.1.2-paddlepaddle3.0.0-cpu
13+
14+
WORKDIR /app
15+
16+
ENV PYTHONUNBUFFERED=1 \
17+
PYTHONDONTWRITEBYTECODE=1 \
18+
PORT=9090 \
19+
PIP_CACHE_DIR=/.cache \
20+
WORKERS=1 \
21+
THREADS=8 \
22+
DEVICE=cpu
23+
24+
# Install base requirements (label-studio-ml and gunicorn)
25+
COPY requirements-base.txt .
26+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
27+
pip install -r requirements-base.txt
28+
29+
# Install PaddleX OCR extra dependencies
30+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
31+
pip install "paddlex[ocr]"
32+
33+
# Install custom requirements (boto3, opencv, etc.)
34+
COPY requirements.txt .
35+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
36+
pip install -r requirements.txt
37+
38+
# Install test requirements if needed
39+
ARG TEST_ENV
40+
COPY requirements-test.txt .
41+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
42+
if [ "$TEST_ENV" = "true" ]; then \
43+
pip install -r requirements-test.txt; \
44+
fi
45+
46+
# Copy application code
47+
COPY . .
48+
49+
EXPOSE 9090
50+
51+
CMD gunicorn --preload --bind :$PORT --workers $WORKERS --threads $THREADS --timeout 0 _wsgi:app
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# syntax=docker/dockerfile:1
2+
# PP-OCR ML Backend for Label Studio (GPU version)
3+
#
4+
# Uses the official PaddleX Docker image which includes
5+
# PaddlePaddle and PaddleX pre-installed.
6+
#
7+
# Build arguments:
8+
# CUDA_VERSION: cuda11.8 (default) or cuda12.6
9+
# TEST_ENV: Set to "true" to install test dependencies
10+
11+
ARG CUDA_VERSION=cuda11.8
12+
13+
# GPU with CUDA 11.8 (default)
14+
FROM ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/paddlex:paddlex3.1.2-paddlepaddle3.0.0-gpu-cuda11.8-cudnn8.9-trt8.6 AS gpu-cuda11.8
15+
16+
# GPU with CUDA 12.6
17+
FROM ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/paddlex:paddlex3.1.2-paddlepaddle3.0.0-gpu-cuda12.6-cudnn9.5-trt10.5 AS gpu-cuda12.6
18+
19+
# Select the appropriate base image
20+
FROM gpu-${CUDA_VERSION} AS base
21+
22+
ARG TEST_ENV
23+
24+
WORKDIR /app
25+
26+
ENV PYTHONUNBUFFERED=1 \
27+
PYTHONDONTWRITEBYTECODE=1 \
28+
PORT=9090 \
29+
PIP_CACHE_DIR=/.cache \
30+
WORKERS=1 \
31+
THREADS=8
32+
33+
# Install base requirements (label-studio-ml and gunicorn)
34+
COPY requirements-base.txt .
35+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
36+
pip install -r requirements-base.txt
37+
38+
# Install PaddleX OCR extra dependencies
39+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
40+
pip install "paddlex[ocr]"
41+
42+
# Install custom requirements (boto3, opencv, etc.)
43+
COPY requirements.txt .
44+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
45+
pip install -r requirements.txt
46+
47+
# Install test requirements if needed
48+
COPY requirements-test.txt .
49+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
50+
if [ "$TEST_ENV" = "true" ]; then \
51+
pip install -r requirements-test.txt; \
52+
fi
53+
54+
# Copy application code
55+
COPY . .
56+
57+
EXPOSE 9090
58+
59+
CMD gunicorn --preload --bind :$PORT --workers $WORKERS --threads $THREADS --timeout 0 _wsgi:app

0 commit comments

Comments
 (0)