[Feat] Add PP-OCRv6 iOS Demo by Bobholamovic · Pull Request #17933 · PaddlePaddle/PaddleOCR

Bobholamovic · 2026-04-16T15:44:42Z

No description provided.

- Implement ClipperOffset class with addPath/execute API - Support JT_ROUND + ET_CLOSEDPOLYGON for DB postprocessing - Add static offsetPolygon() convenience for DBPostProcess - Arc tolerance calculation matches pyclipper's Clipper 6.x - Pure Swift, no external dependencies

- Add Yams ~> 5.0 pod to Podfile for inference.yml parsing - Create InferenceConfig.swift with typed parsing of inference.yml - Support TransformOp enum: DetResizeForTest, NormalizeImage, ToCHWImage, RecResizeImg - PostProcessConfig handles both det (DBPostProcess) and rec (CTCLabelDecode) configs - Python-style scale string '1./255.' parsed via string splitting (not eval) - Register InferenceConfig.swift in Xcode project

- Create Preprocessing.swift with DetPreprocessor and PreprocessResult - Port DetResizeForTest: resize longest side to resize_long, ceil to 128 stride - Port NormalizeImage: config-driven scale/mean/std normalization - Port ToCHWImage: HWC-to-CHW layout conversion producing [1,3,H,W] tensor - Image padding for tiny images (h+w < 64) matching Python reference - Pure Swift using CoreGraphics + Accelerate (no OpenCV dependency) - All transform parameters read from InferenceConfig (zero hardcoded values) - Register Preprocessing.swift in Xcode project

- Add DBPostProcessor with full pipeline: threshold -> contours -> minAreaRect -> score -> expand -> scale - Implement Suzuki-Abe contour finding with CHAIN_APPROX_SIMPLE compression - Implement rotating calipers minAreaRect + convex hull (Andrew's monotone chain) - Add scanline polygon fill for box_score_fast computation - Integrate ClipperOffset for polygon expansion (unclip) - All parameters configurable via DBPostProcessConfigurable protocol - Pure Swift, no OpenCV dependency

…xproj UUID conflicts

- Add runDetection(inputData:shape:) for real inference with preprocessed data - Returns output tensors as [String: (data: [Float], shape: [Int])] dictionary - Includes NaN validation on output tensors - All existing methods (loadModels, validateDetModel, validateRecModel) preserved unchanged

…line - DetectionEngine wires DetPreprocessor -> ORTSessionManager.runDetection -> DBPostProcessor - DetectionResult struct with boxes and per-stage timing metrics (preprocess/inference/postprocess) - All parameters loaded from inference.yml via InferenceConfig.load() - PostProcessConfig conforms to DBPostProcessConfigurable for type-safe init bridging - DetectionEngineError for noOutputTensor and unexpectedOutputShape cases - Registered in Xcode project pbxproj

- SUMMARY.md documenting DetectionEngine pipeline integration - STATE.md updated with position, decisions, metrics - REQUIREMENTS.md: POST-01, POST-02 marked complete

…rical exactness

…rence method - Extract private runInference() from runDetection to eliminate code duplication - Add public runRecognition() method that uses recSession for recognition model inference - Both runDetection and runRecognition guard their respective sessions and delegate to runInference - Recognition model supports dynamic-width input tensors [1, 3, 48, W] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Implement OCRResizeNormImg algorithm matching PaddleX text_recognition/processors.py - Read imgC/imgH/imgW from inference.yml RecResizeImg.image_shape (config-driven, not hardcoded) - Aspect-ratio-aware resize with ceil() width computation matching Python math.ceil() - Recognition normalization: pixel/127.5 - 1.0 mapping [0,255] to [-1,1] (not ImageNet mean/std) - HWC-to-CHW transpose and right-pad with zeros to target width - Bilinear interpolation via CGContext (pure Swift, no OpenCV) - Register RecPreprocessor.swift in Xcode project (PBXBuildFile, PBXFileReference, PBXGroup, Sources) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Create 03-01-SUMMARY.md documenting plan execution - Update STATE.md: advance to Phase 3 Plan 1 complete, add decisions - Update ROADMAP.md: Phase 3 progress 1/2 - Update REQUIREMENTS.md: mark PREP-04 and PREP-05 as complete Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…racter dictionary - CTCDecoder struct ported from ppocr/postprocess/rec_postprocess.py CTCLabelDecode - Reads character_dict from inference.yml PostProcess config - Prepends blank token at index 0 (CTC convention) - Decoding: argmax + consecutive duplicate removal + blank filtering + char mapping - Confidence: mean probability of selected timesteps - Registered in Xcode project (Engine group)

…pipeline - RecognitionEngine class mirrors DetectionEngine pattern - Composes RecPreprocessor + ORTSessionManager.runRecognition + CTCDecoder - Returns RecognitionEngineResult with text, confidence, and per-stage timing - Config-driven: reads inference.yml for preprocessing dims and character dictionary - Registered in Xcode project (Engine group)

timminator · 2026-04-19T23:24:51Z

Hi! Sorry for bothering. I saw the upcoming PRs regarding a PPOCRv6 model but there were no changes made to the general OCR pipeline yet. Will this also get an update or has PPOCRv6 a different use case distinct from the PPOCRv5 model?
Thank you for your time!

Bobholamovic · 2026-04-20T02:02:19Z

Hi! Sorry for bothering. I saw the upcoming PRs regarding a PPOCRv6 model but there were no changes made to the general OCR pipeline yet. Will this also get an update or has PPOCRv6 a different use case distinct from the PPOCRv5 model? Thank you for your time!

PP-OCRv6 is still under development. It is expected to be used in the same way as PP-OCRv5, namely through the general OCR pipeline.

…into feat/ios

Per review feedback from changdazhou on PR PaddlePaddle#17820 (L26), update the CUDA 12.6 Docker GPU line to require driver >= 550.54.14, matching the pip section already at L61 (both ZH and EN). Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>

Per follow-up review on PR PaddlePaddle#17820: from a completeness standpoint, None belongs in the "Supports ..." enumeration rather than only in the trailing clarification sentence. Move None into the list as the default value and tighten the follow-on sentence accordingly. - EN: "Supports None (the default), paddle, paddle_static, paddle_dynamic, and transformers. When left as None, PaddleOCR preserves the behavior of earlier versions..." - ZH: "支持 None（默认值）、paddle、paddle_static、paddle_dynamic、 transformers。保持为默认值 None 时..." Applied to all three supported-value variants across the module_usage and pipeline_usage pages — same 48 files / 66 rows as the previous clarification commit. Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>

Resolves conflict in docs/version3.x/pipeline_usage/PaddleOCR-VL.md: - Accept upstream refactor of CLI and Python instantiation parameter tables from HTML to markdown pipe-table format. - Preserve the {#流程导览} anchor on the "流程导览" heading (needed for mkdocs bilingual link check). - Re-apply the engine-row clarification (None as default + legacy behavior note) to the two engine rows in the new pipe-tables. Incoming commits: - a874bcb Optimize docs - 85275d4 Update docs Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>

docs: release-review fixes for 3.5 docs

… into feat/ios

… feat/ios

…ary (PaddlePaddle#17954) (PaddlePaddle#17961) * ci,docs: align PaddleX install branch and document py3.8 extras boundary - CI: derive the PaddleX install branch from the paddlex constraint in pyproject.toml (release/X.Y) so PR/GPU tests stay in sync as the paddleocr series advances; apply to both tests.yml and test_gpu.yml - CI: install only paddleocr[doc2md] on py3.8 since several paddlex transitive deps require py3.9+; add a py38_incompatible pytest marker and gate affected tests behind it - CI: pin paddlepaddle==3.0.0 on py3.8 / 3.1.0 on py3.9+ (match GPU CI) - CI: standardize workflow filenames (.yaml -> .yml, dashes -> underscores) - deps: pin lmdb<1.5 on py3.8 (newer lmdb wheels reference Py_SET_REFCNT, a py3.9 stdlib C API) - docs: note Python 3.8+ for base paddleocr/doc2md; py3.9+ for doc-parser/ie/trans/all extras (safetensors>=0.7 dropped py3.8) - docs: update PaddleOCR-VL manual-install Python range to 3.9-3.13 across all hardware variants (VL pipelines use doc-parser) - tests: tag tests that require py3.9+ extras/deps with py38_incompatible * ci: mark /workspace as git safe.directory in GPU runner setuptools_scm runs git to derive the package version during pip install -e .; inside the GPU CI container /workspace is owned by the host user, which trips git's dubious-ownership check and aborts the paddleocr install. --------- (cherry picked from commit 09e8700) Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>

…ib .npy Made-with: Cursor

Bobholamovic and others added 30 commits March 16, 2026 19:13

Update for paddleocr 3.5

0722c5f

Fix doc

154294d

Add doc to nav

0949998

Refine docs

d8f0bea

Fix PaddleOCR-VL doc

60be983

Polish PaddleOCR-VL docs

d38826e

Polish docs

80b6a70

Add notice

ec9ff9d

Polish doc

11531a3

Merge branch 'main' into feat/engine

0deb436

Merge branch 'main' into feat/engine

ab6ddfd

docs: initialize project

a5092db

Init

f6c3fcd

merge: wave 1 plan 02-02 (ClipperOffset + DBPostProcess) — resolve pb…

8b5b75d

…xproj UUID conflicts

docs(02-02): create SUMMARY.md for ClipperOffset + DBPostProcess plan

9b46d77

docs(02-03): complete detection engine integration plan

a690e75

- SUMMARY.md documenting DetectionEngine pipeline integration - STATE.md updated with position, decisions, metrics - REQUIREMENTS.md: POST-01, POST-02 marked complete

docs(phase-02): verification complete — human_needed for runtime nume…

cac1575

…rical exactness

docs(phase-02): complete phase execution

b237e81

docs(phase-02): evolve PROJECT.md after phase completion

774adc8

Bobholamovic and others added 28 commits April 20, 2026 10:04

Remove planning files

4e1dcc6

Merge branch 'main' into feat/ios

4121743

Fix code style

4268187

Merge branch 'feat/ios' of https://github.com/Bobholamovic/PaddleOCR …

1b1deb2

…into feat/ios

Optimize docs

a874bcb

Update docs

85275d4

Merge pull request #15 from scyyh11/docs-fix-17820

8de3204

docs: release-review fixes for 3.5 docs

Merge branch 'feat/engine' of https://github.com/Bobholamovic/PaddleOCR…

039b807

… into feat/ios

Fix benchmark

9feee8d

Merge branch 'main' of https://github.com/PaddlePaddle/PaddleOCR into…

4695592

… feat/ios

Add icon and fix validation

654946b

Add quantization

b3b7580

chore: carry release/3.5 tag lineage

3efa726

docs(ios_demo): add build_onnx_calib_npy.py and README for static cal…

3bd8174

…ib .npy Made-with: Cursor

Enhance quantization and support ort format

6ac0c02

Fix skills

aa53cc0

Merge branch 'main' into feat/ios

61a326d

Fix installation docs

57418a4

Enhance benchmark

b247fc4

Use release config

362c084

By default use CPU

a793126

Support setting session options

c08cef0

Enhance UI

74defeb

Update benchmark docs

9f2885e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Add PP-OCRv6 iOS Demo#17933

[Feat] Add PP-OCRv6 iOS Demo#17933
Bobholamovic wants to merge 126 commits into
PaddlePaddle:mainfrom
Bobholamovic:feat/ios

Bobholamovic commented Apr 16, 2026

Uh oh!

timminator commented Apr 19, 2026

Uh oh!

Bobholamovic commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Bobholamovic commented Apr 16, 2026

Uh oh!

timminator commented Apr 19, 2026

Uh oh!

Bobholamovic commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants