mem: reduce PaddleOCR rec_batch_num from 6 to 1

KRRT7 · KRRT7 · commit 0f2e7c05494b · 2026-03-27T13:41:30.000-05:00
Paddle's native inference engine allocates 500 MiB memory arena chunks
during text recognition, proportional to batch size. With the default
rec_batch_num=6, four 500 MiB chunks are allocated simultaneously.

Setting rec_batch_num=1 reduces this to a single chunk, cutting peak
memory on the PaddleOCR code path by ~1,265 MiB (-42.6%).

Latency benchmark (55 text regions, CPU, 5 runs):
- rec_batch_num=6: 39.1s +/- 3.5s
- rec_batch_num=1: 37.0s +/- 2.0s
No throughput regression — on CPU, batch processing is sequential.
diff --git a/unstructured/partition/utils/ocr_models/paddle_ocr.py b/unstructured/partition/utils/ocr_models/paddle_ocr.py
@@ -48,6 +48,7 @@ def load_agent(self, language: str):
                 lang=language,
                 enable_mkldnn=True,
                 show_log=False,
+                rec_batch_num=1,
             )
         except AttributeError:
             paddle_ocr = PaddleOCR(
@@ -56,6 +57,7 @@ def load_agent(self, language: str):
                 lang=language,
                 enable_mkldnn=False,
                 show_log=False,
+                rec_batch_num=1,
             )
         return paddle_ocr
 

Original file line number	Diff line number	Diff line change
`@@ -48,6 +48,7 @@ def load_agent(self, language: str):`
`48`	`48`	`lang=language,`
`49`	`49`	`enable_mkldnn=True,`
`50`	`50`	`show_log=False,`
	`51`	`+ rec_batch_num=1,`
`51`	`52`	`)`
`52`	`53`	`except AttributeError:`
`53`	`54`	`paddle_ocr = PaddleOCR(`
`@@ -56,6 +57,7 @@ def load_agent(self, language: str):`
`56`	`57`	`lang=language,`
`57`	`58`	`enable_mkldnn=False,`
`58`	`59`	`show_log=False,`
	`60`	`+ rec_batch_num=1,`
`59`	`61`	`)`
`60`	`62`	`return paddle_ocr`
`61`	`63`