Commit 71c5762
Dhanush Varma
fix: resolve TESSDATA_PREFIX path correctly for all Tesseract versions
Two bugs in init_ocr() in ocr.c:
1. The Tesseract 4/5 branch always blindly appended '/tessdata' to the
path returned by probe_tessdata_location(). If TESSDATA_PREFIX was
already set to a path ending in 'tessdata/', this caused a double-
append e.g. '/usr/share/tessdata/tessdata'.
2. The legacy Tesseract <4 branch passed tessdata_path raw to
TessBaseAPIInit4 without appending 'tessdata' at all, causing
Tesseract to look for eng.traineddata directly in e.g. '/usr/share/'
instead of '/usr/share/tessdata/'.
Fix: normalize the path once before both branches. Detect whether the
returned path already ends with 'tessdata' or 'tessdata/', handle
Windows backslash separators, and use the resolved path in both
Tesseract version branches. Add mprint diagnostic for the resolved path.
Fixes #14921 parent 395f9b3 commit 71c5762
1 file changed
Lines changed: 29 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
261 | 261 | | |
262 | 262 | | |
263 | 263 | | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
264 | 288 | | |
265 | 289 | | |
266 | | - | |
267 | | - | |
268 | 290 | | |
| 291 | + | |
269 | 292 | | |
| 293 | + | |
270 | 294 | | |
271 | 295 | | |
272 | 296 | | |
273 | 297 | | |
274 | 298 | | |
275 | 299 | | |
| 300 | + | |
276 | 301 | | |
277 | | - | |
| 302 | + | |
| 303 | + | |
278 | 304 | | |
279 | 305 | | |
280 | 306 | | |
| |||
0 commit comments