Description
useOCR().forward(image) returns detected text boxes sorted top-to-bottom, which is correct for the line order. But within a single row, boxes are not sorted left-to-right — they come out in roughly descending-side-length order, which (because longer detections tend to be the cleaner ones) reads as "sorted by probability."
Where it happens
packages/react-native-executorch/common/rnexecutorch/models/ocr/utils/DetectorUtils.cpp — groupTextBoxes:
The merge pass at line 544 sorts boxes descending by maxSideLength:
std::ranges::sort(boxes, [](const types::DetectorBBox &lhs,
const types::DetectorBBox &rhs) {
return maxSideLength(lhs.bbox) > maxSideLength(rhs.bbox);
});
The final sort at line 605 then only compares the top-Y coordinate:
std::ranges::sort(mergedVec, [](const auto &obj1, const auto &obj2) {
const float minY1 = minimumYFromBox(obj1.bbox);
const float minY2 = minimumYFromBox(obj2.bbox);
return minY1 < minY2;
});
For boxes with near-equal minY (i.e. same visual row), the order falls back to mergedVec insertion order, which is the side-length order from earlier. Result: longer boxes appear first within a row.
Reproduction
- Open the
computer-vision demo → OCR screen.
- Pick a photo with multiple short tokens on the same line (e.g. a label with
Price 12.99 USD).
- Inspect the results list — boxes within the row come out re-ordered, not left-to-right.
Suggested fix
Row-bucket then sort by X. Two viable shapes:
- Soft compare: in the final sort comparator, treat
|minY1 - minY2| < yThresh as same row and fall back to comparing minX. yThresh can be a fraction of the average detected box height.
- Explicit grouping: group boxes whose Y-extents overlap by ≥ N% into rows, sort rows by top-Y, then sort each row by min-X.
Notes
Description
useOCR().forward(image)returns detected text boxes sorted top-to-bottom, which is correct for the line order. But within a single row, boxes are not sorted left-to-right — they come out in roughly descending-side-length order, which (because longer detections tend to be the cleaner ones) reads as "sorted by probability."Where it happens
packages/react-native-executorch/common/rnexecutorch/models/ocr/utils/DetectorUtils.cpp—groupTextBoxes:The merge pass at line 544 sorts boxes descending by
maxSideLength:The final sort at line 605 then only compares the top-Y coordinate:
For boxes with near-equal
minY(i.e. same visual row), the order falls back to mergedVec insertion order, which is the side-length order from earlier. Result: longer boxes appear first within a row.Reproduction
computer-visiondemo → OCR screen.Price 12.99 USD).Suggested fix
Row-bucket then sort by X. Two viable shapes:
|minY1 - minY2| < yThreshas same row and fall back to comparingminX.yThreshcan be a fraction of the average detected box height.Notes
main— unrelated to PR feat(constants)!: switch URLs to v0.9.0 layout + add MODEL_REGISTRY #1148 (model registry).