Skip to content

OCR: boxes within a row are not sorted left-to-right #1159

@msluszniak

Description

@msluszniak

Description

useOCR().forward(image) returns detected text boxes sorted top-to-bottom, which is correct for the line order. But within a single row, boxes are not sorted left-to-right — they come out in roughly descending-side-length order, which (because longer detections tend to be the cleaner ones) reads as "sorted by probability."

Where it happens

packages/react-native-executorch/common/rnexecutorch/models/ocr/utils/DetectorUtils.cppgroupTextBoxes:

The merge pass at line 544 sorts boxes descending by maxSideLength:

std::ranges::sort(boxes, [](const types::DetectorBBox &lhs,
                            const types::DetectorBBox &rhs) {
  return maxSideLength(lhs.bbox) > maxSideLength(rhs.bbox);
});

The final sort at line 605 then only compares the top-Y coordinate:

std::ranges::sort(mergedVec, [](const auto &obj1, const auto &obj2) {
  const float minY1 = minimumYFromBox(obj1.bbox);
  const float minY2 = minimumYFromBox(obj2.bbox);
  return minY1 < minY2;
});

For boxes with near-equal minY (i.e. same visual row), the order falls back to mergedVec insertion order, which is the side-length order from earlier. Result: longer boxes appear first within a row.

Reproduction

  1. Open the computer-vision demo → OCR screen.
  2. Pick a photo with multiple short tokens on the same line (e.g. a label with Price 12.99 USD).
  3. Inspect the results list — boxes within the row come out re-ordered, not left-to-right.

Suggested fix

Row-bucket then sort by X. Two viable shapes:

  • Soft compare: in the final sort comparator, treat |minY1 - minY2| < yThresh as same row and fall back to comparing minX. yThresh can be a fraction of the average detected box height.
  • Explicit grouping: group boxes whose Y-extents overlap by ≥ N% into rows, sort rows by top-Y, then sort each row by min-X.

Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    user expThis issue tackles problems with user experience e.g. overcomplicated API

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions