Summary
Currently, navigating between detection boxes in PPOCRLabel requires individual mouse clicks on each box or list entry. For annotation-heavy workflows (correcting OCR output on dense documents), this creates significant friction.
Proposed shortcuts:
| Key |
Action |
Tab |
Move focus to the next detection box |
Shift+Tab |
Move focus to the previous detection box |
F2 or Enter |
Open inline text editor for the focused box |
Escape |
Confirm edit and return to navigation mode |
Motivation
When correcting OCR results on a page with 50–200 detection boxes, the current mouse-only workflow means:
- Click a box in the image or the list panel
- Click into the text field
- Edit
- Click the next box
- Repeat
With keyboard navigation, steps 1–2 and 4 collapse into a single Tab keypress, reducing correction time dramatically.
This pattern is standard in annotation tools (e.g. LabelImg, Label Studio, CVAT all support some form of keyboard-first navigation). It also improves accessibility for users with repetitive strain concerns.
Expected behavior
Tab cycles through detected boxes in order (top-to-bottom, left-to-right, or list order — whatever is consistent with the current sort)
- The focused box is visually highlighted (already done on click, same style would work)
F2 or Enter activates the text input field for the focused box without needing the mouse
Escape confirms the current edit and returns focus to the box (ready for next Tab)
Workaround
None currently. Every box requires a mouse click.
Environment
- PPOCRLabel v3
- Windows 10 / Python 3.9
- PP-OCRv5 mobile model
Summary
Currently, navigating between detection boxes in PPOCRLabel requires individual mouse clicks on each box or list entry. For annotation-heavy workflows (correcting OCR output on dense documents), this creates significant friction.
Proposed shortcuts:
TabShift+TabF2orEnterEscapeMotivation
When correcting OCR results on a page with 50–200 detection boxes, the current mouse-only workflow means:
With keyboard navigation, steps 1–2 and 4 collapse into a single
Tabkeypress, reducing correction time dramatically.This pattern is standard in annotation tools (e.g. LabelImg, Label Studio, CVAT all support some form of keyboard-first navigation). It also improves accessibility for users with repetitive strain concerns.
Expected behavior
Tabcycles through detected boxes in order (top-to-bottom, left-to-right, or list order — whatever is consistent with the current sort)F2orEnteractivates the text input field for the focused box without needing the mouseEscapeconfirms the current edit and returns focus to the box (ready for nextTab)Workaround
None currently. Every box requires a mouse click.
Environment