Skip to content

Commit 17a3c7f

Browse files
committed
Over the course of several days, around 20 files had problems applying OCR; now all the errors that occurred with these files have been resolved.
1 parent e6a09c3 commit 17a3c7f

121 files changed

Lines changed: 84118 additions & 89682 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 42 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -5,52 +5,48 @@
55
**The complete OCR toolkit for Linux — turn scanned PDFs and images into searchable, editable documents.**
66

77
[![License: GPL-3.0](https://img.shields.io/badge/License-GPL%203.0-blue.svg)](LICENSE)
8+
[![Version: 3.0.0](https://img.shields.io/badge/Version-3.0.0-green.svg)](pyproject.toml)
89
[![Python 3.10+](https://img.shields.io/badge/Python-3.10%2B-3776AB.svg)](https://python.org)
910
[![GTK4 + Libadwaita](https://img.shields.io/badge/GTK4-Libadwaita-4A86CF.svg)](https://gnome.org)
11+
[![Tests: 311](https://img.shields.io/badge/Tests-311%20passing-brightgreen.svg)](tests/)
1012

1113
</div>
1214

1315
---
1416

1517
BigOcrPDF is a powerful, all-in-one OCR application that adds searchable text layers to scanned PDFs, extracts text from images, and provides a full-featured PDF editor — all from a modern, native Linux interface.
1618

17-
## Three Interfaces, One Toolkit
19+
## Why BigOcrPDF?
1820

19-
BigOcrPDF offers three independent interfaces that cover every stage of document work:
21+
- **AI-Powered OCR** — Uses **RapidOCR PP-OCRv5** with OpenVINO hardware acceleration for fast, accurate text recognition across **130+ languages**
22+
- **Edit, Merge & Organize PDFs** — Reorder pages, rotate, delete, and combine multiple PDFs and images into a single document
23+
- **Smart Preprocessing** — Automatic perspective correction, deskew, dewarping, and illumination normalization — even photos of documents come out clean
24+
- **Multiple Export Formats** — Searchable PDF, PDF/A-2b archival, plain text, and ODF/ODT with layout-aware formatting
25+
- **Screen Capture OCR** — Select any region on screen and instantly extract text
26+
- **Batch Processing** — Process dozens of files at once with checkpoint/resume support
27+
- **File Manager Integration** — Right-click any PDF or image to OCR it directly
2028

21-
### 1. PDF OCR (`bigocrpdf`)
22-
23-
The main interface. Drop your scanned PDFs, choose your settings, and get searchable documents back. Ideal for:
24-
25-
- Turning scanned paperwork, contracts, and books into searchable PDFs
26-
- Archiving documents as PDF/A-2b for long-term preservation
27-
- Batch-processing dozens of files with checkpoint/resume
28-
- Re-OCR'ing documents that already have a poor text layer
29-
- Exporting extracted text as TXT or ODF/ODT with layout detection
30-
31-
### 2. PDF Editor (`bigocrpdf --edit` or `bigocrpdf -e`)
32-
33-
A standalone page editor that runs independently of the OCR window. Use it to organize your PDFs before or after OCR:
34-
35-
- Reorder, rotate, flip, and delete pages with drag-and-drop
36-
- Merge multiple PDFs and images into a single document
37-
- Import photos (JPEG, PNG, TIFF, WebP, RAW) with automatic EXIF rotation
38-
- Split large PDFs by page count or target file size
39-
- Compress PDFs with configurable quality and DPI
40-
- Save individual pages as images or separate PDFs
41-
42-
### 3. Image OCR (`bigocrimage`)
29+
---
4330

44-
A lightweight window for quick text extraction from images and screenshots:
31+
## Key Features
4532

46-
- Open any image — JPEG, PNG, WebP, TIFF, or RAW (CR2, DNG, NEF, ARW…)
47-
- Capture a screen region and extract the text instantly
48-
- Copy results to clipboard with one click
49-
- Works with Spectacle (KDE), GNOME Screenshot, and Flameshot
33+
### PDF Editor
5034

51-
---
35+
Manage your documents before and after OCR — no need for a separate tool.
5236

53-
## Key Features
37+
- **Drag-and-drop page reordering** with thumbnail previews
38+
- **Rotate & flip pages** — left, right, horizontal, and vertical
39+
- **Delete pages** you don't need
40+
- **Merge files** — combine pages from multiple PDFs and images into one document
41+
- **Create PDFs from images** — import JPEG, PNG, TIFF, WebP, RAW photos, and more
42+
- **EXIF-aware import** — automatically applies correct orientation from camera metadata
43+
- **Zoom control** — 50% to 200% thumbnail scaling with keyboard shortcuts
44+
- **Select pages for OCR** — choose exactly which pages to process
45+
- **Context menu** — right-click any page to save as image or PDF
46+
- **Compress PDF** — reduce file size with configurable quality and DPI
47+
- **Split PDF** — by page count or target file size
48+
- **Undo support** — revert page operations with Ctrl+Z
49+
- **Window size persistence** — remembers your preferred dimensions
5450

5551
### OCR Engine
5652

@@ -90,10 +86,20 @@ Get your text out in the format you need.
9086
| **Custom Quality PDF** | Choose JPEG quality: 30%, 50%, 70%, 85%, or 95% |
9187
| **Black & White (JBIG2)** | Pure black-and-white output using JBIG2 — the most compact format for text-only documents |
9288
| **Plain Text (.txt)** | Extracted text from all pages |
93-
| **ODF/ODT** | Formatted text with optional embedded images *(experimental — formatting quality may vary)* |
89+
| **ODF/ODT** ⚠️ | 4 modes: formatted + images, images + simple text, formatted text only, or plain text *(experimental — formatting quality may vary)* |
9490

9591
ODF export includes **layout analysis**: automatic paragraph/heading detection, table detection, image embedding, and proper page breaks. Note: ODF/ODT export is experimental and formatting results may not always be accurate.
9692

93+
### Screen Capture & Image OCR
94+
95+
Extract text from anything on your screen.
96+
97+
- **Region capture** — select an area and get the text instantly
98+
- **Works with**: Spectacle (KDE), GNOME Screenshot, Flameshot
99+
- **Open any image** — JPEG, PNG, WebP, TIFF, RAW formats (CR2, DNG, NEF, ARW, and more)
100+
- **Copy to clipboard** with one click
101+
- **Standalone mode** — run `bigocrimage` for a dedicated image OCR window
102+
97103
### Batch Processing & Session Management
98104

99105
Handle large workloads efficiently.
@@ -141,10 +147,8 @@ pip install -e .
141147
### GUI
142148

143149
```bash
144-
bigocrpdf # PDF OCR — main interface
145-
bigocrpdf --edit file.pdf # PDF Editor — standalone page editor
146-
bigocrpdf -e file.pdf # (short form)
147-
bigocrimage # Image OCR — quick text extraction
150+
bigocrpdf # PDF OCR interface
151+
bigocrimage # Image OCR window
148152
```
149153

150154
### Command Line
@@ -153,7 +157,6 @@ bigocrimage # Image OCR — quick text extraction
153157
bigocrpdf [OPTIONS] [FILES...]
154158
155159
Options:
156-
-e, --edit Open the PDF editor instead of the OCR interface
157160
-v, --version Show version and exit
158161
-d, --debug Enable debug logging
159162
--verbose Verbose output
@@ -232,7 +235,8 @@ graph TD
232235

233236
## Quality & Testing
234237

235-
- **303 automated tests** covering OCR pipeline, PDF operations, export, preprocessing, editor logic, and utilities
238+
- **311 automated tests** covering OCR pipeline, PDF operations, export, preprocessing, editor logic, and utilities
239+
- **Tested with Python 3.10 through 3.14** — supports the latest Python release
236240
- **100% i18n coverage** — all 28 languages fully translated (604 strings each)
237241
- **Ruff-enforced** code style and linting
238242
- **WCAG 2.1 Level AA** accessibility considerations

0 commit comments

Comments
 (0)