-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.qmd
More file actions
120 lines (83 loc) · 5.24 KB
/
index.qmd
File metadata and controls
120 lines (83 loc) · 5.24 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
title: "AI Comic Panel Extractor"
subtitle: "High-speed computer vision for mobile-optimized comic reading."
page-layout: article
---
::: {.grid .align-items-center}
::: {.g-col-12 .g-col-md-8}
The **AI Comic Panel Extractor** is a specialized computer vision tool that automatically processes `.cbz` and `.pdf` files.
By detecting individual panel boundaries, it transforms static, hard-to-read pages into a panel-by-panel experience perfectly suited for mobile readers like **CDisplayEx**, **Perfect Viewer**, or **Panels**.
<br>
[Download Win64 .exe](https://github.com/ehsanx/Comic-Panel-Extractor/raw/main/dist/Comic-Panel-Extractor_Win_64.exe){.btn .btn-primary .btn-lg role="button"}
:::
::: {.g-col-12 .g-col-md-4 style="text-align: center;"}
{width=100% style="border-radius: 12px; box-shadow: 0 8px 16px rgba(0,0,0,0.15); margin-bottom: 20px;"}
{width=60% style="border-radius: 20px; box-shadow: 0 16px 16px rgba(0,0,0,0.15);"}
:::
:::
---
## Core Engine Logic
The extraction engine uses **OpenCV 4.9.0.80** to simulate human visual perception of a page layout.
**The Process**:
1. **Grayscale & Binarization:** Strips color and applies Otsu thresholding. If the page is heavily yellowed or unevenly lit, it automatically falls back to Adaptive Gaussian thresholding to isolate ink from paper.
2. **Morphological Closing:** Applies a scale-aware mathematical closing step to bridge small gaps in hand-drawn or faded panel borders without over-thickening lines.
3. **Contour Detection:** Employs `cv2.findContours` (using `RETR_EXTERNAL`) to draw bounding boxes around outer panel regions, intentionally ignoring internal dialogue balloons.
4. **Filtering & Sorting:** Deduplicates overlapping boxes using IoU-based Non-Maximum Suppression (NMS) and dynamically sorts panels into a proper reading order based on median panel heights.
---
## Deployment Options
::: {.panel-tabset}
### Windows Executable
The easiest way to run the tool locally. No background dependencies, environments, or installations required.
[Download Win64 .exe](https://github.com/ehsanx/Comic-Panel-Extractor/raw/main/dist/Comic-Panel-Extractor_Win_64.exe){.btn .btn-success role="button"}
### Python CLI / GUI
For developers or users with an existing Python environment. Run the application directly from source:
```bash
pip install PyMuPDF opencv-python==4.9.0.80 "numpy<2"
python desktop_app.py
```
### Local Web Server
Deploy the extractor as a browser-based application using React and FastAPI. You must start both servers simultaneously:
* **Backend:** `python backend/main.py`
* **Frontend:** `cd frontend && npm install && npm run dev`
:::
---
::: {.callout-important}
## Strict Environment Requirements
To avoid matrix calculation conflicts with **NumPy 2.0**, please ensure your Python environment uses the following exact versions:
* **Python:** 3.12.x
* **OpenCV:** `opencv-python==4.9.0.80`
* **NumPy:** `numpy<2`
* **PyMuPDF:** Latest stable
:::
---
## Testing & Archives
The bounding box algorithm is specifically tuned for the narrow gutters and ink-bleed typical of **Golden Age** and **Silver Age** comic scans. You can find excellent public domain material for testing at the following archives:
* [Digital Comic Museum](https://digitalcomicmuseum.com/)
* [Comic Book Plus](https://comicbookplus.com/)
* [Internet Archive](https://archive.org/details/comics)
---
## Technical Details & Known Limitations
**How it works:**
The engine uses OpenCV to convert each page to grayscale and applies an Otsu/Adaptive binary threshold to isolate the ink. It then uses morphological closing to bridge gaps in panel borders, and contour detection (`cv2.findContours`) to draw candidate bounding boxes. Finally, it filters out noise based on aspect ratios, discards overlapping regions via IoU Non-Maximum Suppression (NMS), and calculates reading order using DPI-independent row clustering. See [technical notes](technical.html) for further details.
**Where it might fail (Please read):**
Because this relies on visual heuristics rather than publisher-provided metadata, it may not work in some cases. You may experience incorrect crops or missed panels under the following conditions:
* **Overlapping Panels / Broken Gutters:** If a character's arm reaches outside a panel and breaks the gutter line, the algorithm may merge two panels together.
* **Non-Rectangular / Splash Pages:** Circular panels, extreme diagonals, or chaotic splash pages without clear borders will likely confuse the bounding box logic.
* **Zero-Gutter / Full Bleed:** If the page has no gutters at all (the art goes entirely to the edge of the page), the algorithm cannot slice it.
* **Floating Text:** Text or dialogue balloons that float in the absolute white space outside of a bordered panel may be ignored by the cropper.
---
## Citation
```bibtex
@software{LocalComic-Panel-Extractor2026,
author = {Karim, ME.},
title = {AI Comic Panel Extractor},
year = {2026},
publisher = {GitHub},
url = {https://github.com/ehsanx/Comic-Panel-Extractor}
}
```
<br>
::: {.callout-warning}
## Disclaimer
This software is provided "AS IS", without warranty of any kind. This tool is intended solely for use with files you legally own or those explicitly in the public domain. Please respect publisher copyrights.
:::