Skip to content

Commit afa4272

Browse files
committed
add badges
1 parent 101658b commit afa4272

2 files changed

Lines changed: 17 additions & 15 deletions

File tree

README.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,21 @@
11
<div align="center">
22

33
[![logo](public/wide.webp)](https://github.com/ipitio/ocr-pdf)
4-
Link to presentation -> https://docs.google.com/presentation/d/1o0IAz4Awh5Gfb3jGGnsWIuF53o3sAy9TA1a3grr7Lf0/edit?usp=sharing
54

65
# ocr2pdf
76

87
**OCRmyPDF and Merge it**
98

109
---
1110

12-
[![downloads](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fipitio.github.io%2Fbackage%2Fipitio%2Focr-pdf%2Focr-pdf.json&query=%24.downloads&logo=github&logoColor=959da5&labelColor=333a41&label=pulls)](https://github.com/ipitio/ocr-pdf/pkgs/container/ocr-pdf) [![build](https://github.com/ipitio/ocr-pdf/actions/workflows/publish.yml/badge.svg)](https://github.com/ipitio/ocr-pdf/actions/workflows/publish.yml)
11+
[![build](https://github.com/ipitio/ocr-pdf/actions/workflows/publish.yml/badge.svg)](https://github.com/ipitio/ocr-pdf/actions/workflows/publish.yml) [![downloads](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fipitio.github.io%2Fbackage%2Fipitio%2Focr-pdf%2Focr-pdf.json&query=%24.downloads&logo=github&logoColor=959da5&labelColor=333a41&label=pulls)](https://github.com/ipitio/ocr-pdf/pkgs/container/ocr-pdf) [![size](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fipitio.github.io%2Fbackage%2Fipitio%2Focr-pdf%2Focr-pdf.json&query=%24.size&logo=github&logoColor=959da5&label=size&labelColor=333a41&color=indigo)](https://github.com/arevindh/backage/pkgs/container/backage) [![latest](https://img.shields.io/badge/dynamic/xml?url=https%3A%2F%2Fipitio.github.io%2Fbackage%2Fipitio%2Focr-pdf%2Focr-pdf.xml&query=%2Fbkg%2Fversion%5B.%2Flatest%5B.%3D%22true%22%5D%5D%2Ftags%5B.!%3D%22latest%22%5D&logo=github&logoColor=959da5&label=latest&labelColor=333a41&color=darkgreen)](https://github.com/arevindh/backage/pkgs/container/backage)
1312

1413
</div>
1514

16-
Convert images and scans to searchable and selectable (and merged) PDFs! The core logic resides in a Python script that you could run yourself, if you really wanted to. It extracts all the files from `todo`, transforms them with Tesseract via [OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF), and loads them into `done`. Files in subfolders will be merged in alphabetical order, but will still be available individually.
15+
Convert images and scans to searchable and selectable (and merged) PDFs! The core logic resides in a Python script that extracts all the files from `todo`, transforms them with Tesseract via [OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF), and loads them into `done`.
16+
17+
> [!NOTE]
18+
> Files in subfolders will be merged in alphabetical order, but will still be available individually.
1719
1820
I recommend you use either:
1921

@@ -37,19 +39,19 @@ It's as easy as 1, 2, 3! Get up and going in no time with these options:
3739

3840
Are you on mobile or simply want an easy and seamless experience?
3941

40-
1. Open [Colab](https://colab.research.google.com/github/ipitio/ocr-pdf/blob/master/colab.ipynb) in your browser
41-
2. Follow the instructions in the notebook
42+
1. Run the [Colab](https://colab.research.google.com/github/ipitio/ocr-pdf/blob/master/colab.ipynb) cell in your browser
43+
2. Follow the prompts to upload your files
4244
3. Find the OCR'd files in your [Drive](https://drive.google.com/drive/my-drive)`/ocr-pdf`
4345

44-
To add OCRmyPDF options, append them to the `run` command in the code cell.
46+
To add OCRmyPDF options, append them to the `run` command.
4547

4648
### Self-hosted: Prebuilt Docker Image
4749

4850
If you want to skip building an image, just use mine:
4951

5052
1. Install Docker, such as with Docker Desktop
5153
2. Make a new `pdf` folder and put your files in `pdf/todo`
52-
3. Run the following command from `pdf/..` to convert the files and move them into `pdf/done`
54+
3. Run the following command from the parent of `pdf` to convert the files and move them into `pdf/done`
5355

5456
```bash
5557
docker run --rm \
@@ -62,13 +64,13 @@ docker run --rm \
6264

6365
It's still easy as 1, 2, 3! You'll find the OCR'd files in `pdf/done`.
6466

65-
1. First (fork and) clone this repo
67+
1. Fork and clone this repo
6668
2. `cd` into it and put your files in `pdf/todo`
6769
3. Complete one of the following:
6870

6971
### Cloud: GitHub Actions Workflow
7072

71-
If you made a fork and cloned it, Git is your best friend!
73+
Enable Actions and push your files:
7274

7375
```bash
7476
git add .
@@ -82,19 +84,19 @@ To add OCRmyPDF options, edit the command in the `predict.yml` file before commi
8284

8385
### Self-hosted
8486

85-
#### Build Docker Image
87+
#### Docker Compose Service
8688

87-
If you aren't on Linux, or want to avoid polluting your system, use Docker Compose (which is included with Docker Desktop):
89+
If you want to avoid polluting your system, use Docker Compose (which is included with Docker Desktop):
8890

8991
```bash
9092
docker compose up
9193
```
9294

9395
To add OCRmyPDF options, edit the command in the `compose.yml` file.
9496

95-
#### Use Bare Metal
97+
#### Bash Install Script
9698

97-
Are you on Linux and want to make the most out of it?
99+
Do want to make the most out of your hardware?
98100

99101
```bash
100102
bash src/predict.sh pdf [OCRmyPDF options]

colab.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
"\n",
1919
"---\n",
2020
"\n",
21-
"[![downloads](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fipitio.github.io%2Fbackage%2Fipitio%2Focr-pdf%2Focr-pdf.json&query=%24.downloads&logo=github&logoColor=959da5&labelColor=333a41&label=pulls)](https://github.com/arevindh/pihole-speedtest/pkgs/container/pihole-speedtest) [![build](https://github.com/ipitio/ocr-pdf/actions/workflows/publish.yml/badge.svg)](https://github.com/ipitio/ocr-pdf/actions/workflows/publish.yml)\n",
21+
"[![build](https://github.com/ipitio/ocr-pdf/actions/workflows/publish.yml/badge.svg)](https://github.com/ipitio/ocr-pdf/actions/workflows/publish.yml) [![downloads](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fipitio.github.io%2Fbackage%2Fipitio%2Focr-pdf%2Focr-pdf.json&query=%24.downloads&logo=github&logoColor=959da5&labelColor=333a41&label=pulls)](https://github.com/ipitio/ocr-pdf/pkgs/container/ocr-pdf) [![size](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fipitio.github.io%2Fbackage%2Fipitio%2Focr-pdf%2Focr-pdf.json&query=%24.size&logo=github&logoColor=959da5&label=size&labelColor=333a41&color=indigo)](https://github.com/arevindh/backage/pkgs/container/backage) [![latest](https://img.shields.io/badge/dynamic/xml?url=https%3A%2F%2Fipitio.github.io%2Fbackage%2Fipitio%2Focr-pdf%2Focr-pdf.xml&query=%2Fbkg%2Fversion%5B.%2Flatest%5B.%3D%22true%22%5D%5D%2Ftags%5B.!%3D%22latest%22%5D&logo=github&logoColor=959da5&label=latest&labelColor=333a41&color=darkgreen)](https://github.com/arevindh/backage/pkgs/container/backage)\n",
2222
"\n",
2323
"</div>\n",
2424
"\n",
@@ -32,7 +32,7 @@
3232
" - The files in each zip will be merged in alphabetical order\n",
3333
"- If you'd like to add any options for [OCRmyPDF](https://ocrmypdf.readthedocs.io/en/latest), append them to line 23 in the cell below\n",
3434
"- The upload button will appear below the cell after running it\n",
35-
"- At the end, you'll be offered a zip of the converted (and merged) files to download locally, whether or not Drive was connected\n",
35+
"- Depending on your browser's settings, the resulting files will either be automatically downloaded or you will be prompted to save them, whether or not Drive was connected\n",
3636
"\n",
3737
"\n",
3838
"## Steps\n",

0 commit comments

Comments
 (0)