This Python script splits a PDF into multiple smaller PDFs based on an index.
The index defines the title and start page of each section.
The script automatically calculates the end page for each section using the next section's start page.
- Python 3.8+
Use the provided setup script to automatically create a virtual environment and install dependencies:
chmod +x setup.sh
./setup.shThis will:
- Create a Python virtual environment (
.venv) - Activate the virtual environment
- Upgrade pip to the latest version
- Install all required dependencies from
requirements.txt
After setup, activate the virtual environment with:
source .venv/bin/activate-
Put your PDF file at the root directory (named as
document.pdf) -
Run the script with:
python main.pyIt will:
-
Read document.pdf
-
Read index.json
-
Create one PDF per section inside the output/ directory
- split_pdf.py → the main script
- setup.sh → automated setup script for virtual environment and dependencies
- requirements.txt → Python dependencies
- logo.svg → project logo
- index.json → defines the index
- document.pdf → the original PDF to split
- output/ → folder where the split PDFs will be stored
-
Filenames are generated from the section titles (spaces replaced with _).
-
The last section automatically goes until the last page of the PDF.
-
Works with any PDF and any custom index JSON.