edgeparse/site/src/content/docs/getting-started/quick-start-python.mdx at f4b71dea562b599c612000c3f9bcc3b93f5f8560 · raphaelmansuy/edgeparse

title	Quick Start: Python
description	Install EdgeParse and extract your first PDF in under a minute.

Installation

pip install edgeparse

Requirements: Python 3.9+ · No additional system dependencies.

Parse a PDF

import edgeparse

# Get Markdown string
markdown = edgeparse.convert("document.pdf", format="markdown")
print(markdown)

# Get structured JSON string
json_str = edgeparse.convert("document.pdf", format="json")

# Get HTML string
html = edgeparse.convert("document.pdf", format="html")

Write to File

import edgeparse

# Save to output directory (returns path of saved file)
out_path = edgeparse.convert_file("document.pdf", "output/", format="json")
print(f"Saved to: {out_path}")

Output Formats

Format	Description
`"markdown"`	Clean Markdown with table support
`"json"`	Structured JSON with full metadata
`"html"`	Semantic HTML
`"text"`	Plain text (reading order)

Next Steps

JSON Schema Reference — understand every field
Benchmark Results — accuracy & speed data
Try Live Demo — parse PDFs in your browser with WebAssembly
Enterprise — self-hosted deployment, priority support by Elitizon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation

Parse a PDF

Write to File

Output Formats

Next Steps

FilesExpand file tree

quick-start-python.mdx

Latest commit

History

quick-start-python.mdx

File metadata and controls

Installation

Parse a PDF

Write to File

Output Formats

Next Steps