Skip to content

Commit 8724cd7

Browse files
authored
Improve Documentation (#72)
* Add getting started entries * Script to generate cli and configuration reference * Separate merge and split pdf tutorial * Add frontmatter to every page --------- Co-authored-by: avvertix <5672748+avvertix@users.noreply.github.com>
1 parent eb4f9fd commit 8724cd7

25 files changed

Lines changed: 1370 additions & 404 deletions

.github/workflows/update-docs.yml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
name: Update reference docs
2+
3+
on:
4+
pull_request:
5+
paths:
6+
- "src/parxy_cli/commands/**"
7+
- "src/parxy_cli/cli.py"
8+
- "src/parxy_core/models/config.py"
9+
- "scripts/generate_docs.py"
10+
11+
jobs:
12+
update-docs:
13+
name: Regenerate reference docs
14+
runs-on: ubuntu-latest
15+
permissions:
16+
contents: write
17+
18+
steps:
19+
- uses: actions/checkout@v6
20+
with:
21+
fetch-depth: 1
22+
23+
- name: Install uv
24+
uses: astral-sh/setup-uv@v7.3.1
25+
with:
26+
enable-cache: true
27+
28+
- name: Install dependencies
29+
run: uv sync
30+
31+
- name: Generate reference docs
32+
run: uv run python scripts/generate_docs.py
33+
34+
- name: Commit if changed
35+
uses: stefanzweifel/git-auto-commit-action@v7.1.0
36+
with:
37+
commit_message: "docs: sync CLI and configuration reference"
38+
file_pattern: "docs/reference/*.md"

docs/howto/add_new_parser.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Add a new parser
3+
description: How to implement a custom driver, register it with Parxy at runtime, and make it available alongside the built-in parsers.
4+
---
5+
16
# How to Add a New Parser to Parxy
27

38
Parxy is designed to be **extensible** — you can integrate new parsing backends (drivers) or create custom variants of existing ones directly from your Python code, without modifying the core library.

docs/howto/batch_processing.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Process multiple documents in parallel
3+
description: How to use Parxy's batch API to parse many documents concurrently, control worker count, handle per-file errors, and collect structured results.
4+
---
5+
16
# How to Process Multiple Documents in Parallel
27

38
Parxy provides a `batch` method for processing multiple documents in parallel, with support for per-file configuration. This is useful when you need to parse many documents efficiently or when different documents require different parsing strategies.

docs/howto/configure_landingai.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Configure LandingAI ADE
3+
description: How to set up the LandingAI Agentic Document Extraction driver, configure the API key and environment, and override parsing options per document.
4+
---
5+
16
# How to Configure LandingAI ADE
27

38
This guide shows you how to configure the LandingAI ADE (Agentic Document Extraction) driver for document processing, including setting default options and overriding them on a per-document basis.

docs/howto/configure_llamaparse.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Configure LlamaParse
3+
description: How to set up the LlamaParse driver, configure the API key and parsing mode, and override options on a per-document basis for better extraction results.
4+
---
5+
16
# How to Configure LlamaParse
27

38
This guide shows you how to configure the LlamaParse driver for document processing, including setting default options and overriding them on a per-document basis.

docs/howto/configure_llmwhisperer.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Configure LLMWhisperer
3+
description: How to set up the LLMWhisperer driver, configure the API key and parsing mode, and override options on a per-document basis for better extraction results.
4+
---
5+
16
# How to Configure LLMWhisperer
27

38
This guide shows you how to configure the LLMWhisperer driver for document processing, including setting default options and overriding them on a per-document basis.

docs/howto/configure_observability.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Configure observability
3+
description: How to enable OpenTelemetry tracing and metrics in Parxy, connect to an OTLP collector, and monitor document processing operations in your observability stack.
4+
---
5+
16
# How to Configure Observability
27

38
This guide shows you how to enable and configure OpenTelemetry-based observability in Parxy to monitor document processing operations.

docs/howto/configure_pdfact.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Configure PdfAct
3+
description: How to set up the PdfAct driver against a self-hosted or remote service instance, configure the base URL and API key, and run PdfAct locally with Docker.
4+
---
5+
16
# How to Configure PdfAct
27

38
This guide shows you how to configure the PdfAct driver for document processing using a self-hosted or remote PdfAct service.

docs/howto/configure_pymupdf.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Configure PyMuPDF
3+
description: How to use Parxy's default PyMuPDF driver, choose the right extraction level for your use case, and adjust the output when working with local PDF files.
4+
---
5+
16
# How to Configure PyMuPDF
27

38
This guide shows you how to use the PyMuPDF driver for document processing. PyMuPDF is the default driver in Parxy and requires no external services or API keys.

docs/howto/configure_unstructured_local.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Configure Unstructured library
3+
description: How to install and configure the Unstructured local driver for offline PDF parsing without external APIs, including extraction levels and output options.
4+
---
5+
16
# How to Configure Unstructured Local
27

38
This guide shows you how to configure the Unstructured Local driver for document processing. This driver uses the open-source `unstructured` library for local document parsing without requiring external services.

0 commit comments

Comments
 (0)