Skip to content

Commit b698a10

Browse files
committed
chore(site): update benchmark figures to 2026-03-28 snapshot
- EdgeParse: overall=0.7811 (+0.6 pp), speed=0.007 s/doc - Speed ratios updated: 83× faster than Docling (was 12×), 49× faster than PyMuPDF4LLM (was 6.9×), 2× faster than ODL (was 1.5×) - TEDS claim corrected: 73% better than OpenDataLoader (was 83%) - benchmark.ts lastUpdated → 2026-03-28 with exact per-engine figures - Hero.astro eyebrow + subtitle updated - index.mdx description, tagline, FeatureGrid card updated - results.mdx Key Takeaways speed claim updated
1 parent 3661293 commit b698a10

4 files changed

Lines changed: 38 additions & 38 deletions

File tree

site/src/components/landing/Hero.astro

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ const installCmd = 'pip install edgeparse';
1515
<div class="hero-content">
1616
<div class="hero-eyebrow">
1717
<span class="eyebrow-badge">#1 Non-ML PDF Parser</span>
18-
<span class="eyebrow-text">Leads the current benchmark · 12× faster than Docling · Zero dependencies</span>
18+
<span class="eyebrow-text">Leads the current benchmark · 83× faster than Docling · Zero dependencies</span>
1919
<svg class="eyebrow-arrow" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><path d="m9 18 6-6-6-6"/></svg>
2020
</div>
2121

@@ -25,7 +25,7 @@ const installCmd = 'pip install edgeparse';
2525
</h1>
2626

2727
<p class="hero-subtitle">
28-
Best published benchmark score without ML. 12× faster than Docling and 1.5× faster than OpenDataLoader. Zero GPU, zero OCR, zero JVM — just a 15&nbsp;MB Rust binary with the best reported scores across reading order, tables, headings, paragraphs, text quality, and speed.
28+
Best published benchmark score without ML. 83× faster than Docling and 2× faster than OpenDataLoader. Zero GPU, zero OCR, zero JVM — just a 15&nbsp;MB Rust binary with the best reported scores across reading order, tables, headings, paragraphs, text quality, and speed.
2929
</p>
3030

3131
<div class="hero-actions">

site/src/content/docs/benchmark/results.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ import { benchmarkSnapshot, formatSpeed, getBenchmarkTool } from '../../../data/
3535
## Key Takeaways
3636

3737
- **Latest published snapshot:** updated {benchmarkSnapshot.lastUpdated} on {benchmarkSnapshot.hardware} across {benchmarkSnapshot.documentCount} documents
38-
- **EdgeParse is the fastest**{formatSpeed(getBenchmarkTool('EdgeParse').speedSeconds)} per document, 12× faster than Docling
38+
- **EdgeParse is the fastest**{formatSpeed(getBenchmarkTool('EdgeParse').speedSeconds)} per document, 83× faster than Docling
3939
- **Highest overall score**{getBenchmarkTool('EdgeParse').overall.toFixed(3)} across the current six-engine comparison
4040
- **Best structure metrics** — leading NID ({getBenchmarkTool('EdgeParse').nid.toFixed(3)}), TEDS ({getBenchmarkTool('EdgeParse').teds.toFixed(3)}), and MHS ({getBenchmarkTool('EdgeParse').mhs.toFixed(3)})
4141
- **Best text metrics** — also leads paragraph boundaries, text quality, and table-detection F1 in the full benchmark report

site/src/content/docs/index.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
22
title: EdgeParse — Fast PDF Parser. Zero ML. Best Benchmark Score.
3-
description: EdgeParse extracts structured Markdown, JSON, and HTML from born-digital PDFs. 0.787 overall and 0.064 s/doc on the current 200-document benchmark. Python, Node.js, Rust, CLI, WebAssembly. Zero GPU. Zero OCR.
3+
description: EdgeParse extracts structured Markdown, JSON, and HTML from born-digital PDFs. 0.781 overall and 0.007 s/doc on the current 200-document benchmark. Python, Node.js, Rust, CLI, WebAssembly. Zero GPU. Zero OCR.
44
template: splash
55
hero:
66
title: |
77
PDF parsing for <span class="font-black text-transparent bg-clip-text bg-gradient-to-b from-accent-700 to-accent-400">AI Agents</span>
88
tagline: |
9-
The PDF-to-Markdown engine that leads the current benchmark without ML. 12× faster than Docling · 1.5× faster than OpenDataLoader · best reported scores across reading order, tables, headings, paragraphs, text quality, and speed. Python · Node.js · WebAssembly · Rust · CLI.
9+
The PDF-to-Markdown engine that leads the current benchmark without ML. 83× faster than Docling · 2× faster than OpenDataLoader · best reported scores across reading order, tables, headings, paragraphs, text quality, and speed. Python · Node.js · WebAssembly · Rust · CLI.
1010
actions:
1111
- text: Get Started Free
1212
link: /getting-started/quick-start-python/
@@ -139,8 +139,8 @@ const html = convert_to_string(bytes, 'html');
139139
title="Everything Your AI Stack Needs From a PDF"
140140
subtitle="EdgeParse is the only PDF parser with ML-level accuracy that runs without ML — in Python, Node.js, the browser, and Rust."
141141
features={[
142-
{ icon: 'zap', title: '12× Faster Than Docling', description: `${formatSpeed(getBenchmarkTool('EdgeParse').speedSeconds)} on ${benchmarkSnapshot.hardware}. 6.9× faster than PyMuPDF4LLM and 1.5× faster than OpenDataLoader. Parallel per-page processing via Rayon — CPU only.` },
143-
{ icon: 'table', title: 'Best-in-Class Table Extraction', description: `TEDS score of ${getBenchmarkTool('EdgeParse').teds.toFixed(3)} — best in the current published comparison and 83% better than OpenDataLoader heuristic mode (${getBenchmarkTool('OpenDataLoader').teds.toFixed(3)}). Ruling-line + borderless cluster detection with merged cell support.` },
142+
{ icon: 'zap', title: '83× Faster Than Docling', description: `${formatSpeed(getBenchmarkTool('EdgeParse').speedSeconds)} on ${benchmarkSnapshot.hardware}. 49× faster than PyMuPDF4LLM and 2× faster than OpenDataLoader. Parallel per-page processing via Rayon — CPU only.` },
143+
{ icon: 'table', title: 'Best-in-Class Table Extraction', description: `TEDS score of ${getBenchmarkTool('EdgeParse').teds.toFixed(3)} — best in the current published comparison and 73% better than OpenDataLoader heuristic mode (${getBenchmarkTool('OpenDataLoader').teds.toFixed(3)}). Ruling-line + borderless cluster detection with merged cell support.` },
144144
{ icon: 'target', title: 'Multi-Column Reading Order', description: `XY-Cut++ reads multi-column layouts, sidebars, and mixed content in the correct logical order. NID score of ${getBenchmarkTool('EdgeParse').nid.toFixed(3)} — highest in the current benchmark snapshot.` },
145145
{ icon: 'layers', title: 'Full Document Hierarchy', description: `Headings, paragraphs, lists, figures — all classified with nesting. MHS score of ${getBenchmarkTool('EdgeParse').mhs.toFixed(3)}, best among the compared engines in the current release snapshot.` },
146146
{ icon: 'globe', title: 'WebAssembly: Runs in the Browser', description: 'The only PDF parser with a WebAssembly build. Full Rust engine in the browser — PDF data never leaves the device. No server, no uploads, offline-capable.' },

site/src/data/benchmark.ts

Lines changed: 31 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -9,58 +9,58 @@ export interface BenchmarkTool {
99
}
1010

1111
export const benchmarkSnapshot = {
12-
lastUpdated: "2026-03-26",
12+
lastUpdated: "2026-03-28",
1313
hardware: "Apple M4 Max",
1414
documentCount: 200,
1515
tools: [
1616
{
1717
name: "EdgeParse",
18-
nid: 0.889,
19-
teds: 0.596,
20-
mhs: 0.553,
21-
overall: 0.787,
22-
speedSeconds: 0.064,
18+
nid: 0.8846,
19+
teds: 0.5590,
20+
mhs: 0.5540,
21+
overall: 0.7811,
22+
speedSeconds: 0.007,
2323
isHighlight: true,
2424
},
2525
{
2626
name: "Docling (IBM)",
27-
nid: 0.867,
28-
teds: 0.54,
29-
mhs: 0.438,
30-
overall: 0.745,
31-
speedSeconds: 0.768,
27+
nid: 0.8665,
28+
teds: 0.5404,
29+
mhs: 0.4384,
30+
overall: 0.7452,
31+
speedSeconds: 0.584,
3232
},
3333
{
3434
name: "OpenDataLoader",
35-
nid: 0.873,
36-
teds: 0.326,
37-
mhs: 0.442,
38-
overall: 0.733,
39-
speedSeconds: 0.094,
35+
nid: 0.8611,
36+
teds: 0.3234,
37+
mhs: 0.4360,
38+
overall: 0.7233,
39+
speedSeconds: 0.014,
4040
},
4141
{
4242
name: "PyMuPDF4LLM",
43-
nid: 0.852,
44-
teds: 0.323,
45-
mhs: 0.407,
46-
overall: 0.71,
47-
speedSeconds: 0.439,
43+
nid: 0.8522,
44+
teds: 0.3233,
45+
mhs: 0.4066,
46+
overall: 0.7103,
47+
speedSeconds: 0.327,
4848
},
4949
{
5050
name: "LiteParse",
51-
nid: 0.815,
52-
teds: 0,
53-
mhs: 0.001,
54-
overall: 0.564,
55-
speedSeconds: 0.196,
51+
nid: 0.8148,
52+
teds: 0.0000,
53+
mhs: 0.0012,
54+
overall: 0.5642,
55+
speedSeconds: 0.160,
5656
},
5757
{
5858
name: "MarkItDown",
59-
nid: 0.808,
60-
teds: 0.193,
61-
mhs: 0.001,
62-
overall: 0.564,
63-
speedSeconds: 0.149,
59+
nid: 0.8075,
60+
teds: 0.1925,
61+
mhs: 0.0012,
62+
overall: 0.5639,
63+
speedSeconds: 0.123,
6464
},
6565
] satisfies BenchmarkTool[],
6666
};

0 commit comments

Comments
 (0)