Skip to content

Commit 9292dee

Browse files
jensensclaude
andcommitted
Add benchmark generate subcommand with Wikipedia seed data
Adds a `generate` subcommand to bench.py that creates a reproducible FileStorage from Wikipedia seed data (1,062 multilingual articles). Body text truncated with exponential skew (500-10,000 chars) for realistic ZODB record sizes. Updates FileStorage benchmark section in BENCHMARKS.md and README.md with fresh numbers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 7d50a6b commit 9292dee

4 files changed

Lines changed: 331 additions & 32 deletions

File tree

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,3 +23,6 @@ dist/
2323
# Test/build artifacts
2424
.pytest_cache/
2525
*.prof
26+
27+
# Generated benchmark data
28+
benchmarks/bench_data/

BENCHMARKS.md

Lines changed: 51 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -88,40 +88,60 @@ JSON is typically smaller than pickle for string-heavy data (wide_dict: 42%
8888
smaller). It is larger for binary data (base64 overhead) and deeply nested
8989
structures (marker overhead).
9090

91-
## FileStorage Scan (Real Plone 6 Database)
91+
## FileStorage Scan (Generated Wikipedia Database)
9292

93-
8,422 records, 182 distinct types, 0 errors.
93+
1,692 records, 6 distinct types, 0 errors. Generated from 1,062 multilingual
94+
Wikipedia articles (en/de/zh) with body text truncated to 500-10,000 chars
95+
(exponential skew toward shorter texts), enriched type-diverse fields
96+
(datetime, date, timedelta, Decimal, UUID, frozenset, set, tuple, bytes)
97+
plus OOBTree containers, group summaries, and edge-case objects.
98+
99+
Generate with: `python benchmarks/bench.py generate`
94100

95101
| Metric | Codec | Python | Speedup |
96102
|---|---|---|---|
97-
| Decode mean | 5.3 us | 100.1 us | **18.7x faster** |
98-
| Decode median | 3.6 us | 4.6 us | **1.3x faster** |
99-
| Decode P95 | 11.6 us | 10.1 us | 1.1x slower |
100-
| Encode mean | 1.1 us | 3.8 us | **3.5x faster** |
101-
| Encode median | 0.7 us | 2.9 us | **4.1x faster** |
102-
| Encode P95 | 2.7 us | 7.0 us | **2.6x faster** |
103-
| Total pickle | 3.1 MB |||
104-
| Total JSON | 4.1 MB || 1.30x |
105-
106-
The codec's mean decode speedup (18.7x) far exceeds median (1.3x) because
107-
Python pickle has extreme outliers (max 365 ms) that the Rust codec avoids
108-
(max 2.4 ms). This matters for tail latency in web applications.
103+
| Decode mean | 30.5 us | 24.2 us | 1.3x slower |
104+
| Decode median | 26.1 us | 23.4 us | 1.1x slower |
105+
| Decode P95 | 43.2 us | 36.1 us | 1.2x slower |
106+
| Encode mean | 7.5 us | 19.3 us | **2.6x faster** |
107+
| Encode median | 6.8 us | 20.9 us | **3.1x faster** |
108+
| Encode P95 | 13.2 us | 31.9 us | **2.4x faster** |
109+
| Total pickle | 5.1 MB |||
110+
| Total JSON | 7.2 MB || 1.41x |
111+
112+
The codec is slightly slower on decode (1.1x median) because it does
113+
fundamentally more work than CPython's C-extension pickle: two conversions
114+
(pickle bytes → Rust AST → Python objects) plus type-aware transformation.
115+
The gap narrows on metadata-heavy records (small dicts with mixed types).
116+
117+
Encode is consistently **2.4-3.1x faster** because the Rust encoder writes
118+
pickle opcodes directly from Python objects, bypassing intermediate
119+
allocations that CPython's pickle module incurs.
120+
121+
| Record type | Count | % |
122+
|---|---|---|
123+
| persistent.mapping.PersistentMapping | 1,188 | 70.2% |
124+
| BTrees.OOBTree.OOBucket | 342 | 20.2% |
125+
| persistent.list.PersistentList | 100 | 5.9% |
126+
| BTrees.OOBTree.OOBTree | 55 | 3.3% |
127+
| BTrees.Length.Length | 5 | 0.3% |
128+
| BTrees.OIBTree.OIBTree | 2 | 0.1% |
109129

110130
## Analysis
111131

112132
The codec **beats CPython pickle** on decode for 8 of 10 synthetic categories,
113-
and on encode for **all 10 categories**. On real Plone data, both decode and
114-
encode are faster across all statistical measures.
115-
116-
The remaining decode-parity cases:
117-
118-
- **btree_small decode**: at parity (1.0x) — small payload, minimal work
119-
- **deep_nesting decode**: recursive marker prefix scanning on nested dicts
133+
and on encode for **all 10 categories**. On the generated FileStorage data,
134+
decode is near parity (1.1x median) while encode is **2.4-3.1x faster**.
120135

121136
The sweet spot is typical ZODB objects (5-50 keys, mixed types, datetime
122137
fields, persistent refs) where the codec is **1.3-1.7x faster** decode and
123138
**3-7x faster** encode while also producing queryable JSONB output.
124139

140+
Decode overhead comes from the codec's two-pass conversion plus type
141+
transformation. On string-dominated payloads this matters more; on
142+
metadata-rich records with mixed types (the typical ZODB case) the codec
143+
is competitive or faster.
144+
125145
## Optimizations Applied
126146

127147
1. **Direct PickleValue <-> PyObject** (`src/pyconv.rs`) — bypasses the
@@ -219,9 +239,15 @@ maturin develop --release
219239
# Synthetic micro-benchmarks
220240
python benchmarks/bench.py synthetic --iterations 1000
221241

222-
# Scan a real FileStorage
223-
python benchmarks/bench.py filestorage /path/to/Data.fs
242+
# Generate a reproducible benchmark FileStorage (requires ZODB + BTrees)
243+
python benchmarks/bench.py generate
244+
# Custom paths:
245+
python benchmarks/bench.py generate --output /tmp/bench.fs \
246+
--seed-data path/to/seed_data.json.gz
247+
248+
# Scan the generated (or any) FileStorage
249+
python benchmarks/bench.py filestorage benchmarks/bench_data/Data.fs
224250

225-
# Both, with JSON export for tracking
226-
python benchmarks/bench.py all --filestorage /path/to/Data.fs --output results.json
251+
# Both synthetic + filestorage, with JSON export
252+
python benchmarks/bench.py all --filestorage benchmarks/bench_data/Data.fs --output results.json
227253
```

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -86,13 +86,12 @@ categories:
8686

8787
| Operation | Best | Worst | Typical ZODB |
8888
|---|---|---|---|
89-
| Decode | **1.8x faster** | 1.1x slower | 1.3x faster |
90-
| Encode | **7.0x faster** | 1.4x faster | 4.0x faster |
91-
| Roundtrip | **2.9x faster** | 1.3x slower | 2.0x faster |
89+
| Decode | **1.7x faster** | 1.0x slower | 1.3x faster |
90+
| Encode | **7.0x faster** | 1.3x faster | 4.0x faster |
91+
| Roundtrip | **2.7x faster** | 1.0x | 2.0x faster |
9292

93-
On a real Plone 6 database (8,400+ records, 182 distinct types, 0 errors):
94-
decode is **1.3x faster** (median), **18.7x faster** mean; encode is
95-
**3.5x faster** (median). Python pickle's extreme outliers are eliminated.
93+
On a generated Wikipedia database (1,692 records, 6 types, 0 errors):
94+
decode is near parity (1.1x median), encode is **3.1x faster** (median).
9695

9796
For detailed numbers and optimization history, see [BENCHMARKS.md](BENCHMARKS.md).
9897

@@ -121,7 +120,8 @@ maturin develop --release
121120

122121
# Run benchmarks
123122
python benchmarks/bench.py synthetic --iterations 1000
124-
python benchmarks/bench.py filestorage /path/to/Data.fs
123+
python benchmarks/bench.py generate # create benchmark FileStorage
124+
python benchmarks/bench.py filestorage benchmarks/bench_data/Data.fs
125125
```
126126

127127
### Project Structure

0 commit comments

Comments
 (0)