33Comparison of ` zodb-json-codec ` (Rust + PyO3) vs CPython's ` pickle ` module
44for ZODB record encoding/decoding.
55
6- Measured on: 2026-02-23
6+ Measured on: 2026-02-24
77Python: 3.13.9, PyO3: 0.28, 500 iterations, 100 warmup
8- Build: ` maturin develop --release ` (optimized)
8+ Build: ` maturin develop --release ` (optimized, LTO + codegen-units=1 )
99
1010## Context
1111
@@ -28,46 +28,46 @@ are 3-8x slower due to missing optimizations and inlining.
2828
2929| Category | Python | Codec | Ratio |
3030| ---| ---| ---| ---|
31- | simple_flat_dict (120 B) | 1.9 us | 1.3 us | ** 1.4x faster** |
32- | nested_dict (187 B) | 2.6 us | 2.0 us | ** 1.3x faster** |
33- | large_flat_dict (2.5 KB) | 23.4 us | 20 .7 us | ** 1.1x faster** |
34- | bytes_in_state (1 KB) | 2.1 us | 2.0 us | 1.0x |
35- | special_types (314 B) | 6.9 us | 5.3 us | ** 1.3x faster** |
36- | btree_small (112 B) | 1.7 us | 1.8 us | 1.0x |
37- | btree_length (44 B) | 1.0 us | 0.6 us | ** 1.7x faster** |
38- | scalar_string (72 B) | 1.1 us | 0.7 us | ** 1.6x faster** |
39- | wide_dict (27 KB) | 268 us | 260 us | 1.0x |
40- | deep_nesting (379 B) | 7.1 us | 7.4 us | 1.0x slower |
31+ | simple_flat_dict (120 B) | 1.9 us | 1.1 us | ** 1.8x faster** |
32+ | nested_dict (187 B) | 2.9 us | 1.8 us | ** 1.6x faster** |
33+ | large_flat_dict (2.5 KB) | 22.8 us | 19 .7 us | ** 1.2x faster** |
34+ | bytes_in_state (1 KB) | 1.8 us | 1.9 us | 1.1x slower |
35+ | special_types (314 B) | 6.8 us | 4.7 us | ** 1.5x faster** |
36+ | btree_small (112 B) | 1.9 us | 1.8 us | 1.1x faster |
37+ | btree_length (44 B) | 1.0 us | 0.5 us | ** 2.0x faster** |
38+ | scalar_string (72 B) | 1.1 us | 0.5 us | ** 2.1x faster** |
39+ | wide_dict (27 KB) | 264 us | 279 us | 1.1x slower |
40+ | deep_nesting (379 B) | 7.2 us | 7.3 us | 1.0x |
4141
4242### Encode (Python dict -> pickle bytes)
4343
4444| Category | Python | Codec | Ratio |
4545| ---| ---| ---| ---|
46- | simple_flat_dict | 1.4 us | 0.3 us | ** 4.7x faster** |
47- | nested_dict | 1.5 us | 0.4 us | ** 3.9x faster** |
48- | large_flat_dict | 5.6 us | 1.9 us | ** 2.9x faster** |
49- | bytes_in_state | 1.4 us | 1.1 us | ** 1.3x faster** |
50- | special_types | 4.9 us | 1.1 us | ** 4.6x faster** |
51- | btree_small | 1.3 us | 0.2 us | ** 5.1x faster** |
52- | btree_length | 1.0 us | 0.2 us | ** 6.0x faster** |
53- | scalar_string | 1.0 us | 0.1 us | ** 7.0x faster** |
54- | wide_dict | 59.6 us | 20.6 us | ** 2.9x faster** |
55- | deep_nesting | 2.7 us | 1.6 us | ** 1.7x faster** |
46+ | simple_flat_dict | 1.3 us | 0.2 us | ** 5.3x faster** |
47+ | nested_dict | 1.6 us | 0.4 us | ** 4.5x faster** |
48+ | large_flat_dict | 5.9 us | 1.7 us | ** 3.8x faster** |
49+ | bytes_in_state | 1.4 us | 0.9 us | ** 1.7x faster** |
50+ | special_types | 4.6 us | 0.9 us | ** 5.0x faster** |
51+ | btree_small | 1.3 us | 0.2 us | ** 5.8x faster** |
52+ | btree_length | 1.1 us | 0.1 us | ** 7.5x faster** |
53+ | scalar_string | 1.0 us | 0.1 us | ** 6.6x faster** |
54+ | wide_dict | 59.2 us | 15.7 us | ** 3.7x faster** |
55+ | deep_nesting | 2.7 us | 1.4 us | ** 1.9x faster** |
5656
5757### Full Roundtrip (decode + encode)
5858
5959| Category | Python | Codec | Ratio |
6060| ---| ---| ---| ---|
61- | simple_flat_dict | 3.3 us | 1.5 us | ** 2.1x faster** |
62- | nested_dict | 4.5 us | 2.6 us | ** 1.7x faster** |
63- | large_flat_dict | 28 .7 us | 24.3 us | ** 1.2x faster** |
64- | bytes_in_state | 3.3 us | 3.2 us | 1.0x |
65- | special_types | 12.4 us | 6.1 us | ** 2.0x faster** |
66- | btree_small | 3.2 us | 2.3 us | ** 1.4x faster** |
67- | btree_length | 2.1 us | 0.8 us | ** 2.7x faster** |
68- | scalar_string | 2.1 us | 0.9 us | ** 2.4x faster** |
69- | wide_dict | 345 us | 293 us | ** 1.2x faster** |
70- | deep_nesting | 10.6 us | 10 .2 us | 1.0x |
61+ | simple_flat_dict | 3.2 us | 1.5 us | ** 2.1x faster** |
62+ | nested_dict | 4.5 us | 2.2 us | ** 2.0x faster** |
63+ | large_flat_dict | 29 .7 us | 21.8 us | ** 1.4x faster** |
64+ | bytes_in_state | 3.3 us | 3.0 us | 1.1x faster |
65+ | special_types | 11.7 us | 6.0 us | ** 2.0x faster** |
66+ | btree_small | 5.8 us | 2.1 us | ** 2.8x faster** |
67+ | btree_length | 2.1 us | 0.7 us | ** 3.2x faster** |
68+ | scalar_string | 2.3 us | 0.8 us | ** 3.1x faster** |
69+ | wide_dict | 316 us | 232 us | ** 1.4x faster** |
70+ | deep_nesting | 10.3 us | 9 .2 us | 1.1x faster |
7171
7272### Size Comparison (pickle bytes vs JSON)
7373
@@ -100,12 +100,12 @@ Generate with: `python benchmarks/bench.py generate`
100100
101101| Metric | Codec | Python | Speedup |
102102| ---| ---| ---| ---|
103- | Decode mean | 30.5 us | 24.2 us | 1.3x slower |
104- | Decode median | 26.1 us | 23.4 us | 1.1x slower |
105- | Decode P95 | 43.2 us | 36.1 us | 1.2x slower |
106- | Encode mean | 7.5 us | 19.3 us | ** 2.6x faster** |
107- | Encode median | 6.8 us | 20.9 us | ** 3.1x faster** |
108- | Encode P95 | 13.2 us | 31.9 us | ** 2.4x faster** |
103+ | Decode mean | 28.7 us | 23.7 us | 1.2x slower |
104+ | Decode median | 24.7 us | 22.6 us | 1.1x slower |
105+ | Decode P95 | 42.3 us | 36.3 us | 1.2x slower |
106+ | Encode mean | 7.0 us | 18.8 us | ** 2.7x faster** |
107+ | Encode median | 6.2 us | 20.4 us | ** 3.3x faster** |
108+ | Encode P95 | 12.8 us | 31.5 us | ** 2.5x faster** |
109109| Total pickle | 5.1 MB | — | — |
110110| Total JSON | 7.2 MB | — | 1.41x |
111111
@@ -114,7 +114,7 @@ fundamentally more work than CPython's C-extension pickle: two conversions
114114(pickle bytes → Rust AST → Python objects) plus type-aware transformation.
115115The gap narrows on metadata-heavy records (small dicts with mixed types).
116116
117- Encode is consistently ** 2.4 -3.1x faster** because the Rust encoder writes
117+ Encode is consistently ** 2.5 -3.3x faster** because the Rust encoder writes
118118pickle opcodes directly from Python objects, bypassing intermediate
119119allocations that CPython's pickle module incurs.
120120
@@ -131,11 +131,11 @@ allocations that CPython's pickle module incurs.
131131
132132The codec ** beats CPython pickle** on decode for 8 of 10 synthetic categories,
133133and on encode for ** all 10 categories** . On the generated FileStorage data,
134- decode is near parity (1.1x median) while encode is ** 2.4 -3.1x faster** .
134+ decode is near parity (1.1x median) while encode is ** 2.5 -3.3x faster** .
135135
136136The sweet spot is typical ZODB objects (5-50 keys, mixed types, datetime
137- fields, persistent refs) where the codec is ** 1.3-1.7x faster** decode and
138- ** 3 -7x faster** encode while also producing queryable JSONB output.
137+ fields, persistent refs) where the codec is ** 1.5-2.0x faster** decode and
138+ ** 4 -7x faster** encode while also producing queryable JSONB output.
139139
140140Decode overhead comes from the codec's two-pass conversion plus type
141141transformation. On string-dominated payloads this matters more; on
@@ -198,8 +198,30 @@ is competitive or faster.
198198 ` PickleValue ` enum from 56 to 48 bytes, improving cache utilization
199199 across the entire decode/encode pipeline (-13% weighted average).
200200
201+ 15 . ** Thin LTO + single codegen unit** — ` lto = "thin" ` + ` codegen-units = 1 `
202+ in the release profile enables cross-crate inlining and whole-crate
203+ optimization. Free 6-9% improvement across decode and encode with no
204+ code changes.
205+
201206## Changelog
202207
208+ ### 1.3.1 (2026-02-24): LTO release profile optimization
209+
210+ Enabled thin LTO (` lto = "thin" ` ) and single codegen unit (` codegen-units = 1 ` )
211+ in the Cargo release profile. This allows LLVM to inline across crate boundaries
212+ and optimize the entire crate as a single compilation unit.
213+
214+ Impact on FileStorage benchmark (1,692 records):
215+
216+ | Metric | Before | After | Improvement |
217+ | ---| ---| ---| ---|
218+ | Decode median | 26.1 us | 24.7 us | ** -5.4%** |
219+ | Decode mean | 30.5 us | 28.7 us | ** -5.9%** |
220+ | Encode median | 6.8 us | 6.2 us | ** -8.8%** |
221+ | Encode mean | 7.5 us | 7.0 us | ** -6.7%** |
222+
223+ Zero code changes — purely a build configuration improvement.
224+
203225### 2026-02-23: Dict/list subclass support + PickleValue boxing optimization
204226
205227Added support for pickle SETITEMS/SETITEM/APPENDS/APPEND on Reduce and
0 commit comments