Commit 15569bb
committed
OnPair decoder: combined (offset|length) table + skip canonicalize double-copy
Two production improvements with measured benchmark backing. A side-by-side
microbench was used to compare four candidate decoders against each other on
the same compressed array; only the winning variant was kept (numbers below).
Combined `(offset << 16) | length` table
----------------------------------------
`OwnedDecodeInputs::collect` now packs `dict_offsets` into a single
`Buffer<u64>` table at materialise time. The hot decode loop loads one u64
per token instead of two adjacent u32s — `entry = *table_ptr.add(c);
off = entry >> 16; len = entry & 0xffff` — matching the strategy
`onpair_cpp/include/onpair/decoding/decoder.h` uses on its hot path. The
table costs `dict_size * 8` bytes (32 KiB at dict-12) which is amortised
over every row decode and trivially small next to the row payload.
Drop double-copy in `canonicalize_onpair`
-----------------------------------------
Previously the canonical buffer was assembled as:
let mut buf: Vec<u8> = Vec::with_capacity(total + MAX_TOKEN_SIZE);
dv.decode_rows_into_with_size(0, n, total, &mut buf);
let mut out_bytes = ByteBufferMut::with_capacity(buf.len());
out_bytes.extend_from_slice(&buf); // ← second memcpy
Now we decode straight into `ByteBufferMut::spare_capacity_mut()`, so the
entire decoded payload is written exactly once.
Strategies that lost the bench (see git history for the full
benchmark + experimental variants):
* Padding every dict entry to 16 B (no `dict_offsets`, straight `c * 16`
lookup): 25 % faster on 10 K and 100 K rows but **3.6× slower on 1 M
rows** — extra working set blew out of L2.
* Non-temporal stores (`_mm_stream_si128`): catastrophic — the
`cursor % 16` realign branch + `sfence` per token tanked it by 17×.
Final numbers (release, URL/log corpus, dict-12, 30 samples)
------------------------------------------------------------
before after speedup
raw decode 10 K 60 µs 56 µs 1.07×
raw decode 100 K 693 µs 635 µs 1.09×
raw decode 1 M 9.5 ms 9.6 ms ≈ 1×
canonicalize 10 K 190 µs 171 µs 1.11×
canonicalize 100 K 2.35 ms 1.85 ms 1.27×
canonicalize 1 M 55 ms 29.7 ms **1.85×**
The raw-decode-only speedup is modest (the inner loop is already
memory-bound at 1 M), but the canonicalize end-to-end win is dominated
by the dropped second memcpy.
Verified
* `cargo test -p vortex-onpair -p vortex-btrblocks` — all green.
* `cargo test -p vortex-file --features onpair,tokio
--test test_onpair_string_roundtrip` — all 5 green.
Signed-off-by: Claude <noreply@anthropic.com>1 parent d9a6c8c commit 15569bb
3 files changed
Lines changed: 122 additions & 28 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
6 | | - | |
7 | | - | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
8 | 16 | | |
9 | 17 | | |
10 | 18 | | |
| 19 | + | |
11 | 20 | | |
12 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
13 | 24 | | |
14 | 25 | | |
15 | 26 | | |
| |||
23 | 34 | | |
24 | 35 | | |
25 | 36 | | |
| 37 | + | |
26 | 38 | | |
| 39 | + | |
27 | 40 | | |
28 | 41 | | |
29 | 42 | | |
| |||
63 | 76 | | |
64 | 77 | | |
65 | 78 | | |
66 | | - | |
67 | | - | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
68 | 116 | | |
69 | 117 | | |
70 | 118 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
67 | 76 | | |
68 | 77 | | |
69 | 78 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
37 | 41 | | |
38 | 42 | | |
39 | 43 | | |
40 | 44 | | |
41 | 45 | | |
42 | 46 | | |
| 47 | + | |
| 48 | + | |
43 | 49 | | |
44 | 50 | | |
45 | | - | |
| 51 | + | |
| 52 | + | |
46 | 53 | | |
47 | 54 | | |
48 | 55 | | |
| |||
52 | 59 | | |
53 | 60 | | |
54 | 61 | | |
| 62 | + | |
55 | 63 | | |
56 | 64 | | |
57 | 65 | | |
58 | 66 | | |
59 | 67 | | |
60 | 68 | | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
61 | 83 | | |
62 | 84 | | |
63 | 85 | | |
| |||
67 | 89 | | |
68 | 90 | | |
69 | 91 | | |
70 | | - | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
71 | 98 | | |
72 | 99 | | |
73 | 100 | | |
| |||
79 | 106 | | |
80 | 107 | | |
81 | 108 | | |
82 | | - | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
83 | 115 | | |
84 | 116 | | |
85 | 117 | | |
| |||
96 | 128 | | |
97 | 129 | | |
98 | 130 | | |
| 131 | + | |
99 | 132 | | |
100 | 133 | | |
101 | 134 | | |
| |||
189 | 222 | | |
190 | 223 | | |
191 | 224 | | |
192 | | - | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
193 | 228 | | |
194 | 229 | | |
195 | 230 | | |
| |||
203 | 238 | | |
204 | 239 | | |
205 | 240 | | |
206 | | - | |
207 | | - | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
208 | 244 | | |
209 | | - | |
| 245 | + | |
210 | 246 | | |
211 | 247 | | |
212 | 248 | | |
213 | | - | |
| 249 | + | |
214 | 250 | | |
215 | 251 | | |
216 | 252 | | |
| |||
221 | 257 | | |
222 | 258 | | |
223 | 259 | | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
228 | 265 | | |
229 | 266 | | |
230 | 267 | | |
| |||
0 commit comments