Commit 8c00da0
committed
apacheGH-3503: Optimize ByteStreamSplitValuesWriter with batched scatter writes
The current ByteStreamSplitValuesWriter.writeFloat/writeDouble/writeInteger/
writeLong path allocates a new byte[4] or byte[8] per value via
BytesUtils.intToBytes / BytesUtils.longToBytes, then dispatches one
single-byte CapacityByteArrayOutputStream.write(int) call per byte per
value (4 calls per float/int, 8 per double/long). For a 100k-value page
that is up to 800k single-byte virtual dispatches plus 100k short-lived
byte[] allocations.
This change collapses that hot path in two stacked steps:
1. Eliminate the per-value byte[] allocation by inlining the
little-endian decomposition with bit shifts into helper methods
bufferInt(int) / bufferLong(long), instead of going through
BytesUtils.intToBytes / BytesUtils.longToBytes which allocate
byte[4] / byte[8] on every call.
2. Batch values into a small per-instance scratch buffer (BATCH_SIZE = 128)
and flush them as N bulk write(byte[], off, len) calls per stream per
flush, replacing N * elementSizeInBytes single-byte virtual dispatches
with elementSizeInBytes bulk writes. The batch is flushed automatically
when full, on getBytes(), and is included in getBufferedSize() so page
sizing decisions remain correct. reset() and close() clear the pending
batch.
Benchmark (ByteStreamSplitEncodingBenchmark, 100k values per invocation,
JMH -wi 3 -i 5 -f 1):
Type Before (ops/s) After (ops/s) Improvement
Float 99,333,148 536,838,625 +440% (5.4x)
Double 49,754,756 411,012,257 +726% (8.3x)
Int 97,458,782 534,894,208 +449% (5.5x)
Long 50,862,770 423,182,754 +732% (8.3x)
All 573 parquet-column tests pass.1 parent 53d7842 commit 8c00da0
1 file changed
Lines changed: 88 additions & 6 deletions
File tree
- parquet-column/src/main/java/org/apache/parquet/column/values/bytestreamsplit
Lines changed: 88 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
24 | 23 | | |
25 | 24 | | |
26 | 25 | | |
| |||
29 | 28 | | |
30 | 29 | | |
31 | 30 | | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
32 | 38 | | |
33 | 39 | | |
34 | 40 | | |
35 | 41 | | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
36 | 49 | | |
37 | 50 | | |
38 | 51 | | |
| |||
53 | 66 | | |
54 | 67 | | |
55 | 68 | | |
56 | | - | |
| 69 | + | |
| 70 | + | |
57 | 71 | | |
58 | 72 | | |
59 | 73 | | |
| |||
62 | 76 | | |
63 | 77 | | |
64 | 78 | | |
| 79 | + | |
65 | 80 | | |
66 | 81 | | |
67 | 82 | | |
| |||
76 | 91 | | |
77 | 92 | | |
78 | 93 | | |
| 94 | + | |
79 | 95 | | |
80 | 96 | | |
81 | 97 | | |
82 | 98 | | |
83 | 99 | | |
84 | 100 | | |
85 | 101 | | |
| 102 | + | |
86 | 103 | | |
87 | 104 | | |
88 | 105 | | |
| |||
99 | 116 | | |
100 | 117 | | |
101 | 118 | | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
102 | 184 | | |
103 | 185 | | |
104 | 186 | | |
| |||
116 | 198 | | |
117 | 199 | | |
118 | 200 | | |
119 | | - | |
| 201 | + | |
120 | 202 | | |
121 | 203 | | |
122 | 204 | | |
| |||
133 | 215 | | |
134 | 216 | | |
135 | 217 | | |
136 | | - | |
| 218 | + | |
137 | 219 | | |
138 | 220 | | |
139 | 221 | | |
| |||
149 | 231 | | |
150 | 232 | | |
151 | 233 | | |
152 | | - | |
| 234 | + | |
153 | 235 | | |
154 | 236 | | |
155 | 237 | | |
| |||
165 | 247 | | |
166 | 248 | | |
167 | 249 | | |
168 | | - | |
| 250 | + | |
169 | 251 | | |
170 | 252 | | |
171 | 253 | | |
| |||
0 commit comments