Commit 39fc18c
committed
GH-3495: Optimize PlainValuesWriter with direct ByteBuffer slab writes
PlainValuesWriter previously wrote values through a two-layer abstraction:
PlainValuesWriter -> LittleEndianDataOutputStream -> CapacityByteArrayOutputStream.
Each writeInt() decomposed the int into 4 bytes in a temp writeBuffer[8]
array, then dispatched through the OutputStream chain. Since
CapacityByteArrayOutputStream already uses ByteBuffer slabs internally, we
can write directly to the slab with putInt()/putLong() using LITTLE_ENDIAN
byte order -- a single JVM intrinsic on x86/ARM -- eliminating the byte
decomposition, temp array, and virtual dispatch.
Changes:
- CapacityByteArrayOutputStream: set ByteOrder.LITTLE_ENDIAN on newly
allocated slabs in addSlab(); add writeInt(int) and writeLong(long)
methods that use currentSlab.putInt(v) / currentSlab.putLong(v) directly.
- PlainValuesWriter: remove the LittleEndianDataOutputStream field; route
writeInteger/writeLong/writeFloat/writeDouble/writeBytes through the
underlying CapacityByteArrayOutputStream directly. writeFloat and
writeDouble use Float.floatToIntBits / Double.doubleToLongBits + the new
writeInt/writeLong methods. getBytes() no longer needs to flush a
buffering layer; close() no longer closes the defunct stream.
Benchmark (IntEncodingBenchmark.encodePlain, 100k INT32 values per
invocation, JMH -wi 3 -i 5 -f 1):
Pattern Before (ops/s) After (ops/s) Improvement
SEQUENTIAL 26,817,451 52,953,193 +97.5% (2.0x)
RANDOM 28,517,312 37,774,036 +32.5%
LOW_CARDINALITY 28,705,158 52,819,678 +84.0%
HIGH_CARDINALITY 28,595,519 37,862,571 +32.4%
The same code path also benefits writeLong, writeFloat, writeDouble, and
the length prefix written by writeBytes(Binary).1 parent 53d7842 commit 39fc18c
2 files changed
Lines changed: 37 additions & 36 deletions
File tree
- parquet-column/src/main/java/org/apache/parquet/column/values/plain
- parquet-common/src/main/java/org/apache/parquet/bytes
Lines changed: 7 additions & 36 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
27 | 26 | | |
28 | 27 | | |
29 | 28 | | |
| |||
41 | 40 | | |
42 | 41 | | |
43 | 42 | | |
44 | | - | |
45 | 43 | | |
46 | 44 | | |
47 | 45 | | |
48 | | - | |
49 | 46 | | |
50 | 47 | | |
51 | 48 | | |
52 | 49 | | |
53 | 50 | | |
54 | | - | |
55 | | - | |
| 51 | + | |
| 52 | + | |
56 | 53 | | |
57 | 54 | | |
58 | 55 | | |
59 | 56 | | |
60 | 57 | | |
61 | 58 | | |
62 | 59 | | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
| 60 | + | |
68 | 61 | | |
69 | 62 | | |
70 | 63 | | |
71 | 64 | | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
| 65 | + | |
77 | 66 | | |
78 | 67 | | |
79 | 68 | | |
80 | 69 | | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
| 70 | + | |
86 | 71 | | |
87 | 72 | | |
88 | 73 | | |
89 | 74 | | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
| 75 | + | |
95 | 76 | | |
96 | 77 | | |
97 | 78 | | |
98 | 79 | | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
| 80 | + | |
104 | 81 | | |
105 | 82 | | |
106 | 83 | | |
| |||
110 | 87 | | |
111 | 88 | | |
112 | 89 | | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | 90 | | |
119 | 91 | | |
120 | 92 | | |
| |||
127 | 99 | | |
128 | 100 | | |
129 | 101 | | |
130 | | - | |
131 | 102 | | |
132 | 103 | | |
133 | 104 | | |
| |||
Lines changed: 30 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
| |||
194 | 195 | | |
195 | 196 | | |
196 | 197 | | |
| 198 | + | |
197 | 199 | | |
198 | 200 | | |
199 | 201 | | |
| |||
225 | 227 | | |
226 | 228 | | |
227 | 229 | | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
228 | 258 | | |
229 | 259 | | |
230 | 260 | | |
| |||
0 commit comments