Commit 0165e61
committed
GH-3495: Deprecate LittleEndianDataOutputStream and remove remaining wrapper usages
This is an API cleanup commit with no measurable performance impact;
it removes the last two production usages of LittleEndianDataOutputStream
so the class can be deprecated.
After the previous commit removed LittleEndianDataOutputStream from
PlainValuesWriter, two production usages remained:
- FixedLenByteArrayPlainValuesWriter wrapped its CapacityByteArrayOutputStream
in a LittleEndianDataOutputStream solely to call Binary.writeTo(out) for the
fixed-length payload. The fixed-length encoding has no length prefix and the
wrapper exposed no LE-specific behavior used here -- Binary.writeTo() only
invokes OutputStream.write(byte[], int, int), which the wrapper passes
through unchanged. The wrapper has been removed and the writer now writes
the binary payload directly to the underlying CapacityByteArrayOutputStream.
The wrapper-specific flush() in getBytes() is also gone (CBOS does not
buffer).
- DeltaLengthByteArrayValuesWriter had the same pattern: a wrapper used only
for v.writeTo(out) on the concatenated byte-array payload, with lengths
written through a separate DeltaBinaryPackingValuesWriterForInteger. The
wrapper has been removed for the same reasons.
With no remaining production usages, LittleEndianDataOutputStream is marked
@deprecated. The class is retained for binary compatibility (it is part of
the public parquet-common API) and will be removed in a future major release.
The javadoc directs producers of PLAIN-encoded data to write little-endian
values directly into a ByteBuffer with ByteOrder.LITTLE_ENDIAN, which
compiles to a single intrinsic store on little-endian architectures and
avoids the per-call byte decomposition and virtual dispatch performed by
this class.
Benchmarks (BinaryEncodingBenchmark, JMH -wi 5 -i 10 -f 3, 30 samples per
row, 100k values per invocation):
encodeDeltaLengthByteArray (touched by this commit):
Param Before (ops/s) After (ops/s) Delta
LOW / 10 21,695,420 22,106,165 +1.9%
LOW / 100 6,803,834 6,798,992 +0.0%
LOW / 1000 865,171 866,820 +0.2%
HIGH / 10 19,985,164 20,225,337 +1.2%
HIGH / 100 5,677,955 5,600,746 -1.4%
HIGH / 1000 695,246 704,673 +1.4%
encodeDeltaByteArray (uses DeltaLengthByteArrayValuesWriter for suffixes):
Param Before (ops/s) After (ops/s) Delta
LOW / 10 11,423,436 10,854,196 -5.0%
LOW / 100 4,810,962 4,821,321 +0.2%
LOW / 1000 646,415 667,650 +3.3%
HIGH / 10 10,047,817 9,963,667 -0.8%
HIGH / 100 4,337,800 4,103,937 -5.4%
HIGH / 1000 574,769 580,489 +1.0%
All deltas are within roughly +/- 5% with allocation rates per operation
unchanged within 2%, consistent with measurement noise rather than a
systematic effect on either direction. The motivation for this change is
code health (one fewer wrapper layer in the writer call chain, ability to
deprecate an internal-shaped class), not performance.
All 573 parquet-column tests and 308 parquet-common tests pass.1 parent b27c80e commit 0165e61
3 files changed
Lines changed: 13 additions & 19 deletions
File tree
- parquet-column/src/main/java/org/apache/parquet/column/values
- deltalengthbytearray
- plain
- parquet-common/src/main/java/org/apache/parquet/bytes
Lines changed: 1 addition & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
26 | 25 | | |
27 | 26 | | |
28 | 27 | | |
| |||
46 | 45 | | |
47 | 46 | | |
48 | 47 | | |
49 | | - | |
50 | 48 | | |
51 | 49 | | |
52 | 50 | | |
53 | | - | |
54 | 51 | | |
55 | 52 | | |
56 | 53 | | |
| |||
63 | 60 | | |
64 | 61 | | |
65 | 62 | | |
66 | | - | |
| 63 | + | |
67 | 64 | | |
68 | 65 | | |
69 | 66 | | |
| |||
76 | 73 | | |
77 | 74 | | |
78 | 75 | | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | 76 | | |
85 | 77 | | |
86 | 78 | | |
| |||
Lines changed: 1 addition & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
26 | 25 | | |
27 | 26 | | |
28 | 27 | | |
| |||
37 | 36 | | |
38 | 37 | | |
39 | 38 | | |
40 | | - | |
41 | 39 | | |
42 | 40 | | |
43 | 41 | | |
| |||
46 | 44 | | |
47 | 45 | | |
48 | 46 | | |
49 | | - | |
50 | 47 | | |
51 | 48 | | |
52 | 49 | | |
| |||
56 | 53 | | |
57 | 54 | | |
58 | 55 | | |
59 | | - | |
| 56 | + | |
60 | 57 | | |
61 | 58 | | |
62 | 59 | | |
| |||
69 | 66 | | |
70 | 67 | | |
71 | 68 | | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | 69 | | |
78 | 70 | | |
79 | 71 | | |
| |||
Lines changed: 11 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
28 | 37 | | |
| 38 | + | |
29 | 39 | | |
30 | 40 | | |
31 | 41 | | |
| |||
0 commit comments