Skip to content

Commit 2fd3508

Browse files
committed
apacheGH-3522: Optimize IntList.size() from O(slabs) to O(1) with running counter
IntList.size() was iterating over all slabs to sum their lengths on every call. Replace with a simple totalSize counter incremented on each add(). This eliminates O(slabs) overhead from dictionary encoding hot paths where size() is called frequently.
1 parent 53d7842 commit 2fd3508

1 file changed

Lines changed: 3 additions & 6 deletions

File tree

  • parquet-column/src/main/java/org/apache/parquet/column/values/dictionary

parquet-column/src/main/java/org/apache/parquet/column/values/dictionary/IntList.java

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ private void incrementPosition() {
9999
// not be added
100100
private int[] currentSlab;
101101
private int currentSlabPos;
102+
private int totalSize;
102103

103104
private void allocateSlab() {
104105
currentSlab = new int[currentSlabSize];
@@ -129,6 +130,7 @@ public void add(int i) {
129130

130131
currentSlab[currentSlabPos] = i;
131132
++currentSlabPos;
133+
++totalSize;
132134
}
133135

134136
/**
@@ -150,11 +152,6 @@ public IntIterator iterator() {
150152
* @return the current size of the list
151153
*/
152154
public int size() {
153-
int size = currentSlabPos;
154-
for (int[] slab : slabs) {
155-
size += slab.length;
156-
}
157-
158-
return size;
155+
return totalSize;
159156
}
160157
}

0 commit comments

Comments
 (0)