[improve][broker] Reduce PendingAcksMap memory by replacing IntIntPair values with packed longs#26030
[improve][broker] Reduce PendingAcksMap memory by replacing IntIntPair values with packed longs#26030void-ptr974 wants to merge 9 commits into
Conversation
|
After this PR is merged, the next PR will switch the per-ledger entry map in That follow-up will keep the packed-value shape from this PR and focus only on removing the boxed inner-map key/value overhead. The BitSet prefix index for same-ledger |
…alue-part2 # Conflicts: # pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java # pulsar-broker/src/test/java/org/apache/pulsar/broker/service/PendingAcksMapTest.java
lhotari
left a comment
There was a problem hiding this comment.
Related to #26028 (review). I have the fastutil minification working now and this change can be implemented on top of fastutil once the restoration of fastutil has been merged.
…alue-part2 # Conflicts: # pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java
lhotari
left a comment
There was a problem hiding this comment.
Please check the review comment about separation of concerns.
f7c44f8 to
6751ad9
Compare
|
Regarding the current title "[improve][broker] Part 2: Pack PendingAcksMap values into long", please revisit it. |
6751ad9 to
a905166
Compare
|
@void-ptr974 Please avoid force pushes to a PR branch when the review of the PR has started. It makes easier for reviewers to track the actual changes since the last time the reviewer performed a review. |
|
Sorry about that. I'll avoid force-pushing after review has started next time. I updated the title and PR description based on your suggestion, including the self-contained context, memory estimate, and JOL layout details. |
|
@void-ptr974 for the |
|
@merlimat I updated the JMH benchmark and reran it on a Linux server with NUMA pinning. The previous numbers were from a local macOS run and may have more noise, so please refer to the latest PR description for the updated benchmark data. For
The previous |
Main Issue: #26027
Motivation
PendingAcksMaptracks entries that have been dispatched to Shared and Key_Shared consumers but are not fully acknowledged yet. Each pending ack is keyed by ledger id and entry id, and stores two values:remainingUnackedandstickyKeyHash.Before this PR, each value was stored as an
IntIntPair, which adds one heap object per pending ack entry. Most consumer hot paths only need the remaining unacked count after lookup or removal, so this extra object increases retained heap and creates avoidable allocations on update/remove paths.This PR only changes the value representation. It keeps the existing ordered ledger/entry map layout.
Modifications
remainingUnackedandstickyKeyHashinto one primitivelong.remainingUnackedstickyKeyHashPendingAckValueshelper methods for packing, unpacking, and sentinel handling.remainingUnacked == -1as the not-found sentinel and reject negative packed runtime values.Long2ObjectSortedMap<IntIntPair>toLong2LongSortedMap.getRemainingUnackedandremoveAndGetRemainingUnacked.PendingAcksMap#get,PendingAcksMap#removeAndGet, andConsumer#removePendingAckAndGet) so production code no longer materializesIntIntPairvalues.Consumercall sites and add focused tests/benchmarks.Memory estimate
JOL on OpenJDK 25.0.2 reports these object sizes:
IntIntImmutablePairLong2ObjectRBTreeMap$EntryLong2LongRBTreeMap$EntryLong2ObjectRBTreeMapLong2LongRBTreeMapnodePatharray per tree mapdirPatharray per tree mapLet
Ebe pending ack entries andLbe active ledgers.Before:
After:
Estimated saving:
The
24Esaving comes from removing oneIntIntPairobject per pending ack entry. The8Loffset comes from each innerLong2LongRBTreeMapbeing 8 bytes larger thanLong2ObjectRBTreeMap.For
50,000pending ack entries across20ledgers:The object-count reduction is
E, so the same case removes50,000heap objects.JOL layout check
The full JOL CLI repro is available here:
https://gist.github.com/void-ptr974/97f38f62eebab58aa953dad84f6deb3f
The gist uses JOL
internalsto inspect the base object layouts and JOLfootprintto inspect the populated before/after map graphs. The summarized footprint numbers match the estimate:Benchmark
The benchmark compares the former
PendingAcksMapvalue layout with the current packed-value layout. It uses an inline copy of the former implementation and delegates the current implementation directly toPendingAcksMap.The benchmark methods are named as JMH scenarios. Most of them call real
PendingAcksMapproduction methods. Some scenarios add a refill step only to keep the benchmark state stable across iterations:addOrReplaceaddPendingAckIfAllowedcontainsHitcontainsgetRemainingUnackedHitgetRemainingUnackedupdateRemainingUnackedupdateRemainingUnackedremoveAndGetRemainingAndAddremoveAndGetRemainingUnackedremoveWithValueAndAddremove(ledgerId, entryId, batchSize, stickyKeyHash)removeAllUpToSmallPrefixAndRefillremoveAllUpToremoveAllUpToBeforeFirstEntryremoveAllUpToforEachScanforEachtraversalTest environment:
-Xms8g -Xmx8gnumactl --physcpubind=0-19 --membind=0Cnt=24from3 forks * 8 measurement iterations50000Command:
numactl --physcpubind=0-19 --membind=0 \ java -Xms8g -Xmx8g \ -jar microbench/build/libs/microbench-5.0.0-M1-SNAPSHOT-benchmarks.jar \ 'PendingAcksMapPackingBenchmark\..*' \ -p entries=50000 \ -p ledgers=1,20 \ -wi 5 -i 8 -w 2s -r 2s -f 3 \ -bm avgt -tu ns \ -prof gcResults,
ledgers=1addOrReplace121.321 +/- 1.149 ns/op110.660 +/- 2.409 ns/op-8.8%24.000 B/op -> ~0 B/opcontainsHit100.627 +/- 0.717 ns/op77.440 +/- 0.412 ns/op-23.0%~0 B/op -> ~0 B/opgetRemainingUnackedHit109.043 +/- 1.105 ns/op79.413 +/- 0.793 ns/op-27.2%~0 B/op -> ~0 B/opupdateRemainingUnacked172.174 +/- 3.638 ns/op161.994 +/- 1.363 ns/op-5.9%24.001 B/op -> ~0 B/opremoveAndGetRemainingAndAdd546.718 +/- 17.202 ns/op366.131 +/- 7.189 ns/op-33.0%64.002 B/op -> 40.001 B/opremoveWithValueAndAdd442.518 +/- 3.147 ns/op385.052 +/- 5.674 ns/op-13.0%88.002 B/op -> 40.001 B/opremoveAllUpToBeforeFirstEntry37.919 +/- 1.697 ns/op37.639 +/- 0.497 ns/op-0.7%144.000 B/op -> 144.000 B/opforEachScan501.804 +/- 20.568 us/op283.846 +/- 10.216 us/op-43.4%1.723 B/op -> 40.974 B/opResults,
ledgers=20addOrReplace106.911 +/- 0.129 ns/op93.786 +/- 0.357 ns/op-12.3%24.000 B/op -> ~0 B/opcontainsHit80.041 +/- 0.930 ns/op73.458 +/- 1.011 ns/op-8.2%~0 B/op -> ~0 B/opgetRemainingUnackedHit77.907 +/- 0.669 ns/op67.452 +/- 1.152 ns/op-13.4%~0 B/op -> ~0 B/opupdateRemainingUnacked153.931 +/- 1.352 ns/op138.780 +/- 0.425 ns/op-9.8%24.001 B/op -> ~0 B/opremoveAndGetRemainingAndAdd326.264 +/- 3.313 ns/op311.748 +/- 5.072 ns/op-4.4%64.001 B/op -> 40.001 B/opremoveWithValueAndAdd367.315 +/- 2.858 ns/op322.185 +/- 5.512 ns/op-12.3%88.001 B/op -> 40.001 B/opremoveAllUpToBeforeFirstEntry37.142 +/- 0.813 ns/op36.764 +/- 0.593 ns/op-1.0%144.000 B/op -> 144.000 B/opforEachScan507.101 +/- 35.542 us/op296.794 +/- 8.809 us/op-41.5%1.742 B/op -> 41.021 B/opPrefix removal
removeAllUpToSmallPrefixAndRefilluses a single-ledger state and refills the removed prefix after each invocation.removeAllUpToSmallPrefixAndRefill320.064 +/- 5.213 us/op302.746 +/- 8.342 us/op-5.4%64289.102 B/op -> 40369.042 B/opThe allocation changes correspond to replacing one per-entry
IntIntPairheap object with one packed primitivelongvalue. The measured allocation change is about-24 B/opin add/update paths,-24 B/opto-48 B/opin remove/add paths, and-23920 B/opin the small-prefix remove/refill benchmark.Verifying this change
This change added tests and can be verified as follows:
PendingAckValuesTestfor packing round trips, signed sticky key hashes, sentinel behavior, and negativeremainingUnackedrejection.PendingAcksMapTestfor primitive accessors, value-based removal,updateRemainingUnacked,removeAllUpTo, missing-entry lookups, and boundary values.PendingAcksMapPackingBenchmarkfor before/after allocation and timing comparison.Local verification:
./gradlew :pulsar-broker:test --tests org.apache.pulsar.broker.service.PendingAckValuesTest --tests org.apache.pulsar.broker.service.PendingAcksMapTest --max-workers=1 -PtestRetryCount=0./gradlew :pulsar-broker:checkstyleMain :pulsar-broker:checkstyleTest :pulsar-broker:spotlessCheck --max-workers=1./gradlew :microbench:compileJava --max-workers=1./gradlew :microbench:checkstyleMain --max-workers=1./gradlew :microbench:shadowJar --max-workers=1Does this pull request potentially affect one of the following parts: