Skip to content

[improve][broker] Reduce PendingAcksMap memory by replacing IntIntPair values with packed longs#26030

Open
void-ptr974 wants to merge 9 commits into
apache:masterfrom
void-ptr974:pending-acks-pack-value-part2
Open

[improve][broker] Reduce PendingAcksMap memory by replacing IntIntPair values with packed longs#26030
void-ptr974 wants to merge 9 commits into
apache:masterfrom
void-ptr974:pending-acks-pack-value-part2

Conversation

@void-ptr974

@void-ptr974 void-ptr974 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Main Issue: #26027

Motivation

PendingAcksMap tracks entries that have been dispatched to Shared and Key_Shared consumers but are not fully acknowledged yet. Each pending ack is keyed by ledger id and entry id, and stores two values: remainingUnacked and stickyKeyHash.

Before this PR, each value was stored as an IntIntPair, which adds one heap object per pending ack entry. Most consumer hot paths only need the remaining unacked count after lookup or removal, so this extra object increases retained heap and creates avoidable allocations on update/remove paths.

This PR only changes the value representation. It keeps the existing ordered ledger/entry map layout.

Modifications

  • Pack remainingUnacked and stickyKeyHash into one primitive long.
    • high 32 bits: remainingUnacked
    • low 32 bits: stickyKeyHash
  • Add package-private PendingAckValues helper methods for packing, unpacking, and sentinel handling.
  • Reserve remainingUnacked == -1 as the not-found sentinel and reject negative packed runtime values.
  • Change the per-ledger value map from Long2ObjectSortedMap<IntIntPair> to Long2LongSortedMap.
  • Add primitive hot-path accessors: getRemainingUnacked and removeAndGetRemainingUnacked.
  • Remove the now-unused pair-returning methods (PendingAcksMap#get, PendingAcksMap#removeAndGet, and Consumer#removePendingAckAndGet) so production code no longer materializes IntIntPair values.
  • Update Consumer call sites and add focused tests/benchmarks.

Memory estimate

JOL on OpenJDK 25.0.2 reports these object sizes:

Object Size
IntIntImmutablePair 24 B
Long2ObjectRBTreeMap$Entry 40 B
Long2LongRBTreeMap$Entry 40 B
Long2ObjectRBTreeMap 64 B
Long2LongRBTreeMap 72 B
nodePath array per tree map 272 B
dirPath array per tree map 80 B

Let E be pending ack entries and L be active ledgers.

Before:

64E + 456L + 416 bytes

After:

40E + 464L + 416 bytes

Estimated saving:

24E - 8L bytes

The 24E saving comes from removing one IntIntPair object per pending ack entry. The 8L offset comes from each inner Long2LongRBTreeMap being 8 bytes larger than Long2ObjectRBTreeMap.

For 50,000 pending ack entries across 20 ledgers:

24 * 50,000 - 8 * 20 = 1,199,840 bytes

The object-count reduction is E, so the same case removes 50,000 heap objects.

JOL layout check

The full JOL CLI repro is available here:

https://gist.github.com/void-ptr974/97f38f62eebab58aa953dad84f6deb3f

The gist uses JOL internals to inspect the base object layouts and JOL footprint to inspect the populated before/after map graphs. The summarized footprint numbers match the estimate:

Entries / ledgers Estimated saving JOL before JOL after JOL saving
1,000 / 1 23,992 B 64,872 B / 2,007 objects 40,880 B / 1,007 objects 23,992 B / 1,000 objects
1,000 / 20 23,840 B 73,536 B / 2,083 objects 49,696 B / 1,083 objects 23,840 B / 1,000 objects
50,000 / 1 1,199,992 B 3,200,872 B / 100,007 objects 2,000,880 B / 50,007 objects 1,199,992 B / 50,000 objects
50,000 / 20 1,199,840 B 3,209,536 B / 100,083 objects 2,009,696 B / 50,083 objects 1,199,840 B / 50,000 objects

Benchmark

The benchmark compares the former PendingAcksMap value layout with the current packed-value layout. It uses an inline copy of the former implementation and delegates the current implementation directly to PendingAcksMap.

The benchmark methods are named as JMH scenarios. Most of them call real PendingAcksMap production methods. Some scenarios add a refill step only to keep the benchmark state stable across iterations:

Benchmark scenario Production method being measured Note
addOrReplace addPendingAckIfAllowed Benchmark adapter name only
containsHit contains Existing lookup path
getRemainingUnackedHit getRemainingUnacked Primitive hot path
updateRemainingUnacked updateRemainingUnacked Partial batch ack update path
removeAndGetRemainingAndAdd removeAndGetRemainingUnacked The trailing add only restores benchmark state
removeWithValueAndAdd remove(ledgerId, entryId, batchSize, stickyKeyHash) The trailing add only restores benchmark state
removeAllUpToSmallPrefixAndRefill removeAllUpTo The refill only restores benchmark state
removeAllUpToBeforeFirstEntry removeAllUpTo Control case with no actual entry removed
forEachScan forEach traversal Uses a sum to prevent dead-code elimination

Test environment:

  • OS: Ubuntu 22.04.5 LTS, Linux 5.15.0-164-generic, x86_64
  • CPU: Intel Xeon E5-2673 v4 @ 2.30GHz
  • Topology: 2 sockets, 2 NUMA nodes, 20 physical cores per socket, SMT enabled
  • JDK: OpenJDK 21.0.11
  • JMH: 1.37
  • JVM heap: -Xms8g -Xmx8g
  • Binding: numactl --physcpubind=0-19 --membind=0
  • JMH samples per result: Cnt=24 from 3 forks * 8 measurement iterations
  • Entries: 50000

Command:

numactl --physcpubind=0-19 --membind=0 \
  java -Xms8g -Xmx8g \
  -jar microbench/build/libs/microbench-5.0.0-M1-SNAPSHOT-benchmarks.jar \
  'PendingAcksMapPackingBenchmark\..*' \
  -p entries=50000 \
  -p ledgers=1,20 \
  -wi 5 -i 8 -w 2s -r 2s -f 3 \
  -bm avgt -tu ns \
  -prof gc

Results, ledgers=1

Benchmark Former Current Time change Allocation change
addOrReplace 121.321 +/- 1.149 ns/op 110.660 +/- 2.409 ns/op -8.8% 24.000 B/op -> ~0 B/op
containsHit 100.627 +/- 0.717 ns/op 77.440 +/- 0.412 ns/op -23.0% ~0 B/op -> ~0 B/op
getRemainingUnackedHit 109.043 +/- 1.105 ns/op 79.413 +/- 0.793 ns/op -27.2% ~0 B/op -> ~0 B/op
updateRemainingUnacked 172.174 +/- 3.638 ns/op 161.994 +/- 1.363 ns/op -5.9% 24.001 B/op -> ~0 B/op
removeAndGetRemainingAndAdd 546.718 +/- 17.202 ns/op 366.131 +/- 7.189 ns/op -33.0% 64.002 B/op -> 40.001 B/op
removeWithValueAndAdd 442.518 +/- 3.147 ns/op 385.052 +/- 5.674 ns/op -13.0% 88.002 B/op -> 40.001 B/op
removeAllUpToBeforeFirstEntry 37.919 +/- 1.697 ns/op 37.639 +/- 0.497 ns/op -0.7% 144.000 B/op -> 144.000 B/op
forEachScan 501.804 +/- 20.568 us/op 283.846 +/- 10.216 us/op -43.4% 1.723 B/op -> 40.974 B/op

Results, ledgers=20

Benchmark Former Current Time change Allocation change
addOrReplace 106.911 +/- 0.129 ns/op 93.786 +/- 0.357 ns/op -12.3% 24.000 B/op -> ~0 B/op
containsHit 80.041 +/- 0.930 ns/op 73.458 +/- 1.011 ns/op -8.2% ~0 B/op -> ~0 B/op
getRemainingUnackedHit 77.907 +/- 0.669 ns/op 67.452 +/- 1.152 ns/op -13.4% ~0 B/op -> ~0 B/op
updateRemainingUnacked 153.931 +/- 1.352 ns/op 138.780 +/- 0.425 ns/op -9.8% 24.001 B/op -> ~0 B/op
removeAndGetRemainingAndAdd 326.264 +/- 3.313 ns/op 311.748 +/- 5.072 ns/op -4.4% 64.001 B/op -> 40.001 B/op
removeWithValueAndAdd 367.315 +/- 2.858 ns/op 322.185 +/- 5.512 ns/op -12.3% 88.001 B/op -> 40.001 B/op
removeAllUpToBeforeFirstEntry 37.142 +/- 0.813 ns/op 36.764 +/- 0.593 ns/op -1.0% 144.000 B/op -> 144.000 B/op
forEachScan 507.101 +/- 35.542 us/op 296.794 +/- 8.809 us/op -41.5% 1.742 B/op -> 41.021 B/op

Prefix removal

removeAllUpToSmallPrefixAndRefill uses a single-ledger state and refills the removed prefix after each invocation.

Benchmark Former Current Time change Allocation change
removeAllUpToSmallPrefixAndRefill 320.064 +/- 5.213 us/op 302.746 +/- 8.342 us/op -5.4% 64289.102 B/op -> 40369.042 B/op

The allocation changes correspond to replacing one per-entry IntIntPair heap object with one packed primitive long value. The measured allocation change is about -24 B/op in add/update paths, -24 B/op to -48 B/op in remove/add paths, and -23920 B/op in the small-prefix remove/refill benchmark.

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:

  • Added PendingAckValuesTest for packing round trips, signed sticky key hashes, sentinel behavior, and negative remainingUnacked rejection.
  • Extended PendingAcksMapTest for primitive accessors, value-based removal, updateRemainingUnacked, removeAllUpTo, missing-entry lookups, and boundary values.
  • Added PendingAcksMapPackingBenchmark for before/after allocation and timing comparison.

Local verification:

  • ./gradlew :pulsar-broker:test --tests org.apache.pulsar.broker.service.PendingAckValuesTest --tests org.apache.pulsar.broker.service.PendingAcksMapTest --max-workers=1 -PtestRetryCount=0
  • ./gradlew :pulsar-broker:checkstyleMain :pulsar-broker:checkstyleTest :pulsar-broker:spotlessCheck --max-workers=1
  • ./gradlew :microbench:compileJava --max-workers=1
  • ./gradlew :microbench:checkstyleMain --max-workers=1
  • ./gradlew :microbench:shadowJar --max-workers=1

Does this pull request potentially affect one of the following parts:

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

@void-ptr974

void-ptr974 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor Author

After this PR is merged, the next PR will switch the per-ledger entry map in PendingAcksMap to the primitive Long2LongOpenHashMap added in #26028.

That follow-up will keep the packed-value shape from this PR and focus only on removing the boxed inner-map key/value overhead. The BitSet prefix index for same-ledger removeAllUpTo cleanup will be a separate PR after that.

…alue-part2

# Conflicts:
#	pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java
#	pulsar-broker/src/test/java/org/apache/pulsar/broker/service/PendingAcksMapTest.java

@lhotari lhotari left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to #26028 (review). I have the fastutil minification working now and this change can be implemented on top of fastutil once the restoration of fastutil has been merged.

@void-ptr974 void-ptr974 requested a review from lhotari June 18, 2026 02:59
@void-ptr974

Copy link
Copy Markdown
Contributor Author

Thanks @lhotari. I updated this on top of #26032.

The latest version uses fastutil and keeps the scope to the value-layout change. I also updated the benchmark/JOL numbers in the PR description.

@lhotari lhotari left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the review comment about separation of concerns.

Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java Outdated
Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java Outdated
@void-ptr974 void-ptr974 force-pushed the pending-acks-pack-value-part2 branch from f7c44f8 to 6751ad9 Compare June 30, 2026 13:17
Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAckValue.java Outdated
Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java Outdated
@lhotari

lhotari commented Jun 30, 2026

Copy link
Copy Markdown
Member

Regarding the current title "[improve][broker] Part 2: Pack PendingAcksMap values into long", please revisit it.
Something like "Optimize PendingAcksMap by using a primitive long2long map for values".
In the PR description, the calculation of memory consumption can be shown for before and after. The calculation could be based on theoretical analysis (for example when there's no IntIntPair values, there's no object reference overhead etc.). In PIP-379, I made this type of estimation. It's obviously not the exact amount that would be consumed, but a good estimate. Something similar could be explained in the justification for this PR. It could be better to make the description contain the justification since referring to "part 2" in #26027 is outdated.

@void-ptr974 void-ptr974 force-pushed the pending-acks-pack-value-part2 branch from 6751ad9 to a905166 Compare June 30, 2026 14:39
@void-ptr974 void-ptr974 changed the title [improve][broker] Part 2: Pack PendingAcksMap values into long [improve][broker] Optimize PendingAcksMap value storage with primitive longs Jun 30, 2026
@lhotari

lhotari commented Jun 30, 2026

Copy link
Copy Markdown
Member

@void-ptr974 Please avoid force pushes to a PR branch when the review of the PR has started. It makes easier for reviewers to track the actual changes since the last time the reviewer performed a review.

@void-ptr974 void-ptr974 changed the title [improve][broker] Optimize PendingAcksMap value storage with primitive longs [improve][broker] Reduce PendingAcksMap memory by replacing IntIntPair values with packed longs Jun 30, 2026
@void-ptr974

void-ptr974 commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Sorry about that. I'll avoid force-pushing after review has started next time.

I updated the title and PR description based on your suggestion, including the self-contained context, memory estimate, and JOL layout details.

@merlimat

Copy link
Copy Markdown
Contributor

@void-ptr974 for the forEachScan, do we know why we still have 277.6 us/op, 815.4 B/op? Could the allocations be removed / reduced further?

@void-ptr974

Copy link
Copy Markdown
Contributor Author

@merlimat I updated the JMH benchmark and reran it on a Linux server with NUMA pinning. The previous numbers were from a local macOS run and may have more noise, so please refer to the latest PR description for the updated benchmark data.

For forEachScan, the latest numbers are:

  • former layout: 563.022 +/- 41.151 us/op, 1.932 B/op
  • packed layout: 354.509 +/- 16.877 us/op, 2.946 B/op

The previous 815.4 B/op result does not reproduce with the updated benchmark.

Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java Outdated
Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java Outdated
Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/Consumer.java Outdated

@lhotari lhotari left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a question/comment

@lhotari lhotari left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good work @void-ptr974

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants