Skip to content

HDDS-15592 use heap allocation in ChunkBuffer#10536

Open
yandrey321 wants to merge 4 commits into
apache:masterfrom
yandrey321:HDDS-15592
Open

HDDS-15592 use heap allocation in ChunkBuffer#10536
yandrey321 wants to merge 4 commits into
apache:masterfrom
yandrey321:HDDS-15592

Conversation

@yandrey321

@yandrey321 yandrey321 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

When ChunkBuffer allocates heap buffer UnsafeByteOperations.unsafeWrap() returns a BoundedByteString with a backing array, enabling a single System.arraycopy into the gRPC/Netty wire buffer instead of the slow byte-by-byte NioByteString path for direct buffers.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15592

How was this patch tested?

Benchmark run on m3 mac/ARM with 14 cores producing the following results: Heap allocation is consistently +8% to +18% faster across all test.

4096 KB writes

threads direct MB/s heap MB/s Δ%
1 5,344 5,929 +10.9%
2 10,112 11,183 +10.6%
4 17,971 19,836 +10.4%
7 24,696 29,157 +18.1%
14 28,025 32,222 +15.0%
28 27,709 31,067 +12.1%
42 27,617 31,235 +13.1%

2048 KB writes

threads direct MB/s heap MB/s Δ%
1 5,144 5,801 +12.8%
2 9,920 11,310 +14.0%
4 18,087 20,006 +10.6%
7 24,876 29,258 +17.6%
14 27,661 31,677 +14.5%
28 27,594 31,170 +13.0%
42 26,703 30,894 +15.7%

1024 KB writes

threads direct MB/s heap MB/s Δ%
1 5,125 5,803 +13.2%
2 10,245 11,269 +10.0%
4 18,160 19,540 +7.6%
7 24,973 29,282 +17.3%
14 27,541 32,346 +17.4%
28 27,222 31,110 +14.3%
42 27,824 31,328 +12.6%

512 KB writes

threads direct MB/s heap MB/s Δ%
1 5,239 5,825 +11.2%
2 10,236 11,338 +10.8%
4 18,378 20,177 +9.8%
7 25,233 29,494 +16.9%
14 28,008 32,575 +16.3%
28 27,846 31,433 +12.9%
42 28,935 31,229 +7.9%

256 KB writes

threads direct MB/s heap MB/s Δ%
1 5,050 5,840 +15.6%
2 10,238 11,375 +11.1%
4 18,100 19,652 +8.6%
7 25,221 29,532 +17.1%
14 28,072 32,755 +16.7%
28 28,564 31,270 +9.5%
42 27,745 31,299 +12.8%


/**
* When true, {@link #allocate} uses direct ByteBuffers instead of heap ByteBuffers.
* Set only by {@link BlockOutputStreamWriteBenchmark} to compare allocation strategies

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBuffer.java:36: warning - Tag @link: reference not found: BlockOutputStreamWriteBenchmark

https://github.com/yandrey321/ozone/actions/runs/27780341047/job/82205691733#step:13:2213

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +39 to +42
@VisibleForTesting
// CHECKSTYLE:OFF VisibilityModifier
AtomicBoolean ALLOCATE_DIRECT = new AtomicBoolean(false);
// CHECKSTYLE:ON VisibilityModifier

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please:

  • avoid checkstyle suppression
  • do not add @VisibleForTesting

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted something lightweight for a benchmark that proves performance improvements.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is not the benchmark code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a switch that is used in the benchmark to find out the diff between heap and direct allocations

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making it configurable in prod is not a big change: adoroszlai@79cbbf6

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should it be configurable at all? The purpose of the benchmarks is to produce evidence so we can make a conclusive decision.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark needs to test various implementations. Adding the benchmark code to the master branch in the main Ozone repo requires that those implementations also exist there. Choosing between the implementations using an internal toggle that even requires CHECKSTYLE:OFF-style hacks is ugly.

Results may depend on hardware/software environment. Providing this simple toggle configuration does not add much complexity, but enables users to test in their own environment if needed.

I would prefer adding performance tests in a separate repo:

  • various POC implementations can be freely added without affecting production code
  • JMH can be used without license problems (HDDS-6202)
  • can test different versions of Ozone
  • performance tests would not be confused with functional tests (see TestUnhealthyContainersDerbyPerformance)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I endorse the idea of an ozone-dev style public repo for POC, benchmarks, & profiling; it would be particularly great if we could keep reference to the benchmark and profiling results themselves for later development. But if the second repo is also Apache, does that let us off the GPL hook?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if the second repo is also Apache, does that let us off the GPL hook?

Yes, since we would not be distributing (releasing) code from that repo.

https://www.apache.org/legal/resolved.html

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the switch and use heap allocations by default.

@errose28

Copy link
Copy Markdown
Contributor

cc @rnblough

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR the benchmark is a shell script but in #10546 it is a java class. We should be consistent with how we create benchmarks and where we put them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the shell script

Comment on lines +39 to +42
@VisibleForTesting
// CHECKSTYLE:OFF VisibilityModifier
AtomicBoolean ALLOCATE_DIRECT = new AtomicBoolean(false);
// CHECKSTYLE:ON VisibilityModifier

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should it be configurable at all? The purpose of the benchmarks is to produce evidence so we can make a conclusive decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants