HDDS-15592 use heap allocation in ChunkBuffer#10536
Conversation
|
|
||
| /** | ||
| * When true, {@link #allocate} uses direct ByteBuffers instead of heap ByteBuffers. | ||
| * Set only by {@link BlockOutputStreamWriteBenchmark} to compare allocation strategies |
There was a problem hiding this comment.
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBuffer.java:36: warning - Tag @link: reference not found: BlockOutputStreamWriteBenchmark
https://github.com/yandrey321/ozone/actions/runs/27780341047/job/82205691733#step:13:2213
| @VisibleForTesting | ||
| // CHECKSTYLE:OFF VisibilityModifier | ||
| AtomicBoolean ALLOCATE_DIRECT = new AtomicBoolean(false); | ||
| // CHECKSTYLE:ON VisibilityModifier |
There was a problem hiding this comment.
Please:
- avoid checkstyle suppression
- do not add
@VisibleForTesting
There was a problem hiding this comment.
I wanted something lightweight for a benchmark that proves performance improvements.
There was a problem hiding this comment.
But this is not the benchmark code.
There was a problem hiding this comment.
this is a switch that is used in the benchmark to find out the diff between heap and direct allocations
There was a problem hiding this comment.
Making it configurable in prod is not a big change: adoroszlai@79cbbf6
There was a problem hiding this comment.
Why should it be configurable at all? The purpose of the benchmarks is to produce evidence so we can make a conclusive decision.
There was a problem hiding this comment.
The benchmark needs to test various implementations. Adding the benchmark code to the master branch in the main Ozone repo requires that those implementations also exist there. Choosing between the implementations using an internal toggle that even requires CHECKSTYLE:OFF-style hacks is ugly.
Results may depend on hardware/software environment. Providing this simple toggle configuration does not add much complexity, but enables users to test in their own environment if needed.
I would prefer adding performance tests in a separate repo:
- various POC implementations can be freely added without affecting production code
- JMH can be used without license problems (HDDS-6202)
- can test different versions of Ozone
- performance tests would not be confused with functional tests (see
TestUnhealthyContainersDerbyPerformance)
There was a problem hiding this comment.
I endorse the idea of an ozone-dev style public repo for POC, benchmarks, & profiling; it would be particularly great if we could keep reference to the benchmark and profiling results themselves for later development. But if the second repo is also Apache, does that let us off the GPL hook?
There was a problem hiding this comment.
But if the second repo is also Apache, does that let us off the GPL hook?
Yes, since we would not be distributing (releasing) code from that repo.
There was a problem hiding this comment.
Removed the switch and use heap allocations by default.
|
cc @rnblough |
There was a problem hiding this comment.
In this PR the benchmark is a shell script but in #10546 it is a java class. We should be consistent with how we create benchmarks and where we put them.
There was a problem hiding this comment.
removed the shell script
| @VisibleForTesting | ||
| // CHECKSTYLE:OFF VisibilityModifier | ||
| AtomicBoolean ALLOCATE_DIRECT = new AtomicBoolean(false); | ||
| // CHECKSTYLE:ON VisibilityModifier |
There was a problem hiding this comment.
Why should it be configurable at all? The purpose of the benchmarks is to produce evidence so we can make a conclusive decision.
What changes were proposed in this pull request?
When ChunkBuffer allocates heap buffer UnsafeByteOperations.unsafeWrap() returns a BoundedByteString with a backing array, enabling a single System.arraycopy into the gRPC/Netty wire buffer instead of the slow byte-by-byte NioByteString path for direct buffers.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-15592
How was this patch tested?
Benchmark run on m3 mac/ARM with 14 cores producing the following results: Heap allocation is consistently +8% to +18% faster across all test.
4096 KB writes
2048 KB writes
1024 KB writes
512 KB writes
256 KB writes