quic: make multiple improvements to packet#62589
Open
jasnell wants to merge 4 commits intonodejs:mainfrom
Open
quic: make multiple improvements to packet#62589jasnell wants to merge 4 commits intonodejs:mainfrom
jasnell wants to merge 4 commits intonodejs:mainfrom
Conversation
Collaborator
|
Review requested:
|
Previously Packets were ReqWrap objects with a shared free-list. This commit changes to a per-Endpoint arena with no v8 involvement. This is the design I originally had in mind but I initially went with the simpler freelist approach to get something working. There's too much overhead in the reqrap/freelist approach and individual packets do not really need to be observable via async hooks. This design should eliminate the risk of memory fragmentation and eliminate a significant bottleneck in the hot path. Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode:Opus 4.6
Handful of additional improvements to the Packet class. Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode:Opus 4.6
ead61aa to
66d349a
Compare
Signed-off-by: James M Snell <jasnell@gmail.com>
66d349a to
da1e78a
Compare
Signed-off-by: James M Snell <jasnell@gmail.com>
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Collaborator
Qard
approved these changes
Apr 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously Packets were ReqWrap objects with a shared free-list. This commit changes to a per-Endpoint arena with no v8 involvement. This is the design I originally had in mind but I initially went with the simpler freelist approach to get something working. There's too much overhead in the reqrap/freelist approach and individual packets do not really need to be observable via async hooks.
This design should eliminate the risk of memory fragmentation and eliminate a significant bottleneck in the hot path.
Summary of improvements:
Memory Comparison
Fragmentation
Before: Each packet reuse from the freelist still called std::make_shared(length, label) — a new heap allocation for the Data object + its shared_ptr control block + the std::string diagnostic label. These are small, variably-sized allocations scattered across the heap.
After: All slots are identical 1,712-byte regions within contiguous 214 KB blocks. Zero per-packet heap allocations during steady-state operation. The only allocations happen when a new block is grown.
Performance Comparison
Acquire (hot path — called up to 32× per SendPendingData)
Before (freelist hit):
Before (freelist miss):
After (always):
The new acquire is essentially 2 pointer operations + a placement new. No heap allocation, no V8 involvement, no atomic operations (shared_ptr control block had atomics).
Release (send callback — every completed packet)
Before:
After:
The new release is pointer arithmetic + 2 pointer operations + 2 decrements. No atomic operations, no heap free, no V8 interaction.
Send path (UDP::Send)
Before: ClearWeak() + Dispatched() + uv_udp_send() + on error: Done() + MakeWeak()
After: Ptr::release() (1 pointer swap) + uv_udp_send() + on error: ArenaPool::Release()
SendPendingData loop (up to 32 packets per call)
Before: Each iteration potentially triggered JS_NEW_INSTANCE_OR_RETURN (V8 object allocation) on freelist miss, plus std::make_shared on every iteration.
After: Each iteration is just a free list pop + placement new. For a full 32-packet burst from a warm pool, this is ~32 × (2 pointer ops + a memset/memcpy for the Packet fields) — essentially zero allocation cost.
GC pressure
Before: Each Packet had a persistent V8 JS object. When the freelist was full (>100 packets), excess packets were destroyed, leaving their V8 objects for the garbage collector. Under high throughput, this created ongoing GC pressure proportional to packet churn.
After: Zero V8 objects. Zero GC pressure from packets. The ArenaPool::MaybeGC() only runs when >50% of total slots are free and only frees entire blocks — a rare bulk operation, not per-packet work.
Summary
The biggest wins are eliminating the per-packet V8 object allocation (which could trigger GC) and the shared_ptr atomic operations on every acquire/release. For a high-throughput QUIC session sending 32 packets per SendPendingData call, the new path is essentially allocation-free after the first block is populated.
Signed-off-by: James M Snell jasnell@gmail.com
Assisted-by: Opencode:Opus 4.6