io_uring: bound injected completions per dispatcher tick#44665
io_uring: bound injected completions per dispatcher tick#44665aburan28 wants to merge 4 commits into
Conversation
The injected-completion drain loop in IoUringImpl::forEveryCompletion was unbounded. A steady stream of injected completions, or completion callbacks that inject more completions, could stall the dispatcher thread indefinitely and starve other event sources sharing the same loop. The TODO at io_uring_impl.cc:78 called this out explicitly. Cap the per-tick drain at a configurable bound (default 1024). When the cap is hit with completions still queued, write to the registered eventfd to re-arm the file event so processing resumes on the next dispatcher tick. The eventfd is non-blocking and is fully drained at the top of the method, so the self-poke is safe. Add a unit test that injects more completions than the cap, asserts the drain is split across multiple ticks, and verifies all completions still fire (proving the eventfd re-arm path works without an explicit activate()). Signed-off-by: Adam Buran <aburan28@gmail.com>
|
Hi @aburan28, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
ravenblackx
left a comment
There was a problem hiding this comment.
Thanks, I like this improvement. One nitty request about the tests, but I'm marking it approved by me early since it's going to need a review from a senior maintainer anyway.
| // Each tick after the first must have been driven by the eventfd self-poke since we never | ||
| // called activate() again — proving the re-arm path works. | ||
| EXPECT_EQ(7, total_completions); | ||
| EXPECT_GE(event_callback_invocations, 3u); |
There was a problem hiding this comment.
Why isn't this EXPECT_EQ? I thought for a moment it might be because other unrelated events could increment it, but it's only incremented by this specific event, so it seems like it would always be exactly 3.
If there is a reason, please add a comment explaining it; if there is not, please make it EXPECT_EQ.
(Also for consistency either total_completions should be uint32_t or event_callback_invocations should be int32_t, they're both just counting up from zero so it's weird that they're different types.)
|
/assign-from @envoyproxy/senior-maintainers |
|
@envoyproxy/senior-maintainers assignee is @zuercher |
|
could you merge main please @aburan28 |
…ound-injected-completions # Conflicts: # changelogs/current.yaml
|
DCO needs to be fixed now. Also can you address the comments in ravenblackx's review? |
|
/wait |
Per ravenblackx review: tighten EXPECT_GE to EXPECT_EQ since the eventfd is only written by IoUring's self-poke, so the callback fires exactly 3 times for 3+3+1. Also unify event_callback_invocations to int32_t for consistency with the other counters. Signed-off-by: Adam Buran <aburan28@gmail.com>
Signed-off-by: Adam Buran <aburan28@gmail.com> # Conflicts: # source/common/io/io_uring_impl.cc
548de3d to
39c3a9a
Compare
Commit Message:
io_uring: bound injected completions per dispatcher tick
IoUringImpl::forEveryCompletion's injected-completion drain loop is unbounded:a steady stream of injected completions — or callbacks that themselves inject
more completions — can stall the dispatcher thread indefinitely and starve
other event sources sharing the same loop. The TODO at io_uring_impl.cc:78
called this out explicitly.
This PR caps the per-tick drain at a configurable bound (default 1024). When
the cap is reached with completions still queued,
forEveryCompletionwritesto the registered eventfd to re-arm the file event so processing resumes on
the next dispatcher tick. The eventfd is non-blocking and is fully drained at
the top of
forEveryCompletion, so the self-poke is safe and idempotent.Additional Description:
realistic traffic; in steady state the loop drains the queue and never
re-arms.
(
IoUringWorkerImpl) are unchanged.can be a follow-up if anyone needs it.
AI usage disclosure: Portions of the code and/or PR description were drafted
with the assistance of Claude (Anthropic). I reviewed and understand all
submitted code.
Risk Level: Low
(Caps an existing unbounded loop with a generous default. Existing callers
are unchanged. In steady state this is a no-op.)
Testing:
IoUringImplTest.BoundedInjectedCompletionsPerEvent: injects 7completions with cap=3 and asserts the drain splits 3/3/1 across three
dispatcher ticks, and that all 7 fire without any explicit
activate()beyond the first — verifying the eventfd self-poke path.
io_uringunit and integration tests on Linux CI.Docs Changes: N/A
Release Notes: Added a
bug_fixesentry inchangelogs/current.yamlunderio_uring.Platform Specific Features: N/A. io_uring is Linux-only and the affected
code already lives behind the existing io_uring build gating; this PR does
not change platform support.
Runtime guard: N/A. The new bound takes effect unconditionally with a
generous default; happy to add a guard if maintainers prefer.