perf(event_dispatcher): lock-free emit via atomic shared_ptr snapshot#70
Conversation
Replace shared_mutex reader lock with std::atomic<std::shared_ptr<const HandlerList>> copy-on-write snapshot for emit()/emit_safe(), plus an atomic handler counter fast path for the zero-subscriber case. Add DetourModKit_bench microbench (DMK_BUILD_BENCHMARKS) and three new tests covering the fast path, snapshot stability, and reclamation. Include loader-lock destructor hardening in HookManager to keep move-only VmtHookEntry values out of container copy fallbacks. CI: add MSVC verify job to coverage-pages.yml, gate pages deploy on both MinGW and MSVC passing.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 27 minutes and 7 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (10)
📒 Files selected for processing (7)
📝 WalkthroughWalkthroughThis PR performs a comprehensive architectural refactor of Changes
Sequence Diagram(s)sequenceDiagram
participant Pub as Publisher<br/>(emit/emit_safe)
participant Atom as Atomic Snapshot<br/>(shared_ptr)
participant WritMut as Writer Mutex
participant Sub as Subscriber
participant Hand as Handlers<br/>(vector snapshot)
rect rgba(0, 100, 150, 0.5)
Note over Pub,Hand: Emit Path (Lock-Free)
Pub->>Atom: Load snapshot (atomic, no lock)
Atom-->>Pub: Immutable handler vector
Pub->>Hand: Iterate handlers
Hand->>Sub: Invoke each handler
end
rect rgba(150, 100, 0, 0.5)
Note over WritMut,Hand: Subscribe/Unsubscribe Path (Copy-on-Write)
Sub->>WritMut: Lock writer mutex
WritMut->>Hand: Clone current vector
Hand->>Hand: Insert/remove handler
Hand->>Atom: Atomically publish new snapshot
Atom-->>WritMut: Old snapshot released
WritMut->>Sub: Unlock mutex
end
rect rgba(100, 150, 0, 0.5)
Note over Pub,Atom: Zero-Subscriber Fast Path
Pub->>Atom: Check atomic counter == 0
Atom-->>Pub: Return (skip snapshot load)
Pub->>Pub: Skip emit entirely
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
.github/workflows/coverage-pages.yml (1)
77-90:⚠️ Potential issue | 🟠 MajorThe 80% coverage gate is no longer enforced.
This
gcovrinvocation only prints and uploads the report. Without--fail-under-line 80(or an equivalent explicit check), the workflow stays green below the documented threshold anddeploy-pageswill still run.Suggested fix
- name: Generate HTML Coverage Report run: | mkdir -p coverage-report gcovr \ --root . \ --filter "src/" \ --filter "include/" \ --exclude "external/" \ --exclude "build/" \ --exclude "tests/" \ --gcov-ignore-parse-errors=negative_hits.warn_once_per_file \ + --fail-under-line 80 \ --print-summary \ --html-details coverage-report/index.html \ build/mingw-debug shell: bash🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/coverage-pages.yml around lines 77 - 90, The gcovr step currently only generates and uploads HTML and does not enforce the 80% gate; update the gcovr invocation in the "Generate HTML Coverage Report" step (the gcovr command that writes coverage-report/index.html) to include a fail threshold flag such as --fail-under-line 80 (or an equivalent explicit check) so the job fails when line coverage is under 80% and prevents the subsequent deploy-pages step from running.tests/test_hook_manager.cpp (1)
2453-2468:⚠️ Potential issue | 🟡 MinorSynchronize the shutdown racer before asserting it is blocked.
EXPECT_FALSE(shutdown_returned)also passes whenkillerjust has not reachedhm.shutdown()yet, so this test can miss a regression where shutdown returns immediately. Add a started handoff before the sleep/assertion.Suggested test hardening
- std::atomic<bool> shutdown_returned{false}; + std::atomic<bool> shutdown_started{false}; + std::atomic<bool> shutdown_returned{false}; std::thread killer([&]() { + shutdown_started.store(true, std::memory_order_release); hm.shutdown(); shutdown_returned.store(true, std::memory_order_release); }); + while (!shutdown_started.load(std::memory_order_acquire)) + { + std::this_thread::yield(); + } + // Give the killer a chance to start; it must not return while the // reader holds shared_lock (shutdown path acquires exclusive on // m_hooks_mutex after the shared disable pass). Assert that🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_hook_manager.cpp` around lines 2453 - 2468, The test races because EXPECT_FALSE(shutdown_returned) can be true simply because the killer thread hasn't reached hm.shutdown() yet; add a "started" handoff so the main thread waits until the killer has entered the shutdown call before asserting it's blocked. Introduce a synchronization flag/future (e.g., atomic<bool> killer_started or std::promise<void>/future) set inside the killer thread immediately before calling hm.shutdown(), have the main thread wait on that handoff (instead of just sleeping) and only then sleep briefly and assert shutdown_returned.load(... ) is false; reference the existing killer thread, shutdown_returned, and hm.shutdown() to locate where to add the handoff and wait. Ensure the handoff is set before hm.shutdown() and cleared/checked after to preserve the original ordering semantics involving m_mutator_gate and m_hooks_mutex.
🧹 Nitpick comments (1)
tests/bench_event_dispatcher.cpp (1)
51-73: Median helper is only well-defined for oddsamples.
per_op[per_op.size() / 2]returns the upper-middle value for even-length samples rather than(lo+hi)/2. All current callers pass11, so this is fine in practice — but a future caller with an even sample count would get a biased "median" silently. Consider anassert(samples % 2 == 1)or interpolating the two middle values.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/bench_event_dispatcher.cpp` around lines 51 - 73, The median_ns_per_op helper currently returns per_op[per_op.size() / 2], which yields the upper-middle value for even sample counts; update median_ns_per_op to either assert that samples is odd (e.g., assert(samples % 2 == 1)) or compute the true median by averaging the two middle values when per_op.size() is even (take lo = per_op[n/2 - 1], hi = per_op[n/2], return (lo + hi) / 2.0); modify the logic in median_ns_per_op to sort per_op as it does and then branch on per_op.size() % 2 to return the correct median value so future callers with even samples aren’t biased.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/analysis/event_dispatcher_bench_v3.2.0/README.md`:
- Around line 94-97: The README's suggested restore command uses "git checkout
HEAD -- include/DetourModKit/event_dispatcher.hpp", which re-checks out the PR
branch's header instead of the pre-change baseline; update the instruction to
check out the base branch or a specific pre-change commit (e.g., reference the
base branch name or commit SHA) when restoring
include/DetourModKit/event_dispatcher.hpp so the true baseline implementation is
used for the "before" benchmark.
In `@include/DetourModKit/event_dispatcher.hpp`:
- Around line 348-357: The clear() and unsubscribe() methods are marked noexcept
but perform allocations via std::make_shared and vector::reserve/push_back
(risking std::bad_alloc which will call std::terminate); update these writer
paths to avoid implicit termination by either removing noexcept from
unsubscribe() (and document behavior) or by catching std::bad_alloc inside
unsubscribe() and clear(), leaving handlers_/handler_count_ unchanged on
allocation failure and returning a failure status (e.g., make unsubscribe()
return bool) so callers like Subscription::reset() and the stored lambda can
retry; also add a clear class-level note that mutation may allocate and either
throws or returns false on OOM and reference using emit_safe() from handler
callbacks as per project guidance.
- Around line 13-33: Update the documentation wording that currently calls the
emit path "lock-free": replace claims like "lock-free on the hot path" with
language such as "wait-free zero-subscriber fast path; short-critical-section
snapshot load" or similar that notes the snapshot uses an internal bit-lock;
update the phrases around emit()/emit_safe(), handlers_.load(), and
handler_count_.load() mentions (locations noted) so readers understand the fast
path is real but relies on a short internal critical section rather than true
lock-freedom.
In `@README.md`:
- Around line 108-109: Replace the misleading "Lock-free `emit()` /
`emit_safe()`" phrasing with wording that accurately reflects the design: state
that `emit()` / `emit_safe()` are "reader-lock-free" or "no shared_mutex on the
hot path" (they perform an atomic acquire-load of a `std::shared_ptr<const
vector>` snapshot and iterate without a shared_mutex), and note that
`std::atomic<std::shared_ptr<...>>` may use internal locking on some toolchains;
similarly update the zero-subscriber fast path description to keep `emit()` /
`emit_safe()` and the copy-on-write `subscribe()` / `unsubscribe()` behavior but
avoid claiming toolchain-independent lock-free semantics.
In `@src/hook_manager.cpp`:
- Around line 176-179: Add compile-time guards to ensure the maps' move
constructors are noexcept: insert static_asserts that check
std::is_nothrow_move_constructible_v<HookMap> and
std::is_nothrow_move_constructible_v<VmtHookMap> immediately before the new
expressions that create leaked_hooks and leaked_vmt_hooks (which move m_hooks
and m_vmt_hooks). This enforces at compile time that HookMap and VmtHookMap meet
the noexcept move requirement used in the noexcept destructor fallback.
In `@tests/bench_event_dispatcher.cpp`:
- Around line 41-46: The global volatile g_sink used by noop_handler causes a
data race when bench_concurrent_emit calls noop_handler from multiple threads;
replace g_sink with std::atomic<std::uint64_t> and change noop_handler to
perform a relaxed fetch_add of e.value (or equivalent relaxed RMW) to remove the
UB and reduce cross-core contention, and update the final print/inspection site
to load the atomic value (with relaxed or appropriate ordering) instead of
reading the volatile variable.
---
Outside diff comments:
In @.github/workflows/coverage-pages.yml:
- Around line 77-90: The gcovr step currently only generates and uploads HTML
and does not enforce the 80% gate; update the gcovr invocation in the "Generate
HTML Coverage Report" step (the gcovr command that writes
coverage-report/index.html) to include a fail threshold flag such as
--fail-under-line 80 (or an equivalent explicit check) so the job fails when
line coverage is under 80% and prevents the subsequent deploy-pages step from
running.
In `@tests/test_hook_manager.cpp`:
- Around line 2453-2468: The test races because EXPECT_FALSE(shutdown_returned)
can be true simply because the killer thread hasn't reached hm.shutdown() yet;
add a "started" handoff so the main thread waits until the killer has entered
the shutdown call before asserting it's blocked. Introduce a synchronization
flag/future (e.g., atomic<bool> killer_started or std::promise<void>/future) set
inside the killer thread immediately before calling hm.shutdown(), have the main
thread wait on that handoff (instead of just sleeping) and only then sleep
briefly and assert shutdown_returned.load(... ) is false; reference the existing
killer thread, shutdown_returned, and hm.shutdown() to locate where to add the
handoff and wait. Ensure the handoff is set before hm.shutdown() and
cleared/checked after to preserve the original ordering semantics involving
m_mutator_gate and m_hooks_mutex.
---
Nitpick comments:
In `@tests/bench_event_dispatcher.cpp`:
- Around line 51-73: The median_ns_per_op helper currently returns
per_op[per_op.size() / 2], which yields the upper-middle value for even sample
counts; update median_ns_per_op to either assert that samples is odd (e.g.,
assert(samples % 2 == 1)) or compute the true median by averaging the two middle
values when per_op.size() is even (take lo = per_op[n/2 - 1], hi = per_op[n/2],
return (lo + hi) / 2.0); modify the logic in median_ns_per_op to sort per_op as
it does and then branch on per_op.size() % 2 to return the correct median value
so future callers with even samples aren’t biased.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 25774a5a-a5ef-4081-9792-c8c755ffd712
⛔ Files ignored due to path filters (2)
docs/analysis/event_dispatcher_bench_v3.2.0/after.tsvis excluded by!**/*.tsvdocs/analysis/event_dispatcher_bench_v3.2.0/before.tsvis excluded by!**/*.tsv
📒 Files selected for processing (12)
.github/workflows/coverage-pages.ymlAGENTS.mdCMakeLists.txtREADME.mddocs/analysis/event_dispatcher_bench_v3.2.0/README.mddocs/tests/README.mdinclude/DetourModKit/event_dispatcher.hppsrc/hook_manager.cpptests/CMakeLists.txttests/bench_event_dispatcher.cpptests/test_event_dispatcher.cpptests/test_hook_manager.cpp
Archive 5 runs per side under runs/, report medians with run-to-run spread as the noise floor. The prior single-run 64-subscriber regression was a statistical outlier (within 1% across 5 runs per side). Rename the directory to match the v3.1.0 release label.
- event_dispatcher: catch bad_alloc in noexcept unsubscribe()/clear(), leave state unchanged, return false for retry via Subscription::reset() - event_dispatcher + README: qualify "lock-free" claims to note that std::atomic<std::shared_ptr> may use an implementation-internal bit lock on toolchains without DWCAS (e.g. MSVC STL); the wait-free zero-subscriber fast path is unaffected - hook_manager: static_assert HookMap/VmtHookMap are nothrow-move- constructible so the noexcept loader-lock leak path cannot terminate - bench: replace volatile g_sink with std::atomic; make median_ns_per_op correct for even sample counts - test_hook_manager: tighten shutdown-blocks-killer race via an explicit killer-entered handoff before the EXPECT_FALSE assertion - coverage-pages.yml: enforce 80% line coverage gate via gcovr --fail-under-line 80 - bench docs: restore command references main, drop stale single-run before.tsv/after.tsv in favor of runs/ median
Summary
Replaces
shared_mutex-guardedemit()with anstd::atomic<std::shared_ptr<const HandlerList>>copy-on-write snapshot, and adds an atomic handler-count fast path for the zero-subscriber case.emit()andemit_safe()are now lock-free on the hot path;subscribe()/unsubscribe()serialize writers on a smallstd::mutexand publish a new immutable snapshot.Perf (mingw-release, median of 5 runs per side)
emitemitemitemitemit_concurrent_4_threads(8 subs)subscribe_unsub_roundtripVariance-backed. An earlier single-run measurement suggested an 18% regression at 64 subscribers; across 5 runs per side the delta drops to within noise. Full table, per-run TSVs, and methodology in
docs/analysis/event_dispatcher_bench_v3.1.0/README.md.Changes
include/DetourModKit/event_dispatcher.hpp: atomic snapshot + handler counter; fast path inemit/emit_safe; writer mutex for COW mutations.tests/test_event_dispatcher.cpp: 3 new tests (EmptyFastPath_SkipsLock,SnapshotStability_DuringEmit,SnapshotReclamation_NoLeak) viaDMK_EVENT_DISPATCHER_INTERNAL_TESTING.tests/bench_event_dispatcher.cpp+ CMake wiring: opt-inDMK_BUILD_BENCHMARKS=ONbuilds a standaloneDetourModKit_bench(no gtest dep).src/hook_manager.cpp+tests/test_hook_manager.cpp: loader-lock destructor now heap-allocates the leaked maps directly to avoid container copy fallbacks on move-onlyVmtHookEntry;static_assertguards the ownership contract..github/workflows/coverage-pages.yml: parallelmsvc-verifyjob; pages deploy now requires both MinGW coverage and MSVC verify to pass.README.md,AGENTS.md,docs/tests/README.mdupdated for the new model + bench harness.Test plan
mingw-debug.EventDispatcherTest.*29 / 29 pass, including the three new tests.DetourModKit_benchbuilds and runs clean onmingw-release.Summary by CodeRabbit
New Features
DMK_BUILD_BENCHMARKSbuild option for standalone performance testingBug Fixes
Documentation