Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ include/DetourModKit/ # Public headers -- one per module
config_watcher.hpp # Filesystem watcher (ReadDirectoryChangesW) for INI hot-reload
input.hpp # Input polling (keyboard/mouse/XInput) + opt-in mouse-wheel capture and gamepad passthrough suppression
input_codes.hpp # Unified InputCode type and named key tables (incl. WheelUp/Down/Left/Right)
srw_shared_mutex.hpp # Opaque Windows SRWLOCK SharedLockable wrapper; use for reader/writer locks that need native Windows semantics without exposing Windows headers
memory.hpp # Memory read/write, sharded region cache, seh_read<T>, seh_resolve_chain/seh_read_chain<T>, plausible_userspace_ptr, PE module range
rtti.hpp # MSVC RTTI walker (type_name_of, vtable_is_type, find_in_pointer_table) + reverse name-to-vtable (vtable_for_type, vtables_for_type, TypeIdentity)
rtti_dissect.hpp # Reverse RTTI dissection + self-healing offsets (identify_pointee_type, reverse_scan_block, heal_landmark/heal_offset, solve_fingerprint) + drift-telemetry report (heal_report, DriftEntry)
Expand Down Expand Up @@ -310,6 +311,7 @@ The raw `Scanner::find_pattern(start_address, region_size, pattern)` overloads d
- **Decoder tests:** `tests/test_x86_decode.cpp` tests the internal header `src/x86_decode.hpp` (RIP-relative E9 / EB / FF25 resolvers consumed by `Scanner`). The test file adds `src/` to its include path and drives each decoder with hand-crafted byte buffers.
- **Input interception tests:** `tests/test_input_intercept.cpp` tests the internal header `src/input_intercept.hpp` (same `src/`-on-include-path pattern as the decoder tests). It covers the two pure state machines that back the active-input layer: the wheel-pulse stepper (one notch maps to one Press edge with a forced low cycle) and the gamepad consume-until-release latch (modifier-released-before-trigger does not leak, release grace expiry, re-press during grace). The live hooks are exercised by integration tests against a throwaway top-level window the test process creates: the window-procedure subclass install/uninstall, per-direction wheel-notch capture, the consume-swallow versus forward decision (observed through a recording predecessor procedure), and the `WM_NCDESTROY` self-heal that lets a recreated window be re-subclassed. The XInput inline hook is exercised against a loaded `xinput` runtime (install, trampoline publish, idempotent re-install, teardown), and a standalone `InputPoller` test drives the full hot-reload disarm path (a consume wheel binding arms the swallow flag; `clear_bindings(false)` lets the poll loop disarm it on a later cycle). These live-hook tests skip themselves when the host has no window station or no XInput runtime, so a headless runner stays green. The one path still validated manually is the gamepad button masking itself, which requires a physically connected controller.
- **Worker tests:** `tests/test_worker.cpp` covers the `StoppableWorker` RAII `std::jthread` wrapper, including the empty-body early return, swallowed `std::exception` and unknown-exception paths, and idempotent `request_stop()` / `shutdown()`. The loader-lock detach arm is untestable from user code (only reached under DllMain) and is accepted as such.
- **SRW shared mutex tests:** `tests/test_srw_shared_mutex.cpp` covers the opaque Windows SRWLOCK SharedLockable wrapper used by DetourModKit reader/writer locks: shared-reader overlap, exclusive try-lock failure while readers are held, shared try-lock failure while a writer is held, a single-threaded try-lock/unlock sequence that proves both unlock paths actually release, and a blocked-writer handoff (a writer blocked in `lock()` cannot complete while a shared owner is live and acquires once the reader releases). File-scope static_asserts pin the deleted copy/move members (SRWLOCK owners and waiters reference the lock by address).
- **Pointer-chain tests:** `tests/test_memory_chain.cpp` is a deliberate second suite for the public `memory.hpp` surface, kept separate from `test_memory.cpp` because the single-fault-frame pointer-chain primitives (`plausible_userspace_ptr`, `seh_resolve_chain`, `seh_read_chain`, `seh_read_chain_bytes`) walk in-process pointer chains and need no cache or game-memory state, whereas `test_memory.cpp` drives the sharded cache, read/write, and module-range paths. Both suites bind to the same header; this is the only same-module split and is intentional for state isolation.
- **Reverse-RTTI / self-heal tests:** `tests/test_rtti_dissect.cpp` covers the `rtti_dissect.hpp` surface (L1 `identify_pointee_type` through L4 `solve_fingerprint`). It rebuilds the `SyntheticVtable` COL/TD/vtable fixture in the test exe's PE range (so the shared prelude's module bound-check accepts it) and adds a `syn_heap_object` helper that `VirtualAlloc`s an object outside every module range, exercising the pointer-to-object branch and the cross-DLL resolvability case. The suite replaces the throwing global `operator new`/`delete` with a malloc/free pair plus a counter so the `_AllocatesNothing` tests can assert the heal and fingerprint paths are allocation-free (each warms the module-range cache first, then measures a second call; the heal case is measured on the window-scan path, not just the nominal short-circuit); the aligned `operator new` forms are left at their defaults so over-aligned allocations never cross-free. This replacement is process-wide for the test binary but harmless to other suites (malloc/free is a valid implementation and only this suite reads the counter).
- **Reverse-RTTI by-name tests:** `tests/test_rtti_reverse.cpp` covers `rtti.hpp`'s reverse resolvers (`vtable_for_type`, `vtables_for_type`, `TypeIdentity`). It builds synthetic COL/TypeDescriptor/vtable fixtures in the test exe's data segment (so the prelude's module bound-check accepts them) and resolves them through a tight pool range for speed, with one case driving the real PE-section walk via `host_module_range()`. It exercises single- and multiple-inheritance (one name -> many sub-object vtables), the `COL.offset == 0` primary selection, ambiguous-name fail-closed, and the warm-cache identity path.
Expand Down Expand Up @@ -349,11 +351,11 @@ PATH="/c/msys64/mingw64/bin:$PATH" ./build/mingw-debug/tests/DetourModKit_tests.
| Module | Thread safety | Hot-path mechanism |
|--------|--------------|-------------------|
| Scanner | Stateless -- inherently safe | N/A (startup only) |
| HookManager | `shared_mutex` (readers) / `unique_lock` (writers); two-phase shutdown (disable under shared lock, clear under exclusive lock); `m_mutator_gate` (shared_mutex) blocks new mutators (including all VMT operations) during teardown; CAS on `m_shutdown_called` serializes shutdown/remove_all_hooks; double-checked fast-fail on `m_shutdown_called` in all mutators; destructor fallback (when `DMK_Shutdown()` was not called) acquires `m_mutator_gate` exclusively, flips `m_shutdown_called`, drains readers via exclusive `m_hooks_mutex`, then clears the maps -- under loader lock it pins the module and swaps each map's contents into heap storage allocated via `new (std::nothrow)` so the storage outlives the destructor without ever draining, mirroring the leak-on-loader-lock discipline used in `Logger::shutdown_internal` and `ConfigWatcher::~ConfigWatcher` | `shared_lock` for `with_inline_hook()` |
| HookManager | SRWLOCK-backed reader/writer locks; two-phase shutdown (disable under shared lock, clear under exclusive lock); `m_mutator_gate` blocks new mutators (including all VMT operations) during teardown; CAS on `m_shutdown_called` serializes shutdown/remove_all_hooks; double-checked fast-fail on `m_shutdown_called` in all mutators; destructor fallback (when `DMK_Shutdown()` was not called) acquires `m_mutator_gate` exclusively, flips `m_shutdown_called`, drains readers via exclusive `m_hooks_mutex`, then clears the maps -- under loader lock it pins the module and swaps each map's contents into heap storage allocated via `new (std::nothrow)` so the storage outlives the destructor without ever draining, mirroring the leak-on-loader-lock discipline used in `Logger::shutdown_internal` and `ConfigWatcher::~ConfigWatcher` | `shared_lock` SRWLOCK reader for `with_inline_hook()` |
| Logger | `atomic<shared_ptr>` for lock-free async reads; `shutdown_internal` is safe across repeated shutdown / enable_async_mode cycles: when the writer thread has to be detached under loader lock, the module is pinned and the `shared_ptr<AsyncLogger>` is moved into a per-call heap cell allocated via `new (std::nothrow)` rather than appended to a static `std::vector`, so the leak path keeps the noexcept destructor honest under OOM (returns nullptr instead of throwing `bad_alloc`) and prior handles are never dropped while their writer threads may still be running | Single atomic load on log level check |
| AsyncLogger | Lock-free MPMC queue (Vyukov-style); post-join drain on shutdown (at most one message per producer can be lost in the nanosecond race between drain and force-zero -- accepted trade-off to avoid atomic overhead on every enqueue); timestamp caching in write batches | Atomic sequence numbers per slot |
| InputPoller | Atomic `m_active_states[]` array | `shared_lock` (uncontended SRWLOCK reader) guarding the `m_active_states` pointer swap + `memory_order_relaxed` load per binding |
| InputManager | `mutex` for lifecycle, `atomic<InputPoller*>` for reads | `atomic<InputPoller*>` acquire-load, then the poller's `shared_lock` + relaxed load (not lock-free) |
| InputManager | `mutex` for lifecycle, `atomic<shared_ptr<InputPoller>>` for reads | `atomic<shared_ptr<InputPoller>>` acquire-load, then the poller's `shared_lock` + relaxed load (not lock-free) |
| InputIntercept (internal `src/input_intercept.*`) | File-scope atomics shared between the poll thread and the game's threads (XInput callers, window message thread); owns its safetyhook InlineHooks directly (not via HookManager) because the poll thread reads the trampoline and the hook lifetime is coupled to the poll thread; consume-until-release latch and wheel-pulse state are poll-thread-private; teardown skipped under loader lock (detours left installed against the pinned module) | Lock-free atomic loads in each detour; allocation-free, non-throwing detour bodies |
| Memory cache | Sharded `SRWLOCK` + epoch-based shutdown | Shared reader locks per shard |
| Config | `mutex` for registration; deferred setter invocation outside lock (no reentrancy guard needed -- setters may call back into Config); `reload()` re-runs the registered items against the stashed INI path using the same deferred pattern and short-circuits on FNV-1a 64 hash match of the on-disk bytes to skip no-op reloads; bytes are read once per load/reload and fed to `CSimpleIniA::LoadData`, so the cached hash and the parsed INI state are guaranteed to reflect the same file snapshot (no TOCTOU between hash and parse); `enable_auto_reload()` owns a `ConfigWatcher` behind a separate `std::mutex` so start/stop transitions do not contend with registration traffic; setters invoked by the watcher run on the watcher thread, setters invoked by the reload hotkey run on a dedicated `ReloadServicer` thread (lazily started on first `register_reload_hotkey`, torn down in `clear_registered_items()`) so the `InputManager` poll thread never blocks on INI parsing; the servicer's press-request path takes its internal `m_mutex` around the predicate store before `cv.notify_one` to close the lost-wakeup window; all setters must be reentrant and thread-safe | N/A (startup only) |
Expand All @@ -367,7 +369,7 @@ These are called at 60+ fps from game hook callbacks. Never add allocations, exc

- `InputPoller::is_binding_active(index)` -- `shared_lock` (uncontended SRWLOCK reader, guards the `m_active_states` pointer swap on reshape) + single `memory_order_relaxed` load
- `InputPoller::is_binding_active(name)` -- `shared_lock` + hash lookup + `memory_order_relaxed` load per matching binding (typically 1-3)
- `HookManager::with_inline_hook()` -- shared_lock read
- `HookManager::with_inline_hook()` -- `shared_lock` SRWLOCK reader
- `Logger::log()` level check -- single atomic load
- `Logger::log()` async enqueue -- atomic shared_ptr load + lock-free queue push
- `Memory::is_readable()` -- sharded SRWLOCK reader + cache lookup
Expand Down Expand Up @@ -410,7 +412,7 @@ These are called at 60+ fps from game hook callbacks. Never add allocations, exc
- **Do not use** `EventDispatcher::emit()` from hook callbacks -- use `emit_safe()` instead to prevent unhandled handler exceptions from crashing the host process.
- **Do not return** from a memory-writing helper before its post-write cache maintenance (instruction-cache flush and cache-range invalidation) once bytes have been modified -- run the cleanup on every exit path, even when a later step such as restoring page protection fails.
- **Do not let** a public doc comment describe behavior the implementation no longer has -- lifecycle and ordering claims must match the code, and are best pinned by a test.
- **Do not take** a shared lock in a const query/accessor that can be called from inside a `with_*`/`try_with_*` callback without first making it reentrancy-guard-aware (skip the lock when the per-thread guard is non-zero, since the callback already holds it). Recursive shared acquisition of the non-recursive `shared_mutex`/SRWLOCK on one thread is undefined behavior and deadlocks if a writer is queued between the two acquisitions (see `HookManager::lock_hooks_shared_reentrant`).
- **Do not take** a shared lock in a const query/accessor that can be called from inside a `with_*`/`try_with_*` callback without first making it reentrancy-guard-aware (skip the lock when the per-thread guard is non-zero, since the callback already holds it). Recursive shared acquisition of a non-recursive reader/writer lock on one thread is undefined behavior and deadlocks if a writer is queued between the two acquisitions (see `HookManager::lock_hooks_shared_reentrant`).
- **Do not key** a cache store and its invalidation/eviction by different addresses or shard-selection functions. Eviction must use the same canonical key and the same containment lookup as insertion and read, or entries silently survive invalidation (see `Memory::invalidate_range`, which scans every shard because storage is sharded by query address, not region base).
- **Do not let** a queue or backlog fed at an external event rate grow without bound -- clamp the pending count to a documented ceiling (see `Input` wheel-notch `MAX_WHEEL_PENDING`).
- **Do not let** an async/deferred sink diverge from its synchronous counterpart's configuration (timestamp format, level, etc.) -- carry the same settings through, as `Logger::enable_async_mode` does for the timestamp format.
17 changes: 11 additions & 6 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -527,15 +527,20 @@ if(DMK_ENABLE_SANITIZERS)
string(REGEX REPLACE "/RTC[1csu]+" "" CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG}")

# The vendored SafetyHook is not instrumented, so the MSVC STL ASan container
# annotations (annotate_string / annotate_vector) would differ across the link
# and fail with LNK2038. Disable those annotations on every instrumented target
# so all objects agree (value 0); the core ASan checks (heap/stack/use-after-
# free) stay on. The library is instrumented explicitly because it was created
# annotations (annotate_string / annotate_vector / annotate_optional) would
# differ across the link and fail with LNK2038. Disable those annotations on
# every instrumented target so all objects agree (value 0); the core ASan
# checks (heap/stack/use-after-free) stay on. _DISABLE_STL_ANNOTATION is the
# STL's umbrella switch and auto-defines every per-family disable the toolset
# knows, so annotation families added in future STL versions stay covered;
# the explicit string/vector macros remain for VS 2022 toolsets that predate
# the umbrella. The library is instrumented explicitly because it was created
# before this block; the tests (added afterwards) pick up the directory options.
target_compile_options(DetourModKit PRIVATE /fsanitize=address)
target_compile_definitions(DetourModKit PRIVATE _DISABLE_STRING_ANNOTATION=1 _DISABLE_VECTOR_ANNOTATION=1)
target_compile_definitions(DetourModKit PRIVATE
_DISABLE_STL_ANNOTATION=1 _DISABLE_STRING_ANNOTATION=1 _DISABLE_VECTOR_ANNOTATION=1)
add_compile_options(/fsanitize=address)
add_compile_definitions(_DISABLE_STRING_ANNOTATION=1 _DISABLE_VECTOR_ANNOTATION=1)
add_compile_definitions(_DISABLE_STL_ANNOTATION=1 _DISABLE_STRING_ANNOTATION=1 _DISABLE_VECTOR_ANNOTATION=1)
add_link_options(/INCREMENTAL:NO)
message(STATUS "Sanitizers enabled: ASan (MSVC)")
elseif(WIN32)
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,7 @@ See the [Config Hot-Reload Guide](docs/config-hot-reload/README.md) for the thre
**Performance:**

- Hash-map-backed `is_binding_active()` query for low-overhead cross-thread state reads (e.g., from render hooks at 60+ fps)
- SRWLOCK-backed reader/writer synchronization for live binding reshapes, using the same native Windows lock primitive as hook registries and memory caches
- Multiple bindings per name for multi-combo hotkeys
- Lock-free `is_running()` via atomic flag
- O(1) reverse name lookup for `input_code_to_name()`
Expand Down
Loading
Loading