fix(scanner): correct offset semantics docs and harden find_pattern#67
Conversation
📝 WalkthroughWalkthroughRefactors scanner internals to apply CompiledPattern::offset exactly once on return, adds raw-match helpers and stricter input validation, restricts process-wide scanning to readable+executable pages, fixes RIP-relative arithmetic, expands tests, and adds comprehensive AOB-signature documentation and README/AGENTS links. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant Scanner
participant MemEnum as MemoryRegionEnumerator
participant Logger
Caller->>Scanner: find_pattern / scan_executable_regions(pattern)
Scanner->>Scanner: validate_find_pattern_inputs(pattern)
Scanner->>Logger: warn if all-wildcards (if applicable)
alt scan_executable_regions
Scanner->>MemEnum: enumerate executable regions (filter READABLE+EXEC)
loop per region
Scanner->>Scanner: find_pattern_raw(region, pattern)
alt match found
Scanner->>Scanner: apply pattern.offset once to match
alt RIP-relative pattern
Scanner->>Scanner: resolve_rip_relative(match)
Scanner->>Logger: log resolve errors/warnings
end
Scanner->>Caller: return match (offset applied)
end
end
else direct find_pattern
Scanner->>Scanner: find_pattern_raw(range, pattern)
Scanner->>Scanner: apply pattern.offset once to match
Scanner->>Caller: return match (offset applied)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
tests/test_scanner.cpp (1)
1868-1882: Test name overstates what is verified.
ResolveRipRelative_UnsignedWraparoundContractasserts a normal-range negative disp32 on a heap address — not an actual wraparound nearUINTPTR_MAX. The signed and unsigned formulations both produce the same result here, so this test would have passed even before the unsigned-arithmetic fix. Consider either renaming to something likeNegativeDisp32_ProducesExpectedTarget(it's already effectively covered byScannerRipResolveTest.resolve_rip_relative_negative_displacement), or adding a second assertion that exercises a syntheticuintptr_tbase near the signed boundary (e.g. via a constructed non-dereferenced address path) to actually stress modular wraparound.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_scanner.cpp` around lines 1868 - 1882, The test name ResolveRipRelative_UnsignedWraparoundContract is misleading because Scanner::resolve_rip_relative is being exercised with a normal heap base and a negative disp32 (no unsigned wraparound); either rename the test to reflect it verifies negative disp32 behavior (e.g., NegativeDisp32_ProducesExpectedTarget) or add a second assertion that actually forces modular wraparound by constructing a synthetic uintptr_t base near UINTPTR_MAX (without dereferencing that address) and calling Scanner::resolve_rip_relative with the same disp to verify the result wraps (using the constructed base + instr_len + disp modulo address space) so the test truly validates the unsigned-arithmetic fix.src/scanner.cpp (2)
582-655: Skipping pure-execute pages removes a real crash vector.
READABLE_EXEC_FLAGScorrectly excludes barePAGE_EXECUTE(which grants exec without read on processors that honor NX), sofind_pattern_raw'smemchrcan no longer be handed a page that faults on load. Theprotection_unsafecheck retains the guard/noaccess exclusion. The trace-level diagnostic for the skipped regions is gated byis_enabled(Trace)so it's free when tracing is off.Two minor notes — both optional:
- The trace message at line 619-621 passes
mbi.BaseAddress(avoid*) directly to{}formatting. If the project'sstd::formatformatter forvoid*doesn't render a hex address cleanly, considerFormat::format_address(reinterpret_cast<uintptr_t>(mbi.BaseAddress))for consistency with other scanner logs.- The
execute_only && !protection_unsafe && MEM_COMMIT && is_enabled(Trace)condition at 616 is a pure diagnostic branch; consider hoisting it behind a singleif (logger.is_enabled(LogLevel::Trace))outer scope to make it visually obvious this block is trace-only. Non-blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scanner.cpp` around lines 582 - 655, The code intentionally skips pure-execute pages to avoid read faults in DetourModKit::Scanner::scan_executable_regions; keep the READABLE_EXEC_FLAGS logic and protection_unsafe check but adjust the diagnostic only: move the logger.is_enabled(LogLevel::Trace) check outward so the entire trace-only block is gated, and change the logger.trace argument for mbi.BaseAddress to use a formatted address (e.g., via Format::format_address(reinterpret_cast<uintptr_t>(mbi.BaseAddress))) to ensure consistent hex rendering; leave matching logic (find_pattern_raw, pattern.offset) unchanged.
289-339: Nice refactor — thefind_pattern_rawsplit fixes the offset double-apply on the Nth-occurrence walk.Extracting the raw primitive is the key structural change: it keeps the
match + 1continuation anchored at the true match start (independent ofpattern.offset), and it gives each public entry point a single, well-defined place to apply the offset and emit the all-wildcards warning. The Nth-occurrence overload previously would have misbehaved if the publicfind_patternhad already addedpattern.offsetto its return value (the cursor advance would have been wrong). The new structure makes this robust by construction.One small observation on the public wrappers: the guards in
find_pattern(start, size, pattern)andfind_pattern(start, size, pattern, occurrence)are now identical. Consider extracting a smallvalidate_pattern_and_start(...)helper (or folding it into thepattern_has_literal_bytehelper) to keep the two entry points in lockstep if the validation ever grows.Also applies to: 463-512
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scanner.cpp` around lines 289 - 339, Extract the duplicated input/guard logic from DetourModKit::Scanner::find_pattern(...) (both the two-argument and Nth-occurrence overloads) into a small helper like validate_pattern_and_start(const std::byte *start_address, size_t region_size, const CompiledPattern &pattern, Logger& logger) (or fold it into the existing pattern_has_literal_byte helper), move the empty-pattern, null-start and all-wildcards checks into that helper while preserving the same logger.error/warning messages and return semantics, and call that helper from both public find_pattern overloads before calling find_pattern_raw so the two entry points stay in sync; keep references to pattern_has_literal_byte and find_pattern_raw unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/misc/aob-signatures.md`:
- Around line 465-473: The table row for "find_pattern hits the wrong site" is
malformed (only two cells); edit the markdown row that starts with "|
`find_pattern` hits the wrong site" so it has three pipe-separated cells: first
cell the Symptom text, second cell the Likely cause ("Signature not unique; pick
a tighter neighbour; or use the Nth-occurrence overload with a confirmed N"),
and third cell the Remedy (for example "Pick a tighter neighbour or use the
Nth-occurrence overload with a confirmed N" or a succinct actionable fix).
Ensure the row follows the same "| A | B | C |" format as the other rows
(matching the header "| Symptom | Likely cause | Remedy |").
In `@include/DetourModKit/scanner.hpp`:
- Around line 61-70: Replace the multi-line trailing "///<" comments on the
member named `offset` with a proper Doxygen block comment placed immediately
above the `offset` declaration: remove the continued `///<` lines after
`offset`, add a /** `@brief` ... */ block that summarizes the field, include an
`@details` section for the longer explanation (mention how `|` sets the value, it
may equal bytes.size(), and why the type is signed for
pointer-arithmetic/future-proofing), and keep the brief single-line `///<` style
only for `bytes` and `mask` if they remain trivial; ensure the symbol `offset`
is documented with `@brief` and `@details` rather than multi-line trailing `///<`
comments.
---
Nitpick comments:
In `@src/scanner.cpp`:
- Around line 582-655: The code intentionally skips pure-execute pages to avoid
read faults in DetourModKit::Scanner::scan_executable_regions; keep the
READABLE_EXEC_FLAGS logic and protection_unsafe check but adjust the diagnostic
only: move the logger.is_enabled(LogLevel::Trace) check outward so the entire
trace-only block is gated, and change the logger.trace argument for
mbi.BaseAddress to use a formatted address (e.g., via
Format::format_address(reinterpret_cast<uintptr_t>(mbi.BaseAddress))) to ensure
consistent hex rendering; leave matching logic (find_pattern_raw,
pattern.offset) unchanged.
- Around line 289-339: Extract the duplicated input/guard logic from
DetourModKit::Scanner::find_pattern(...) (both the two-argument and
Nth-occurrence overloads) into a small helper like
validate_pattern_and_start(const std::byte *start_address, size_t region_size,
const CompiledPattern &pattern, Logger& logger) (or fold it into the existing
pattern_has_literal_byte helper), move the empty-pattern, null-start and
all-wildcards checks into that helper while preserving the same
logger.error/warning messages and return semantics, and call that helper from
both public find_pattern overloads before calling find_pattern_raw so the two
entry points stay in sync; keep references to pattern_has_literal_byte and
find_pattern_raw unchanged.
In `@tests/test_scanner.cpp`:
- Around line 1868-1882: The test name
ResolveRipRelative_UnsignedWraparoundContract is misleading because
Scanner::resolve_rip_relative is being exercised with a normal heap base and a
negative disp32 (no unsigned wraparound); either rename the test to reflect it
verifies negative disp32 behavior (e.g., NegativeDisp32_ProducesExpectedTarget)
or add a second assertion that actually forces modular wraparound by
constructing a synthetic uintptr_t base near UINTPTR_MAX (without dereferencing
that address) and calling Scanner::resolve_rip_relative with the same disp to
verify the result wraps (using the constructed base + instr_len + disp modulo
address space) so the test truly validates the unsigned-arithmetic fix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3c86a7ec-64d8-45b3-bf3f-ece63bec0c63
📒 Files selected for processing (6)
AGENTS.mdREADME.mddocs/misc/aob-signatures.mdinclude/DetourModKit/scanner.hppsrc/scanner.cpptests/test_scanner.cpp
221be63 to
28fa8ba
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/misc/aob-signatures.md`:
- Line 86: The hard-coded header line number in the markdown sentence should be
removed to avoid drift: update the link that currently points to
Scanner::parse_aob(std::string_view) ([include/DetourModKit/scanner.hpp:89]) by
dropping the ":89" so it links to the header file only (e.g.
[include/DetourModKit/scanner.hpp]) and keep the text referencing
Scanner::parse_aob(std::string_view) intact; this ensures the reference remains
correct even if line numbers change.
- Line 93: The table entry for the offset marker `\|` currently says it "records
the byte position of the next token" which implies it cannot appear at end;
update the wording to explicitly allow the end case by mentioning that the
marker records the byte position of the next token or, if placed at the very
end, the position equal to the pattern length (i.e. offset == bytes.size()), and
keep the note that this value is stored on CompiledPattern::offset and that the
marker cannot appear more than once.
- Around line 348-360: The example uses nonexistent members (candidate.compiled
and candidate.source->disp_offset); either add those members to AddrCandidate
or, more simply, use the fields already declared — call sc::find_pattern with
candidate.pattern (or add a compiled pattern field if that API requires it) and
read the offset from candidate.disp_offset (not candidate.source->disp_offset);
also ensure any use of instr_end_offset matches the declared instr_end_offset
field. Update the snippet so the struct and its use are consistent (e.g., use
candidate.pattern and candidate.disp_offset, or extend AddrCandidate with
compiled/source members if you prefer).
In `@include/DetourModKit/scanner.hpp`:
- Around line 120-130: Update the return-value documentation for the function
that returns a "const std::byte*" (the block mentioning pattern.offset and
adjusted pointer) to explicitly state that if pattern.offset equals the matched
length (e.g., a trailing '|' in the AOB), the returned pointer may be one-past
the matched range — and in edge cases one-past the scanned region — and callers
must not blindly dereference it; change the wording wherever similar return docs
appear (the other blocks referenced) to warn about one-past pointers and
recommend checking bounds before dereferencing when using pattern.offset.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9b2344b0-f987-48c4-a06b-ce24213212ea
📒 Files selected for processing (6)
AGENTS.mdREADME.mddocs/misc/aob-signatures.mdinclude/DetourModKit/scanner.hppsrc/scanner.cpptests/test_scanner.cpp
🚧 Files skipped from review as they are similar to previous changes (3)
- README.md
- tests/test_scanner.cpp
- src/scanner.cpp
- Hoist empty/all-wildcard validation out of internal primitive so warnings log once per public call, not per region or per occurrence. - Use unsigned modular arithmetic in resolve_rip_relative to avoid signed pointer overflow UB. - Correct docs/misc/aob-signatures.md: find_pattern applies pattern.offset internally; remove misleading manual-add guidance. - Add regression tests for offset-at-end, all-wildcard Nth occurrence, zero/empty/null guards, and RIP-relative wraparound contract.
28fa8ba to
3f6bf6a
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tests/test_scanner.cpp (1)
1565-1584: Nit: VirtualAlloc region leaks ifVirtualProtectfails.If the
ASSERT_TRUE(::VirtualProtect(...))on line 1570 fires, the test exits without the::VirtualFree(region, 0, MEM_RELEASE)on line 1583. Low-probability in practice, but a tiny RAII guard (or swapping the order: setPAGE_NOACCESSfirst viaVirtualAllocwithMEM_RESERVE+ selectiveMEM_COMMIT, or usingScopeGuard-style cleanup) avoids the leak on the failure path.♻️ Proposed minimal fix
auto *region = static_cast<std::byte *>(::VirtualAlloc( nullptr, page_size * 2, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE)); ASSERT_NE(region, nullptr); + auto region_cleanup = [&]() { ::VirtualFree(region, 0, MEM_RELEASE); }; + DWORD old_protect = 0; - ASSERT_TRUE(::VirtualProtect(region + page_size, page_size, PAGE_NOACCESS, &old_protect)); + if (!::VirtualProtect(region + page_size, page_size, PAGE_NOACCESS, &old_protect)) + { + region_cleanup(); + FAIL() << "VirtualProtect(PAGE_NOACCESS) failed"; + } @@ - ::VirtualFree(region, 0, MEM_RELEASE); + region_cleanup();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_scanner.cpp` around lines 1565 - 1584, The test leaks the VirtualAlloc'd region if ASSERT_TRUE(::VirtualProtect(...)) aborts before VirtualFree is called; wrap the allocation in an RAII cleanup (e.g., a small scope guard or std::unique_ptr with a custom deleter) that calls ::VirtualFree(region, 0, MEM_RELEASE) so the region is always freed, then use that RAII handle while calling ::VirtualProtect and Scanner::resolve_rip_relative; reference the existing symbols region, ::VirtualProtect, ::VirtualFree and Scanner::resolve_rip_relative when making the change.docs/misc/aob-signatures.md (1)
1-488: New AOB guide reads well and prior review threads are resolved.
- The troubleshooting table now has proper 3-cell rows (previous row-malformation fix).
parse_aoblink no longer hard-codes a header line number.- The
|offset-marker row correctly covers the end-marker case (offset == bytes.size()).- The §6.4 negative-offset example now consistently uses
candidate.disp_offsetagainst the declared struct fields.- §4.4, §4.5, §7, and §8 consistently drum in the "don't add
pattern->offsetmanually" contract, matching the scanner implementation and the new regression tests.One soft suggestion (non-blocking): §6.1 introduces a user-managed
offset_to_hookalongside patterns that may also carry a|marker — readers could conflate the two. A single sentence clarifying thatoffset_to_hookis applied on top of the already-offset-adjusted pointer (and is typically zero when|is used) would prevent the exact double-apply confusion §8 warns against.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/misc/aob-signatures.md` around lines 1 - 488, Add a brief clarifying sentence to §6.1 explaining how per-candidate offsets interact with the Scanner's built-in offset: state that AddrCandidate::offset_to_hook (or similar per-candidate signed offsets) is applied on top of the pointer returned by find_pattern/scan_executable_regions (which have already applied CompiledPattern::offset), and that offset_to_hook is typically zero when the pattern includes a '|' marker to avoid double-applying the same offset.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/misc/aob-signatures.md`:
- Around line 1-488: Add a brief clarifying sentence to §6.1 explaining how
per-candidate offsets interact with the Scanner's built-in offset: state that
AddrCandidate::offset_to_hook (or similar per-candidate signed offsets) is
applied on top of the pointer returned by find_pattern/scan_executable_regions
(which have already applied CompiledPattern::offset), and that offset_to_hook is
typically zero when the pattern includes a '|' marker to avoid double-applying
the same offset.
In `@tests/test_scanner.cpp`:
- Around line 1565-1584: The test leaks the VirtualAlloc'd region if
ASSERT_TRUE(::VirtualProtect(...)) aborts before VirtualFree is called; wrap the
allocation in an RAII cleanup (e.g., a small scope guard or std::unique_ptr with
a custom deleter) that calls ::VirtualFree(region, 0, MEM_RELEASE) so the region
is always freed, then use that RAII handle while calling ::VirtualProtect and
Scanner::resolve_rip_relative; reference the existing symbols region,
::VirtualProtect, ::VirtualFree and Scanner::resolve_rip_relative when making
the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f7e42e8a-9ed2-4bc3-a3e4-9d2ad44a0b71
📒 Files selected for processing (6)
AGENTS.mdREADME.mddocs/misc/aob-signatures.mdinclude/DetourModKit/scanner.hppsrc/scanner.cpptests/test_scanner.cpp
✅ Files skipped from review due to trivial changes (1)
- README.md
🚧 Files skipped from review as they are similar to previous changes (1)
- src/scanner.cpp
Summary
find_pattern_rawso warnings log once per public call, not per region or per occurrence iteration.resolve_rip_relativetarget computation to unsigned modular arithmetic to avoid signed pointer overflow UB.docs/misc/aob-signatures.md:find_patternalready appliespattern.offsetinternally; prior docs instructed callers to add it again.README.md,AGENTS.md, andscanner.hpp.Test plan
tests/test_scanner.cppcovering offset-at-end, all-wildcard Nth occurrence, zero/empty/null guards, and RIP-relative wraparound.DMK_BUILD_TESTS=ON.Summary by CodeRabbit
Documentation
Bug Fixes
Tests