Skip to content

Commit bc1c7fa

Browse files
kalwaltclaude
andcommitted
doc: add §3.4 documenting the NONMAX_CHECK macro replacement
The C++ DoG_scale_invariant_detector.cpp uses an #define NONMAX_CHECK(OPERATOR, VALUE) preprocessor macro that expands to a chain of 26 short-circuit && comparisons. The macro is parameterized by an operator token (> or <) so a single macro covers both maxima and minima detection. Document how this is replaced in Rust: - enum NonMaxOp { Greater, Less } passed as a parameter - Three private #[inline] helpers (one per dimension pattern) build a cmp closure with `match op` and chain the 26 comparisons with && - Call site mirrors C++ `if (NONMAX_CHECK(>, value)) ... else if (NONMAX_CHECK(<, value)) ...` Also document the alternatives considered (trait-generic closure, macro_rules! macro, inline expansion) and why the enum-with-closure form was chosen. Renumbers the FFI shim section from 3.4 to 3.5; no content lost. refs #125 #128 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
1 parent e7357a5 commit bc1c7fa

1 file changed

Lines changed: 28 additions & 1 deletion

File tree

docs/design/m8-3-dog-detector.md

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,34 @@ impl DoGScaleInvariantDetector {
133133
6. **Orientation assignment** (when `find_orientation == true`): precompute one gradient image per pyramid level; for each refined point query `OrientationAssignment::compute(...)`; emit one copy per dominant peak (zero peaks ⇒ keypoint dropped, matches C++).
134134
7. **Bucket pruning**: distribute keypoints across `num_buckets_x × num_buckets_y` fine-image spatial buckets; take best-by-|score| per bucket round-robin until `max_num_feature_points` reached.
135135

136-
### 3.4 FFI shim (`kpm_c_api.h/.cpp`)
136+
### 3.4 Replacing the C++ `NONMAX_CHECK` preprocessor macro
137+
138+
The C++ `extractFeatures` uses an `#define NONMAX_CHECK(OPERATOR, VALUE)` macro that expands to a chain of 26 short-circuit `&&` comparisons (9 from `im0`, 8 from `im1` excluding the center, 9 from `im2`). It is parameterized by an operator token (`>` or `<`) so a single macro covers both maxima and minima detection, and a single chain runs at full speed because each comparison is inlined by the preprocessor.
139+
140+
Rust doesn't let you pass operator tokens as macro parameters as cleanly, so the macro is replaced by **an `enum NonMaxOp { Greater, Less }` plus three private inlined helper functions**, one per dimension pattern:
141+
142+
| Function | C++ macro instance |
143+
|---|---|
144+
| `nonmax_same_octave(op, val, im0, im1, im2, w, row, col)` | SameOctave (all three `im*` same dims) |
145+
| `nonmax_fine_octave(op, val, im0, im1, im2, w, row, col, ds_x, ds_y)` | FineOctavePair (`im2` half-sized → 9 `bilinear_interpolate_f32` lookups) |
146+
| `nonmax_coarse_octave(op, val, im0, im1, im2, w, row, col, us_x, us_y)` | CoarseOctavePair (`im0` double-sized → 9 `bilinear_interpolate_f32` lookups) |
147+
148+
Each helper builds a `cmp` closure with a `match op { Greater => a > b, Less => a < b }` and chains the 26 `cmp(val, neighbor)` calls with `&&`. The call site mirrors the C++ `if/else if`:
149+
150+
```rust
151+
let extrema = nonmax_same_octave(NonMaxOp::Greater, value, d0, d1, d2, w, row, col)
152+
|| nonmax_same_octave(NonMaxOp::Less, value, d0, d1, d2, w, row, col);
153+
```
154+
155+
**Trade-offs considered:**
156+
157+
- **Trait-generic `fn nonmax<C: Fn(f32, f32) -> bool>(cmp: C, ...)`** — would force monomorphization twice per call site and bloat the call graph; rejected.
158+
- **`macro_rules!` macro** — would match the C++ structure most literally, but Rust hygiene requires explicit captures of every variable, making the macro definition harder to read than the function form; rejected.
159+
- **Inline expansion** of all 156 comparisons (26 × 2 operators × 3 patterns) — rejected as repetitive.
160+
161+
The enum-with-closure form is shorter than the alternatives and the `#[inline]` annotation gives the compiler the same opportunity to specialize as the C++ macro got at preprocess time. In optimized builds the `match op` is constant-folded per call site because the caller passes a literal `NonMaxOp::Greater` or `NonMaxOp::Less`.
162+
163+
### 3.5 FFI shim (`kpm_c_api.h/.cpp`)
137164

138165
```c
139166
int webarkit_cpp_dog_detect_count(

0 commit comments

Comments
 (0)