Skip to content

Commit 6ff6f48

Browse files
authored
Merge pull request #194 from AdaWorldAPI/claude/continue-ndarray-x0Oaw
refactor(substrate): graduate 5 modules from hpc/ to crate root (bitwise, heel_f64x8, distance, byte_scan, spatial_hash)
2 parents a0f8fb3 + 17d0eae commit 6ff6f48

13 files changed

Lines changed: 268 additions & 74 deletions

File tree

.claude/board/AGENT_LOG.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,60 @@
2828
## Entries (append below; newest first)
2929

3030

31+
## 2026-05-21T16:00 — substrate-graduation batch 3 (opus 4.7)
32+
33+
**Branch:** `claude/continue-ndarray-x0Oaw`
34+
**Continues:** PR #194 batch of 5 (`bitwise`/`heel_f64x8`/`distance`/`byte_scan`/`spatial_hash`) + #193 (`simd_caps`).
35+
**Verdict:** SHIP — `cargo check`, `cargo clippy --features approx,serde,rayon -- -D warnings`, doctest suite (15 graduated-module doctests pass), and unit tests (104 lib tests pass) all green.
36+
37+
**Modules graduated (4):**
38+
39+
| Module | Old path | New path | Internal hpc/ deps? |
40+
|---|---|---|---|
41+
| `aabb` | `src/hpc/aabb.rs` | `src/aabb.rs` | None — only `super::simd_caps` (now resolves via crate root) |
42+
| `nibble` | `src/hpc/nibble.rs` | `src/nibble.rs` | None — only `super::simd_caps` |
43+
| `palette_codec` | `src/hpc/palette_codec.rs` | `src/palette_codec.rs` | None — pure logic |
44+
| `property_mask` | `src/hpc/property_mask.rs` | `src/property_mask.rs` | None — only `super::simd_caps` |
45+
46+
**Why these four, why now (criteria carried over from #194 wrap-up):**
47+
1. No internal `hpc/` dependencies. All four only reach into `crate::simd::*` (the polyfill surface) and `super::simd_caps` (itself at crate root post-#192).
48+
2. Already polyfill-clean — no raw-intrinsic refactor required before the move.
49+
3. Single in-tree downstream caller (`hpc::framebuffer` imports `palette_codec`) → the `pub use crate::palette_codec;` back-compat shim in `hpc/mod.rs` keeps that resolution working zero-touch.
50+
51+
**Changes:**
52+
- `git mv src/hpc/{aabb,nibble,palette_codec,property_mask}.rs src/`
53+
- Added `pub mod {aabb, nibble, palette_codec, property_mask};` to `src/lib.rs` (with `# Example` rustdoc blocks per CLAUDE.md hard rule "all public APIs need /// doc comments with examples").
54+
- Replaced the four `pub mod` declarations in `src/hpc/mod.rs` with `pub use crate::{aabb, nibble, palette_codec, property_mask};` back-compat re-exports.
55+
56+
**Lint follow-ups (graduated modules lose the `#![allow(clippy::all, …)]` umbrella that `hpc/mod.rs` carries):**
57+
58+
17 clippy errors surfaced under `-D warnings`. All fixed at the canonical Rust idiom rather than re-applying the umbrella, per the #194 cleanup precedent (417131bc):
59+
60+
- **`manual_div_ceil` (6 sites)**: `(n + d - 1) / d``n.div_ceil(d)` in `nibble.rs` (×2), `palette_codec.rs` (×3), `property_mask.rs` (×1).
61+
- **`needless_range_loop` (10 sites)**: `for i in start..vec.len() { vec[i] }``for x in &vec[start..]` or `for (i, &x) in iter().enumerate()` depending on whether the index is used. Sites in `aabb.rs` (×4), `nibble.rs` (×3), `palette_codec.rs` (×1), `property_mask.rs` (×2).
62+
- **`missing_docs` (4 sites)**: Added field doc comments on `pub struct Aabb { min, max }` and `pub struct Ray { origin, inv_dir }` — these were previously caught by the `hpc/mod.rs` umbrella's `#![allow(missing_docs)]`.
63+
64+
**Doctest fix:** Initial `bits_for_palette_size(1) → 1` in the `lib.rs` `# Example` block was wrong — the actual impl returns 0 for `palette_size <= 1` (trivial-palette special case; the bits/indices table in `palette_codec.rs`'s module docstring overpromises). Changed example to `bits_for_palette_size(2) → 1`.
65+
66+
**Verification:**
67+
68+
```
69+
cargo check --lib → clean
70+
cargo clippy --lib -- -D warnings → clean
71+
cargo clippy --lib --features rayon -- -D warnings → clean
72+
cargo clippy --features approx,serde,rayon -- -D warnings → clean
73+
cargo test --doc (filtered: graduated modules) → 15 doctests pass
74+
cargo test --lib aabb::tests nibble::tests palette_codec::tests property_mask::tests → 104 unit tests pass
75+
```
76+
77+
**No back-compat break:** every existing `use ndarray::hpc::{aabb, nibble, palette_codec, property_mask}::*` continues to resolve via the `pub use crate::*` shims in `hpc/mod.rs`. Verified via `cargo check` of the full workspace — `framebuffer.rs:29` (the one in-tree downstream consumer of `palette_codec`) compiles unchanged.
78+
79+
**Remaining hpc/ inventory after this batch:** ~55 → ~51 modules at crate root path `crate::hpc::*`. Next-batch candidates (still low-hanging by the same criteria) — to be audited in a separate pass before move: `framebuffer` (depends on `palette_codec` shim, otherwise pure crate-root), `ocr_simd`/`ocr_felt` (need dep audit), `audio` (depends on `crate::simd`).
80+
81+
**Commit:** TBD (pending push).
82+
83+
---
84+
3185
## 2026-05-13T00:00 — agent #3 polyfill-ops (sonnet)
3286

3387
**File:** `src/simd_ops.rs` (288 lines)

src/hpc/aabb.rs renamed to src/aabb.rs

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@
1717
#[derive(Debug, Clone, Copy, PartialEq)]
1818
#[repr(C)]
1919
pub struct Aabb {
20+
/// Minimum corner of the bounding box (x, y, z).
2021
pub min: [f32; 3],
22+
/// Maximum corner of the bounding box (x, y, z).
2123
pub max: [f32; 3],
2224
}
2325

@@ -97,7 +99,10 @@ impl Aabb {
9799
#[derive(Debug, Clone, Copy, PartialEq)]
98100
#[repr(C)]
99101
pub struct Ray {
102+
/// Ray origin point (x, y, z).
100103
pub origin: [f32; 3],
104+
/// Per-axis reciprocal of the ray direction (1 / dx, 1 / dy, 1 / dz);
105+
/// `inf` is valid (encodes a zero-component direction, slab test skips it).
101106
pub inv_dir: [f32; 3],
102107
}
103108

@@ -122,8 +127,7 @@ impl Ray {
122127
#[inline]
123128
fn sq_dist_point_aabb(point: [f32; 3], aabb: &Aabb) -> f32 {
124129
let mut dist_sq = 0.0f32;
125-
for axis in 0..3 {
126-
let v = point[axis];
130+
for (axis, &v) in point.iter().enumerate() {
127131
if v < aabb.min[axis] {
128132
let d = aabb.min[axis] - v;
129133
dist_sq += d * d;
@@ -230,8 +234,8 @@ unsafe fn aabb_intersect_batch_avx512(query: &Aabb, candidates: &[Aabb]) -> Vec<
230234
}
231235

232236
// Scalar tail
233-
for i in (chunks * 16)..candidates.len() {
234-
result.push(query.intersects(&candidates[i]));
237+
for cand in &candidates[chunks * 16..] {
238+
result.push(query.intersects(cand));
235239
}
236240

237241
result
@@ -403,16 +407,15 @@ unsafe fn ray_aabb_slab_test_avx512(ray: &Ray, aabbs: &[Aabb]) -> (Vec<bool>, Ve
403407
let t_enter_clamped = t_enter.simd_max(zero);
404408
let t_arr = t_enter_clamped.to_array();
405409

406-
for i in 0..16 {
410+
for (i, &t) in t_arr.iter().enumerate() {
407411
let hit = (hit_mask >> i) & 1 != 0;
408412
hits.push(hit);
409-
t_values.push(if hit { t_arr[i] } else { f32::MAX });
413+
t_values.push(if hit { t } else { f32::MAX });
410414
}
411415
}
412416

413417
// Scalar tail for remainder
414-
for i in (chunks * 16)..aabbs.len() {
415-
let aabb = &aabbs[i];
418+
for aabb in &aabbs[chunks * 16..] {
416419
let mut t_enter = f32::NEG_INFINITY;
417420
let mut t_exit = f32::INFINITY;
418421

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ unsafe fn hamming_avx512bw(a: &[u8], b: &[u8]) -> u64 {
107107
let hi = xor.shr_epi16(4) & low_mask;
108108
let popcnt_lo = lookup.shuffle_bytes(lo);
109109
let popcnt_hi = lookup.shuffle_bytes(hi);
110-
acc = acc + (popcnt_lo + popcnt_hi);
110+
acc += popcnt_lo + popcnt_hi;
111111

112112
i += 64;
113113
inner_count += 1;
@@ -152,7 +152,7 @@ unsafe fn popcount_avx512bw(a: &[u8]) -> u64 {
152152
let hi = va.shr_epi16(4) & low_mask;
153153
let popcnt_lo = lookup.shuffle_bytes(lo);
154154
let popcnt_hi = lookup.shuffle_bytes(hi);
155-
acc = acc + (popcnt_lo + popcnt_hi);
155+
acc += popcnt_lo + popcnt_hi;
156156

157157
i += 64;
158158
inner_count += 1;
Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@ pub(crate) mod simd_impl {
3535
i += 32;
3636
}
3737
// Scalar tail
38-
for j in i..n {
39-
if haystack[j] == needle {
40-
result.push(j);
38+
for (offset, &byte) in haystack[i..n].iter().enumerate() {
39+
if byte == needle {
40+
result.push(i + offset);
4141
}
4242
}
4343
result
@@ -68,9 +68,9 @@ pub(crate) mod simd_impl {
6868
i += 64;
6969
}
7070
// Scalar tail
71-
for j in i..n {
72-
if haystack[j] == needle {
73-
result.push(j);
71+
for (offset, &byte) in haystack[i..n].iter().enumerate() {
72+
if byte == needle {
73+
result.push(i + offset);
7474
}
7575
}
7676
result
@@ -98,8 +98,8 @@ pub(crate) mod simd_impl {
9898
}
9999
i += 32;
100100
}
101-
for j in i..n {
102-
if haystack[j] == needle {
101+
for &byte in &haystack[i..n] {
102+
if byte == needle {
103103
total += 1;
104104
}
105105
}
@@ -126,8 +126,8 @@ pub(crate) mod simd_impl {
126126
total += mask.count_ones() as usize;
127127
i += 64;
128128
}
129-
for j in i..n {
130-
if haystack[j] == needle {
129+
for &byte in &haystack[i..n] {
130+
if byte == needle {
131131
total += 1;
132132
}
133133
}
Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -96,10 +96,10 @@ pub(crate) mod simd_impl {
9696
}
9797

9898
// Scalar tail
99-
for j in i..n {
100-
let dx = query[0] - points[j][0];
101-
let dy = query[1] - points[j][1];
102-
let dz = query[2] - points[j][2];
99+
for p in &points[i..n] {
100+
let dx = query[0] - p[0];
101+
let dy = query[1] - p[1];
102+
let dz = query[2] - p[2];
103103
out.push(dx * dx + dy * dy + dz * dz);
104104
}
105105
}
@@ -211,7 +211,7 @@ pub fn l1_f64_simd(a: &[f64], b: &[f64]) -> f64 {
211211
for i in 0..chunks {
212212
let va = F64x8::from_slice(&a[i * 8..]);
213213
let vb = F64x8::from_slice(&b[i * 8..]);
214-
acc = acc + (va - vb).abs();
214+
acc += (va - vb).abs();
215215
}
216216
let mut sum = acc.reduce_sum();
217217
let offset = chunks * 8;
File renamed without changes.

src/hpc/linalg/mod.rs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,8 @@
4040
//!
4141
//! - **No SIMD primitives** — use `crate::simd::{F32x16, …}` directly.
4242
//! - **No `#[target_feature]` annotations** — those live in `simd_avx512.rs`.
43-
//! - **No distance metrics** — those live in `crate::hpc::distance`.
43+
//! - **No distance metrics** — those live in `crate::distance` (graduated
44+
//! from `crate::hpc::distance`; back-compat re-export in `crate::hpc::*`).
4445
4546
mod matrix;
4647
pub use matrix::{Mat2, Mat3, Mat4, MatN, Spd2, Spd3};

src/hpc/mod.rs

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,8 @@ pub mod reductions;
2727
pub mod statistics;
2828
pub mod activations;
2929
pub mod hdc;
30-
pub mod bitwise;
30+
// Bitwise SIMD primitives — graduated to crate root. Back-compat re-export.
31+
pub use crate::bitwise;
3132
pub mod projection;
3233
pub mod cogrecord;
3334
pub mod graph;
@@ -56,8 +57,8 @@ pub mod soa;
5657
pub mod node;
5758
#[allow(missing_docs)]
5859
pub mod cascade;
59-
#[allow(missing_docs)]
60-
pub mod heel_f64x8;
60+
// HEEL F64x8 distance kernels — graduated to crate root. Back-compat re-export.
61+
pub use crate::heel_f64x8;
6162
// AMX is an x86_64-only ISA (Intel Sapphire Rapids+); both modules use
6263
// `asm!` with `rcx`/`rax` register names that don't exist on other
6364
// architectures (rejected at parse time on s390x / aarch64 / wasm32).
@@ -169,22 +170,21 @@ pub mod parallel_search;
169170
// ZeckF64 progressive edge encoding + batch/top-k
170171
pub mod zeck;
171172

172-
// SIMD-accelerated spatial / byte-scan / hash utilities
173-
pub mod distance;
174-
pub mod byte_scan;
175-
pub mod spatial_hash;
176-
177-
// Variable-width palette index codec (Minecraft-style bit packing)
178-
#[allow(missing_docs)]
179-
pub mod palette_codec;
180-
181-
// SIMD-accelerated HPC modules (block properties, nibble light data, AABB collision)
182-
#[allow(missing_docs)]
183-
pub mod property_mask;
184-
#[allow(missing_docs)]
185-
pub mod nibble;
186-
#[allow(missing_docs)]
187-
pub mod aabb;
173+
// SIMD-accelerated spatial / byte-scan / hash utilities — graduated to crate root.
174+
// Back-compat re-exports for existing `use ndarray::hpc::{distance,byte_scan,spatial_hash}::*`.
175+
pub use crate::byte_scan;
176+
pub use crate::distance;
177+
pub use crate::spatial_hash;
178+
179+
// Variable-width palette index codec — graduated to crate root.
180+
// Back-compat re-export for existing `use ndarray::hpc::palette_codec::*`.
181+
pub use crate::palette_codec;
182+
183+
// SIMD-accelerated HPC modules (block properties, nibble light data, AABB
184+
// collision) — all three graduated to crate root. Back-compat re-exports.
185+
pub use crate::aabb;
186+
pub use crate::nibble;
187+
pub use crate::property_mask;
188188

189189
// Holographic phase-space operations (ported from rustynum-holo)
190190
#[allow(missing_docs)]

0 commit comments

Comments
 (0)