Commit 9e96459
committed
splat3d/PR7: end-to-end demo + PLY loader + e2e integration test (PR 7)
Closes the splat3d sprint's "Definition of done" — the full PR 1-6
pipeline now runs end-to-end on the CPU with a real binary that takes
a .ply scene as input and produces image output.
## Shipped
### src/hpc/splat3d/ply.rs (~370 LoC, 4 unit tests)
Minimal Inria 3DGS PLY reader. Parses ASCII header up to `end_header`,
validates the canonical 62-property vertex layout (x/y/z, normals,
SH DC + 45 rest, opacity, scale × 3, quat × 4), reads the binary
little-endian body, applies the canonical activations inline
(sigmoid opacity, exp scale, normalize quat), and reorders SH into
the gaussian-major channel-major layout `sh_eval_deg3` expects.
Rejects ASCII bodies, big-endian, unexpected properties, and
truncated files with typed `PlyError` variants. No new top-level
deps — single-file hand-rolled binary parser.
### tests/splat3d_correctness.rs (5 e2e integration tests)
Walks the full PR 1-6 pipeline against a synthetic 1000-gaussian
cube scene (10×10×10 grid spanning [-2,2]³, colored by position via
SH DC term).
- `end_to_end_synthetic_cube_renders_without_panic` — pipeline
produces non-trivial pixel variance (>100 lit pixels, <50%
saturated) on a 256×256 render.
- `end_to_end_double_buffer_swap_preserves_consistency` — SplatRenderer
tick 2x; front_frame_id advances 1, 2 across both buffers.
- `end_to_end_camera_translation_changes_render` — two cameras at
different world positions produce DIFFERENT framebuffers (SSD > 1).
- `end_to_end_empty_scene_yields_pure_background` — zero gaussians ⇒
pixel-exact background fill.
- `end_to_end_three_consecutive_ticks_preserve_invariants` — 3 ticks,
frame_id monotonic 1/2/3, all pixels finite (no NaN bleed).
### examples/splat3d_flex.rs (~200 LoC, runnable demo)
CLI binary that loads a `.ply` scene (or falls back to the synthetic
cube), bakes a circular camera path around the origin, renders N
frames, writes PPM output, reports p50/p95/p99 frame timing + fps.
PPM over PNG: the sprint's "no new top-level deps" invariant rules
out flate2 / png crates. PPM is 14-byte header + raw RGB bytes,
trivially viewable in every image tool, and `splat3d_flex.rs`
documents the choice + the deferred PNG-as-followup option.
Smoke test (5 frames × 256² synthetic cube on AVX2-emulated build):
p50=133.63 ms, p95=146.57 ms, p99=146.57 ms, 7.5 fps
The 1080p × 500K-gaussian acceptance target awaits the Inria
bicycle .ply asset and a benchmarking-only session.
### benches/RESULTS.md (real measured numbers)
Baselined the four PR 1 microbenches under both default (AVX2-
emulated F32x16) and `target-cpu=native` (AVX-512F) builds. Honest
findings:
- `sandwich_simd_x16` on AVX-512 native: 1.83× over scalar loop
(below the spec's 10× aspiration; the AoS↔SoA transpose at 6
fields × 16 lanes dominates the inner-loop savings for this
microbench). Filed as TECH_DEBT for the performance sprint.
- `sandwich_simd_x16` on AVX2-emulated default: 0.17× (slower).
Documented as the polyfill's two-`__m256`-per-`F32x16` cost.
TECH_DEBT: add runtime tier dispatch so AVX2 builds prefer the
scalar loop, or restructure to take SoA inputs directly.
- `from_scale_quat`: 9 ns on AVX-512 native (the 3DGS canonical
Σ builder; GaussianBatch::covariance_x16 SIMD-batches it).
- `eig_smith_1961`: 126 ns (acos dominates; diagonal fast-path
bypasses the trig).
Documented the per-PR follow-up bench rows that should populate
when the rasterizer-driven full-pipeline bench lands.
## Sprint state (Definition of done)
- [x] 7 PRs merged to splat3d branch
- [x] `cargo test --features splat3d -p ndarray` green
(1859 prior tests + 90 splat3d lib tests + 5 e2e + 4 PLY = 1958)
- [x] `cargo bench --features splat3d` baselined in RESULTS.md
- [x] `cargo run --features splat3d --example splat3d_flex` runs
end-to-end (synthetic fallback OR a .ply scene)
- [x] No regression in existing ndarray benches
- [x] Pillar-7 probe certified in lance-graph jc (PR #403 + the
rotated-axisymmetric fix in claude/jc-pillar-7-eigvec-duplicate-fix-MAOO0)
## Deferred to follow-up sprint
- Inria bicycle .ply SSIM comparison vs reference CUDA (asset
download required; not in this remote container).
- 1080p × 500K real-data benchmark (same).
- PNG output via `image`/`png` crate (gated on the no-new-deps
invariant; PPM works for the v1 demo deliverable).
- Performance: AVX2-tier SIMD path optimization; tile-binner radix
sort; rayon-parallel rasterize_frame.
- Backward pass / training pipeline (separate sprint per the
sprint prompt's "After the sprint" section).
https://claude.ai/code/session_017GFLBnDy23AWBqvkbHHC411 parent 5ea62e0 commit 9e96459
6 files changed
Lines changed: 945 additions & 24 deletions
File tree
- benches
- examples
- src/hpc/splat3d
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
41 | 45 | | |
42 | 46 | | |
43 | 47 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | | - | |
5 | | - | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
11 | 16 | | |
12 | 17 | | |
13 | | - | |
14 | | - | |
15 | | - | |
| 18 | + | |
| 19 | + | |
16 | 20 | | |
17 | 21 | | |
18 | 22 | | |
19 | | - | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
20 | 66 | | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
25 | 71 | | |
26 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
27 | 77 | | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
34 | 82 | | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
39 | 86 | | |
40 | 87 | | |
41 | 88 | | |
42 | | - | |
| 89 | + | |
| 90 | + | |
43 | 91 | | |
44 | 92 | | |
45 | 93 | | |
46 | | - | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
0 commit comments