Skip to content

Loosely-coupled VIO pipeline (Stage 0b + 0c): IMU preintegration BA residual, accel-bias estimation, rotation coupling, VI initialization#3

Open
rsasaki0109 wants to merge 59 commits intomasterfrom
vio-integration
Open

Loosely-coupled VIO pipeline (Stage 0b + 0c): IMU preintegration BA residual, accel-bias estimation, rotation coupling, VI initialization#3
rsasaki0109 wants to merge 59 commits intomasterfrom
vio-integration

Conversation

@rsasaki0109
Copy link
Copy Markdown
Owner

Summary

  • Promotes the 0b.{a,b,c,d} IMU scaffolding into a working loosely-coupled visual-inertial pipeline: `Tracking::predictVelocityFromImu()` + `reconcileVelocityWithVisual()` write IMU-derived world-frame velocity onto each Frame, and `populateKeyframeImuSpan()` attaches a frozen `ImuPreintegrationSpan` to every new KF for BA consumption.
  • Adds the local-BA residuals: `VelocityPreintegrationError` (Forster 9-DoF pos + vel + rot between consecutive KFs, with first-order accel-bias Jacobian), `VelocityDeltaPriorError` (loose fallback when a span is missing), plus per-KF velocity / accel-bias / gyro-bias parameter blocks shaped by `BiasAnchorError` and `BiasRandomWalkError`.
  • Adds Visual-Inertial Initialization (`src/tracking/visual_inertial_initializer.{h,cc}`): two-stage linear solve for gyro bias (capped at 0.05 rad/s to reject short-window runaway) + {scale, gravity, velocities}, then rescales/rotates the map and applies `applyGyroBiasCorrectionToSpans` so the BA rotation residual stays consistent with the new bias reference.
  • I/O: EuRoC `cam0 T_BS` extrinsic is now parsed (multi-line YAML supported) and threaded into Tracking. Gravity alignment during mono/depth init now transforms IMU-frame gravity into the camera frame before building the world-aligned pose — previously invisible on TUM (IMU ≈ camera) but catastrophic on EuRoC's ~90° offset.

Validated ATE (evo_ape --align --correct_scale --t_max_diff 0.05)

Dataset Visual only +VIO (this PR) Delta
MH_01_easy (3683 frames) 3.44 m 1.41 m −59%
V1_01_easy (2912 frames) 1.25 m 1.31 m +5% (best intermediate value)

MH_01 rejects VI init (visual rotations too noisy for the 0.08 rad threshold) and benefits from the 9-DoF BA residual alone. V1_01 accepts VI init and picks up additional scale + gravity refinement.

Env knobs (all optional)

  • `SVSLAM_BA_VELOCITY_PRIOR_SIGMA_M` / `SVSLAM_BA_VELOCITY_PRIOR_VEL_SIGMA` — pos/vel sigmas for the preintegration + loose-delta residuals (defaults 0.3 m, 0.3 m/s). Swept on MH_01: σ=0.3 optimal.
  • `SVSLAM_BA_PREINT_ROT_SIGMA_RAD` — rotation residual sigma (default 0.05 rad ≈ 2.9°).
  • `SVSLAM_BA_BIAS_{ACCEL,GYRO}_{ANCHOR,RW}_SIGMA` — bias anchor / random-walk prior sigmas.
  • `SVSLAM_VIO_MIN_INIT_KEYFRAMES` — min KFs before VI init attempts (default 15).
  • `SVSLAM_VIO_GATE_PREINT` — restore the original "gate preint until VI init" behavior (default: preint always on — safer on sequences where VI init rejects).

Test plan

  • 77/77 unit tests pass (10 new tests across OptimizerTest, EurocDatasetTest, VisualInertialInitializerTest).
  • All 7 TUM regression gates pass with bitwise-reproducible trajectories (VIO path is dormant without IMU).
  • EuRoC MH_01_easy full sequence: 1.41 m mean ATE (down from 3.44 m visual-only).
  • EuRoC V1_01_easy full sequence: 1.31 m mean ATE.
  • CI green.

Not in this PR

  • ORB-SLAM3-style MAP refinement of the VI init (larger window + full ceres over scale/gravity/biases/velocities). Tried a sketch; current linear solve + BA bias blocks already cover most of the gain.
  • Gyro-bias first-order Jacobian inside the BA rotation residual. Tried and reverted — without a reliable VI init the Jacobian lets BA absorb visual rotation noise into the bias, regressing MH_01 by 2–3×.
  • EuRoC stereo depth has a pre-existing rectification issue (stereo-only visual is ~2.77 m on MH_01) unrelated to the VIO work.

Loop closing: add second-loop overlap decay to downweight constraints
whose KF ranges overlap with existing loops. Relocalization now
prefers local candidates (2.5m radius) during pending loop correction
and recovery window (4.0m radius), preventing far-field relocalization
chains. applyPendingLoopCorrection returns bool for control flow.

stella_vslam comparison: fair head-250 evaluation on 4 TUM scenarios
(xyz_depth/mono, room_depth/mono) with identical evo_ape flags.
Results in eval/stella_comparison_results.md and .json.

Tests: 48 -> 51 (relocalization candidate policy, overlap decay).
Regression gates: 5/5 pass, bitwise determinism preserved.
…son.

Tracking: cap depth-seeded landmarks to best 600 per keyframe ranked
by octave/response/depth, reducing low-value map growth. xyz_depth
ATE improved 0.01137 -> 0.01090 (regression gates 5/5 pass).

Metric depth: add MetricDepthEstimator for metric ONNX models
(Metric3D v2, UniDepth). New --metric-depth-model CLI flag.
Resolves tensor shapes from model metadata, skips relative-depth
rescaling. 55 tests pass with USE_DEPTH_DL=ON.

stella comparison: add loop-enabled head-250 results (4 scenarios
x 3 runs each) and 600-frame room_depth validation (3 runs, median
0.617m). Loop closing narrows room_depth gap slightly but does not
yet close it.
Initializer: select decomposition solution by median parallax instead
of just point count. Filters degenerate low-parallax solutions that
produce noisy initial maps. xyz_mono ATE 0.048 -> 0.036 (-26%).

Loop closing: increase cooldown from 120 to 200 KF, reducing
over-correction on long runs. 600-frame room_depth: 0.124m and
0.109m (both < 0.20m, previously median 0.617m).

README: add stella_vslam comparison tables (repro-eval and
loop-enabled), update accuracy numbers, document --metric-depth-model,
update test count to 51/55.

Regression gates: 5/5 pass, bitwise determinism preserved.
Baselines: xyz_depth 0.020->0.016, xyz_mono 0.040->0.030,
room_depth 0.200->0.165, room_depth_accel 0.170->0.145.
All gates pass with ~20-30% headroom.

plan.md: update Section 1.1 to 2026-04-13 state (4 commits,
55 tests, mono improvement, 600-frame stabilization, metric depth,
stella comparison). Add Section 1.8 with room gap closing design:
Ceres sparse solver, BA window expansion, covisibility weighting.
Tested all combinations: 15/10 (baseline 0.0731), 20/15 (0.0937),
20/10 (0.0769), 15/15 (0.0695). The extra iterations without
widening the KF window gave the best room_depth result.
Regression gates 5/5 pass, bitwise determinism preserved.
Pose graph: weight covisibility edges by sqrt(shared_landmarks /
max_shared_landmarks). Strongest edge = 1.0, weaker edges get
proportionally lower weight. Loop constraint edges unchanged.
room_depth ATE improved 0.1289 -> 0.0695 (-46%).
Regression gates 5/5 pass.

Metric depth test: relative depth pipeline works (ATE 0.00465).
Metric depth pipeline crashes with std::length_error on the
current model - tensor shape handling needs fixing.
Results in eval/metric_depth_test_results.md.
Handle symbolic/dynamic ONNX tensor dimensions (-1) in output shape
resolution by falling back to input image size. Allow depth output
candidates with dynamic shapes if their name matches (e.g.
predicted_depth). Add non-tensor input/output type guards.

Tested with Depth Anything V2 small: --metric-depth-model now runs
successfully (xyz ATE 0.00913m, 50 frames). 55/55 tests pass.
…e plan.

Tracking: tighten Lowe ratio 0.75->0.70 for reference keyframe matching.
Local mapping: mono landmarks culled faster if < 3 observations by next KF,
mono triangulation rejects baselines < 0.02m (was 0.01m for all modes).
room_mono ATE 0.269 -> 0.223 (-17%). Regression gates 5/5 pass.

Verification: covisibility-weighted pose graph with loop closing gives
room_depth median 0.086m (neutral vs 0.083m without weighting).
Metric depth on room: 50f ATE 0.016m, 250f ATE 0.384m (no loop correction).

plan.md: update to reflect all 8 commits from 2026-04-13.
EuRoC: add --euroc-camera-config <json> for external calibration,
stereo image pair loading (cam0+cam1), --stereo flag, undistortion
support. Example config: config/examples/euroc_mh01.json.
New test_euroc_dataset.cc (53 tests total).

CI: add synthetic test data generator (10-frame TUM-style dataset),
smoke regression step in ci.yml (continue-on-error). Verifies
trajectory production and loose ATE ceiling (0.5m).

Metric depth: verified with true indoor metric ONNX model.
xyz 50f: 0.011m, xyz 250f: 0.065m, room 50f: 0.017m.
Results in eval/metric_depth_test_results.md.
Stereo: implement StereoDepthEstimator using OpenCV StereoSGBM.
Computes metric depth from stereo disparity (depth = baseline * fx /
disparity). Integrated with --euroc --stereo pipeline. Filters
invalid depth outside 0.1-20m range. Tests: 55/55 pass.

ROS2: add ros2/ package with ament_cmake build, slam_node.cc
subscribing to Image/CameraInfo/Depth and publishing
Odometry/Path/PointCloud2/TF. Launch file with configurable topics.

CI: fix CXSparse dependency (libsuitesparse-dev already present,
documented explicitly). Add py_compile for new scripts. Remove
continue-on-error from smoke regression.

Version: 0.1.0 -> 0.2.0. CHANGELOG and CITATION.cff updated.
ROS2: fix package naming, cv_bridge include, linker signature.
colcon build + node startup + launch file all verified.

EuRoC: mono and stereo pipelines verified with synthetic dataset
(MH01 download URL is dead). Stereo produced 4x more landmarks
than mono (1223 vs 277) with baseline 0.110m.

Optimizer: add IRLS reweighting for pose graph (2-pass with
Cauchy downweighting of high-residual edges), information matrix
scaling based on edge covisibility strength, and increased pose
graph robustness. Regression gates 5/5 pass.
Stereo pipeline fully working: StereoSGBM disparity -> metric depth,
single-frame depth initialization (369 3D points), 80 frames tracked
with 27 KFs and 9134 landmarks. ETH EuRoC server is down so used
synthetic stereo pairs with simulated forward motion.
README: add compelling subtitle, feature highlights, copy-paste Quick
Start, Mermaid thread model diagram, supported modes table, cleaner
accuracy section with stella_vslam comparison. Shorter and more
scannable for GitHub visitors.

Demo images: trajectory + map visualization for xyz and room (1200x800),
groundtruth comparison overlay for xyz.

GitHub metadata: description, 13 topics (slam, visual-slam, ros2, etc.),
homepage set to GitHub Pages.
Full rewrite with 14 sections: vision, current state (v0.2.0),
commit history (16 commits), file inventory with line counts,
thread model, thread safety, constants, known issues, roadmap,
stella_vslam gap analysis with code-anchored improvement proposals,
priorities, agent instructions, build commands, non-goals.
1106 lines, self-contained for Codex/Claude/Cursor handoff.
The 0.70 ratio from d7b4657 helped room_mono but broke xyz_mono
(0.022 -> 0.041, exceeding the 0.030 ceiling). Reverting to 0.75
restores xyz_mono to 0.022. room_mono stays at 0.269 (within
ceiling 0.340). All 5 regression gates pass.
Depth prior sigma 0.02->0.015 (stronger trust in sensor depth).
Local BA covisibility window 15->20 KF, iterations 15->20.
xyz_depth ATE improved 0.01214 -> 0.01104 (-9%).
Still behind stella_vslam (0.00889) but closing the gap.
Regression gates 5/5 pass, bitwise determinism preserved.
Add Section 14: complete stella_vslam battle log with every parameter
change tried, measured ATE, and verdict (kept/reverted). Includes
cross-scenario effects, lessons learned, and ordered experiments
for the next agent to try. Update HEAD to ef8e04e.
- Tracking: mono min_frames_since_last_kf floor 4 (RGB-D stays 3); mono
  bootstrap visible-pool floor; BootstrapStats and RoomFocusTrace (frames
  100-125); relocalize candidate tie-break by temporal proximity; keyframe
  trace plumbing.
- Backend: skip sparse triangulation fallback in local mapping when
  appropriate; add sequential keyframe edges to pose graph when covis is
  sparse.
- Reference policy: late sparse mono refresh rule adjustments; tests.
- Tools: segment_ate_tum.py for segment ATE splits; plan and stella comparison
  metadata refreshed for reproducible gates (5/5, room_mono ~0.197 m).

Made-with: Cursor
…re descriptor dist

Validated with: python3 -u scripts/check_regression_gate.py --build build_codex --all-gates --quiet

Gates: room_depth_accel_head_repro=0.057702 sha=a4f036f7ca3059ea; room_depth_head_repro=0.079914 sha=ebdd323dbc378992; room_mono_head_repro=0.197177 sha=6583177e40532f6a; xyz_depth_head_repro=0.011042 sha=4b21294168165ee9; xyz_mono_head_repro=0.028136 sha=583e07bf2ba4ecaf.

ctest --test-dir build_codex --output-on-failure: 62/62 passed.
…nliers

Trace-only verification:

ctest --test-dir build_codex --output-on-failure: 62/62 passed.

python3 -u scripts/check_regression_gate.py --build build_codex --gate room_mono_head_repro --quiet: room_mono_head_repro=0.197177, two-run SHA prefix=6583177e40532f6a.

python3 -u scripts/check_regression_gate.py --build build_codex --gate xyz_mono_head_repro --quiet: xyz_mono_head_repro=0.028136, two-run SHA prefix=583e07bf2ba4ecaf.

Focused trace: ./build_codex/run_mono --tum data/tum/rgbd_dataset_freiburg1_room --max-frames 250 --repro-eval --no-viz --reference-policy heuristic > log/room_mono_trace_phaseB.log 2>&1.

Diagnosis appended to eval/room_mono_frame199_diag.txt; source bucket did not distinguish near-equal coarse-error candidates, so Phase C was skipped.
…localization

Cooldown: kPostRelocEmergencyKfCooldownFrames = 3.

Validation:

- cmake --build build_codex -j20

- ctest --test-dir build_codex --output-on-failure: 62/62 passed

- python3 -u scripts/check_regression_gate.py --build build_codex --gate xyz_mono_head_repro --quiet: 0.028136 m PASS, SHA 583e07bf2ba4ecaf

- python3 -u scripts/check_regression_gate.py --build build_codex --gate room_mono_head_repro --quiet: 0.176340 m PASS, SHA 70fff60ac7874535

- python3 -u scripts/check_regression_gate.py --build build_codex --gate room_depth_head_repro --quiet: 0.079914 m PASS

- python3 -u scripts/check_regression_gate.py --build build_codex --gate xyz_depth_head_repro --quiet: 0.011042 m PASS

- python3 -u scripts/check_regression_gate.py --build build_codex --gate room_depth_accel_head_repro --quiet: 0.057702 m PASS

Diagnostics:

- ./build_codex/run_mono --tum data/tum/rgbd_dataset_freiburg1_room --max-frames 250 --repro-eval --no-viz --reference-policy heuristic > log/room_mono_trace_phaseA2.log 2>&1

- KF insertions in frames 195-210: 5 baseline -> 4; cooldown stdout firings: 2.
The coarse-reprojection-error tiebreak inserted between the coarse_ok
flag and descriptor distance was documented (plan.md §8.3 step 7) as
a near-noise gain (0.197374 -> 0.197177). Confirmed empirically here:
after removal, room_mono drifts 0.176340 -> 0.176506 (+0.000166 m,
still well under the 0.340 ceiling). xyz_mono stays at 0.028136 with
identical SHA. All three depth gates keep their locked numbers and
SHAs byte-for-byte.

Drop ~11 lines (constexpr + finite_coarse_err lambda + the err-compare
block inside the sort comparator) and revert the trace label to
coarse_dist_bucket_tie. The trace of coarse_err_px on each candidate
is still collected for diagnostics; only the ordering effect is gone.

Validated with:
  cmake --build build_codex -j$(nproc)
  ctest --test-dir build_codex --output-on-failure  # 62/62
  python3 -u scripts/check_regression_gate.py --build build_codex \
    --gate xyz_mono_head_repro --quiet              # 0.028136 PASS
  python3 -u scripts/check_regression_gate.py --build build_codex \
    --gate room_mono_head_repro --quiet             # 0.176506 PASS
  python3 -u scripts/check_regression_gate.py --build build_codex \
    --gate room_depth_head_repro --quiet            # 0.079914 PASS
  python3 -u scripts/check_regression_gate.py --build build_codex \
    --gate xyz_depth_head_repro --quiet             # 0.011042 PASS
  python3 -u scripts/check_regression_gate.py --build build_codex \
    --gate room_depth_accel_head_repro --quiet      # 0.057702 PASS
The experimental ScoreReferenceKeyframePolicy and PipelineReferenceKeyframePolicy
variants lived under src/experiments/reference_keyframe/ with a dedicated CMake
library, an experiments CLI (tools/reference_policy_experiments.cc), a 200-line
evaluation shell script, a 959-line doc-generator Python script, four doc pages
under docs/, and a scenario corpus under experiments/reference_keyframe/. None
of them were ever promoted to the runtime default, which stayed HeuristicReferenceKeyframePolicy.
plan.md §13 Non-goals explicitly flagged the reference-policy research track as
non-default.

Keep the ReferenceKeyframePolicy contract in src/core/ so a future experiment
can plug back in without a refactor, but drop everything that existed solely to
compare the three variants. Net deletion: ~1900 tracked lines across 17 files
plus ~100 lines of wiring updates elsewhere.

Behavioral check: cmake reconfigure + build clean; ctest 59/59 (was 62/62,
the three Score/Pipeline policy tests are gone); all five regression gates
pass with byte-identical trajectory SHAs and ATE numbers versus HEAD~1:

  room_depth_accel_head_repro: 0.057702 sha=a4f036f7ca3059ea
  room_depth_head_repro:       0.079914 sha=ebdd323dbc378992
  room_mono_head_repro:        0.176506 sha=60383555ba272c17
  xyz_depth_head_repro:        0.011042 sha=4b21294168165ee9
  xyz_mono_head_repro:         0.028136 sha=583e07bf2ba4ecaf

Validated with:
  cmake -S . -B build_codex -G Ninja -DBUILD_TESTS=ON -DUSE_DEPTH_DL=ON
  cmake --build build_codex -j\$(nproc)
  ctest --test-dir build_codex --output-on-failure
  python3 -u scripts/check_regression_gate.py --build build_codex --all-gates --quiet
The per-frame FallbackInlier and FallbackSummary stdout lines (commit
3c2257f) were trace-only diagnostics added to inspect the late sparse
mono fallback. Their findings are already written up in
eval/room_mono_frame199_diag.txt ("Phase B trace-only fallback
diagnostics" and "Cascade diagnosis" sections), so the live plumbing
is no longer paying rent.

Delete:
- struct FallbackTraceMatch and FallbackSummaryTrace (fields + default
  values + the member vectors fallback_trace_matches /
  fallback_summary_trace)
- the trace_percentile lambda
- the two blocks inside trackLocalMap that populated and printed them

Kept: the candidate-level FallbackCandidateTrace top-N dump (still
useful when live-investigating a single frame) and the MatchCandidate
coarse_ok/coarse_err_px fields it reads from.

Net deletion is about 160 lines. Behavior is unchanged: ctest 59/59,
all five gates pass with byte-identical trajectory SHAs and ATEs
(room_mono 0.176506, xyz_mono 0.028136, room_depth 0.079914,
xyz_depth 0.011042, room_depth_accel 0.057702).

Validated with:
  cmake --build build_codex -j\$(nproc)
  ctest --test-dir build_codex --output-on-failure
  python3 -u scripts/check_regression_gate.py --build build_codex --all-gates --quiet
Remove stdout traces that no external consumer reads and that were
added to help past investigations rather than the runtime pipeline:

- RoomFocusTrace: mono-only, freiburg1_room-specific frame-range
  (100..125) print. The dataset-specific scaffolding is exactly the
  kind of hack simple-like-Tesla cleanup targets.
- TrackLocalMap::LocalMapVisibility: per-bucket nf/depth/oob/vis/match
  counters. The headline "Visible: N, Matches: M" line right above it
  remains.
- FallbackCandidateTrace top-N dump: 20-line per-frame candidate table
  during late_sparse_mono_bootstrap.
- BootstrapStats: 20+ field key=value dump after the descriptor
  fallback path. The single "Fallback global matches: N" line right
  after is kept.

No member variables or control flow were touched; only stdout
plumbing is gone. Downstream PnP, pose filtering, and KF insertion
are unchanged.

Validated: build OK, ctest 59/59, all five regression gates pass
with byte-identical trajectory SHAs and ATEs (room_mono 0.176506,
xyz_mono 0.028136, room_depth 0.079914, xyz_depth 0.011042,
room_depth_accel 0.057702).
The BootstrapStats and FallbackCandidateTrace stdout blocks removed
in d2676a3 and f2d7111 were the only readers of several local
variables inside the trackLocalMap fallback. With no consumers left,
they only produced noise:

- Counters incremented but never read: fallback_two_nn,
  fallback_reject_distance, fallback_reject_ratio,
  fallback_reject_index, fallback_reject_used.
- Per-frame diagnostics with no remaining readers:
  visible_pool_before_fallback, bootstrap_added_pre_pose,
  bootstrap_added_post_pose, retried_relaxed_pose_filter.
- MatchCandidate fields that no branch reads anymore: ratio_margin,
  coarse_err_px. MatchCandidate now only carries what the sort and
  the PnP feed need: lm_idx, kp_idx, dist, octave, source_bucket,
  coarse_ok.
- bootstrap_coarse_ok_and_err lambda had an out-param err_px used
  only by the removed trace; rename to bootstrap_coarse_ok and drop
  the out-param.

No behavior change. All five gates pass with byte-identical
trajectory SHAs and ATEs versus HEAD~1 (room_mono 0.176506,
xyz_mono 0.028136, room_depth 0.079914, xyz_depth 0.011042,
room_depth_accel 0.057702). ctest still 59/59.
Both structs collected many counters that only the deleted trace
prints (LocalMapVisibility in f2d7111, BootstrapStats in 1af225d)
consumed. Keep just the fields that still affect behavior and remove
the rest of the increments.

PoseFilterStats: kept only focus_reject_reprojection, which is read
by the late sparse mono fallback retry decision.

LocalMapSourceStats: kept only pool_added (the one array that the
surviving TrackLocalMap::LocalMapSources line prints); dropped
rejected_nonfinite, rejected_depth, rejected_oob, visible, matched
and their increment sites. The short-circuit `continue` logic that
used to own those increments is unchanged.

No behavior change. All five gates pass with byte-identical
trajectory SHAs and ATEs (room_mono 0.176506, xyz_mono 0.028136,
room_depth 0.079914, xyz_depth 0.011042, room_depth_accel 0.057702).
ctest still 59/59.
The --keyframe-trace-csv CLI flag piped a CSV of needNewKeyframe()
inputs and decisions into a file. No script, test, doc, or other
consumer under src/, apps/, tests/, scripts/, eval/, or ros2/ reads
the output (only plan.md mentioned it in prose). When the flag is
unset, the nine traceKeyframeDecision() call sites early-return, so
removal has zero behavioral impact.

Delete across three files:
- apps/run_mono.cc: --keyframe-trace-csv help text, CLI parsing,
  keyframe_trace_csv_path / keyframe_trace_file locals, and the
  setKeyframeDecisionTraceSink call. The --reference-policy /
  --run-summary-json / --skip-frames / --max-frames parsing sibling
  arguments are untouched.
- src/tracking/tracking.cc: setKeyframeDecisionTraceSink and
  traceKeyframeDecision method bodies, and the nine call sites
  inside needNewKeyframe() that passed per-decision rows. The
  surrounding stdout lines (e.g. "Low tracked features", "Max
  frames reached") are kept - they are still useful.
- src/tracking/tracking.h: method declarations plus the two member
  variables keyframe_decision_trace_sink_ and
  keyframe_decision_trace_header_written_.

plan.md: drop the --keyframe-trace-csv CLI list entry and the
"`--keyframe-trace-csv` plumbing" bullet under §8.2.

Validated: ctest 59/59, all five gates pass with byte-identical
trajectory SHAs and ATEs (room_mono 0.176506, xyz_mono 0.028136,
room_depth 0.079914, xyz_depth 0.011042, room_depth_accel 0.057702).
Phase A2 trace re-read shows reference-KF landmark propagation
collapses 13→21→13→9→5→2→0 across frames 188-194 in fr1_room.
The V2 post-reloc cooldown defers a refresh KF at frame 193
(counter 1/3 after successful reloc at 192), after which the
tracker goes lost and pose freezes through frame 202. V3 attempt
to skip ultra-sparse emergency KFs regressed room_mono 0.176→0.296
(reverted), confirming isolated sparse KFs still act as needed
scaffolding. Real next lever is reference-KF switching when
propagated-landmark budget drops, not the emergency-KF path.
plan.md:
- §8.3: add step 12 covering the d2676a3/f2d7111/1af225d/776fd05
  simplification commits (~2325 net deleted lines after the
  9fbf2de experimental-policy removal).
- §8.4: add two new ruled-out entries. Skipping ultra-sparse
  emergency KFs (V3) and lowering kMinTrackLocalMapInliers from 12
  to 10 both regressed room_mono ~65%; xyz_mono holds both times.
- §11: refresh the recent-commits table to include the simplification
  sweep.

eval/room_mono_frame199_diag.txt:
- Append a "Simplification pass and failed threshold attack" section
  summarizing the ~2325-line deletion pass (byte-identical SHAs
  across all five gates) and recording why kMinTrackLocalMapInliers
  12 is load-bearing (10-inlier PnP solutions poison
  current_frame_->landmarks_ and cascade). Note the consistent
  pattern: two independent speculative attacks (V3 ultra-sparse skip,
  this 12->10 threshold) regressed room_mono 65-68% each, so the next
  concrete lever is structural (reference-KF refresh under a dying
  propagation chain), not a threshold knob.
rsasaki0109 and others added 27 commits April 18, 2026 08:27
The BA depth-prior sigma was a single 0.015 m (15 mm) for anything
flagged depth_is_metric_, which was fine for TUM sensor RGB-D but
punished the ONNX indoor metric path: Depth-Anything-V2 metric-indoor
small on fr1_room @ 250 frames drove room ATE to 0.624 m (Sim3
aligned with evo_ape) - a 3.5x regression vs the mono-only baseline
of 0.176506 m. With the 1/sigma weight the BA was effectively
treating per-pixel network outputs as if they were millimeter-accurate
sensor returns, so network noise baked straight into the pose.

Add a separate depth_is_learned_ flag on Frame / Keyframe that
is only set when run_mono filled the depth image from dl_depth_estimator.
In Optimizer::addDepthPriorResiduals the sigma is now:

  - non-metric (relative DL): 0.2          (unchanged)
  - metric, learned (DL ONNX): 0.15        (new)
  - metric, sensor / stereo:   0.015       (unchanged)

Sweep on fr1_room mono @ 250 with indoor_small metric model
(evo_ape --align --correct_scale --t_max_diff 0.05):

  sigma=0.015 (old): 0.624 m
  sigma=0.15:        0.225 m
  sigma=0.5:         0.441 m

0.15 is the empirical sweet spot. Even the improved learned-depth run
at 0.225 is still worse than the mono-only gate of 0.176506 m, so the
metric path is kept out of the regression gates for now; this commit
is infrastructure that lets future DL-depth experiments pick a
reasonable starting trust level without silently abusing the sensor
sigma.

Behavior on all five gates is unchanged: sensor depth frames set
depth_is_learned_=false and still hit the 0.015 branch. ctest 59/59;
all five gates pass with byte-identical trajectory SHAs:

  room_depth_accel_head_repro 0.057702  a4f036f7ca3059ea
  room_depth_head_repro       0.079914  ebdd323dbc378992
  room_mono_head_repro        0.176506  60383555ba272c17
  xyz_depth_head_repro        0.011042  4b21294168165ee9
  xyz_mono_head_repro         0.028136  583e07bf2ba4ecaf
Re-ran all four fair-window presets with bash scripts/verify_comparison_benchmark.sh
BUILD=build_codex on HEAD df644d2. The stella_vslam head-250 baselines
are unchanged (provided artifacts).

Deltas vs the 2026-04-15 snapshot (commit 2ac7ffa):

  xyz_depth:  0.01104221 -> 0.01104221   unchanged
  xyz_mono:   0.02702567 -> 0.02813558   +4.1% (within run-to-run noise)
  room_depth: 0.07991444 -> 0.07991444   unchanged
  room_mono:  0.22049743 -> 0.17650551   -19.95%

room_mono narrowed from ~8x to ~6.4x of stella_vslam head-250 after
the post-relocalization emergency-KF cooldown (commit 0220ea7) landed.
xyz_depth and room_depth stay byte-identical because the cooldown
only fires when a successful relocalization is followed by a low
tracked-features frame, which does not happen inside those head-250
windows.

Update the Fair Head-250 comparison table to add an explicit "Ratio
vs stella" column (1.24x / 1.99x / 3.79x / 6.43x) so the remaining
gap is visible at a glance, and refresh the room_mono loss_hypothesis
in stella_comparison.json to note the drop.
Re-ran the 600-frame room_depth loop-enabled validation at HEAD
3d7b5f1 with the same command and evo_ape flags as the earlier
2026-04-14 snapshot:

  ./build_codex/run_mono --tum data/tum/rgbd_dataset_freiburg1_room \
    --depth --max-frames 600 --no-viz data/ORBvoc.txt
  evo_ape ... --align --correct_scale --t_max_diff 0.05

3-run results (m):

                 2026-04-14     2026-04-18
  rep1          0.13731232     0.125949
  rep2          0.61716193     0.146260
  rep3          0.87081246     0.110414
  median        0.617          0.126
  worst case    0.871          0.146

All three runs now land in the 0.11-0.15 m band. Variance and
worst-case dropped ~80% each versus the pre-cooldown snapshot. A
single rep is now roughly representative instead of a lottery.

Keep the earlier snapshot in the same section as "pre-cooldown
(2026-04-14)" for provenance, with the inference about stabilization
rejections it captured at the time.
initializeWithDepth() already does a gravity alignment step when
--accel has filled accel_buffer_: it estimates gravity, builds
R_align via AccelerometerProcessor, rotates the initial frame, and
flips gravity_aligned_ to true. The two-frame mono init path never
did any of this, so gravity_aligned_ stayed false for any pure-mono
run. That in turn made setKeyframeGravity() early-exit on every KF
and GravityPriorError silently no-op in the BA -- the --accel flag
was on paper but dead in mono.

Add the same gravity alignment to the mono init success branch.
Because the initializer operates in c1-frame coordinates (independent
of world), we can apply the alignment after initialize() returns and
then rewrite poses and triangulated-point positions:

  T_c1_w_new   = T_align
  T_c2_w_new   = T_c2_c1 * T_align
  p_world_new  = T_align^{-1} * p_c1   (for every triangulated point)

setKeyframeGravity() is now called *after* gravity_aligned_ may flip
to true, so kf_init and kf_cur pick up gravity_in_camera_ on frames
where the accel window is close to stationary. The new keyframes'
T_cw_ is also synced to the updated frame poses.

When --accel is not used (the common case, including all five current
regression gates), accel_buffer_ stays empty, T_align collapses to
identity, and the landmark transform is a no-op. All five gates hold
byte-identical trajectory SHAs:

  room_depth_accel_head_repro 0.057702 sha=a4f036f7ca3059ea
  room_depth_head_repro       0.079914 sha=ebdd323dbc378992
  room_mono_head_repro        0.176506 sha=60383555ba272c17
  xyz_depth_head_repro        0.011042 sha=4b21294168165ee9
  xyz_mono_head_repro         0.028136 sha=583e07bf2ba4ecaf

ctest 59/59.

Opt-in measurements with --accel --max-frames 250 --repro-eval
--no-viz --reference-policy heuristic (evo_ape --align --correct_scale
--t_max_diff 0.05):

  fr1_xyz  mono           0.028136 -> 0.024931  (-11.4%)
  fr1_room mono           0.176506 -> 0.164001  (-7.1%)

Both runs now log "Applied gravity alignment (mono init)" and "BA:
Added N gravity prior residuals" as N grows with keyframe count,
confirming the prior is live in the BA instead of being dropped.

Next VIO-ward steps (not in this commit) can add velocity estimation
from the accel buffer, static-motion detection for scale-drift
suppression, and online accel bias estimation.
Commit cbea161 made --accel actually work in mono init, which dropped
fr1_room mono from 0.176506 to 0.164001 and fr1_xyz mono from 0.028136
to 0.024931 with everything else held constant. Without a dedicated
gate, that improvement is easy to regress silently -- the existing
room_mono_head_repro / xyz_mono_head_repro gates do not pass --accel,
so they measure the non-gravity path only.

Add two new gates mirroring the existing mono gates but with
use_accel=true:

  room_mono_accel_head_repro   ceiling 0.170  measured 0.164001
                               two-run SHA 25ba31699be19bcd
  xyz_mono_accel_head_repro    ceiling 0.027  measured 0.024931
                               two-run SHA 1fdb49cb60f71841

Ceilings leave enough headroom for normal run-to-run drift (~1-2 mm)
without being so loose that a significant regression in the gravity
prior path would sneak through. Both gates are deterministic
(--repro-eval) and reproduce the two-run SHA, same as the existing
five gates.

The original five gates still pass byte-identically; this commit
only adds two new gates, it does not change any existing gate's
flags or ceiling.
Three runs of 600-frame room_mono with the default async threads
produced one core dump at ~frame 195 during a relocalize cascade,
while three runs under --repro-eval completed with identical
deterministic ATE 0.468671 m. An AddressSanitizer build ran clean on
async 600-frame and found zero memory errors; that is consistent with
a timing-sensitive race that ASan's slowdown hides rather than a
heap / out-of-bounds bug.

Fixing this needs ThreadSanitizer plus enough run reps to catch the
race, then a mutex / atomic audit on the Map API (getAllKeyframes /
getAllLandmarks return refs without locking, flagged in plan.md §6.1)
and the LoopClosing pose-graph handoff. Out of scope for this session.

Append the investigation note to eval/room_mono_frame199_diag.txt so
future sessions see it alongside the existing cascade diagnosis.
ThreadSanitizer run on 250-frame fr1_room mono with loop closing
enabled flagged Landmark::is_bad_ as the top data race in the async
pipeline. LocalMapping writes the flag from
Optimizer::bundleAdjustment -> Landmark::setBad() while the main
Tracking thread concurrently reads it from needNewKeyframe() /
trackLocalMap() via Landmark::isBad(). The bool had no mutex and no
atomic, so the publish was unsynchronized; that matches the 600-frame
async intermittent core dump in eval/room_mono_frame199_diag.txt
(crash absent under --repro-eval, absent under ASan which slows the
race below trigger).

Change is_bad_ from bool to std::atomic<bool> with release-store /
acquire-load on the setter and getter. This gives a lock-free
happens-before between "marked bad" and "read as bad" without taking
Landmark::mutex_ on the hot readers (needNewKeyframe walks every
reference-KF landmark on every frame).

Behavior on all seven gates is unchanged: ctest 59/59 and all gates
pass with byte-identical trajectory SHAs:

  room_depth_accel_head_repro 0.057702  a4f036f7ca3059ea
  room_depth_head_repro       0.079914  ebdd323dbc378992
  room_mono_accel_head_repro  0.164001  25ba31699be19bcd
  room_mono_head_repro        0.176506  60383555ba272c17
  xyz_depth_head_repro        0.011042  4b21294168165ee9
  xyz_mono_accel_head_repro   0.024931  1fdb49cb60f71841
  xyz_mono_head_repro         0.028136  583e07bf2ba4ecaf

Other TSan-flagged races remain (current_frame_ shared_ptr swap
between addFrame and onBACompleted callback; a handful in
local_mapping.cc hot paths). This commit only fixes the highest-hit
race; the rest stay as follow-ups.
Post-commit-ff74140 TSan rerun on 250-frame fr1_room async mono
confirms `Landmark::is_bad_` race is gone and catalogs the remaining
34 races across two structural buckets:

1. shared_ptr<Landmark> access on shared containers (Keyframe
   landmarks_) -- needs a container-level lock or snapshot getter,
   not a one-line fix.
2. Tracking::current_frame_ shared_ptr swap between addFrame() and
   the LocalMapping onBACompleted callback -- pose_mutex_ already
   exists; holding it across the swap and the callback dereference
   is the minimal fix.

Both are plausibly linked to the intermittent 600-frame async core
dump around frame 195. Fixing them needs the same scope-and-gate
discipline as the is_bad_ atomic (one race per commit, TSan
before / after, all seven gates byte-identical), so they stay as
follow-ups for a dedicated session rather than being rushed in
alongside the rest of this branch.

Also record the `setarch $(uname -m) -R` workaround for TSan's
"unexpected memory mapping" FATAL on recent kernels (higher
mmap_rnd_bits than TSan expects).
ThreadSanitizer flagged the shared_ptr<Frame> assignment in addFrame
(main thread) as racing with the read inside onBACompleted ->
recomputeCurrentPose (LocalMapping thread). onBACompleted already
takes pose_mutex_ around its current_frame_ access, but addFrame
wrote current_frame_ without taking any lock, so the pair was
one-sided and the swap was observable to the callback mid-assignment.

Wrap only the two shared_ptr swaps at the top and bottom of addFrame
in a lock_guard<pose_mutex_>. The heavy middle (initialize / track)
stays outside the critical section: addFrame is only called from the
main thread, so its internal reads of current_frame_ do not race with
other writers, and Frame / Keyframe / Landmark members carry their
own locks where needed.

Post-fix TSan rerun on 250-frame fr1_room async mono shows zero
remaining `current_frame_` matches in the race report (previously
every run listed tracking.cc:283 as a hot race site). Other races in
the report are unrelated -- the addFrame frames that still show up
on stack traces are parent frames for downstream races in track() /
trackLocalMap(), not the shared_ptr swap itself.

All seven regression gates pass with byte-identical trajectory SHAs:

  room_depth_accel_head_repro 0.057702  a4f036f7ca3059ea
  room_depth_head_repro       0.079914  ebdd323dbc378992
  room_mono_accel_head_repro  0.164001  25ba31699be19bcd
  room_mono_head_repro        0.176506  60383555ba272c17
  xyz_depth_head_repro        0.011042  4b21294168165ee9
  xyz_mono_accel_head_repro   0.024931  1fdb49cb60f71841
  xyz_mono_head_repro         0.028136  583e07bf2ba4ecaf

ctest still 59/59.
Post-ff74140 / a2f65f1, the remaining TSan hot races were all of the
form "shared_ptr<Landmark>::operator bool() / shared_ptr_base.h:1670"
flagged between:

- Main thread iterating kf->landmarks_ inside
  Tracking::trackLocalMap's add_landmarks_from_kf lambda (tracking.cc
  lines 942-948 before the fix), and
- LocalMapping thread assigning kf->landmarks_[idx] = lm in
  LocalMapping::createNewMapPoints (local_mapping.cc 376-377).

The vector elements are shared_ptr values; a torn read on one element
is exactly the "control block + raw ptr" race TSan reports.

Fix both sides with Keyframe::mutex_ (already declared on Keyframe):

- In tracking.cc, snapshot kf->landmarks_ into a local std::vector
  under kf->mutex_ before iteration, then walk the snapshot. The
  shared_ptr copies increment refcounts safely on their own; we only
  need the lock for the vector copy.
- In local_mapping.cc createNewMapPoints, take current_processed_kf_
  and neighbor mutex_ separately (not nested) around the assignments.
  Sequential single-keyframe locks avoid any lock-order inversion.

All seven regression gates pass with byte-identical trajectory SHAs:

  room_depth_accel_head_repro 0.057702  a4f036f7ca3059ea
  room_depth_head_repro       0.079914  ebdd323dbc378992
  room_mono_accel_head_repro  0.164001  25ba31699be19bcd
  room_mono_head_repro        0.176506  60383555ba272c17
  xyz_depth_head_repro        0.011042  4b21294168165ee9
  xyz_mono_accel_head_repro   0.024931  1fdb49cb60f71841
  xyz_mono_head_repro         0.028136  583e07bf2ba4ecaf

Other kf->landmarks_ write sites (initialization, keyframe insertion)
are on the main thread only and do not race; this patch only touches
the main-thread-reads-vs-LocalMapping-writes pair that TSan flagged.
Record the TSan race-count trajectory for the three targeted fixes
that landed on this branch (is_bad_ atomic, current_frame_ pose_mutex_
pair, Keyframe::landmarks_ kf->mutex_ snapshot), and the two remaining
race families that each need a dedicated session:

- Frame::landmarks_ concurrent access from recomputeCurrentPose
  (LocalMapping thread) vs many main-thread write sites. Frame has
  mutex_ declared but nothing takes it today.
- Map / LocalMapping::mapPointCulling removing landmarks while the
  Tracking thread reads via getAllKeyframes / getAllLandmarks
  (plan.md §6.1 already flags this as Map API hygiene).

Net-useful TSan deltas per fix commit recorded in the diag, with the
same before/after discipline future commits should follow.
3-rep 600-frame fr1_room mono async rerun at HEAD 9b7f854 completed
all three reps with exit=0 and no core dump, versus the pre-race-fix
baseline that had 1/3 crashing during a relocalize cascade around
frame 195. ATE variance also tightened (0.57-0.76 m vs 0.48-0.93 m).
The three commits that landed the targeted fixes - ff74140, a2f65f1,
996aefb - are the credited cause; absolute 600-frame ATE remains high
and that falls under the separate scale-drift / reference-KF collapse
problem, not a threading bug.
Flesh out the [Unreleased] section with the net-useful deltas on this
branch:

Added
- Mono init gravity alignment, so --accel actually fires the BA
  gravity prior on mono (fr1_xyz -11.4%, fr1_room -7.1% on head-250).
- room_mono_accel_head_repro / xyz_mono_accel_head_repro regression
  gates (7-gate suite).
- Post-relocalization emergency-KF cooldown (kPostRelocEmergencyKf
  CooldownFrames = 3), room_mono_head_repro 0.197 -> 0.177.
- depth_is_learned_ flag + 0.15 m sigma for learned metric depth in
  BA (vs 0.015 m for sensor).

Changed
- Three ThreadSanitizer-flagged races fixed (Landmark::is_bad_ atomic,
  Tracking::current_frame_ pose_mutex_ pair, Keyframe::landmarks_
  snapshot / per-kf lock); 3-rep 600-frame async fr1_room mono no
  longer core-dumps (was 1/3 crash rate).
- 600-frame loop-enabled room_depth median 0.617 -> 0.126 m.
- stella_comparison refreshed: room_mono 8x -> 6.4x of stella
  head-250.

Removed
- Experimental score / pipeline reference policies + harness (~2000
  lines), --keyframe-trace-csv (unused), several trace-only stdout
  lines, dead locals, and the V1 coarse-err tiebreak that plan.md
  had already flagged as noise.
The master-branch CI has been failing since Ceres 2.1.0 (fetched via
FetchContent when no system Ceres is present) tries to link against
CXSparse::CXSparse. Recent Ubuntu runners ship libsuitesparse-dev
that renames the CMake-exported target to SuiteSparse::CXSparse,
leaving Ceres 2.1.0's hard-coded

  target_link_libraries(ceres ... CXSparse::CXSparse)

at internal/ceres/CMakeLists.txt:329 unable to find the target and
killing the Configure step. Reproduced on both origin/master 2c3cbd5
and on this PR branch; neither our code nor this PR introduced it.

Set CXSPARSE=OFF for Ceres's FetchContent-driven build. The solvers
we actually use are DENSE_SCHUR (local BA) and SPARSE_NORMAL_CHOLESKY
via Eigen (pose graph) -- both come from Ceres's built-in/Eigen-backed
paths and do not need CXSparse. Scope is limited to the fetched
Ceres, so developers with a system Ceres install see no change.

Local builds already use a fetched Ceres that was configured long
ago and is cached in build_codex/; the configure step there is not
re-run so this change takes effect on CI and on clean rebuilds.
…lver

Follow-up to 7ed6402 (Ceres CXSPARSE=OFF for CI). With CXSparse off,
the Ceres build no longer enables SuiteSparse support either, so the
pose-graph call site failed at runtime with

  solver.cc:508 Terminating: Can't use SPARSE_NORMAL_CHOLESKY with
  Solver::Options::sparse_linear_algebra_library_type = SUITE_SPARSE,
  because support was not enabled when Ceres Solver was built.

on the CI runner.

Switch the hard-coded backend to EIGEN_SPARSE, which ships with
Ceres/Eigen and does not require a system SuiteSparse install. This
keeps both local builds (with or without a system Ceres) and the CI
runner on the same solver, avoiding the "works here, fails there"
split.

All seven regression gates pass with byte-identical trajectory SHAs:

  room_depth_accel_head_repro 0.057702  a4f036f7ca3059ea
  room_depth_head_repro       0.079914  ebdd323dbc378992
  room_mono_accel_head_repro  0.164001  25ba31699be19bcd
  room_mono_head_repro        0.176506  60383555ba272c17
  xyz_depth_head_repro        0.011042  4b21294168165ee9
  xyz_mono_accel_head_repro   0.024931  1fdb49cb60f71841
  xyz_mono_head_repro         0.028136  583e07bf2ba4ecaf

i.e., head-250 windows either do not exercise the pose-graph sparse
solver at all, or EIGEN_SPARSE produces the same factorization shape
that SuiteSparse was producing.
Tesla-style simplification + post-reloc KF cooldown for room_mono
…rks_ races

Two TSan-flagged data race classes in the async pipeline:

1. Frame::landmarks_ vs onBACompleted
   - Writers on tracking thread (assignFrameLandmarksFromInliers,
     initializeWithDepth, mono init, trackReferenceKeyframe, relocalize,
     reinitialize) now hold current_frame_->mutex_ around landmarks_
     writes.
   - onBACompleted path readers (countValidFrameLandmarks,
     recomputeCurrentPose) snapshot under the same mutex_.
   - Keyframe(Frame::Ptr) ctor uses frame->snapshotLandmarks() helper.
   - Frame::mutex_ is now mutable so snapshotLandmarks() can be const.

2. Keyframe::landmarks_ vs LocalMapping::createNewMapPoints writes
   - needNewKeyframe snapshots reference_keyframe_->landmarks_.
   - relocalize candidate loop and best-candidate assign phase snapshot
     cand.kf->landmarks_ / best_candidate.kf->landmarks_ under each
     kf->mutex_ before iterating; avoids holding two container mutexes
     at once (matches local_mapping.cc convention).

Validation (build_codex, local):
- ctest: 59/59 passed
- check_regression_gate xyz_mono_head_repro: 0.028136 m (same SHA)
- check_regression_gate room_mono_head_repro: 0.176506 m (same SHA)
- check_regression_gate xyz_depth_head_repro: 0.011042 m (same SHA)
- check_regression_gate room_depth_head_repro: 0.079914 m (same SHA)
- TSan room_mono 250f: Frame::landmarks_ race gone; Keyframe::landmarks_
  races in relocalize / needNewKeyframe gone. Residual TSan warnings are
  pre-existing orthogonal issues (Map container API, Pose SSE races,
  LocalMapping control flags).
Adds a neutral IMU measurement record (svslam::ImuEntry, accel + gyro +
timestamp) in src/sensors/imu.h, and extends EurocDataset to load
mav0/imu0/data.csv when present.

- Silently ignored when imu0 is absent (synthetic test_seq has none).
- New accessors: hasImu(), allImu(), getImuBetween(t0, t1).
- Parses the EuRoC 7-column CSV (ts_ns, wx..wz, ax..az) and sorts by ts.

Tests:
- tests/test_euroc_dataset.cc: LoadsImuDataWhenPresent,
  SilentlySkipsMissingImu (2 new tests, 61/61 total passing).

This is scaffolding for VIO Stage 0b.a (preintegrator), 0b.b (velocity
state), and 0b.c (accel -> velocity model); no runtime behavior change
yet.
…e 0b.b)

Adds velocity_ (world-frame, m/s), accel_bias_, gyro_bias_ and
has_velocity_ to Frame and Keyframe. Keyframe ctor copies the current
Frame state at construction time.

Purely scaffolding: no writer or BA residual touches these fields yet.
Prepares Frame::Ptr / Keyframe::Ptr as the carrier for IMU preintegration
results (0b.a) and for a future velocity-prior residual.

Tests: 61/61 passing (no behavior change).
Minimal header-only IMU preintegration class:
- integrate(accel, gyro, dt) accumulates delta_R, delta_v, delta_p over
  an interval using Euler integration in the start-of-interval frame.
- predict(R_i, v_i, p_i, g) -> (R_j, v_j, p_j) propagates a keyframe
  state across the accumulated interval, applying the world-frame
  gravity vector.
- reset(accel_bias, gyro_bias) re-anchors biases for the next interval.

Intentionally omits bias Jacobians and noise covariance — those belong
with a real VIO residual in BA (future work). This header-only skeleton
is callable from Tracking for state prediction and can be upgraded
later without touching call sites beyond the constructor.

Tests: 5 new in tests/test_imu_preintegrator.cc covering zero motion,
constant accel delta_v/delta_p, predict including gravity, reset, and
bias subtraction. 66/66 total passing.
…ge 0b.c)

Adds std::vector<ImuEntry> Tracking::imu_buffer_ and populates it from
EurocDataset::allImu() when --accel is requested on an EuRoC sequence.
The IMU accel channel is also mirrored into the existing accel_buffer_
so gravity alignment / stationary detection paths work unchanged.

Net effect on the runtime:
- EuRoC + --accel now enables gravity prior via IMU accel (previously
  the --accel flag was silently disabled on EuRoC because it only
  consulted TUM accelerometer.txt).
- Full IMU (accel + gyro) is retained on tracker->imu_buffer_ for a
  future VIO path that plugs ImuPreintegrator (0b.a) into the motion
  model (0b.c future work).

Validation:
- ctest: 66/66 passing, same SHAs for xyz_mono_head_repro (0.028136) and
  room_mono_head_repro (0.176506), no regression.
… 0b.e / 0b.f / 0c.b)

Promotes the 0b.{a,b,c,d} scaffolding into a Tracking loop that actually
consumes IMU data on every frame and carries the per-keyframe preintegration
forward for a future BA consumer (the backend residual lands in a follow-up
commit so this one is Tracking-only).

Tracking:
- predictVelocityFromImu() preintegrates imu_buffer_ between last_frame_ and
  current_frame_ via the Forster integrator. Converts last_frame_'s camera
  world pose to the IMU body frame via T_wb = T_wc * T_cam_imu_ before the
  integrator runs, so the gravity subtraction and body->world rotation are
  done in the right frame. Result goes to current_frame_->velocity_ +
  has_velocity_.
- reconcileVelocityWithVisual() blends the IMU prediction with the visual
  post-tracking pose delta so Keyframe::velocity_ doesn't accumulate raw
  open-loop IMU drift. Blend weight tunable via SVSLAM_VIO_VELOCITY_IMU_ALPHA;
  visual-only runs can opt in with SVSLAM_VIO_ENABLE_VISUAL_VELOCITY.
- populateKeyframeImuSpan() attaches a frozen preintegration span between
  consecutive KFs (delta_R/v/p, dt, ref biases, from_kf_id, T_cam_imu) to
  the new Keyframe::prev_imu_span_, readied for a future BA residual.
- setImuToCameraExtrinsic() + T_cam_imu_ state. Identity by default
  preserves existing TUM behavior.
- Gravity-alignment in initializeWithDepth() / initialize() now transforms
  the IMU-frame gravity estimate into the camera frame before building the
  world-align rotation. Previously this used IMU-frame gravity directly as
  if it were in camera frame — invisible on TUM where IMU ≈ camera, but
  catastrophic on EuRoC's ~90° offset (world Z-up ended up wrong).

I/O:
- EurocDataset parses cam0 T_BS into cam0_from_imu_ (SE3) with SVD
  re-orthonormalization; the YAML reader now folds multi-line "[ ... ]"
  arrays so the real EuRoC sensor.yaml with wrapped 16-element matrices
  loads correctly.
- apps/run_mono.cc plumbs cam0FromImuExtrinsic() into the tracker when
  --accel is active.

Data models:
- ImuPreintegrationSpan holds frozen per-KF-pair preintegration deltas +
  the reference biases + T_cam_imu snapshot. Stored on Keyframe as a
  unique_ptr; null until populated.

Tests: 2 new EurocDatasetTest cases (T_BS parse from single-line + multi-line
sensor.yaml). TUM regression gates unaffected.
…ion (VIO Stage 0c.c / 0c.a)

Consumes the Keyframe::prev_imu_span_ attached by Tracking and turns it
into actual BA constraints. With T_cam_imu carried by the span, the residual
operates in the correct IMU-body frame without needing extra plumbing from
the BA caller.

Residuals:
- VelocityPreintegrationError: Forster-style 6-DoF (position + velocity)
  residual between consecutive KFs. Parameter blocks: pose_i, pose_j, vel_i,
  vel_j, bias_accel_i. Computes p_wb = p_wc - R_wb * t_bc and
  q_wb = q_wc * q_cb internally so camera-frame BA poses end up in the
  IMU-body frame where the preintegrated deltas live. First-order accel-bias
  Jacobians (J_p_ba = -0.5*dt^2*I, J_v_ba = -dt*I) let BA re-estimate the
  bias without re-integrating.
- VelocityDeltaPriorError: kept as a loose fallback for consecutive KF
  pairs that don't have a valid span (e.g. first few KFs after init).
  Uses KF::velocity_ as a frozen "position delta ≈ v*dt" prior.
- BiasAnchorError: zero-anchor per-bias pull.
- BiasRandomWalkError: slow-drift coupling between consecutive KFs.

bundleAdjustment:
- Per-KF velocity (3d) and accel/gyro bias (3d each) parameter blocks are
  added when has_velocity_ is set. After the solve they are written back to
  the corresponding KF fields.
- For each consecutive pair (sorted by id), prefers the preintegration
  residual when kf_j->prev_imu_span_ is valid + from_kf_id matches +
  dt agrees; falls back to the loose VelocityDeltaPriorError otherwise.
- Bias anchor and random-walk residuals shape the bias BA params.

Env knobs (all optional):
- SVSLAM_BA_VELOCITY_PRIOR_SIGMA_M (pos sigma, default 0.3 m; <=0 disables)
- SVSLAM_BA_VELOCITY_PRIOR_VEL_SIGMA (vel sigma, default 0.3 m/s)
- SVSLAM_BA_BIAS_ACCEL_ANCHOR_SIGMA (default 0.5 m/s^2)
- SVSLAM_BA_BIAS_GYRO_ANCHOR_SIGMA (default 0.1 rad/s)
- SVSLAM_BA_BIAS_ACCEL_RW_SIGMA (default 0.05 m/s^2)
- SVSLAM_BA_BIAS_GYRO_RW_SIGMA (default 0.005 rad/s)

Validation:
- All 7 TUM regression gates still pass with bitwise repro (VIO path is
  dormant on TUM since has_velocity_ stays false).
- EuRoC MH_01_easy mono (1500 frames, --accel): ATE mean 0.278 m
  (visual-only baseline) -> 0.187 m (visual + IMU), median 0.125 m
  -> 0.069 m after Sim3 alignment with evo_ape.
- 71/71 unit tests pass including 5 new residual tests (VelocityPreintegration
  match + gravity + bias correction, BiasAnchor + BiasRandomWalk).
…ge 0c.d)

Extends VelocityPreintegrationError from 6-DoF (position + velocity) to
full 9-DoF by adding the Forster-style rotation residual
  r_rot = log((q_wb_i * delta_R)^{-1} * q_wb_j)
linearized as 2 * (q_err.vec) for small errors. delta_R is sourced from the
Keyframe's ImuPreintegrationSpan and rotated into the body frame via the
same q_cb = q_cam_imu the 6-DoF version already carries, so the rotation
and translation residuals live in one consistent body-in-world frame.

No gyro-bias first-order correction yet — the preintegrated delta_R stays
frozen at the integration-time bias. Gyro bias remains shaped only by its
anchor + random-walk priors.

Env knob: SVSLAM_BA_PREINT_ROT_SIGMA_RAD (default 0.05 rad ≈ 2.9° per
KF-gap). <=0 zeros the rotation weight and restores 6-DoF behavior
without other plumbing changes.

Validation (mono + --accel, Sim3 ATE vs EuRoC ground truth):
- MH_01_easy (3683 frames): 1.728 m -> 1.409 m mean, 1.477 m -> 1.175 m
  median, 2.023 m -> 1.779 m rmse (vs prior 6-DoF).
- V1_01_easy (2912 frames): 1.589 m -> 1.466 m mean, 1.584 m -> 1.322 m
  median (vs prior 6-DoF). Still +17% over visual-only; closes with VI
  Init, which is the next step.

Tests: 2 new OptimizerTest cases (rotation match = zero residual, 10°
mismatch = expected 2*sin(5°) on the Z residual). Residuals array
widened from 6 to 9 across all existing VelocityPreintegrationError
tests. 23/23 VIO-related tests pass.
…c.e)

Adds a VisualInertialInitializer that bootstraps VIO estimate from a
window of the first N keyframes, then refines their state in-place so
the BA preintegration residual (Stage 0c.d) consumes bias-consistent
spans. The initializer is careful to reject its own output when the
visual window is too noisy for a meaningful solve, so MH_01-style
mono sequences fall back cleanly to the pre-Stage-0c.e behaviour.

VisualInertialInitializer (src/tracking/visual_inertial_initializer.{h,cc}):
- Stage 1 — closed-form gyro bias from visual-vs-preintegrated rotation
  residuals, with a hard magnitude cap (0.05 rad/s by default). Short
  EuRoC windows leave the solve underdetermined, so unregularized output
  regularly lands 10× above reality; the cap rejects the runaway and
  falls back to zero bias, letting BA's per-KF gyro-bias block take
  over. Optional Tikhonov σ exposed but defaults off.
- Stage 2 — LSQ over {scale, gravity_world, per-KF velocities} from the
  Forster position + velocity equations. Accel bias is pinned to zero
  for this pass (BA's accel-bias parameter + BiasRandomWalkError refine
  it later). Scale prior + scale bound guard the degenerate cases.
- Result::applyGyroBiasCorrectionToSpans applies the first-order Forster
  rotation correction (delta_R_new = delta_R_ref * Exp(-dbg * dt)) to
  every span in the window so BA's Stage 0c.d rotation residual sees a
  delta_R consistent with the new reference bias.
- Acceptance gates tightened: rotation_residual_rms_max 0.15 → 0.08 rad
  (observed 0.13 rad on noisy MH_01 mono rejects, 0.05 rad on V1_01
  passes). Prevents applying a scale/gravity correction when the visual
  rotations diverge from the IMU's — that combination produces worse
  ATE than doing nothing.

Tracking integration:
- tryVisualInertialInit() runs after each new KF. On success it locks
  Map::mutex_ + sets loop_correcting_ to pause Local Mapping, rescales
  every KF + landmark + frame pose by the learned scale, rotates the
  map so gravity lands on world -Z, writes biases/velocities back to
  the window KFs, and calls applyGyroBiasCorrectionToSpans so the BA
  rotation residual stays consistent.
- vi_init_done_ flag prevents retry after success.
- On rejection, Tracking falls back silently to the pre-init behavior.

BA gate:
- Optimizer::setPreintegrationResidualEnabled(bool) is kept as a
  dataset-level opt-out but flipped to default-on. Gating preint off
  until VI init succeeds helps V1_01 (1.31 m → 1.15 m) but punishes
  MH_01 where VI init never converges (1.41 m → 3.46 m). With preint
  always on, MH_01 matches master and V1_01 still picks up the VI-init
  gain. Env knob SVSLAM_VIO_GATE_PREINT=1 restores the gated behavior
  for experiments.

Tests: 4 new gtest cases in test_visual_inertial_initializer.cc cover
scale + gravity recovery, capped closed-form gyro bias (with the new
default cap), metric-scale mode, and the missing-span rejection path.
77/77 tests pass.

ATE on EuRoC mono + --accel (Sim3, evo_ape --align --correct_scale):

  Dataset     visual-only   master (0c.d)   +VI init (this commit)
  MH_01_easy  3.44 m        1.41 m          1.41 m  (VI rejected → fallback)
  V1_01_easy  1.25 m        1.47 m          1.31 m  (VI succeeded, -11%)

No regression on either dataset. V1_01 is the sequence VI init was
designed for; MH_01's mono-init rotations are too noisy for a reliable
linear VI solve, so Stage 0c.e correctly opts out rather than making
things worse.

Follow-ups (not in this commit):
- Re-enable the preint gate only when a dataset-chooser can tell
  whether VI init will succeed.
- ORB-SLAM3 Appendix-A style 3-step refinement to make VI init work
  on noisier windows (MH_01 recovery).
- First-order gyro-bias Jacobian in the BA rotation residual itself,
  so span re-integration isn't the only coupling.
The roadmap section's "IMU tight coupling if ever prioritized" bullet is
stale — Stage 0b + 0c landed and is empirically validated on EuRoC MH_01
and V1_01. Update the feature-expansion table, add a VIO status subsection
with the commit range, ATE improvements, and the env-knob defaults we
converged on, plus a note about the gyro-bias Jacobian experiment that was
tried and reverted (regressed MH_01 2–3× because BA absorbs visual
rotation noise into the bias when biases are uncalibrated).
std::rand() isn't seeded, so every fresh process starts at the same
sequence (1804289383, ...). When ctest -j launches the test binary in
parallel with different --gtest_filters they can land on the same
/tmp/svslam_euroc_test_<N> directory and one of them fails with
\"no entries in data.csv\" or a filesystem race on the PNG writes.

Swap the counter for (PID + process-local atomic counter + nanosecond
timestamp). Ran ctest -R EurocDatasetTest -j4 five times in a row with
no failures after the change.
@rsasaki0109
Copy link
Copy Markdown
Owner Author

Cross-sequence validation (2026-04-20)

Validated on five EuRoC mono + --accel sequences downloaded via Wayback (evo_ape --align --correct_scale --t_max_diff 0.05):

Sequence Frames Visual only [m] +VIO [m] Δ mean
MH_01_easy 3683 3.44 1.41 −59%
MH_02_easy 3041 2.64 1.72 −35%
MH_03_medium 2701 2.48 2.91 +17%
V1_01_easy 2912 1.25 1.31 +5% (max 7.16 → 2.95 m)
V2_01_easy 2281 0.69 0.63 −9% (max 4.94 → 2.30 m)
Average 2.10 1.60 −24%

Observations:

  • Machine Hall (MH_01, MH_02): strong gains — long-range flight is where IMU drift correction earns its keep.
  • Vicon Room (V1_01, V2_01): mean barely moves but max improves by 50–60%, i.e. IMU kills the occasional tracking excursion.
  • MH_03_medium regresses on mean; VI init rejects (rot_rms above threshold, same failure mode as MH_01 but with no fallback win on this sequence). Open follow-up: ORB-SLAM3-style MAP refinement with a longer window should recover this.

No TUM regression: 7/7 TUM gates still pass with bitwise-identical trajectories.

…2.0)

Add SVSLAM_BA_GRAVITY_PRIOR_WEIGHT so aggressive-motion sequences can
disable the per-KF gravity prior when the ±50 ms accel window is
dominated by motion rather than gravity. Default 2.0 preserves the
Stage 0c behavior for MH_01 / MH_02 where the prior demonstrably helps.

MH_03_medium empirically neither gains nor loses from the prior
(2.91 m with, 2.90 m without) — the knob mostly exists to cut the
experimentation loop short when probing a new dataset.
Rewrites the stale chunks of plan.md for a clean handoff to the next
agent — specifically so a Codex pickup has an accurate current-state
picture and a VIO-aware guidance path.

Major changes:

- Header updates date to 2026-04-21 and explicitly calls out the VIO
  Stage 0b / 0c landing on master.
- §2.1 Snapshot: replaces stale HEAD / test-count / LOC numbers with
  current values (HEAD 46a726d on master, 77/77 tests, 12 090 LOC,
  18 test files). Adds PR #3 branch pointer.
- §2.2 Feature Matrix: adds the loosely-coupled VIO row, flags the
  known-degraded EuRoC stereo path, notes ROS2 has no VIO hookup yet.
- §2.4 Regression Gates: lists all 7 gates (the 2 _accel_head_repro
  variants were missing) with fresh numbers and thin-margin call-outs.
- §2.7: replaces the synthetic-dataset section with real-dataset
  verification status (5 EuRoC sequences on disk via Wayback, the GT
  TUM conversion one-liner, evo_ape recipe).
- §2.8: new — EuRoC mono --accel VIO sweep table (MH_01/02/03, V1_01,
  V2_01) with Δ mean / Δ max / VI-init accept status and the σ sweep
  finding.
- §3: adds the 10 VIO commits (Stage 0b.a through 0c.e) with per-stage
  labels, plus the 2 extra vio-integration-branch commits.
- §4.2: updates file-inventory entries for src/tracking,
  src/backend, src/sensors, src/io, tests/ to reflect the new VIO
  sources and updated LOC estimates.
- §7.8: new — complete VIO env-knob table (12 knobs) with defaults
  and tuning philosophy.
- §8 Known Issues: rewrites the list — marks "real EuRoC missing"
  done, adds MH_03 regression (§8.11), the gyro-bias-Jacobian attempt
  that reverted (§8.10), the ORB 3000 regression that proved
  plan.md §10.4 is partly stale (§8.12), and the EuRoC stereo
  rectification issue (§8.13).
- §9 Phase D: expanded status with the VIO sub-table and
  ORB-SLAM3-follow-up pointers.
- §10.4 #1: flags as tested-and-regressed with the measured deltas.
- §11: new priority order — land PR #3, diagnose MH_03, write an
  eval/ artifact for the EuRoC sweep, resume §10 probes.
- §12 AI Agent Instructions: expands with VIO-aware reading order,
  the "narrowest gate first" rule per-topic, and explicit warnings
  about the gyro-bias Jacobian + plan.md §10 staleness.
- §15: marks the Mono Room Handoff as historical with a header note.
- §16: new — VIO Handoff section with pipeline diagram, key
  invariants, MH_03 bisection plan, Stage 0c.f follow-ups, and
  minimum reference-file list for any VIO change.
- §17: Non-Goals renumbered from §16.

Net delta: 1705 lines (previously 1418). Preserves §14 battle log
and §15 mono-room diagnostic context since the guidance inside them
is still relevant for any mono front-end change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant