You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf(patchwork): TBB parallel_for over patches in classic Patchwork
The classic Patchwork main loop in cpp/patchwork/src/patchwork.cpp is
now parallelised with a single tbb::parallel_for over all
(zone, ring, sector) patches, mirroring the upstream ~/git/patchwork
pattern. Per-patch work (sort + plane fit + GLE) runs in worker
threads; a serial reduction then accumulates ground / nonground in
deterministic order.
median per-frame time on KITTI seq 00 (i7-12700, 24 logical cores):
single-thread (taskset -c 0) 8.31 ms (120.4 Hz)
parallel (default TBB scheduler) 4.81 ms (207.8 Hz)
→ 1.73x speedup
Patchwork++ (cpp/patchworkpp/src/patchworkpp.cpp) was also benchmarked
under the same TBB pattern at 1 / 2 / 4 / 8 / 16 / 24 threads. Every
multi-thread configuration was SLOWER than single-thread on KITTI:
1 thread → 111 Hz (baseline)
2 threads → 93 Hz
4 threads → 91 Hz
8 threads → 91 Hz
16 threads → 85 Hz
24 threads → 69 Hz
The per-patch work in Patchwork++ is small (~14 µs avg) and dominated
by short-lived std::vector / Eigen::Matrix allocations inside R-VPF
and R-GPF. Concurrent malloc serialises on the heap allocator and TBB
scheduler overhead exceeds the parallelisation benefit at every
thread count. Single-threaded Patchwork++ already runs at ~2x the
paper's reported 55 Hz on i7-7700K, so there is no real-time
motivation to parallelise. Patchwork++ remains single-threaded; the
estimateGround loop has a long-form comment explaining why.
Numerical equivalence verified on KITTI 00-10 full sweep (23,201
frames), both methods, Patchwork++ paper protocol:
patchwork x pp protocol pre: 96.0172 post: 96.0172 (byte-identical)
patchwork++ x pp protocol pre: 96.2918 post: 96.2919 (Δ +0.0001)
Both within the ±0.05 budget set in the refactor plan.
Build:
- Adds find_package(TBB CONFIG/MODULE REQUIRED) to cpp/CMakeLists.txt
with a helpful error message listing the install command for
Ubuntu / macOS / Windows.
- cpp/patchwork/CMakeLists.txt links TBB::tbb; cpp/patchworkpp/ does
not (since it does not use TBB).
Also adds:
- python/examples/bench_hz.py — small per-frame timing harness that
reports median / mean / p95 / p99 ms and Hz from getTimeTaken().
- A `const` qualifier on PatchWork::extract_initial_seeds and
PatchWork::perform_regionwise_segmentation since neither writes to
*this any more — needed so the TBB worker can call them.
0 commit comments