|
1 | 1 | # Changelog |
2 | 2 |
|
| 3 | +## v1.4.0 |
| 4 | + |
| 5 | +### Refactor — shared `common` library + optional TBB parallelisation |
| 6 | + |
| 7 | +`cpp/common/` is a new tiny static library holding the parts of the |
| 8 | +codebase that are independent of the Patchwork / Patchwork++ |
| 9 | +pipelines: |
| 10 | + |
| 11 | +- `cpp/common/include/patchwork/types.h` — `PointXYZ`, `PCAFeature` |
| 12 | + (now carries the `principal_` field for parity with the original |
| 13 | + Patchwork repo), `PatchStatus`. |
| 14 | +- `cpp/common/include/patchwork/plane_fit.h` + |
| 15 | + `cpp/common/src/plane_fit.cpp` — the SVD-based `estimate_plane`, |
| 16 | + plus `xy2theta`, `xy2radius`, `point_z_cmp`. |
| 17 | + |
| 18 | +Both `cpp/patchwork/` and `cpp/patchworkpp/` now link this library |
| 19 | +and the three drifted copies of the plane-fit math are collapsed to |
| 20 | +one canonical implementation. Fix 2 in #90 was a concrete example of |
| 21 | +that drift causing a real bug. |
| 22 | + |
| 23 | +### Perf — `pypatchworkpp.patchwork` is now multi-threaded |
| 24 | + |
| 25 | +Classic Patchwork's main loop uses `tbb::parallel_for` over all |
| 26 | +`(zone, ring, sector)` patches when TBB is available, with a serial |
| 27 | +reduction afterwards that walks the outcome buffer in deterministic |
| 28 | +order so numerical results are byte-identical to the sequential path. |
| 29 | + |
| 30 | +Measured on KITTI seq 00 (i7-12700, 24 logical cores): |
| 31 | + |
| 32 | +| Configuration | Median ms/frame | Median Hz | |
| 33 | +| -- | --: | --: | |
| 34 | +| `--method patchwork` single-thread (taskset -c 0) | 8.31 | 120.4 | |
| 35 | +| `--method patchwork` parallel (TBB default scheduler) | **4.81** | **207.8** | |
| 36 | + |
| 37 | +**1.73× speedup**. TBB is an **optional** build dependency: missing |
| 38 | +TBB causes a CMake STATUS message and falls back to a sequential |
| 39 | +loop (no FATAL_ERROR), so existing CI / wheel builds continue to |
| 40 | +work even when `libtbb-dev` is not installed. |
| 41 | + |
| 42 | +### Perf — `pypatchworkpp.patchworkpp` stays sequential (intentional) |
| 43 | + |
| 44 | +The same TBB pattern was applied to Patchwork++'s main loop and |
| 45 | +benchmarked at 1 / 2 / 4 / 8 / 16 / 24 threads. **Every multi-thread |
| 46 | +configuration was slower** than single-thread (111 Hz → 93 Hz at 2 |
| 47 | +threads, → 69 Hz at 24 threads). Root cause: per-patch work is small |
| 48 | +(~14 µs avg) and dominated by short-lived `std::vector` / `Eigen` |
| 49 | +allocations inside R-VPF + R-GPF, so concurrent malloc serialises on |
| 50 | +the heap allocator. Patchwork++ remains sequential. Issue #96 |
| 51 | +documents the measurement and the conditions under which we'd |
| 52 | +revisit (thread-aware allocator, slab-allocated per-worker scratch, |
| 53 | +or a real user CPU complaint). |
| 54 | + |
| 55 | +### Adds |
| 56 | + |
| 57 | +- `python/examples/bench_hz.py` — per-frame timing harness reporting |
| 58 | + median / mean / p95 / p99 ms and Hz from `getTimeTaken()`. Useful |
| 59 | + for future perf work. |
| 60 | + |
| 61 | +### Numerical equivalence |
| 62 | + |
| 63 | +KITTI 00-10 full sweep (23,201 frames), Patchwork++ paper protocol, |
| 64 | +v1.3.1 → v1.4.0: |
| 65 | + |
| 66 | +| Method | F1 v1.3.1 | F1 v1.4.0 | Δ | |
| 67 | +| --- | --- | --- | --- | |
| 68 | +| `--method patchwork` | 96.0172 | 96.0172 | 0 (byte-identical) | |
| 69 | +| `--method patchworkpp` | 96.2918 | 96.2919 | +0.0001 (float noise) | |
| 70 | + |
| 71 | +Both well within the ±0.05 budget set in the refactor plan. |
| 72 | + |
| 73 | +### References |
| 74 | + |
| 75 | +- #94 — PR (refactor: extract common library) |
| 76 | +- #95 — PR (perf: TBB on classic Patchwork) |
| 77 | +- #96 — Issue (why Patchwork++ has no TBB) |
| 78 | + |
3 | 79 | ## v1.3.1 |
4 | 80 |
|
5 | 81 | ### Bug fix — `pypatchworkpp.patchworkpp` (Patchwork++) |
|
0 commit comments