You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> SDP is stronger than FRCNN — expect single-sequence defaults around 60–65 HOTA on MOT17-04, but the 7-sequence Optuna average is lower because the benchmark includes harder sequences that pull the mean down.
139
139
@@ -241,6 +241,68 @@ uv run python optimize_tracking.py ocsort yoloworld
Full iteration log: `_optimizations/state/20260402-233730/experiments.jsonl`. The table below shows **kept** experiments only — all 20 iterations including regressions are in the JSONL.
251
+
252
+
#### Positive experiments
253
+
254
+
| Iter | Change | HOTA before → after | Δ pts | Δ % | IDSW |
-**ORU velocity re-estimation** (i1) — on re-detection after occlusion, replay virtual predict+update cycles along an interpolated trajectory to reconstruct velocity. Borrowed from OC-SORT.
271
+
-**Separate stage-2 IoU threshold** (i2b) — stage-2 low-confidence recovery uses an independent, lower threshold than stage-1; original ByteTrack shares one threshold for both stages. Codex contribution.
272
+
-**Stage-1 age discount** (i5, `iou_age_weight`) — IoU in the Hungarian cost matrix is multiplied by `1 / (1 + w × lost_frames)` for stale tracks, biasing assignment toward recently-seen ones without affecting the threshold gate.
273
+
-**1st calibration wave** (i10) — first Optuna pass over all 9 parameters under the new code. Key shifts: `lost_track_buffer` 30→62, `track_activation_threshold` 0.70→0.31, `q_scale` 0.01→0.0025, `velocity_decay` 0.95→0.82.
274
+
-**oru_threshold 2 → 5** (i11) — ORU reserved for longer occlusions (≥5 frames); shorter gaps handled by velocity decay alone, reducing noisy replays on brief miss frames.
-**3rd calibration wave** (i16) — 1000-trial joint search over all 19 params after 3 new knobs added in i13–i15 (`conf_cost_weight`, `stage2_min_updates`, `giou_blend`). Key shifts: `oru_threshold` 14→0 (ORU disabled), Kalman loosened ~14×, `high_conf_det_threshold` 0.61→0.80, `stage2_min_updates` activated at 5, `giou_blend` 0.396, `conf_cost_weight` 0.170. Note: i13–i15 added these three features at `default=0` (disabled); no HOTA change at default params until i16 activated them jointly.
277
+
-**stage2_min_updates search cap fix** (i17) — search space was [0, 5]; Optuna reported 5 as optimal (hitting the cap). Manual scan revealed true peak at 12 with a cliff at 14+. Cap widened to [0, 15].
|`iou_age_weight`| 0.07197 | Age discount factor for stage-1 Hungarian assignment cost |
291
+
|`conf_cost_weight`| 0.1696 | Confidence boost multiplier in stage-1 assignment, biases toward certain detections |
292
+
|`stage2_min_updates`| 12 | Minimum track age (successful updates) required to enter stage-2 |
293
+
|`giou_blend`| 0.42 | Blend weight between standard IoU and GIoU in stage-1 cost matrix |
294
+
|`oru_threshold`| 0 | Minimum occlusion length (frames) to trigger ORU velocity replay |
295
+
296
+
#### What failed (reverted)
297
+
298
+
xcycsr 7D Kalman state (−1.2%), anisotropic Q matrix (−0.5%), EMA position blend (−0.2%), CMC detection-centroid (−27%), conf-based R scaling (−0.6%), NMS pre-filter (neutral), cascade matching (−1.2%), birth suppression (neutral), split velocity decay (−0.6%), adaptive confirmation (−0.2%), q_miss_start (neutral).
299
+
300
+
#### Key lesson
301
+
302
+
**Calibration waves account for ~87% of the total gain**: the three Optuna passes (i10 +1.13%, i12 +1.62%, i16 +0.19%) delivered +2.94 of the +3.09 total HOTA improvement. Joint optimisation over all parameters simultaneously reaches basins that sequential per-parameter tuning cannot. Algorithmic additions (ORU, stage-2 threshold, age discount, etc.) created new tunable structure that the calibration waves then exploited.
303
+
304
+
---
305
+
244
306
## Target analysis
245
307
246
308
The ByteTrack Phase 2 campaign target of HOTA = 68.0 requires real architectural improvements, not parameter search — Optuna alone on FRCNN detections plateaus around 52–53.
0 commit comments