roadmap: reflect 2026-06-24/25 session (survival model, finder fix, build log)

SnoopLawg · claude · SnoopLawg · commit 16d6f90a4dbf · 2026-06-25T00:20:46.000-06:00
- M1: correct the Void-phase entry (capture shows boss = day's affinity from
  100% HP, no Void first-half); add the survival-model over-prediction as a
  named gap; add per-run BUILD snapshot logging; dated session status note.
- M6: speed-tune finder fixed (game-truth buff cadence, validated vs the real
  Force wipe, correctly rejects Force-failing tunes; no daily-robust Force tune
  exists for the MEN roster); contention-free equip; gear-optimizer LoS
  hypothetical-mode over-credit fix.

Co-Authored-By: Claude Opus 4.8 &lt;noreply@anthropic.com&gt;
diff --git a/docs/roadmap.md b/docs/roadmap.md
@@ -34,9 +34,11 @@ and verifies leaderboard credit.
 | Sim — Force-day MEN tune | ✅ | **2026-06-23**: -3.4% @ BT24 + -1.8% @ BT19 (2 fixtures within ±5%). DWJ-parity scheduler ported via TM-reset fix. |
 | Sim — Magic-day MEN tune | ✅ | **2026-06-23**: -1.4% / +3.2% / +0.8% @ BT49 (3 BT49 fixtures within ±5%). Was -55% under pre-session. |
 | Sim — Spirit-day MEN tune | 🟡 REGRESSED | Was ±5% (4 fixtures) earlier 2026-06-23, but the game-truth debuff fixes (commit 95472fa: Demy A2 scope + debuff `<=0` expiry) shipped after and regressed it. Fresh capture 20260623_162050 = **-12.4%** (boss-HP) / -5.6% (hero-sum). Diagnosed per-source via `cb_attribution_diff.py`: Geo deflect -24.5% (Stoneguard formula under-models) + Venom poison 50 vs 74 ticks. Cadence is PERFECT (sim casts == real). Fix needs a Magic fixture to avoid over-fitting one affinity. See `project_spirit_fixture_attribution_20260623`. |
-| Sim — Void-day MEN tune | 🟡 | Per user 2026-06-23: there is no "Void day" — boss is Void for first 50% HP, then today's affinity. Validating pure-Void calibration requires solo-attack at fresh boss reset. See `project_cb_void_first_half_mechanic`. |
+| Sim — Void-day MEN tune | 🟡 | **CORRECTION 2026-06-24**: the "boss is Void for the first 50% HP" model is NOT supported by capture. `battle_logs_cb_20260624_152252.json` (UNM Force day) logged boss `element=2` (Force) on EVERY entry while at >98% HP, and the Magic heroes (Ninja/Venom) glanced — so for a stall comp the boss is the DAY'S affinity from 100% HP, there is no Void first-half. No pure-Void capture exists. (Supersedes the earlier `project_cb_void_first_half_mechanic` note.) |
+| Sim — survival model (UK/BD coverage) | 🟡 **NEW GAP 2026-06-24** | The real Force run (T32 wipe) exposed that cb_sim OVER-PREDICTS *survival* (distinct from damage): it keeps Unkillable coverage gapless, so fragile DPS never die — predicted T50 vs real T32. Damage calibration is separate/unaffected. Root cause: `bugfix_buff_tick=True` default over-preserves UK for fast heroes. Fix scoped to the finder (see M6); global sim default left True to preserve the locked damage tests. A full survival recal needs more fixtures (baseline+Force, a stable-day capture). |
 | Battle log — per-tick state | ✅ | Mod's `/tick-log` captures TM, HP, buffs, debuffs, damage events with intermediates |
 | Battle log — post-battle deltas | 🟡 | Damage attribution per-hero works; quest/leaderboard credit verification added to `cb_daily.py`; item drops not yet structured |
+| Battle log — per-run BUILD snapshot | ✅ **NEW 2026-06-24** | `cb_run.py` writes `build_cb_<ts>.json` next to each battle log: per team hero game-computed stats (CB totals + full column breakdown), gear, sets, masteries, blessing, AND effective speed from turn counts. Closes the "what build/speed produced this run" gap — old runs had `s_spd=0` (broken pre-2026-06-16) and lost their build entirely. |
 | Death watcher (key conservation) | ✅ | `tools/cb_watcher.py` validated end-to-end. Trigger fires on `hp_cur<=0`; per-poll JSONL trace lands at `cb_watcher_<tag>_<ts>.poll.jsonl`. Skill: `.claude/skills/cb-key-conservation/` |
 | Fixture library + replay | ✅ | `tools/fixture_archive.py` catalogs (tick + battle + poll + **presets** as of 2026-06-23) triples into `data/fixtures/manifest.json`. **New 2026-06-23**: preset snapshot bundled at capture time so historical fixtures replay against the correct preset. |
 | Per-hero turn cadence diagnostic | ✅ | `tools/turn_cadence_diff.py` (NEW 2026-06-23) compares per-hero turn counts between sim and real per BT. Cb_sim cadence now matches DWJ-parity exactly on the MEN tune. |
@@ -58,6 +60,21 @@ sanity-check-only until the deflect + poison fixes land together (needs a
 Magic fixture next key to avoid over-fitting Spirit). Team-composition
 recommendations (synergy-based, not sim-based) are unaffected.
 
+**Update 2026-06-24/25**: a live UNM **Force** run (real, key spent) drove
+several fixes — (1) the speed-tune **finder was over-predicting Force survival**
+(reported "Force 100%" for a tune that wiped at T32); root-caused to the sim's
+**survival model over-covering UK/BD** (gapless coverage → fragile DPS never
+die) and fixed game-truth in the finder (see M6); (2) **definitive roster
+result**: no daily-robust Force tune exists for the MEN 5 (Ninja/Venom are
+Magic — coverage-bound, not bulk-bound); (3) `loadouts.apply()` made
+contention-free; (4) the gear optimizer's hypothetical **LoS over-credit**
+fixed (exact-tune fielding in one shot); (5) **per-run build snapshot**
+logging added to `cb_run.py`. The Void-phase model was disproved by the Force
+capture (boss = day's affinity from 100% HP). Net: survival-model accuracy is
+now the named frontier alongside the Spirit damage regression; the ±5% *damage*
+gate status is unchanged (Magic/Force ✅, Spirit regressed). All shipped to
+`main`.
+
 **Headline fix this session**: cb_sim per-hero cadence was systematically
 8-14% slower than DWJ-parity (which matches real game). Root cause was
 `champ.tm -= TM_THRESHOLD` (preserve overflow) instead of DWJ's
@@ -157,11 +174,11 @@ Researched 2026-06-23 (see chat: HH optimiser 3.0, DWJ calc guide).
 
 | Sub-goal | Status | Notes |
 |---|---|---|
-| **Generalized gear optimizer (per-champion stat targets)** | ✅ 2026-06-23 — `tools/gear_target_optimizer.py`. Per-stat min/max/importance + modes (balanced/damage/survivability), any champion, location-agnostic. Lexicographic min-satisfaction + **set-aware stacking seeding**. **Stat oracle now GAME-EXACT** (0.0 max error on all 8 stats × all 94 roster heroes after mod 8-stat `flat_bonus` + accessory-ascension + LoS-double-count fixes). **+ per-slot primary constraints** (`--primary "Ring=CD,Banner=ACC"`). **+ CR soft-cap** (damage mode stops wasting CR past 100% → routes to CD). |
+| **Generalized gear optimizer (per-champion stat targets)** | ✅ 2026-06-23 — `tools/gear_target_optimizer.py`. Per-stat min/max/importance + modes (balanced/damage/survivability), any champion, location-agnostic. Lexicographic min-satisfaction + **set-aware stacking seeding**. **Stat oracle now GAME-EXACT** (0.0 max error on all 8 stats × all 94 roster heroes after mod 8-stat `flat_bonus` + accessory-ascension + LoS-double-count fixes). **+ per-slot primary constraints** (`--primary "Ring=CD,Banner=ACC"`). **+ CR soft-cap** (damage mode stops wasting CR past 100% → routes to CD). **2026-06-24 — hypothetical-mode LoS fix**: the oracle was exact for a hero's CURRENT gear but over-credited Lore-of-Steel when scoring a DIFFERENT candidate build (the mod's `mastery_bonus` bakes in +15% of the *current* gear's sets), so the optimizer hit its calc SPD target but landed ~2-4 SPD short live. Added `_set_bonuses_from_gear` + a gated baseline-vs-candidate LoS correction (`current_artifacts=` param); the optimizer then hit the exact tune live in one shot. Sim path (hypothetical=False) byte-identical; 22 tests green. |
 | **Team-wide gear optimizer (cross-hero locking)** | ✅ 2026-06-23 — `tools/team_gear_optimizer.py`. Shared-vault constraint (each artifact → ≤1 hero); priority-greedy + coordinate-ascent rounds (release+re-optimize = coordinate ascent on total team score → converges, resolves contention). `--spec team.json` (per-hero min/max/weight/mode/lock/primary + priority) or quick `--heroes/--mode`; `--json`. Validated: 5-hero CB team competing for SPD/ACC/CD → all met, 45 distinct pieces, 0 dupes; impossible demand → graceful priority-respecting degradation. Wired into `m5_build_recommender --optimize`. |
 | Damage-mode gear optimization (sim-driven) | 🔴 | Optimize gear to maximize a hero's real skill damage, using OUR calibrated sim as the objective (HH uses a formula estimate; ours is game-truth-validated). Gated on M1 ±5%. |
-| Speed-tune solver / finder | ✅ 2026-06-23 — `tools/speed_tune_finder.py`. Searches per-hero SPD ranges (`--vary "Maneater=278..296:2"`), reports combos that hold (survive T50, 0 UK/BD gaps) via the VERIFIED cb_sim (full survival, NOT a coverage sim). Uses real gear + flagship preset (opener+priority via `_build_team_setup`), overrides only SPD. Built on the unified `turn_meter.TurnMeterEngine`. Verified: real MEN holds; Maneater holds 288-296, breaks below 288. |
-| Gear-cleanse + 1-click equip in-game | 🟡 | Pieces exist (`sell.py`, `sell_rules.py`, `loadouts.py`, `/equip`); integrate into the polished filter→sell and equip-full-build flow. Run pillar. |
+| Speed-tune solver / finder | ✅ 2026-06-23; **FIXED 2026-06-24** — `tools/speed_tune_finder.py`. Searches per-hero SPD ranges, reports combos that hold (survive T50, 0 UK/BD gaps), real gear + flagship preset, overrides only SPD. **Fix**: it was OVER-PREDICTING Force survival — reported a tune "Force 100%" that wiped live at boss turn 32 — because it used cb_sim's default buff cadence (`bugfix_buff_tick=True`, which over-preserves UK for fast heroes → gapless coverage). Now runs survival sims with game-truth `bugfix_buff_tick=False`, reproduces the real wipe (sim T29 / 15.26M vs real T32 / 15.77M), and correctly REJECTS Force-failing tunes while still passing stable affinities. **Definitive finding**: NO daily-robust Force tune exists for the MEN roster — Ninja/Venom are Magic (glance only on Force), and DPS bulk doesn't help (survival is coverage-bound, not bulk-bound). Reliable Force needs a coverage upgrade or per-affinity swap, not speed/gear tuning. |
+| Gear-cleanse + 1-click equip in-game | 🟡 | Pieces exist (`sell.py`, `sell_rules.py`, `loadouts.py`, `/equip`); integrate into the polished filter→sell and equip-full-build flow. Run pillar. **2026-06-24**: `loadouts.apply()` made CONTENTION-FREE (release-then-pure-activate) — a multi-hero re-gear from one shared pool no longer starves the low-priority heroes; verified live fielding all 5 team heroes to exact target gear. "THE WALL" (mod /bulk-equip can't move pieces) disproved — all equip paths commit; equips silently no-op only while the client is wedged at the "Empty"/WebView scene (relaunch to Village first). |
 | Interactive visual timeline | 🟡 | Data exists (cast timeline, per-tick); finish the color-coded buff/TM dashboard render (DWJ-style). Surface/UX. |
 
 **What we already do that neither tool does** (don't regress these): actually