You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
roadmap: scope M7 — survival engine + survival-first comp finder
The next milestone toward autonomous novel survive-to-T50 comp discovery
(UK-chain / BD+buff-extension / counter-wall / ally-protect / heal-tank),
derived game-truth without copying DWJ/HH. Honest gate: the survival model
currently OVER-predicts (called Force "T50", real wipe T32) and is validated
on ~1 team, with non-UK mechanics (ally-protect redirect, counterattack
damage) unmodeled.
Phase A (gate): survival-accuracy bar, diverse fixture battery, survival-diff
harness, un-stack survival compensating wrongs together, model missing
mechanics. Phase B: coverage-chain enumerator + sim validation + auto/manual
classifier + ranked output (drop HH from the recommendation path). Phase C:
damage-survive archetypes + speed-tune integration. First no-key step:
productize cb_survival_diff.py from the Force fixture we have.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
#### Phase A — Survival-model hardening (the gate)
219
+
| Sub-goal | Status | Notes |
220
+
|---|---|---|
221
+
| A1 — Define survival accuracy bar | 🔴 | Separate from the damage ±5% bar: predicted death-turn within ±2-3 BT of real AND correct survive/wipe classification, across the whole fixture battery. |
222
+
| A2 — Capture a DIVERSE fixture battery | 🔴 (key-gated) | Need real runs (build snapshot now auto-saved per run): MEN stable-day survive-T50 (HAVE none clean+recent); MEN Force wipe-T32 (✅ HAVE `20260624_152252`); ≥1 non-UK archetype (counter / ally-protect / heal-tank) the roster can field; a buff-extension-dependent comp. Use `cb_watcher.py` to grab wipe cases without burning keys. **Rate limiter: 2 keys/day + must own/gear the archetype.**|
223
+
| A3 — Survival-diff harness | 🟡 | Productize the per-CB-turn sim-vs-real HP + coverage(UK/BD/shield/counter/protect) + death-turn diff we hand-rolled on the Force fixture → `tools/cb_survival_diff.py`. |
224
+
| A4 — Un-stack survival compensating wrongs TOGETHER | 🔴 | Against the full battery (mission rule): promote game-truth buff cadence (`bugfix_buff_tick=False`) from finder-only to global + re-derive damage calibration + re-baseline the 2 locked damage tests once. |
|**GATE A**| 🔴 | Survival sim reproduces EVERY battery fixture within the bar. Until met, survival recs are sanity-check only. |
227
+
228
+
#### Phase B — Survival-first comp finder (depends on Gate A)
229
+
| Sub-goal | Status | Notes |
230
+
|---|---|---|
231
+
| B1 — Coverage-chain enumerator | 🔴 | From the synergy graph + owned roster, enumerate comps that CAN form a survival pattern (UK-chain / BD+extension / counter-wall / ally-protect / heal-tank); tag archetype + providers. |
232
+
| B2 — Validate per comp via hardened sim | 🔴 | Survive-to-50 per affinity via the finder's MC; **drop HellHades from the scoring path** (`cb_team_explorer` currently scores with `tierlist.json` — mission violation to fix). |
233
+
| B3 — Auto-vs-manual classifier | 🔴 | Mark "automatic" only if the required skill order is preset-expressible; flag "manual" separately, never auto-recommended. |
234
+
| B4 — Rank + output best runs | 🔴 | Output: archetype, providers, per-affinity survival turn, damage, auto/manual. DWJ tunes referenced ONLY as "we independently found this too" divergence flag. New tool `tools/cb_comp_finder.py`. |
235
+
|**GATE B**| 🔴 | Finder independently re-derives ≥N known-good comps the roster can run AND surfaces ≥1 novel comp a real run confirms survives. |
236
+
237
+
#### Phase C — Damage-survive comps + speed-tune integration
238
+
| Sub-goal | Status | Notes |
239
+
|---|---|---|
240
+
| C1 — High-damage-and-survive archetype | 🔴 | Extend finder to counter/protect/heal comps that also push damage; needs the damage ±5% gate too (so Spirit regression must be closed). |
241
+
| C2 — Speed-tune integration | ✅ (ready) |`speed_tune_finder.py` already finds the SPD tuple that holds a chosen comp; trustworthy once Gate A lands. |
242
+
243
+
**Critical dependencies / risks:**
244
+
-**Key-gated & roster-gated:** Phase A needs ~5-8 diverse real captures; 2 keys/day, and we can only capture archetypes the user actually owns/gears. Mitigation: capture what the roster supports now; for unowned archetypes, model game-truth from skill data and verify on the closest available fixture.
245
+
-**Re-baselining the damage tests** when `bbt` goes global (A4) is calibration-sensitive — do it once, carefully, with the whole battery, never piecemeal.
246
+
-**Don't reintroduce external inputs:** B2 must remove HH from the recommendation path to stay mission-compliant.
247
+
248
+
**First concrete step (no key needed):** A3 — productize `cb_survival_diff.py` from the Force fixture we already have, so every future capture instantly yields a survival diff. Then A2 captures begin filling the battery.
0 commit comments