Skip to content

Commit 23deb36

Browse files
SnoopLawgclaude
andcommitted
roadmap: scope M7 — survival engine + survival-first comp finder
The next milestone toward autonomous novel survive-to-T50 comp discovery (UK-chain / BD+buff-extension / counter-wall / ally-protect / heal-tank), derived game-truth without copying DWJ/HH. Honest gate: the survival model currently OVER-predicts (called Force "T50", real wipe T32) and is validated on ~1 team, with non-UK mechanics (ally-protect redirect, counterattack damage) unmodeled. Phase A (gate): survival-accuracy bar, diverse fixture battery, survival-diff harness, un-stack survival compensating wrongs together, model missing mechanics. Phase B: coverage-chain enumerator + sim validation + auto/manual classifier + ranked output (drop HH from the recommendation path). Phase C: damage-survive archetypes + speed-tune integration. First no-key step: productize cb_survival_diff.py from the Force fixture we have. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 16d6f90 commit 23deb36

1 file changed

Lines changed: 61 additions & 0 deletions

File tree

docs/roadmap.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,67 @@ run battles; game-truth IL2CPP extraction (vs HH screen-scrape / DWJ community
186186
data); calibrated damage sim (DWJ doesn't simulate damage at all); per-tick
187187
logging; cross-location recommender + build + farm guidance.
188188

189+
### M7: Survival engine + survival-first comp finder (NEXT — scoped 2026-06-25)
190+
191+
**The vision (user, 2026-06-25):** the engine should *mathematically* take our
192+
modeled heroes + abilities and, without copying DWJ/HH, output novel team comps
193+
that survive to turn 50 — ranked best-first — and we trust the survive/wipe
194+
verdict *before* spending a key. Archetypes to cover:
195+
- **Unkillable comps** — keep a death-preventing buff up the whole fight via
196+
Unkillable, Block-Damage, and/or buff-extension chains.
197+
- *Automatic*: the saved preset's skill order makes it work from battle start.
198+
- *Manual*: the preset can't express the needed stall (e.g. delay A4 N turns),
199+
so it needs mid-fight play — must be flagged, never auto-recommended.
200+
- **Non-Unkillable survive+damage comps** — survive via counterattack walls,
201+
Ally-Protect, heal-tank sustain, etc., while pushing high damage.
202+
203+
**Why this is its own milestone (the honest gate):** as of 2026-06-24 the sim's
204+
*survival* model OVER-PREDICTS (it called the Force tune "T50"; real wipe T32).
205+
It's validated against ~1 team. Non-UK survival is barely modeled (Ally-Protect
206+
redirect is OFF; counterattack damage magnitude unwired). So we can enumerate
207+
comps and know *who provides what* (synergy graph), but we cannot yet *trust* a
208+
survive-to-50 verdict on a novel comp. M7 earns that trust, then builds the
209+
finder on top. **Nothing downstream ships prescriptive until Gate A passes.**
210+
211+
> **Definition of done**: given the owned roster, the engine outputs a ranked
212+
> list of survival comps with a per-affinity survive-to-50 verdict that matches
213+
> reality within **±2-3 boss turns** (and never mis-classifies survive-vs-wipe)
214+
> on a *held-out* real fixture — with **zero external-source inputs** in the
215+
> recommendation path (game-truth synergy + our sim only; DWJ/HH = divergence
216+
> flags, not inputs).
217+
218+
#### Phase A — Survival-model hardening (the gate)
219+
| Sub-goal | Status | Notes |
220+
|---|---|---|
221+
| A1 — Define survival accuracy bar | 🔴 | Separate from the damage ±5% bar: predicted death-turn within ±2-3 BT of real AND correct survive/wipe classification, across the whole fixture battery. |
222+
| A2 — Capture a DIVERSE fixture battery | 🔴 (key-gated) | Need real runs (build snapshot now auto-saved per run): MEN stable-day survive-T50 (HAVE none clean+recent); MEN Force wipe-T32 (✅ HAVE `20260624_152252`); ≥1 non-UK archetype (counter / ally-protect / heal-tank) the roster can field; a buff-extension-dependent comp. Use `cb_watcher.py` to grab wipe cases without burning keys. **Rate limiter: 2 keys/day + must own/gear the archetype.** |
223+
| A3 — Survival-diff harness | 🟡 | Productize the per-CB-turn sim-vs-real HP + coverage(UK/BD/shield/counter/protect) + death-turn diff we hand-rolled on the Force fixture → `tools/cb_survival_diff.py`. |
224+
| A4 — Un-stack survival compensating wrongs TOGETHER | 🔴 | Against the full battery (mission rule): promote game-truth buff cadence (`bugfix_buff_tick=False`) from finder-only to global + re-derive damage calibration + re-baseline the 2 locked damage tests once. |
225+
| A5 — Model missing survival mechanics (game-truth) | 🔴 | Ally-Protect redirect (currently OFF), counterattack damage magnitude+uptime (`CounterattackModifier −0.25` unwired), generalized buff-extension cadence, heal-vs-ramp sustain + heal caps. |
226+
| **GATE A** | 🔴 | Survival sim reproduces EVERY battery fixture within the bar. Until met, survival recs are sanity-check only. |
227+
228+
#### Phase B — Survival-first comp finder (depends on Gate A)
229+
| Sub-goal | Status | Notes |
230+
|---|---|---|
231+
| B1 — Coverage-chain enumerator | 🔴 | From the synergy graph + owned roster, enumerate comps that CAN form a survival pattern (UK-chain / BD+extension / counter-wall / ally-protect / heal-tank); tag archetype + providers. |
232+
| B2 — Validate per comp via hardened sim | 🔴 | Survive-to-50 per affinity via the finder's MC; **drop HellHades from the scoring path** (`cb_team_explorer` currently scores with `tierlist.json` — mission violation to fix). |
233+
| B3 — Auto-vs-manual classifier | 🔴 | Mark "automatic" only if the required skill order is preset-expressible; flag "manual" separately, never auto-recommended. |
234+
| B4 — Rank + output best runs | 🔴 | Output: archetype, providers, per-affinity survival turn, damage, auto/manual. DWJ tunes referenced ONLY as "we independently found this too" divergence flag. New tool `tools/cb_comp_finder.py`. |
235+
| **GATE B** | 🔴 | Finder independently re-derives ≥N known-good comps the roster can run AND surfaces ≥1 novel comp a real run confirms survives. |
236+
237+
#### Phase C — Damage-survive comps + speed-tune integration
238+
| Sub-goal | Status | Notes |
239+
|---|---|---|
240+
| C1 — High-damage-and-survive archetype | 🔴 | Extend finder to counter/protect/heal comps that also push damage; needs the damage ±5% gate too (so Spirit regression must be closed). |
241+
| C2 — Speed-tune integration | ✅ (ready) | `speed_tune_finder.py` already finds the SPD tuple that holds a chosen comp; trustworthy once Gate A lands. |
242+
243+
**Critical dependencies / risks:**
244+
- **Key-gated & roster-gated:** Phase A needs ~5-8 diverse real captures; 2 keys/day, and we can only capture archetypes the user actually owns/gears. Mitigation: capture what the roster supports now; for unowned archetypes, model game-truth from skill data and verify on the closest available fixture.
245+
- **Re-baselining the damage tests** when `bbt` goes global (A4) is calibration-sensitive — do it once, carefully, with the whole battery, never piecemeal.
246+
- **Don't reintroduce external inputs:** B2 must remove HH from the recommendation path to stay mission-compliant.
247+
248+
**First concrete step (no key needed):** A3 — productize `cb_survival_diff.py` from the Force fixture we already have, so every future capture instantly yields a survival diff. Then A2 captures begin filling the battery.
249+
189250
---
190251

191252
## Overall progress snapshot (last updated 2026-06-21)

0 commit comments

Comments
 (0)