Commit 8c8a67d

and

committed

ADR 0008 v0.4 amendment: dLM K/V Restoration architecture; reject 0010 + 0011

Reorganises the project's architectural record around a single canonical ADR (0008) per user directive 2026-06-08. Rejects the two parallel-track drafts (ADR 0010 NF4 KV quant; ADR 0011 cross-attention bridge) inline rather than as separate documents. Changes: * Title amended: now covers v0.3 GA scope (sections 1-10, unchanged) AND v0.4 GA architecture (new section 11). * Status block: split v0.3 GA acceptance from v0.4 amendment acceptance. Decision drivers for the v0.4 amendment listed with empirical evidence pointers (sink_window_quality_ab_1780714635.json, R1e-gamma decisive datum). * Rejects block: ADR 0010 and ADR 0011 drafts explicitly rejected with reasoning. Their tombstone branches and research record stay accessible but do not multiply into separate authoritative ADRs. * Section 2.3: forward-pointer added making it clear that 'sink+window only' is the v0.3 GA scope. Section 11 supersedes the cache strategy clause for v0.4; everything else (session model, byte-exact contract, INV invariants, SessionStore) carries over unchanged. New section 11 (391 lines): 11.1 Scope of this amendment 11.2 Reasoning chain leading here (sink+window measured loss + R1c-e falsification) 11.3 The dLM proposer's no-cache property is the load-bearing fact. Documents that diffusion language models compute K/V transiently per forward (no persistent cache), making the proposer a constant- memory K/V reconstruction source. 11.4 Five hard constraints v0.4 must satisfy: constant memory, zero intelligence regression, SD correctness contract preserved, no cross-attention bridge, fits Mac mini 24 GB. 11.5 v0.4 architecture: dLM K/V Restoration. Verifier maintains minimal sink+window cache plus accepts transient K/V at evicted positions reconstructed from proposer's parallel forward. Diagram included. 11.6 Cross-model K/V projection f_theta. Same-model trivially is identity; cross-model needs learned per-layer per-head projection. Tabulates structural advantages over R1c-e bridge (injection point, downstream consumption, source representation, training objective). 11.7 Implementation phases K1-K5 with Linux CI gates and empirical gates on Mac M4 and vast NVIDIA. K1 (same-model toy) doable on Mac M4 alone; K3 (production) requires real long-context corpus. 11.8 v0.4 GA validation criteria: NIAH recall >= 95% at 100k context, constant kv_live_bytes, INV-3 preserved, cross-platform parity, 4 h stability bench. 11.9 Five open questions (layer alignment, sink+window aggressiveness, reject path amortisation, KakeyaLattice composition, multi-tenant scheduling). 11.10 Why ADR 0010 and ADR 0011 drafts were specifically wrong (lessons recorded for future contributors). Sibling closures: PR #65 (R1c), PR #66 (ADR 0010), PR #67 (R1d), PR #68 (R1e) already updated with [CLOSE] / [OBSOLETE] title prefixes; user clicks Close in GitHub UI. Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>

1 parent b6fdec4 commit 8c8a67dCopy full SHA for 8c8a67d

1 file changed

docs/adr
- 0008-session-bound-runtime-and-grpc-protocol.md

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 8c8a67d

File tree

0 commit comments