Commit 8c8a67d
ADR 0008 v0.4 amendment: dLM K/V Restoration architecture; reject 0010 + 0011
Reorganises the project's architectural record around a single canonical ADR
(0008) per user directive 2026-06-08. Rejects the two parallel-track drafts
(ADR 0010 NF4 KV quant; ADR 0011 cross-attention bridge) inline rather than
as separate documents.
Changes:
* Title amended: now covers v0.3 GA scope (sections 1-10, unchanged) AND v0.4
GA architecture (new section 11).
* Status block: split v0.3 GA acceptance from v0.4 amendment acceptance.
Decision drivers for the v0.4 amendment listed with empirical evidence
pointers (sink_window_quality_ab_1780714635.json, R1e-gamma decisive datum).
* Rejects block: ADR 0010 and ADR 0011 drafts explicitly rejected with
reasoning. Their tombstone branches and research record stay accessible but
do not multiply into separate authoritative ADRs.
* Section 2.3: forward-pointer added making it clear that 'sink+window only'
is the v0.3 GA scope. Section 11 supersedes the cache strategy clause for
v0.4; everything else (session model, byte-exact contract, INV invariants,
SessionStore) carries over unchanged.
New section 11 (391 lines):
11.1 Scope of this amendment
11.2 Reasoning chain leading here (sink+window measured loss + R1c-e
falsification)
11.3 The dLM proposer's no-cache property is the load-bearing fact.
Documents that diffusion language models compute K/V transiently
per forward (no persistent cache), making the proposer a constant-
memory K/V reconstruction source.
11.4 Five hard constraints v0.4 must satisfy: constant memory, zero
intelligence regression, SD correctness contract preserved, no
cross-attention bridge, fits Mac mini 24 GB.
11.5 v0.4 architecture: dLM K/V Restoration. Verifier maintains minimal
sink+window cache plus accepts transient K/V at evicted positions
reconstructed from proposer's parallel forward. Diagram included.
11.6 Cross-model K/V projection f_theta. Same-model trivially is
identity; cross-model needs learned per-layer per-head projection.
Tabulates structural advantages over R1c-e bridge (injection point,
downstream consumption, source representation, training objective).
11.7 Implementation phases K1-K5 with Linux CI gates and empirical gates
on Mac M4 and vast NVIDIA. K1 (same-model toy) doable on Mac M4
alone; K3 (production) requires real long-context corpus.
11.8 v0.4 GA validation criteria: NIAH recall >= 95% at 100k context,
constant kv_live_bytes, INV-3 preserved, cross-platform parity,
4 h stability bench.
11.9 Five open questions (layer alignment, sink+window aggressiveness,
reject path amortisation, KakeyaLattice composition, multi-tenant
scheduling).
11.10 Why ADR 0010 and ADR 0011 drafts were specifically wrong
(lessons recorded for future contributors).
Sibling closures: PR #65 (R1c), PR #66 (ADR 0010), PR #67 (R1d), PR #68 (R1e)
already updated with [CLOSE] / [OBSOLETE] title prefixes; user clicks Close
in GitHub UI.
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>1 parent b6fdec4 commit 8c8a67d
1 file changed
Lines changed: 385 additions & 6 deletions
0 commit comments