Commit 8401e76

and

committed

ADR 0010: replace draft with WITHDRAWN tombstone — superseded by KakeyaLattice

Original draft proposed NF4/INT8 KV cache quantization as the v0.4 GA path. Withdrawn the same day because KakeyaLattice (github.com/FluffyAIcode/LLM-KV--Cache-compress, v1.4 D4 / v1.5 E8) already exists, is strictly better on every metric this draft cared about, beats Google TurboQuant 12/12 on real vLLM+H200, and ships as 'pip install kakeyalattice' with a transformers.DynamicCache drop-in. Recommending NF4 as v0.4 GA while KakeyaLattice already existed was a regression from project strength to literature baseline. This tombstone makes the obsolete status visible at file level. Original draft text is recoverable from git history (parent commit) for reference; not preserved inline because lingering rejected designs create direction ambiguity. PR #66 should be CLOSED on GitHub UI (gh CLI is read-only in this agent environment, so the close action requires manual click). Replaced by: ADR 0012 (planned) — KakeyaLattice KV codec integration. Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>

1 parent 9407044 commit 8401e76Copy full SHA for 8401e76

1 file changed

docs/adr
- 0010-full-attention-low-precision-kv.md

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 8401e76

File tree

0 commit comments