Commit 8401e76
ADR 0010: replace draft with WITHDRAWN tombstone — superseded by KakeyaLattice
Original draft proposed NF4/INT8 KV cache quantization as the v0.4 GA
path. Withdrawn the same day because KakeyaLattice
(github.com/FluffyAIcode/LLM-KV--Cache-compress, v1.4 D4 / v1.5 E8)
already exists, is strictly better on every metric this draft cared
about, beats Google TurboQuant 12/12 on real vLLM+H200, and ships as
'pip install kakeyalattice' with a transformers.DynamicCache drop-in.
Recommending NF4 as v0.4 GA while KakeyaLattice already existed was a
regression from project strength to literature baseline. This tombstone
makes the obsolete status visible at file level. Original draft text is
recoverable from git history (parent commit) for reference; not
preserved inline because lingering rejected designs create direction
ambiguity.
PR #66 should be CLOSED on GitHub UI (gh CLI is read-only in this
agent environment, so the close action requires manual click).
Replaced by: ADR 0012 (planned) — KakeyaLattice KV codec integration.
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>1 parent 9407044 commit 8401e76
1 file changed
Lines changed: 61 additions & 351 deletions
0 commit comments