@@ -1346,18 +1346,22 @@ \section{Conclusion}
13461346
13471347\bibitem {kakeya-v14-release}
13481348Li, A.
1349- \newblock \textsc {KakeyaLattice}~v1.4: the canonical implementation.
1350- \newblock Open-source release, \emph {LLM-KV--Cache-compress }, tag \texttt {v1.4 }, April 2026.
1351- \newblock Python class \texttt {kakeyaturbo\_ py.V14KakeyaZamirLatticeGPU }; multi-model
1352- measurement harness \texttt {benchmarks/multimodel\_ v14\_ vs\_ tq.py }; four
1353- per-architecture snapshot hooks in
1354- \texttt {vllm\_ backend/kakeya\_ v1\_ 3\_ ppl/snapshot\_ hook.py }.
1349+ \newblock \textsc {KakeyaLattice}: the canonical implementation.
1350+ \newblock Open-source release, \emph {LLM-KV--Cache-compress },
1351+ April 2026.
1352+ \newblock Python package \texttt {kakeyalattice } (classes
1353+ \texttt {V14KakeyaZamirLatticeGPU } and
1354+ \texttt {V15KakeyaZamirE8GPU }); multi-model measurement harness
1355+ \texttt {benchmarks/rigorous\_ eval.py }; four per-architecture
1356+ attention-module patches in
1357+ \texttt {vllm\_ backend/kakeya\_ v1\_ 4\_ snapshot/snapshot\_ hook.py }.
1358+ Release tags: \texttt {v1.4 } (first $ D_4 $ release, commit
1359+ \texttt {6b02711 }) and \texttt {v1.5 } (first $ E_8 $ release).
13551360
13561361\bibitem {kakeya-v13-paper}
13571362Li, A.
13581363\newblock {Randomized Kakeya Skeletons for LLM KV Cache Compression: Algorithm and Rate--Distortion Boundary.}
1359- \newblock \emph {LLM-KV--Cache-compress v1.3 paper }, April 2026.
1360- \newblock (Superseded by this paper at tag \texttt {v1.4 }.)
1364+ \newblock Prior unpublished draft, superseded by this paper.
13611365
13621366\bibitem {turboquant-vllm}
13631367vibhavagarwal5 \emph {et al. }
@@ -1457,23 +1461,27 @@ \section{Conclusion}
14571461\section {Reproducibility manifest }
14581462\label {app:repro }
14591463
1460- The full multi-model benchmark for \emph {both codecs } is
1461- reproducible via the rigorous evaluation harness introduced with
1462- the v1.5 release:
1464+ The full multi-model benchmark for both lattice variants uses the
1465+ in-forward rigorous evaluation harness
1466+ (\S \ref {sec:methodology-rigorous }). The snapshot-protocol $ D_4 $
1467+ tables in \S \ref {sec:benchmarks } are reproducible by the same
1468+ harness with \texttt {--mode snapshot --boundary-size 2 --n-passages 4 }:
14631469\begin {verbatim }
14641470cd LLM-KV--Cache-compress
14651471pip install -e kakeyalattice
14661472pip install -e vllm_backend
14671473export VLLM_ENABLE_V1_MULTIPROCESSING=0 KAKEYA_SNAPSHOT_QWEN3=1
14681474
1469- # PPL / MSE / CR (v1.4 + v1.5 + TurboQuant at matched Q / b):
1475+ # In-forward rigorous (n=32, 95% CI): D4, E8, and TurboQuant, same run.
1476+ # --q-values selects D4 operating points; --v15-q-values selects E8
1477+ # operating points; --tq-b-values selects TurboQuant bit widths.
14701478python benchmarks/rigorous_eval.py \
14711479 --model-path <HF-id> --model-name <short>_nobdry \
14721480 --mode inforward --no-boundary \
14731481 --q-values 4,10 --v15-q-values 4,10 --tq-b-values 3 \
14741482 --kv-modes KV \
14751483 --ctx-len 2048 --n-eval 64 --n-passages 32 \
1476- --out-dir reports/v1_5_release
1484+ --out-dir reports/rigorous_eval
14771485
14781486# TurboQuant b=2 guardrail (requires boundary=2 to boot):
14791487python benchmarks/rigorous_eval.py \
@@ -1482,7 +1490,7 @@ \section{Reproducibility manifest}
14821490 --q-values "" --v15-q-values "" --tq-b-values 2 \
14831491 --kv-modes KV \
14841492 --ctx-len 2048 --n-eval 64 --n-passages 32 \
1485- --out-dir reports/v1_5_release
1493+ --out-dir reports/rigorous_eval
14861494
14871495# Pure codec latency (no model needed):
14881496python benchmarks/e8_latency_benchmark.py --n-iters 500
@@ -1493,7 +1501,7 @@ \section{Reproducibility manifest}
14931501 --mode inforward --boundary-size 2 --n-trials 3 \
14941502 --ctx-lengths 4096,8192,16384 --depths 0.1,0.5,0.9 \
14951503 --q-values 4,10 --v15-q-values 4,10 --tq-b-values 2,3 \
1496- --out-dir reports/v1_5_release /niah
1504+ --out-dir reports/rigorous_eval /niah
14971505
14981506# Frozen sha256 parity (bit-level regression gate):
14991507python benchmarks/e8_parity_and_smoke.py
@@ -1503,35 +1511,43 @@ \section{Reproducibility manifest}
15031511\texttt {deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B },
15041512\texttt {google/gemma-4-E4B }, or
15051513\texttt {zai-org/GLM-4-9B-Chat } (add \texttt {--trust-remote-code }
1506- for the last). Raw per-passage JSON and full stdout logs are
1507- committed under
1508- \texttt {reports/v1\_ 5\_ release/ } (v1.5 data) and
1509- \texttt {reports/v1\_ 4\_ release/ } (v1.4 frozen data).
1510-
1511- \section {Canonical naming }
1514+ for the last). Raw per-passage JSON, full stdout logs, and frozen
1515+ codec-output hashes are committed under
1516+ \texttt {reports/ } at the release tags listed in
1517+ Appendix~\ref {app:naming }. The CLI flag names
1518+ \texttt {--v15-q-values } are verbatim repository identifiers of the
1519+ $ E_8 $ -variant codec registered at release tag \texttt {v1.5 };
1520+ similarly, directory names \texttt {reports/v1\_ 4\_ release/ } and
1521+ \texttt {reports/v1\_ 5\_ release/ } are on-disk paths that match the
1522+ corresponding release tags.
1523+
1524+ \section {Implementation identifiers }
15121525\label {app:naming }
15131526
1527+ The paper is agnostic to version labelling: the two codec variants
1528+ are the \emph {$ D_4 $ nested lattice } and the \emph {$ E_8 $ nested
1529+ lattice }, and every result table cites its protocol. This appendix
1530+ lists only the repository-level identifiers needed to reproduce the
1531+ bit-identical codec output.
1532+
15141533\begin {itemize }[leftmargin=*]
1515- \item \textbf {Project name }: \textsc {KakeyaLattice}
1516- \item \textbf {Python package }: \texttt {kakeyalattice }
1517- \item \textbf {v1.4 codec ($ D_4 $ ) }:
1518- \begin {itemize }
1519- \item spoken / written: `` v1.4 kakeya zamir lattice GPU''
1520- \item with parameter: e.g.\ `` v1.4 $ Q=152 $ ''
1521- \item class: \texttt {V14KakeyaZamirLatticeGPU }
1522- \item module: \texttt {kakeyalattice.v1\_ 4\_ kakeya\_ zamir\_ lattice\_ gpu }
1523- \item release tag: \texttt {v1.4 } (commit \texttt {6b02711 })
1524- \end {itemize }
1525- \item \textbf {v1.5 codec ($ E_8 $ ) }:
1526- \begin {itemize }
1527- \item spoken / written: `` v1.5 kakeya zamir E8 GPU''
1528- \item with parameter: e.g.\ `` v1.5 $ Q=10 $ ''
1529- \item class: \texttt {V15KakeyaZamirE8GPU }
1530- \item module: \texttt {kakeyalattice.v1\_ 5\_ kakeya\_ zamir\_ e8\_ gpu }
1531- \item release tag: \texttt {v1.5 } (commit at the time of paper release)
1532- \end {itemize }
1534+ \item Project name: \textsc {KakeyaLattice}.
1535+ \item Python package: \texttt {kakeyalattice }.
1536+ \item $ D_4 $ variant: class
1537+ \texttt {V14KakeyaZamirLatticeGPU }, module
1538+ \texttt {kakeyalattice.v1\_ 4\_ kakeya\_ zamir\_ lattice\_ gpu },
1539+ release tag \texttt {v1.4 } (commit \texttt {6b02711 }).
1540+ \item $ E_8 $ variant: class
1541+ \texttt {V15KakeyaZamirE8GPU }, module
1542+ \texttt {kakeyalattice.v1\_ 5\_ kakeya\_ zamir\_ e8\_ gpu },
1543+ release tag \texttt {v1.5 }.
15331544\end {itemize }
15341545
1546+ The \texttt {v1.4 } / \texttt {v1.5 } strings appear only as git release
1547+ tags and as the \texttt {V14 }/\texttt {V15 } class-name prefixes chosen
1548+ by the repository authors; the paper itself references the codec
1549+ variants exclusively by their lattice names.
1550+
15351551\section {Canonical operating points }
15361552\label {app:ops }
15371553
0 commit comments