Skip to content

Commit 4787334

Browse files
authored
Merge PR #44: paper — pass-2 consistency (purge version labels from App A/B/bib)
paper: pass-2 consistency — purge version labels from App A/B/bib
2 parents 9d4e9a2 + ea8a2f6 commit 4787334

2 files changed

Lines changed: 55 additions & 39 deletions

File tree

reports/paper/kakeyalattice.pdf

932 Bytes
Binary file not shown.

reports/paper/kakeyalattice.tex

Lines changed: 55 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1346,18 +1346,22 @@ \section{Conclusion}
13461346

13471347
\bibitem{kakeya-v14-release}
13481348
Li, A.
1349-
\newblock \textsc{KakeyaLattice}~v1.4: the canonical implementation.
1350-
\newblock Open-source release, \emph{LLM-KV--Cache-compress}, tag \texttt{v1.4}, April 2026.
1351-
\newblock Python class \texttt{kakeyaturbo\_py.V14KakeyaZamirLatticeGPU}; multi-model
1352-
measurement harness \texttt{benchmarks/multimodel\_v14\_vs\_tq.py}; four
1353-
per-architecture snapshot hooks in
1354-
\texttt{vllm\_backend/kakeya\_v1\_3\_ppl/snapshot\_hook.py}.
1349+
\newblock \textsc{KakeyaLattice}: the canonical implementation.
1350+
\newblock Open-source release, \emph{LLM-KV--Cache-compress},
1351+
April 2026.
1352+
\newblock Python package \texttt{kakeyalattice} (classes
1353+
\texttt{V14KakeyaZamirLatticeGPU} and
1354+
\texttt{V15KakeyaZamirE8GPU}); multi-model measurement harness
1355+
\texttt{benchmarks/rigorous\_eval.py}; four per-architecture
1356+
attention-module patches in
1357+
\texttt{vllm\_backend/kakeya\_v1\_4\_snapshot/snapshot\_hook.py}.
1358+
Release tags: \texttt{v1.4} (first $D_4$ release, commit
1359+
\texttt{6b02711}) and \texttt{v1.5} (first $E_8$ release).
13551360

13561361
\bibitem{kakeya-v13-paper}
13571362
Li, A.
13581363
\newblock {Randomized Kakeya Skeletons for LLM KV Cache Compression: Algorithm and Rate--Distortion Boundary.}
1359-
\newblock \emph{LLM-KV--Cache-compress v1.3 paper}, April 2026.
1360-
\newblock (Superseded by this paper at tag \texttt{v1.4}.)
1364+
\newblock Prior unpublished draft, superseded by this paper.
13611365

13621366
\bibitem{turboquant-vllm}
13631367
vibhavagarwal5 \emph{et al.}
@@ -1457,23 +1461,27 @@ \section{Conclusion}
14571461
\section{Reproducibility manifest}
14581462
\label{app:repro}
14591463

1460-
The full multi-model benchmark for \emph{both codecs} is
1461-
reproducible via the rigorous evaluation harness introduced with
1462-
the v1.5 release:
1464+
The full multi-model benchmark for both lattice variants uses the
1465+
in-forward rigorous evaluation harness
1466+
(\S\ref{sec:methodology-rigorous}). The snapshot-protocol $D_4$
1467+
tables in \S\ref{sec:benchmarks} are reproducible by the same
1468+
harness with \texttt{--mode snapshot --boundary-size 2 --n-passages 4}:
14631469
\begin{verbatim}
14641470
cd LLM-KV--Cache-compress
14651471
pip install -e kakeyalattice
14661472
pip install -e vllm_backend
14671473
export VLLM_ENABLE_V1_MULTIPROCESSING=0 KAKEYA_SNAPSHOT_QWEN3=1
14681474
1469-
# PPL / MSE / CR (v1.4 + v1.5 + TurboQuant at matched Q / b):
1475+
# In-forward rigorous (n=32, 95% CI): D4, E8, and TurboQuant, same run.
1476+
# --q-values selects D4 operating points; --v15-q-values selects E8
1477+
# operating points; --tq-b-values selects TurboQuant bit widths.
14701478
python benchmarks/rigorous_eval.py \
14711479
--model-path <HF-id> --model-name <short>_nobdry \
14721480
--mode inforward --no-boundary \
14731481
--q-values 4,10 --v15-q-values 4,10 --tq-b-values 3 \
14741482
--kv-modes KV \
14751483
--ctx-len 2048 --n-eval 64 --n-passages 32 \
1476-
--out-dir reports/v1_5_release
1484+
--out-dir reports/rigorous_eval
14771485
14781486
# TurboQuant b=2 guardrail (requires boundary=2 to boot):
14791487
python benchmarks/rigorous_eval.py \
@@ -1482,7 +1490,7 @@ \section{Reproducibility manifest}
14821490
--q-values "" --v15-q-values "" --tq-b-values 2 \
14831491
--kv-modes KV \
14841492
--ctx-len 2048 --n-eval 64 --n-passages 32 \
1485-
--out-dir reports/v1_5_release
1493+
--out-dir reports/rigorous_eval
14861494
14871495
# Pure codec latency (no model needed):
14881496
python benchmarks/e8_latency_benchmark.py --n-iters 500
@@ -1493,7 +1501,7 @@ \section{Reproducibility manifest}
14931501
--mode inforward --boundary-size 2 --n-trials 3 \
14941502
--ctx-lengths 4096,8192,16384 --depths 0.1,0.5,0.9 \
14951503
--q-values 4,10 --v15-q-values 4,10 --tq-b-values 2,3 \
1496-
--out-dir reports/v1_5_release/niah
1504+
--out-dir reports/rigorous_eval/niah
14971505
14981506
# Frozen sha256 parity (bit-level regression gate):
14991507
python benchmarks/e8_parity_and_smoke.py
@@ -1503,35 +1511,43 @@ \section{Reproducibility manifest}
15031511
\texttt{deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B},
15041512
\texttt{google/gemma-4-E4B}, or
15051513
\texttt{zai-org/GLM-4-9B-Chat} (add \texttt{--trust-remote-code}
1506-
for the last). Raw per-passage JSON and full stdout logs are
1507-
committed under
1508-
\texttt{reports/v1\_5\_release/} (v1.5 data) and
1509-
\texttt{reports/v1\_4\_release/} (v1.4 frozen data).
1510-
1511-
\section{Canonical naming}
1514+
for the last). Raw per-passage JSON, full stdout logs, and frozen
1515+
codec-output hashes are committed under
1516+
\texttt{reports/} at the release tags listed in
1517+
Appendix~\ref{app:naming}. The CLI flag names
1518+
\texttt{--v15-q-values} are verbatim repository identifiers of the
1519+
$E_8$-variant codec registered at release tag \texttt{v1.5};
1520+
similarly, directory names \texttt{reports/v1\_4\_release/} and
1521+
\texttt{reports/v1\_5\_release/} are on-disk paths that match the
1522+
corresponding release tags.
1523+
1524+
\section{Implementation identifiers}
15121525
\label{app:naming}
15131526

1527+
The paper is agnostic to version labelling: the two codec variants
1528+
are the \emph{$D_4$ nested lattice} and the \emph{$E_8$ nested
1529+
lattice}, and every result table cites its protocol. This appendix
1530+
lists only the repository-level identifiers needed to reproduce the
1531+
bit-identical codec output.
1532+
15141533
\begin{itemize}[leftmargin=*]
1515-
\item \textbf{Project name}: \textsc{KakeyaLattice}
1516-
\item \textbf{Python package}: \texttt{kakeyalattice}
1517-
\item \textbf{v1.4 codec ($D_4$)}:
1518-
\begin{itemize}
1519-
\item spoken / written: ``v1.4 kakeya zamir lattice GPU''
1520-
\item with parameter: e.g.\ ``v1.4 $Q=152$''
1521-
\item class: \texttt{V14KakeyaZamirLatticeGPU}
1522-
\item module: \texttt{kakeyalattice.v1\_4\_kakeya\_zamir\_lattice\_gpu}
1523-
\item release tag: \texttt{v1.4} (commit \texttt{6b02711})
1524-
\end{itemize}
1525-
\item \textbf{v1.5 codec ($E_8$)}:
1526-
\begin{itemize}
1527-
\item spoken / written: ``v1.5 kakeya zamir E8 GPU''
1528-
\item with parameter: e.g.\ ``v1.5 $Q=10$''
1529-
\item class: \texttt{V15KakeyaZamirE8GPU}
1530-
\item module: \texttt{kakeyalattice.v1\_5\_kakeya\_zamir\_e8\_gpu}
1531-
\item release tag: \texttt{v1.5} (commit at the time of paper release)
1532-
\end{itemize}
1534+
\item Project name: \textsc{KakeyaLattice}.
1535+
\item Python package: \texttt{kakeyalattice}.
1536+
\item $D_4$ variant: class
1537+
\texttt{V14KakeyaZamirLatticeGPU}, module
1538+
\texttt{kakeyalattice.v1\_4\_kakeya\_zamir\_lattice\_gpu},
1539+
release tag \texttt{v1.4} (commit \texttt{6b02711}).
1540+
\item $E_8$ variant: class
1541+
\texttt{V15KakeyaZamirE8GPU}, module
1542+
\texttt{kakeyalattice.v1\_5\_kakeya\_zamir\_e8\_gpu},
1543+
release tag \texttt{v1.5}.
15331544
\end{itemize}
15341545

1546+
The \texttt{v1.4} / \texttt{v1.5} strings appear only as git release
1547+
tags and as the \texttt{V14}/\texttt{V15} class-name prefixes chosen
1548+
by the repository authors; the paper itself references the codec
1549+
variants exclusively by their lattice names.
1550+
15351551
\section{Canonical operating points}
15361552
\label{app:ops}
15371553

0 commit comments

Comments
 (0)