You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Preserve the K1.E Mac postscript already on main and keep the K2 KakeyaLattice amendment from this branch.
Co-authored-by: Cursor <cursoragent@cursor.com>
The v0.4 GA gate (a) of §11.8 — "NIAH mid-context recall ≥ 95 % at 100k-token context" — has been **empirically verified at the K1 same-model identity scope**, on Mac M4 24 GB with `google/gemma-3-1b-it`. The 100k-token claim itself is pending vast.ai multi-context scan (only feasible on a GPU because the full-attention oracle's KV cache alone needs ~10 GB at 100k); this Mac result establishes the architecture works end-to-end at the 1-2k context regime.
Configuration: `n_samples=20`, `haystack_min_lines=60`, `haystack_max_lines=80`, `seed=42`. Prompt token length distribution: min 1234, max 1634, mean 1428 (≈ 1.4 k tokens).
1697
+
1698
+
Gate predicates all `True`:
1699
+
-`v04_vs_oracle_delta = 0.0` (v0.4 matches oracle exactly on these 20 samples)
Evidence: [`results/research/k1e_niah_1780909617.json`](../../results/research/k1e_niah_1780909617.json) and accompanying log under `results/research/logs/`. Reproducible from main via `bash scripts/review_pr_k1e_on_mac.sh`.
1706
+
1707
+
### Why v0.3 went to 0.000 here vs 0.167 in the 2026-06-06 A/B benchmark
1708
+
1709
+
The two evaluations disagree on the v0.3 baseline (16.7 % vs 0 %). They are not contradictory; they differ in dataset construction:
"The secret code is KAPPA-5333.jero.\u0bae\u0ba3\u0bbf\u0bae\u0ba3\u0bbf\u0bae\u0ba3\u0bbf\u0bae\u0ba3\u0bbf",
40
+
"IOTA-3045\n\u06cc iStock\n\n\n\n\n",
41
+
"IOTA-5803kanzaki.\n\n\n\n\n",
42
+
"The secret code is KAPPA-6925.Users are encouraged to keep this secret.Apex",
43
+
"The secret code is ALPHA-4733.Users are given a series of paragraphs and asked to identify the",
44
+
"BETA-4814\n\u06cc understand. Nehru\n\n\n\n",
45
+
"ETA-5554\n\u06ccuriApex Climber-",
46
+
"The secret code is OAK-6977.jero.\u0bae\u0ba3\u0bbf.\u0ba8\u0bbf\u0bb2\u0bc8\u0baf\u0bbf\u0bb2\u0bcd.\u0ba8\u0bbf\u0bb2\u0bc8\u0baf\u0bbf\u0bb2\u0bcd",
"Okay, let's analyze the image.\n\nThe secret code is: **\"Hello, World!\"**\n\nLet",
81
+
"Okay, let's crack this code!\n\nThe secret code is: **\"The quick brown fox jumps over the",
82
+
"Okay, let's crack this!\n\nThe secret code is: **SOS**\n\nLet me know if you'",
83
+
"Okay, let's analyze the image and figure out the secret code.\n\nThe image shows a series of dots and",
84
+
"Okay, let's play a game!\n\nThe secret code is: **741**\n\nLet me know",
85
+
"Okay, let's break down the image and figure out the secret code.\n\nThe image shows a series of dots",
86
+
"Okay, let's analyze the image and try to decipher the secret code.\n\nThe image shows a series of dots",
87
+
"I cannot provide you with a secret code. My purpose is to be helpful and harmless, and that includes protecting people from",
88
+
"Okay, let's analyze the image and try to decipher the secret code.\n\nThe image shows a series of dots",
89
+
"The secret code is: **SOS**\n\u0938\u0941\u0928\u0947\u0442\u043e, \u044f\u043d\u0435\u0437\u043d\u0430\u044e, \u0447\u0442\u043e\u044d\u0442\u043e\u0437\u043d\u0430\u0447\u0438\u0442.\n",
90
+
"Okay, let\u2019s play a game!\n\nThe secret code is: **741**\n\nLet me know",
91
+
"The secret code is \u201cSOS\u201d.\nyer.",
92
+
"The secret code is \u201cSOS\u201d.IDO",
93
+
"I cannot provide you with a secret code. My purpose is to be helpful and harmless, and that includes protecting people from",
94
+
"Okay, let's analyze the image.\n\nThe secret code is: **\"Hello, World!\"**\n\nLet",
95
+
"Okay, let\u2019s play a game!\n\nThe secret code is: **741**\n\nLet me know",
96
+
"Okay, let's crack this code!\n\nThe secret code is: **\"The quick brown fox jumps over the",
97
+
"Okay, let's crack this!\n\nThe secret code is: **\"The quick brown fox jumps over the lazy",
98
+
"Okay, let's analyze the image.\n\nThe secret code is: **\"Hello, World!\"**\n\nLet",
"The secret code is KAPPA-5333.jero.\u0bae\u0ba3\u0bbf\u0bae\u0ba3\u0bbf\u0bae\u0ba3\u0bbf\u0bae\u0ba3\u0bbf",
144
+
"IOTA-3045\n\u06cc iStock\n\n\n\n\n",
145
+
"IOTA-5803kanzaki.\n\n\n\n\n",
146
+
"The secret code is KAPPA-6925.Users are encouraged to keep this secret.Apex",
147
+
"The secret code is ALPHA-4733.Users are given a series of paragraphs and asked to identify the",
148
+
"BETA-4814\n\u06cc understand. Nehru\n\n\n\n",
149
+
"ETA-5554\n\u06ccuriApex Climber-",
150
+
"The secret code is OAK-6977.jero.\u0bae\u0ba3\u0bbf.\u0ba8\u0bbf\u0bb2\u0bc8\u0baf\u0bbf\u0bb2\u0bcd.\u0ba8\u0bbf\u0bb2\u0bc8\u0baf\u0bbf\u0bb2\u0bcd",
0 commit comments