Commit 82adc64
fix(kvcache): trace-faithful PositionalKVCache.update (SKaiNET#763)
PositionalKVCache.update copied the incoming K/V tensor *data* into a heap FloatArray
(writeAt) and read it back via ctx.fromData (sliceView), bypassing ctx.ops. Under tracing
the symbolic K/V carry no data, so toComputeGraph(embedConstants=true) baked an all-zero
KV buffer as stablehlo.constant and disconnected the computed k_proj/v_proj — the exported
decoder then attended over K=V=0. In eager inference the buffer holds real data, so the bug
was invisible there; in export it was masked in plain decoders by the unnormalized FFN
dominating the residual stream, and exposed by Gemma's sandwichNorms (post_ffw_norm
normalizes the FFN, so the lost attention becomes significant — ~1.4x/block logit error).
Fix: when ctx.isRecording, wire K/V functionally through ops.concat (the same history
AppendKVCache already uses) instead of the raw heap buffer, so the StableHLO export carries
the real projections. The eager fast-path (heap buffer) is unchanged. As a side effect the
traced graph no longer surfaces dangling KV-cache buffer leaves as graph outputs.
Verified end-to-end via skainet-iree-conformance gemma-decoder (real 2-block Gemma3 with
qk-norm + sandwichNorms + layer-output-scale): export -> iree-compile -> iree-run-module
matches the SKaiNET-CPU oracle at max_abs_err 3.8e-6 (was ~5.3). Adds KvCacheTraceFidelityTest.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent a49fe30 commit 82adc64
2 files changed
Lines changed: 80 additions & 0 deletions
File tree
- llm-inference/gemma/src/jvmTest/kotlin/sk/ainet/models/gemma
- transformer-core/src/commonMain/kotlin/sk/ainet/lang/nn/transformer
Lines changed: 54 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
Lines changed: 26 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
218 | 222 | | |
219 | 223 | | |
220 | 224 | | |
221 | 225 | | |
222 | 226 | | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
223 | 247 | | |
224 | 248 | | |
225 | 249 | | |
| |||
321 | 345 | | |
322 | 346 | | |
323 | 347 | | |
| 348 | + | |
| 349 | + | |
324 | 350 | | |
325 | 351 | | |
326 | 352 | | |
| |||
0 commit comments