Commit a24f21d
feat(lang/ops): RowDequantSource in the engine + ops.gather row-dequant path
Generalizes the per-row dequant trick out of the model layer (SKaiNET-transformers
issue #184, hoist 1). Adds `RowDequantSource` (a `TensorData` marker:
`dequantRow(rowIdx): FloatArray`) to skainet-lang-core, and teaches
`DefaultCpuOps.gather` to use it: when the gathered table implements RowDequantSource,
dequantise only the touched rows (each unique row once, cached) instead of the generic
element path — which calls `get()`, unsupported on such tensors, and would otherwise
force a full FP32 materialise of the table.
A RowDequantSource table declares logical dtype FP32, so gather returns FP32 with no
typing change. This lets a packed/oversized embedding (e.g. a Q-quantised token_embd)
stay packed and be looked up via ops.gather directly — the basis for keeping Gemma's
~0.67 GB token_embd packed (#178's remaining board-fit item) and, later, whisper int8.
Verified: new GatherRowDequantTest — gather over a fake RowDequantSource table whose
get()/copyToFloatArray() throw returns the correct dequantised rows (so it provably
went through dequantRow). backend-cpu compiles + the test passes.
Next (separate, release-coordinated): SKaiNET-transformers re-points gemma's
RowDequantSource to this engine interface (typealias) and routes token_embd through
ops.gather; the GemmaQ5KPackedParityTest is the end-to-end gate.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent fa3610d commit a24f21d
3 files changed
Lines changed: 85 additions & 5 deletions
File tree
- skainet-backends/skainet-backend-cpu/src
- commonMain/kotlin/sk/ainet/exec/tensor/ops
- commonTest/kotlin/sk/ainet/exec/tensor/ops
- skainet-lang/skainet-lang-core/src/commonMain/kotlin/sk/ainet/lang/tensor/data
Lines changed: 21 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
2633 | 2634 | | |
2634 | 2635 | | |
2635 | 2636 | | |
2636 | | - | |
2637 | | - | |
| 2637 | + | |
| 2638 | + | |
2638 | 2639 | | |
2639 | 2640 | | |
2640 | 2641 | | |
2641 | 2642 | | |
2642 | 2643 | | |
2643 | 2644 | | |
2644 | 2645 | | |
2645 | | - | |
2646 | | - | |
2647 | | - | |
| 2646 | + | |
| 2647 | + | |
| 2648 | + | |
| 2649 | + | |
| 2650 | + | |
| 2651 | + | |
| 2652 | + | |
| 2653 | + | |
| 2654 | + | |
| 2655 | + | |
| 2656 | + | |
| 2657 | + | |
| 2658 | + | |
| 2659 | + | |
| 2660 | + | |
| 2661 | + | |
| 2662 | + | |
| 2663 | + | |
2648 | 2664 | | |
2649 | 2665 | | |
2650 | 2666 | | |
| |||
Lines changed: 45 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
skainet-lang/skainet-lang-core/src/commonMain/kotlin/sk/ainet/lang/tensor/data/RowDequantSource.kt
Lines changed: 19 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
0 commit comments