Skip to content

Commit 89d16fd

Browse files
michalharakalclaude
andcommitted
release: 0.31.2 — RowDequantSource + ops.gather row-dequant
- VERSION_NAME 0.31.0 -> 0.31.2. - CHANGELOG [0.31.2]: RowDequantSource + DefaultCpuOps.gather row-dequant path (#741). - README: BOM example bump + What's New in 0.31.2. - docs: version refs bumped to 0.31.2. Version note: skipped 0.31.1 for the engine to avoid a cross-repo number collision — SKaiNET-transformers 0.31.1 already shipped against engine 0.31.0, so engine 0.31.2 keeps lock-step clean (the next transformers release pins engine 0.31.2 and is 0.31.2 too). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 2a4e5fb commit 89d16fd

8 files changed

Lines changed: 30 additions & 8 deletions

File tree

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,19 @@
22

33
## [Unreleased]
44

5+
## [0.31.2] - 2026-06-18
6+
7+
### Added
8+
9+
- **`RowDequantSource` + `ops.gather` row-dequant path.** Adds `RowDequantSource` (a `TensorData` marker,
10+
`dequantRow(rowIdx): FloatArray`) to `skainet-lang-core`, and teaches `DefaultCpuOps.gather` to use it:
11+
when the gathered table implements `RowDequantSource`, only the rows actually touched are dequantised
12+
(each unique row once, cached) instead of the generic element path — which calls `get()` (unsupported on
13+
such tensors) and would otherwise force a full FP32 materialise of the table. The table declares logical
14+
dtype `FP32`, so `gather` returns FP32 with no typing change. This lets a packed/oversized embedding (a
15+
Q-quantised `token_embd`) stay packed and be looked up via `ops.gather` directly — generalising the
16+
per-row-dequant trick out of the model layer. Adds `GatherRowDequantTest` (commonTest). (PR #741)
17+
518
## [0.31.0] - 2026-06-15
619

720
### Fixed

README.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Add the core dependencies (Gradle Kotlin DSL):
3636
```kotlin
3737
dependencies {
3838
// Recommended: import the umbrella BOM and drop versions on the engine modules.
39-
implementation(platform("sk.ainet:skainet-bom:0.31.0"))
39+
implementation(platform("sk.ainet:skainet-bom:0.31.2"))
4040

4141
implementation("sk.ainet.core:skainet-lang-core")
4242
implementation("sk.ainet.core:skainet-backend-cpu")
@@ -227,6 +227,15 @@ Runnable examples:
227227

228228
---
229229

230+
## What's New in 0.31.2
231+
232+
- **`RowDequantSource` + `ops.gather` row-dequant.** A `TensorData` can now mark itself `RowDequantSource`
233+
(`dequantRow(rowIdx): FloatArray`); `ops.gather` then dequantises only the rows it touches instead of
234+
materialising the whole table (and instead of the `get()` path, which such tensors don't support). The
235+
table presents as logical FP32, so a packed/oversized embedding (a Q-quantised `token_embd`) can stay
236+
packed and be looked up via `ops.gather` directly — moving the per-row-dequant trick out of model code
237+
into the engine. (PR #741)
238+
230239
## What's New in 0.31.0
231240

232241
- **`ops.transpose` lazily handles every packed matmul dtype.** The CPU backend rewraps packed bytes with a flipped shape (metadata-only "lazy transpose") so a packed weight survives `linearProject`'s `matmul(x, transpose(W))` instead of inflating to FP32 — but **Q8_0 and Q4_0** were missing and threw `Byte → Float ClassCastException`. Now the full dispatch set (Q4_K/Q5_K/Q6_K/Q5_0/Q5_1/Q8_0/Q4_0) transposes lazily, so a packed Q8_0/Q4_0 matmul weight (e.g. a tied Q8_0 `lm_head`) stays packed end-to-end on its NEON/SIMD kernel. Regression-tested across all seven packed types. (PRs #736, #737)

docs/modules/ROOT/pages/how-to/io-readers.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Add the following dependencies to your `build.gradle.kts`:
2020
[source,kotlin]
2121
----
2222
dependencies {
23-
implementation(platform("sk.ainet:skainet-bom:0.31.0"))
23+
implementation(platform("sk.ainet:skainet-bom:0.31.2"))
2424
2525
implementation("sk.ainet.core:skainet-io-gguf")
2626
implementation("org.jetbrains.kotlinx:kotlinx-io-core:0.8.2")
@@ -32,7 +32,7 @@ dependencies {
3232
[source,kotlin]
3333
----
3434
dependencies {
35-
implementation(platform("sk.ainet:skainet-bom:0.31.0"))
35+
implementation(platform("sk.ainet:skainet-bom:0.31.2"))
3636
3737
implementation("sk.ainet.core:skainet-io-onnx")
3838
implementation("org.jetbrains.kotlinx:kotlinx-io-core:0.8.2")

docs/modules/ROOT/pages/how-to/minerva-export.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ For a published application, use the SKaiNET BOM and the Minerva artifact:
3838
[source,kotlin]
3939
----
4040
dependencies {
41-
implementation(platform("sk.ainet:skainet-bom:0.31.0"))
41+
implementation(platform("sk.ainet:skainet-bom:0.31.2"))
4242
implementation("sk.ainet.core:skainet-compile-minerva")
4343
}
4444
----

docs/modules/ROOT/pages/reference/kernel-support-matrix.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
= Kernel × platform support matrix
22
:description: Which compute-kernel provider serves each weight format on each KMP target.
33

4-
Generated from `kernel-support.json` (version `0.31.0`) by `KernelSupportMatrixTest` — registry introspection of the registered `KernelProvider` implementations. Do not edit by hand; run `./gradlew generateKernelMatrix` to refresh.
4+
Generated from `kernel-support.json` (version `0.31.2`) by `KernelSupportMatrixTest` — registry introspection of the registered `KernelProvider` implementations. Do not edit by hand; run `./gradlew generateKernelMatrix` to refresh.
55

66
Each cell is the best (highest-priority) provider that serves `Float32 × format` `matmul` on that platform: *native-ffm* (100) → *panama-vector* (50) → *scalar* (0). An empty cell (`—`) means no provider carries a kernel there (the format is dequant-to-FP32 only).
77

docs/modules/ROOT/pages/tutorials/image-data-getting-started.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ For a JVM project, add the image/data modules alongside the CPU backend:
3232
[source,kotlin]
3333
----
3434
dependencies {
35-
implementation(platform("sk.ainet:skainet-bom:0.31.0"))
35+
implementation(platform("sk.ainet:skainet-bom:0.31.2"))
3636
3737
implementation("sk.ainet:skainet-backend-cpu-jvm")
3838
implementation("sk.ainet:skainet-io-image-jvm")

docs/modules/ROOT/pages/tutorials/java-getting-started.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ repositories {
144144
145145
dependencies {
146146
// Import BOM for version alignment
147-
implementation(platform("sk.ainet:skainet-bom:0.31.0"))
147+
implementation(platform("sk.ainet:skainet-bom:0.31.2"))
148148
149149
// Core tensor library
150150
implementation("sk.ainet:skainet-lang-core-jvm")

gradle.properties

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
GROUP=sk.ainet.core
2-
VERSION_NAME=0.31.0
2+
VERSION_NAME=0.31.2
33
POM_DESCRIPTION=SKaiNET
44

55
POM_URL=https://github.com/SKaiNET-developers/skainet/

0 commit comments

Comments
 (0)