release: 0.31.2 — RowDequantSource + ops.gather row-dequant

michalharakal · claude · michalharakal · commit 89d16fddb52b · 2026-06-18T00:02:59.000+02:00
- VERSION_NAME 0.31.0 -> 0.31.2. - CHANGELOG [0.31.2]: RowDequantSource + DefaultCpuOps.gather row-dequant path (#741). - README: BOM example bump + What's New in 0.31.2. - docs: version refs bumped to 0.31.2. Version note: skipped 0.31.1 for the engine to avoid a cross-repo number collision — SKaiNET-transformers 0.31.1 already shipped against engine 0.31.0, so engine 0.31.2 keeps lock-step clean (the next transformers release pins engine 0.31.2 and is 0.31.2 too). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,19 @@
 
 ## [Unreleased]
 
+## [0.31.2] - 2026-06-18
+
+### Added
+
+- **`RowDequantSource` + `ops.gather` row-dequant path.** Adds `RowDequantSource` (a `TensorData` marker,
+  `dequantRow(rowIdx): FloatArray`) to `skainet-lang-core`, and teaches `DefaultCpuOps.gather` to use it:
+  when the gathered table implements `RowDequantSource`, only the rows actually touched are dequantised
+  (each unique row once, cached) instead of the generic element path — which calls `get()` (unsupported on
+  such tensors) and would otherwise force a full FP32 materialise of the table. The table declares logical
+  dtype `FP32`, so `gather` returns FP32 with no typing change. This lets a packed/oversized embedding (a
+  Q-quantised `token_embd`) stay packed and be looked up via `ops.gather` directly — generalising the
+  per-row-dequant trick out of the model layer. Adds `GatherRowDequantTest` (commonTest). (PR #741)
+
 ## [0.31.0] - 2026-06-15
 
 ### Fixed
diff --git a/README.md b/README.md
@@ -36,7 +36,7 @@ Add the core dependencies (Gradle Kotlin DSL):
 ```kotlin
 dependencies {
     // Recommended: import the umbrella BOM and drop versions on the engine modules.
-    implementation(platform("sk.ainet:skainet-bom:0.31.0"))
+    implementation(platform("sk.ainet:skainet-bom:0.31.2"))
 
     implementation("sk.ainet.core:skainet-lang-core")
     implementation("sk.ainet.core:skainet-backend-cpu")
@@ -227,6 +227,15 @@ Runnable examples:
 
 ---
 
+## What's New in 0.31.2
+
+- **`RowDequantSource` + `ops.gather` row-dequant.** A `TensorData` can now mark itself `RowDequantSource`
+  (`dequantRow(rowIdx): FloatArray`); `ops.gather` then dequantises only the rows it touches instead of
+  materialising the whole table (and instead of the `get()` path, which such tensors don't support). The
+  table presents as logical FP32, so a packed/oversized embedding (a Q-quantised `token_embd`) can stay
+  packed and be looked up via `ops.gather` directly — moving the per-row-dequant trick out of model code
+  into the engine. (PR #741)
+
 ## What's New in 0.31.0
 
 - **`ops.transpose` lazily handles every packed matmul dtype.** The CPU backend rewraps packed bytes with a flipped shape (metadata-only "lazy transpose") so a packed weight survives `linearProject`'s `matmul(x, transpose(W))` instead of inflating to FP32 — but **Q8_0 and Q4_0** were missing and threw `Byte → Float ClassCastException`. Now the full dispatch set (Q4_K/Q5_K/Q6_K/Q5_0/Q5_1/Q8_0/Q4_0) transposes lazily, so a packed Q8_0/Q4_0 matmul weight (e.g. a tied Q8_0 `lm_head`) stays packed end-to-end on its NEON/SIMD kernel. Regression-tested across all seven packed types. (PRs #736, #737)
diff --git a/docs/modules/ROOT/pages/how-to/io-readers.adoc b/docs/modules/ROOT/pages/how-to/io-readers.adoc
@@ -20,7 +20,7 @@ Add the following dependencies to your `build.gradle.kts`:
 [source,kotlin]
 ----
 dependencies {
-    implementation(platform("sk.ainet:skainet-bom:0.31.0"))
+    implementation(platform("sk.ainet:skainet-bom:0.31.2"))
 
     implementation("sk.ainet.core:skainet-io-gguf")
     implementation("org.jetbrains.kotlinx:kotlinx-io-core:0.8.2")
@@ -32,7 +32,7 @@ dependencies {
 [source,kotlin]
 ----
 dependencies {
-    implementation(platform("sk.ainet:skainet-bom:0.31.0"))
+    implementation(platform("sk.ainet:skainet-bom:0.31.2"))
 
     implementation("sk.ainet.core:skainet-io-onnx")
     implementation("org.jetbrains.kotlinx:kotlinx-io-core:0.8.2")
diff --git a/docs/modules/ROOT/pages/how-to/minerva-export.adoc b/docs/modules/ROOT/pages/how-to/minerva-export.adoc
@@ -38,7 +38,7 @@ For a published application, use the SKaiNET BOM and the Minerva artifact:
 [source,kotlin]
 ----
 dependencies {
-    implementation(platform("sk.ainet:skainet-bom:0.31.0"))
+    implementation(platform("sk.ainet:skainet-bom:0.31.2"))
     implementation("sk.ainet.core:skainet-compile-minerva")
 }
 ----
diff --git a/docs/modules/ROOT/pages/reference/kernel-support-matrix.adoc b/docs/modules/ROOT/pages/reference/kernel-support-matrix.adoc
@@ -1,7 +1,7 @@
 = Kernel × platform support matrix
 :description: Which compute-kernel provider serves each weight format on each KMP target.
 
-Generated from `kernel-support.json` (version `0.31.0`) by `KernelSupportMatrixTest` — registry introspection of the registered `KernelProvider` implementations. Do not edit by hand; run `./gradlew generateKernelMatrix` to refresh.
+Generated from `kernel-support.json` (version `0.31.2`) by `KernelSupportMatrixTest` — registry introspection of the registered `KernelProvider` implementations. Do not edit by hand; run `./gradlew generateKernelMatrix` to refresh.
 
 Each cell is the best (highest-priority) provider that serves `Float32 × format` `matmul` on that platform: *native-ffm* (100) → *panama-vector* (50) → *scalar* (0). An empty cell (`—`) means no provider carries a kernel there (the format is dequant-to-FP32 only).
 
diff --git a/docs/modules/ROOT/pages/tutorials/image-data-getting-started.adoc b/docs/modules/ROOT/pages/tutorials/image-data-getting-started.adoc
@@ -32,7 +32,7 @@ For a JVM project, add the image/data modules alongside the CPU backend:
 [source,kotlin]
 ----
 dependencies {
-    implementation(platform("sk.ainet:skainet-bom:0.31.0"))
+    implementation(platform("sk.ainet:skainet-bom:0.31.2"))
 
     implementation("sk.ainet:skainet-backend-cpu-jvm")
     implementation("sk.ainet:skainet-io-image-jvm")
diff --git a/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc b/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc
@@ -144,7 +144,7 @@ repositories {
 
 dependencies {
     // Import BOM for version alignment
-    implementation(platform("sk.ainet:skainet-bom:0.31.0"))
+    implementation(platform("sk.ainet:skainet-bom:0.31.2"))
 
     // Core tensor library
     implementation("sk.ainet:skainet-lang-core-jvm")
diff --git a/gradle.properties b/gradle.properties
@@ -1,5 +1,5 @@
 GROUP=sk.ainet.core
-VERSION_NAME=0.31.0
+VERSION_NAME=0.31.2
 POM_DESCRIPTION=SKaiNET
 
 POM_URL=https://github.com/SKaiNET-developers/skainet/

Original file line number	Diff line number	Diff line change
`@@ -38,7 +38,7 @@ For a published application, use the SKaiNET BOM and the Minerva artifact:`
`38`	`38`	`[source,kotlin]`
`39`	`39`	`----`
`40`	`40`	`dependencies {`
`41`		`- implementation(platform("sk.ainet:skainet-bom:0.31.0"))`
	`41`	`+ implementation(platform("sk.ainet:skainet-bom:0.31.2"))`
`42`	`42`	`implementation("sk.ainet.core:skainet-compile-minerva")`
`43`	`43`	`}`
`44`	`44`	`----`