You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+25-10Lines changed: 25 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -103,21 +103,20 @@ Honest status — see the project-status note at the top of this README.
103
103
104
104
## Current release
105
105
106
-
The current release is **0.30.0** — version-aligned with **SKaiNET 0.30.0**.
107
-
Skips 0.29.x: SKaiNET-transformers tracked the engine internally across that
108
-
window without a tagged release. The headline is that **Q5_K weights now stay
109
-
packed in the eager Gemma runtime** (SKaiNET 0.30.0 ships a first-class Q5_K
110
-
packed matmul) and the Gemma `NATIVE_OPTIMIZED` packed-weight path is now
111
-
**Kotlin/Native–ready** — the board binary can keep K-quant weights packed
112
-
without the JVM's `java.lang.foreign` MemSeg path. FunctionGemma-270M (`Q5_K_M`)
113
-
decodes byte-identically across the FP32 baseline and both packed paths
114
-
(`GemmaQ5KPackedParityTest`).
106
+
The current release is **0.31.0** — version-aligned with **SKaiNET 0.31.0**.
107
+
The headline is that the eager `NATIVE_OPTIMIZED` Gemma path now keeps the
108
+
**tied Q8_0 lm_head packed** (paired with SKaiNET 0.31.0's `ops.transpose` fix
109
+
for all packed dtypes), and `GemmaNetworkLoader.load()` takes an optional
110
+
`maxInferenceLen` to cap the KV cache for constrained devices — together
111
+
dropping FunctionGemma-270M's footprint enough to load eagerly on the 1.9 GB
112
+
Astra Machina SL2610. FunctionGemma (`Q5_K_M`) still decodes byte-identically
113
+
across the FP32 baseline and both packed paths (`GemmaQ5KPackedParityTest`).
115
114
116
115
The recommended way to consume is via the BOM. It pins every published `skainet-transformers-*` artifact and re-exports the upstream `sk.ainet:skainet-bom`, so the engine-side `sk.ainet.core:skainet-*` artifacts get the matching version too — you only need to declare the BOM version in one place.
Copy file name to clipboardExpand all lines: llm-inference/gemma/api/jvm/gemma.api
+2-1Lines changed: 2 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -862,7 +862,8 @@ public final class sk/ainet/models/gemma/GemmaNetworkLoader$WeightsProvider$Safe
862
862
}
863
863
864
864
public final class sk/ainet/models/gemma/GemmaNetworkLoaderKt {
865
-
public static final fun applyWeightsToNetworkNonReified (Lsk/ainet/context/ExecutionContext;Lsk/ainet/models/gemma/Gemma4Weights;Lkotlin/reflect/KClass;Z)Lsk/ainet/lang/nn/Module;
865
+
public static final fun applyWeightsToNetworkNonReified (Lsk/ainet/context/ExecutionContext;Lsk/ainet/models/gemma/Gemma4Weights;Lkotlin/reflect/KClass;ZLjava/lang/Integer;)Lsk/ainet/lang/nn/Module;
866
+
public static synthetic fun applyWeightsToNetworkNonReified$default (Lsk/ainet/context/ExecutionContext;Lsk/ainet/models/gemma/Gemma4Weights;Lkotlin/reflect/KClass;ZLjava/lang/Integer;ILjava/lang/Object;)Lsk/ainet/lang/nn/Module;
866
867
}
867
868
868
869
public final class sk/ainet/models/gemma/GemmaPackedWeightsKt {
0 commit comments