@@ -103,8 +103,13 @@ Honest status — see the project-status note at the top of this README.
103103
104104## Current release
105105
106- The current release is ** 0.31.0** — version-aligned with ** SKaiNET 0.31.0** .
107- The headline is that the eager ` NATIVE_OPTIMIZED ` Gemma path now keeps the
106+ The current release is ** 0.31.1** (against ** SKaiNET 0.31.0** ). It adds
107+ ** ` transformer-core ` ** — the framework NN primitives (attention, KV-cache family,
108+ embedding, norms, RoPE, FFNs, linear projection) extracted out of ` llm-core ` so they
109+ build on the ** full target matrix including ` androidNative ` ** (32-bit + 64-bit ARM);
110+ ` llm-core ` re-exports it, so nothing changes for existing consumers, and ARM-native
111+ downstreams (e.g. on-device whisper) can reuse the primitives instead of reimplementing
112+ them. The 0.31.0 highlights still apply: the eager ` NATIVE_OPTIMIZED ` Gemma path keeps the
108113** tied Q8_0 lm_head packed** (paired with SKaiNET 0.31.0's ` ops.transpose ` fix
109114for all packed dtypes), and ` GemmaNetworkLoader.load() ` takes an optional
110115` maxInferenceLen ` to cap the KV cache for constrained devices — together
@@ -116,7 +121,7 @@ The recommended way to consume is via the BOM. It pins every published `skainet-
116121
117122``` kotlin
118123dependencies {
119- implementation(platform(" sk.ainet.transformers:skainet-transformers-bom:0.31.0 " ))
124+ implementation(platform(" sk.ainet.transformers:skainet-transformers-bom:0.31.1 " ))
120125
121126 // Versions resolved from the BOM:
122127 implementation(" sk.ainet.transformers:skainet-transformers-core" )
@@ -141,6 +146,7 @@ dependencies {
141146| Module | Purpose |
142147| -------------------- | ----------------------------------------------------------------------- |
143148| ` llm-api ` | Framework-neutral interfaces (` ChatModel ` , ` EmbeddingModel ` , ` ToolDefinition ` ) — Spring AI-shaped. |
149+ | ` transformer-core ` | Framework NN primitives (attention, KV-cache family, embedding, norms, RoPE, FFNs, linear projection). ` lang-core ` -only → ** all targets incl. ` androidNative ` ** ; re-exported by ` llm-core ` . |
144150| ` llm-core ` | ` OptimizedLLMRuntime ` , ` ModelRegistry ` , ` UnifiedModelLoader ` , shared abstractions. |
145151| ` llm-inference/<arch> ` | Per-architecture network DSLs and weight loaders (` llama ` , ` gemma ` , ` qwen ` , ` apertus ` , ` bert ` ). |
146152| ` llm-runtime/<arch> ` | Per-architecture runtime facades (` kllama ` , ` kgemma ` , ` kqwen ` , ` kapertus ` ). |
@@ -193,6 +199,15 @@ try (KLlamaSession session = KLlamaJava.loadGGUF(modelPath, /* systemPrompt */ n
193199
194200See ` llm-test/llm-test-java/src/test/java/.../KLlamaJavaToolCallingTest.java ` for a runnable reference.
195201
202+ ## What's new in 0.31.1
203+
204+ - ** ` transformer-core ` module — NN primitives reusable on all targets incl. ` androidNative ` .** The
205+ attention / KV-cache / embedding / norm / RoPE / FFN / linear-projection primitives were trapped in
206+ ` llm-core ` (whose io/compile/backend deps lack ` androidNative ` ); they only need ` skainet-lang-core `
207+ (which has it), so they're extracted into ` transformer-core ` and ` llm-core ` re-exports them. Existing
208+ consumers are unaffected; ARM-native downstreams (on-device whisper, future models) reuse them instead of
209+ reimplementing. Ships against engine ** 0.31.0** (additive, no engine change). (#183 )
210+
196211## What's new in 0.31.0
197212
198213- ** Tied Q8_0 lm_head stays packed (eager ` NATIVE_OPTIMIZED ` ).** FunctionGemma's
0 commit comments