Skip to content

Prepare 0.22.0#580

Merged
michalharakal merged 1 commit intodevelopfrom
release/0.22.0
Apr 30, 2026
Merged

Prepare 0.22.0#580
michalharakal merged 1 commit intodevelopfrom
release/0.22.0

Conversation

@michalharakal
Copy link
Copy Markdown
Contributor

@michalharakal michalharakal commented Apr 30, 2026

Summary

Release prep for SKaiNET 0.22.0, mirroring the 0.21.0 pattern (5839da98 — release branch merged into develop, then tagged from develop). Closes the M5 milestone of the JVM inference performance roadmap with a priority-100 native (FFM) CPU kernel provider.

Headline numbers (Linux x86_64, JDK 21.0.10, gcc 13.3 -O3 -ffast-math)

Kernel shapes speedup vs Panama
Native Q4_K matmul 1024² / 2048² / 4096² 5.87× / 4.71× / 4.17×
Native FP32 SGEMM 256³ / 512³ / 1024³ 1.77× / 1.58× / 1.55×
MemSeg zero-copy Q4_K 4096² +20% vs heap-copy

Native is single-threaded scalar C with -O3 -ffast-math; gcc/clang auto-vec emit AVX2 / NEON FMA without hand-tuned intrinsics. Surfaces an interesting incidental finding: native single-threaded beats Panama's parallelChunks multi-threaded path on every measured Q4_K shape — parallelChunks dispatch overhead seems to dominate.

What's in this release

Multi-arch publishing — Windows AMD64 supported out of the box

publish.yml is now a two-phase flow:

  1. build-native matrix on ubuntu-latest, macos-14, windows-latest — each builds its host's libskainet_kernels and uploads as an artifact.
  2. publish on macOS — downloads all three artifacts, stages them into the native module's resources tree, runs ./gradlew publish. Resulting JAR ships native libs for linux-x86_64, macos-arm64, and windows-x86_64. Windows AMD64 devs get a working native path with no manual side-loading.

Limitations called out in CHANGELOG

  • Linux ARM64 native lib not in the published JAR — Kotlin/Native plugin 2.3.21 doesn't support linux aarch64 as a HOST target. Linux ARM64 consumers fall back cleanly to Panama priority-50.
  • Shadow-jar consumers on com.gradleup.shadow:9.4.x need the doLast workaround from SKaiNET-transformers PR Bump com.android.library from 8.12.1 to 8.12.2 #88 (mergeServiceFiles() bug). Spring Boot apps consuming via Maven are unaffected.

After merge

Following the same flow as 0.21.0:

  1. Tag 0.22.0 on develop at the merge commit → publish.yml fires on tag push, runs the matrix + publishes signed artifacts to Maven Central.
  2. Bump VERSION_NAME to 0.23.0-SNAPSHOT on develop (separate trivial commit, mirroring 4a3758af "Increment next version" after 0.21.0).

🤖 Generated with Claude Code

- gradle.properties: drop -SNAPSHOT; RELEASE_SIGNING_ENABLED stays
  true.
- CHANGELOG: add 0.22.0 section covering the native (FFM) CPU
  kernel provider rollout (5-PR staged delivery #571#575) plus
  publishing/CI/docs polish (#576, #577, #579). Native Q4_K matmul
  is 4.17–5.87× faster than Panama Vector at LLM-typical shapes;
  native FP32 SGEMM 1.55–1.77× faster; zero-copy MemSeg path saves
  +20% wall-clock at 4096². Closes M5 milestone metric.
- README: bump Quickstart coordinates to 0.22.0; rewrite "What's
  New" section around the native FFM provider.
- .github/workflows/publish.yml: split into matrix build-native
  + publish so the published JAR carries libskainet_kernels for
  linux-x86_64 + macos-arm64 + windows-x86_64. Downstream devs
  on any of those three arches get a working native path on tag
  push without manual side-loading. Linux ARM64 stays unsupported
  (Kotlin/Native host limitation tracked in #577).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@michalharakal michalharakal changed the base branch from main to develop April 30, 2026 09:45
@michalharakal michalharakal merged commit b25d7e2 into develop Apr 30, 2026
3 checks passed
@michalharakal michalharakal deleted the release/0.22.0 branch April 30, 2026 09:48
@michalharakal michalharakal mentioned this pull request Apr 30, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant