hotfix/0.22.2: BOM coordinates + unsigned GGUF metadata#586
Merged
michalharakal merged 3 commits intodevelopfrom May 2, 2026
Merged
hotfix/0.22.2: BOM coordinates + unsigned GGUF metadata#586michalharakal merged 3 commits intodevelopfrom
michalharakal merged 3 commits intodevelopfrom
Conversation
The umbrella BOM was being emitted as `sk.ainet.core:skainet-bom`
because vanniktech's auto-coordinates feature picks up `GROUP=sk.ainet.core`
from the root `gradle.properties`, clobbering the per-module
`group = "sk.ainet"` override. Downstream BOMs (e.g.
`sk.ainet.transformers:skainet-transformers-bom`) import this with
`<groupId>sk.ainet</groupId>`, so they were unresolvable from a
fresh `mavenCentral()`-only project.
- Use vanniktech's explicit `mavenPublishing { coordinates(...) }`
so the BOM lands at `sk.ainet:skainet-bom:0.22.2` regardless of
the engine-wide GROUP.
- Extend `validate-published-poms.sh` to assert the BOM exists at
`~/.m2/repository/sk/ainet/skainet-bom/` so the regression cannot
ship again silently.
- Bump VERSION_NAME to 0.22.2; update README, CHANGELOG, and Antora
docs samples (java-getting-started, java-model-training,
io-readers, architecture) to the new version and BOM coordinates.
Fixes #584
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GgufModelMetadata.from() — and any consumer using `(value as? Number)?.toInt()` on `reader.fields` — silently dropped UInt/ULong-typed values. Modern GGUFs store dimensions and counts as uint32, but Kotlin's unsigned types do not extend kotlin.Number, so the cast yielded null. As a result contextLength, embeddingLength, layerCount, headCount, vocabSize fallback, bosTokenId, eosTokenId all came back null on real-world GGUFs and the loader fell back to defaults (e.g. blockCount=0 → zero-layer transformer). Add public Map<String, Any?> extensions in GgufFieldAccessors.kt: getInt / getLong / getString / getIntList / getStringList. The numeric accessors handle Int / UInt / Long / ULong / Short / UShort / Byte / UByte plus the matching primitive arrays for the list variant, and the string- encoded numeric variant some GGUF metadata uses. Route GgufModelMetadata.from() through the new public accessors and remove the buggy private helpers. Add a regression test covering uint32/uint64 scalars, uint-typed lists, and every numeric type the accessor accepts. Closes #585 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
|
📖 Documentation Preview The documentation has been built successfully for this PR. Generated Files:
Artifacts:
This comment will be updated automatically when the PR is updated. |
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
📖 Documentation Preview The documentation has been built successfully for this PR. Generated Files:
Artifacts:
This comment will be updated automatically when the PR is updated. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two independent fixes targeted for 0.22.2, plus the version bump.
1.
fix(bom): publish skainet-bom at sk.ainet:skainet-bom(351f727)Already on the branch from the prior commit. Out of scope for this description.
2.
fix(gguf): handle unsigned numeric metadata fields(86cc067) — closes #585GgufModelMetadata.from()and any consumer using(value as? Number)?.toInt()onreader.fieldssilently droppedUInt/ULong-typed values. Modern GGUFs store dimensions and counts asuint32, but Kotlin's unsigned types do not extendkotlin.Number, so the cast yieldednull. Result:contextLength,embeddingLength,layerCount,headCount,vocabSize(fallback),bosTokenId,eosTokenIdall came backnullon real-world GGUFs and the loader fell back to defaults (e.g.blockCount=0→ zero-layer transformer).Fix:
GgufFieldAccessors.ktexposingMap<String, Any?>extensions:getInt/getLong/getString/getIntList/getStringList. The numeric ones handle every signed and unsigned integer type the reader can emit (Int/UInt/Long/ULong/Short/UShort/Byte/UByte) plus the matching primitive arrays for the list variant, plus the string-encoded numeric variant some GGUF metadata uses.GgufModelMetadata.from()now routes through these public accessors; the buggy private helpers are deleted.GgufModelMetadataUnsignedTestcovering uint32 / uint64 scalars, uint-typed lists, every-numeric-type, and key-priority order.Non-breaking — only adds new public API and fixes existing methods to return correct values.
Test plan
:skainet-io:skainet-io-gguf:jvmTestgreen (existing tokenizer suite + 5 new unsigned cases):llm-core:jvmTest :llm-inference:apertus:jvmTest— 96 tests, 0 failuresvalidate-published-poms.shVERSION_NAME=0.22.2propagates correctly to all published artifact coordinates🤖 Generated with Claude Code