Skip to content

Commit 2d3cdd9

Browse files
michalharakalclaude
andcommitted
docs: note K/N pread random-access fix in 0.23.0; drop dead SKaiNET-LLM links
CHANGELOG: add 0.23.0 entries for PR #591 — `PosixPreadRandomAccessSource` under Added and the GGUF >2 GiB load failure under Fixed. README: remove the SKaiNET-LLM ecosystem-table row and the matching Explore row; SKaiNET-LLM repo is 404. Reroute the LLM-inference entry to SKaiNET-transformers, which is the active LLM application layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 2c9642f commit 2d3cdd9

2 files changed

Lines changed: 3 additions & 2 deletions

File tree

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,11 @@
77
### Added
88

99
- **`TensorDataFactory.placeholder(shape, dtype)`** — returns a `TensorData` whose underlying primitive array materializes lazily on first read, instead of allocating a `FloatArray(shape.volume)` eagerly. The default interface implementation falls back to `zeros`, preserving behavior for any custom factory; `DenseTensorDataFactory` overrides with `LazyZeroFloatArrayTensorData` / `LazyZeroIntArrayTensorData`. `ExecutionContext.placeholder(...)` exposes the same path at the `Tensor` level. (PR #588)
10+
- **`PosixPreadRandomAccessSource` for Kotlin/Native** — new public class in `skainet-io-core`'s `nativeMain` source set wrapping POSIX `pread(2)`. `pread` is positional and atomic, so concurrent reads from different positions are safe without locking. Companion `open(path)` returns `null` on open/stat failure to match the JVM `JvmRandomAccessSource.open(...)` behaviour, letting callers cleanly fall back to the legacy sequential reader if needed. Covers `macosArm64`, `linuxX64`, `linuxArm64`, `iosArm64`, `iosSimulatorArm64` — every target in the default `nativeMain` source set on this module. 11 `nativeTest` cases pin the contract (size, partial reads, offset/length variants, EOF/argument validation, idempotent close, missing-file null return). (PR #591)
1011

1112
### Fixed
1213

14+
- **Kotlin/Native consumers couldn't load GGUFs larger than ~2 GiB** — `sk.ainet.io.gguf.createRandomAccessSource(filePath)` on the native target was a placeholder `actual fun … = null`, forcing every K/N caller (`StreamingGGUFReader.open(...)` via the gguf-specific factory, every `*NetworkLoader.fromGguf(...)` path, `LlamaWeightLoader`) to fall through to the legacy reader, which slurps the entire file into a single `ByteArray`. Kotlin arrays cap at `Int.MAX_VALUE` bytes (~2 GiB), so any GGUF over ~1.9 GiB threw `IllegalStateException: Can't create an array of size 2147483648`. Practical impact: macOS / Linux / iOS native builds couldn't open Q8 models above ~1B parameters or Q4 models above ~3B — the JVM target had no such cap because `JvmRandomAccessSource` was already implemented. The `skainet-io-gguf` factory's native actual now delegates to the new `PosixPreadRandomAccessSource` (see *Added* above) and returns the same `null` sentinel on open/stat failure, so existing fall-back code paths remain valid. Verified on macOS arm64 against `Qwen3-1.7B-Q8_0.gguf` (~1.8 GiB), which previously OOMed at construction time. (Issue #589, PR #591)
1315
- **DSL eagerly allocated zero tensors for every Linear / Conv1d / Conv2d, OOMing real-model loaders** — `NetworkBuilder.kt`'s `createLinear`, `DenseImpl`, `Conv1dImpl`, and `Conv2dImpl` paths called `tensorDataFactory.zeros<T, V>(shape, kClass)` eagerly to satisfy each module's constructor whenever the user had not provided initial weights or bias. Downstream loaders always build the network first and only then substitute weights via `WeightMapper.applyWeights`, so the eager zeros were always immediately discarded — but they determined the JVM's peak heap footprint. For `unsloth/Apertus-8B-Instruct-2509-GGUF` (Q4_K_S, 4.7 GB on disk) that was ~27 GB of FP32 zeros allocated and thrown away. Switched every eager-init call site to the new `placeholder(...)` API; the lazy fires only if a caller actually reads the tensor, which never happens on the substitution path because `parameter.value =` swaps the entire `Tensor`. Verified against the real Apertus-8B Q4_K_S GGUF: `ApertusNetworkLoader.fromGguf().load<FP32, Float>(ctx)` now succeeds in 12 GB heap (previously OOMed at 12 GB), constructs all 35 top-level modules in 13 s. Same fix benefits Gemma / Llama / Qwen / Voxtral DSL paths transparently. (Issue #587, PR #588)
1416

1517
## [0.22.2] - 2026-05-02

README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,6 @@ SKaiNET is a modular ecosystem. While this repository contains the core engine,
7878

7979
| Project | Description |
8080
|---|---|
81-
| [SKaiNET-LLM](https://github.com/SKaiNET-developers/SKaiNET-LLM) | Llama, Gemma, and BERT inference runtimes |
8281
| [SKaiNET-transformers](https://github.com/SKaiNET-developers/SKaiNET-transformers) | Pre-built transformer architectures and layers |
8382
| [SKaiNET-examples](https://github.com/SKaiNET-developers/SKaiNET-examples) | Sample projects and integration demos |
8483

@@ -90,7 +89,7 @@ SKaiNET is a modular ecosystem. While this repository contains the core engine,
9089
|---|---|
9190
| Examples and sample projects | [SKaiNET-examples](https://github.com/SKaiNET-developers/SKaiNET-examples) |
9291
| Interactive notebooks | [SKaiNET-notebook](https://github.com/SKaiNET-developers/SKaiNET-notebook) |
93-
| LLM inference (Llama, Gemma) | [SKaiNET-LLM](https://github.com/SKaiNET-developers/SKaiNET-LLM) |
92+
| LLM inference (Llama, Gemma, Qwen) | [SKaiNET-transformers](https://github.com/SKaiNET-developers/SKaiNET-transformers) |
9493

9594
---
9695

0 commit comments

Comments
 (0)