Skip to content

Commit 5baae89

Browse files
michalharakalclaude
andcommitted
docs: transformer-core README + landing notes (rebase onto develop, conflict assessment)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 4042af0 commit 5baae89

1 file changed

Lines changed: 41 additions & 0 deletions

File tree

transformer-core/README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# transformer-core
2+
3+
Framework NN primitives — attention, the KV-cache family, embedding, norms, RoPE, SwiGLU/GeGLU FFN,
4+
residual, linear projection — extracted from `llm-core` so they build on the **full Kotlin target matrix
5+
including `androidNativeArm32/Arm64`** (the on-device ARM path). Depends only on `skainet-lang-core`
6+
(which has androidNative); no io/compile/backend deps.
7+
8+
`llm-core` `api`-depends on this module and **re-exports** it, so existing consumers are unaffected.
9+
ARM-native consumers (e.g. `skainet-whisper-kmp`) depend on `transformer-core` directly and reuse
10+
KV-cache/attention instead of reimplementing.
11+
12+
## Why
13+
`llm-core`'s primitives only need `lang-core`, but were trapped there: `llm-core`'s *other* deps
14+
(`io-gguf`, `io-core`, `compile-*`, `backend-cpu`) lack androidNative, so ARM-native consumers couldn't
15+
depend on it. The primitives are **dtype-agnostic** (just call `ops.*`), so this target generalization is
16+
orthogonal to the quant/dtype generalization (issue #178) — they meet cleanly at these primitives.
17+
18+
## What moved (15 files, lang-core-only)
19+
`transformer/*` (KVCache, RoPE, ResidualAdd, MultiHeadAttention, GeGLUFFN, SwiGLUFFN, XIELUActivation,
20+
LayerScalarMul, LinearProjection, VoidDense), `layers/*` (Embedding*), `normalization/RMSNormalization`,
21+
`dsl/TransformerDsl`. **Kept in `llm-core`:** `dsl/decoder/*` (DecoderTransformerNetwork needs
22+
`apps.llm.HybridTransformerBlock`, which is compile-opt-coupled).
23+
24+
One back-reference decoupled: `MultiHeadAttention`'s diagnostic `dumpStats` → a settable `mhaStatSink`
25+
(default no-op) that `HybridTransformerBlock` wires to llm-core's platform `dumpStats` (no behaviour lost).
26+
27+
## Verified
28+
`:transformer-core:` compiles for jvm + androidNativeArm32 + arm64; `:llm-core:jvmTest` green (5/5) via
29+
the re-export.
30+
31+
## Landing (for the maintainer)
32+
Branch `feature/transformer-core` was cut from `release/0.31.0`. To land on `develop` (which has #178's
33+
merged #179/#180):
34+
1. `git fetch origin && git rebase origin/develop`**no conflicts expected on the moved files**: #178's
35+
merged work is in the model layer (`GemmaPackedWeights`) + engine (`ops.transpose` Q8_0/Q4_0), not these
36+
primitives. (Verified against local refs; re-check against fresh `develop`.)
37+
2. Build the full target matrix + `:llm-core:` tests; PR; CI-publish; bump the `skainet`/transformers pins.
38+
3. **Note for future quant work:** the pre-transpose-marker (#178 "Solution C") will land in
39+
`LinearProjection.kt`, which now lives **here**, not `llm-core`. And `RowDequantSource` + packed-weight
40+
packing (today in `sk.ainet.models.gemma`) are the next candidates to hoist into a shared `quant` layer
41+
or this module — that's what makes quant reusable across models *and* whisper.

0 commit comments

Comments
 (0)