Skip to content

Commit 1bc57a2

Browse files
committed
Merge origin/develop into feature/data-loader-api
2 parents ce4640f + 6a10b1d commit 1bc57a2

32 files changed

Lines changed: 1873 additions & 275 deletions

File tree

CHANGELOG.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,47 @@
22

33
## [Unreleased]
44

5+
## [0.33.0] - 2026-06-29
6+
7+
### Added
8+
9+
- **GRU layer (`sk.ainet.lang.nn.Gru`).** SKaiNET's first recurrent layer (issue #217): single-layer,
10+
unidirectional, batch-first `[B, S, D] -> [B, S, H]`, PyTorch gate order (reset, update, new). Built
11+
by composing existing primitives (matmul/add/sigmoid/tanh/narrow/concat) **unrolled over the static
12+
sequence length at trace time** — StableHLO has no loop construct, so any recurrence must unroll. It
13+
runs eagerly, is trainable through the standard tape, and exports to StableHLO with no dedicated
14+
converter. Also adds a `gru(hiddenSize) { … }` network-DSL builder. (PR #772)
15+
- **`upsample2d` Bilinear + StableHLO export.** Adds the Bilinear forward (PyTorch coord map, 4-neighbour
16+
blend) and its autodiff backward, and a traceable StableHLO lowering for **both** Nearest and Bilinear
17+
(scale is static at trace time, so everything lowers to fixed reshape/broadcast/`dot_general` — no
18+
runtime index math, no `custom_call`). Unblocks export of resize/FPN-style paths. (PR #771)
19+
- **Seven newly-differentiable ops.** `cos`, `sin`, `tril`, `gather`, `indexSelect`, `unfold`,
20+
`convTranspose1d` now carry `@Diff` and have backward rules (with finite-difference parity tests):
21+
trig for RoPE, `gather` for embedding lookup, `tril` for causal masks, the rest structural. (PR #774)
22+
- **KSP-generated autodiff-coverage guard.** The tracing-wrapper processor now emits
23+
`DifferentiableTensorOpsRules.ruleNames` (the authoritative `@Diff` op set); a unit test asserts the
24+
execution tape's dispatch covers it, so a differentiable op can no longer ship with a backward rule
25+
that is never wired. `operators.json` now records `isDifferentiable` (+ optional `diffRuleName`),
26+
schema-validated. (PR #774)
27+
28+
### Fixed
29+
30+
- **Silent gradient drop for `elu`, `leakyRelu`, `permute`.** These were `@Diff` and had correct
31+
backward formulas, but had no arm in the execution tape's trace dispatch, so their gradients fell
32+
through to `null` and were silently discarded. Now wired (and guarded by the coverage test above);
33+
`permuteBackward` also fixed to decode its `axes` attribute as the traced `List<Int>`. (PR #774)
34+
- **`layerNorm` / `rmsNorm` / `batchNorm` lower to real `stablehlo.reduce`.** The norm converters
35+
previously emitted non-compilable `reduce_mean` / `reduce_variance` `custom_call`s (export-only); they
36+
now decompose to real `stablehlo.reduce`, so all three compile and run on stock IREE (llvm-cpu). (PR #769)
37+
38+
### Changed
39+
40+
- **BREAKING: `TensorOps.sin`, `TensorOps.cos`, `TensorOps.convTranspose1d` are now abstract.** They
41+
previously had default `throw NotImplementedError(...)` bodies; they are abstract so the tracing
42+
wrapper records them (and they become differentiable/exportable). Any type implementing `TensorOps`
43+
directly must now override them — both bundled backends (`DefaultCpuOpsBase`, `VoidTensorOps`) already
44+
do. (PR #774)
45+
546
## [0.32.4] - 2026-06-26
647

748
### Fixed

CONTRIBUTING.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@ review.
66

77
## When to Write an SKEEP
88

9-
SKEEP stands for SKaiNET Enhancement and Evolution Proposal. It is the
10-
project's KEEP-style track for changes that need a durable design record before
11-
or alongside implementation.
9+
SKEEP stands for SKaiNET Evolution and Enhancement Process. It is the
10+
project's KEEP-style process for changes that need a durable design record
11+
before or alongside implementation.
1212

1313
Write an SKEEP when a change affects:
1414

README.md

Lines changed: 42 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -36,17 +36,13 @@ Add the core dependencies (Gradle Kotlin DSL):
3636
```kotlin
3737
dependencies {
3838
// Recommended: import the umbrella BOM and drop versions on the engine modules.
39-
implementation(platform("sk.ainet:skainet-bom:0.32.4"))
39+
implementation(platform("sk.ainet:skainet-bom:0.33.0"))
4040

4141
implementation("sk.ainet.core:skainet-lang-core")
4242
implementation("sk.ainet.core:skainet-backend-cpu")
4343
}
4444
```
4545

46-
> The BOM was first correctly published to Maven Central in 0.22.2 — earlier versions
47-
> shipped at the wrong coordinates and could not be imported. Pin versions directly if
48-
> you need an older release.
49-
5046
### Hello Neural Net
5147

5248
```kotlin
@@ -145,26 +141,41 @@ Quick local replay:
145141
## Architecture goal
146142

147143
SKaiNET is built around one path: **a model is defined once in the Kotlin DSL,
148-
then either compiled to native code or executed eagerly — without rewriting it.**
144+
then either compiled or executed eagerly — without rewriting it.**
149145

150146
1. **Define** the model with the DSL (`nn { }` / `dag { }`).
151-
2. **Capture** it as a *tape* (traced execution) or a *DAG* (explicit graph).
147+
2. **Capture** it as a *tape* (traced execution) or a *DAG* (explicit graph) — a `ComputeGraph`.
152148
3. **Run** it one of two ways:
153-
- **Compile** — lower the graph to MLIR / StableHLO (`HloGenerator`) and
154-
compile to **native** code (IREE-compatible) for native / edge targets.
149+
- **Compile** — lower the captured `ComputeGraph` through one of several
150+
**sibling code-generation backends**, each emitting code for a different target
151+
from the *same* graph:
152+
- **StableHLO / MLIR** (`HloGenerator`) → IREE-compilable, for native / edge /
153+
accelerator targets and the wider MLIR ecosystem.
154+
- **Arduino / C99** → standalone, statically-allocated C for microcontrollers.
155+
- **Minerva** → a secure-MCU bundle (weights + firmware skeleton + fingerprinted
156+
manifest).
155157
- **Eager** — execute directly on an available backend. On the **JVM this is
156158
the primary, go-to path.**
157159

160+
StableHLO/MLIR is therefore **one code-generation backend among siblings** — the
161+
IREE/native path next to the C99/Arduino and Minerva MCU paths — not a separate
162+
pipeline.
163+
158164
```mermaid
159165
flowchart LR
160-
DSL["Model — Kotlin DSL"] --> Graph["Tape / DAG"]
161-
Graph --> HLO["MLIR / StableHLO"]
166+
DSL["Model — Kotlin DSL"] --> Graph["Tape / DAG (ComputeGraph)"]
162167
Graph --> Eager["Eager backend (JVM, …)"]
163-
HLO --> Native["Native code"]
168+
Graph -->|code generation| HLO["StableHLO / MLIR"]
169+
Graph -->|code generation| C99["Arduino / C99"]
170+
Graph -->|code generation| Minerva["Minerva"]
171+
HLO --> Native["IREE → native / edge / accelerator"]
172+
C99 --> MCU["Microcontroller"]
173+
Minerva --> SecMCU["Secure-MCU bundle"]
164174
```
165175

166-
The same DSL model feeds both paths — eager execution for development and JVM
167-
deployment, the StableHLO path for native and edge targets.
176+
The same DSL model feeds every path: eager execution for development and JVM
177+
deployment, and the code-generation backends — StableHLO/MLIR (→ IREE), Arduino/C99,
178+
and Minerva — as **sibling alternatives** for native, edge, and secure-MCU targets.
168179

169180
---
170181

@@ -269,6 +280,23 @@ val withoutLabel = dataPipeline<RawDataset>()
269280

270281
---
271282

283+
## What's New in 0.33.0
284+
285+
- **GRU — the first recurrent layer.** `nn.Gru` (`[B,S,D]->[B,S,H]`, PyTorch gate order) composed from
286+
existing primitives and unrolled over the static sequence at trace time, so it runs eagerly, trains
287+
through the standard tape, and exports to StableHLO with no dedicated converter. Plus a `gru(…)`
288+
network-DSL builder. (PR #772, issue #217)
289+
- **`upsample2d` Bilinear + StableHLO export** for both Nearest and Bilinear — everything lowers to fixed
290+
reshape/broadcast/`dot_general` (no `custom_call`), unblocking resize/FPN-style export. (PR #771)
291+
- **Autodiff correctness + coverage.** Fixes a silent gradient-drop for `elu`/`leakyRelu`/`permute`
292+
(backward rules existed but were never wired into the trace dispatch), makes `cos`/`sin`/`tril`/
293+
`gather`/`indexSelect`/`unfold`/`convTranspose1d` differentiable, and adds a KSP-generated coverage
294+
guard so a differentiable op can no longer ship without a wired backward. (PR #774)
295+
- **Norms compile on stock IREE.** `layerNorm`/`rmsNorm`/`batchNorm` now lower to real `stablehlo.reduce`
296+
instead of export-only `custom_call`s. (PR #769)
297+
- **Breaking:** `TensorOps.sin`/`cos`/`convTranspose1d` are now abstract — backends implementing
298+
`TensorOps` directly must override them (both bundled backends already do).
299+
272300
## What's New in 0.32.4
273301

274302
- **Streaming detokenization keeps word spaces (`Tokenizer.decodeToken`).** Decoding generated tokens

build-logic/convention/src/main/resources/schemas/operator-doc-schema-v1.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,14 @@
115115
},
116116
"description": "Array of notes associated with the function"
117117
},
118+
"isDifferentiable": {
119+
"type": "boolean",
120+
"description": "Whether the op carries @Diff, i.e. has a generated backward-rule contract and must be wired into the autodiff dispatch. Sourced from the @Diff annotation."
121+
},
122+
"diffRuleName": {
123+
"type": "string",
124+
"description": "Custom adjoint rule name when @Diff(ruleName=...) is set; omitted for bare @Diff (rule name defaults to the op name)."
125+
},
118126
"validated": {
119127
"type": "boolean",
120128
"description": "Whether the function's documentation has been DARC-validated by a reviewer. Sourced from the @DarcValidated annotation on the function."

docs/.docker/Dockerfile

Lines changed: 78 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,103 @@
1+
# markup-antora — one Antora image to use everywhere.
2+
#
3+
# Consolidates the best of the ~9 Antora images that grew across the
4+
# projects tree. Design priorities, in order:
5+
# 1. Mermaid rendered FULLY OFFLINE — no Kroki server, no kroki.io,
6+
# no CDN at build time and none at view time. Diagrams are baked
7+
# to inline SVG by mermaid-cli (Alpine Chromium + Puppeteer)
8+
# through the local-mermaid-extension.js Asciidoctor block
9+
# processor, with content-hash caching so repeated diagrams render
10+
# once per build.
11+
# 2. Works under `--user $(id -u):$(id -g)` (rootless) without the
12+
# Chromium crashpad / cosmiconfig EACCES failures.
13+
# 3. Offline extras available but not forced: lunr full-text search,
14+
# a pre-baked Antora UI bundle, and MathJax es5 for LaTeX.
15+
# 4. asciidoctor-kroki installed-but-unused as an escape hatch.
116
FROM node:20-alpine
217

3-
LABEL org.opencontainers.image.title="SKaiNET Antora" \
4-
org.opencontainers.image.description="Antora site generator with direct local Mermaid rendering (no Kroki round trip)" \
18+
LABEL org.opencontainers.image.title="markup-antora" \
19+
org.opencontainers.image.description="Universal Antora site generator with offline Mermaid (mermaid-cli), offline search (lunr), pre-baked UI bundle + MathJax. No Kroki, no CDN." \
520
org.opencontainers.image.source="https://github.com/SKaiNET-developers/SKaiNET"
621

7-
# Chromium for mermaid-cli (puppeteer)
8-
RUN apk add --no-cache chromium font-noto
22+
# Chromium for mermaid-cli (Puppeteer). Full font set so diagram labels,
23+
# emoji and CJK render correctly (merged from the Daily-StandAPP image).
24+
RUN apk add --no-cache \
25+
chromium \
26+
nss \
27+
freetype \
28+
harfbuzz \
29+
ttf-freefont \
30+
font-noto \
31+
font-noto-emoji \
32+
ca-certificates \
33+
git
934

10-
# HOME=/tmp: chromium's crashpad handler writes its database under $HOME and
11-
# aborts with `chrome_crashpad_handler: --database is required` when the
12-
# container runs as `--user $(id -u):$(id -g)` and $HOME falls back to `/`
13-
# (no passwd entry, not writable). Same motivation as runtime.cache_dir in
14-
# antora-playbook.yml.
35+
# HOME=/tmp: Chromium's crashpad handler writes its database under $HOME
36+
# and aborts with `--database is required` when the container runs as a
37+
# non-root --user and $HOME falls back to `/` (no passwd entry, not
38+
# writable). Same motivation as runtime.cache_dir in the playbook.
1539
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser \
1640
PUPPETEER_SKIP_DOWNLOAD=true \
1741
HOME=/tmp
1842

19-
# Install Antora + mermaid-cli into /opt/antora (not /antora which gets
20-
# volume-mounted at run time). asciidoctor-kroki is intentionally NOT
21-
# installed — it depends on a Kroki HTTP server (kroki.io or local)
22-
# which returns 400 for large diagrams when using GET and has no
23-
# offline fallback. We render mermaid directly via mermaid-cli through
24-
# the local-mermaid-extension.js asciidoctor block processor.
43+
# Install Antora + tooling into /opt/antora (NOT /antora, which is where
44+
# the project gets volume-mounted at run time).
45+
# - @mermaid-js/mermaid-cli : offline diagram rendering (the point)
46+
# - @antora/lunr-extension : offline full-text search
47+
# - asciidoctor-kroki : escape hatch only; the playbook should
48+
# use the local mermaid extension instead.
2549
WORKDIR /opt/antora
2650
RUN npm init -y && npm i --save-exact \
2751
@antora/cli@3.1 \
2852
@antora/site-generator@3.1 \
53+
@antora/lunr-extension@1.0.0-alpha.8 \
2954
@mermaid-js/mermaid-cli@11 \
55+
asciidoctor-kroki@0.18 \
3056
&& npm cache clean --force
3157

32-
# Make installed modules visible when workdir is the mounted project
58+
# Make installed modules resolvable even when the workdir is the mounted
59+
# project (which has no node_modules of its own).
3360
ENV NODE_PATH=/opt/antora/node_modules
3461

35-
# Mermaid-cli config — used by the local-mermaid-extension to drive
36-
# Puppeteer against the pre-installed Alpine Chromium.
37-
RUN echo '{ \
38-
"executablePath": "/usr/bin/chromium-browser", \
39-
"args": ["--no-sandbox", "--disable-gpu", "--disable-dev-shm-usage"] \
40-
}' > /opt/antora/puppeteer-config.json
41-
42-
# Bake the local mermaid extension in at an absolute path so the
43-
# Antora playbook can reference it without any volume-mount gymnastics.
62+
# Mermaid-cli / Puppeteer config and the offline block processor, baked
63+
# in at absolute paths the playbook can reference without mount gymnastics.
64+
COPY puppeteer-config.json /opt/antora/puppeteer-config.json
4465
COPY local-mermaid-extension.js /opt/antora/local-mermaid-extension.js
4566

46-
# Verify mermaid-cli works end to end at image build time. The cleanup
47-
# also removes mode-0700 root-owned dirs (e.g. /tmp/.config/puppeteer,
48-
# /tmp/.local/share/chromium) that puppeteer/chromium drop into $HOME
49-
# during this run — leaving them in place would make cosmiconfig EACCES
50-
# when the container is later launched with a non-root --user.
67+
# --- Offline assets (available, not forced) -------------------------------
68+
69+
# Pre-download the default Antora UI bundle so sites build without hitting
70+
# gitlab.com. Reference it from a playbook with:
71+
# ui:
72+
# bundle:
73+
# url: /opt/antora-ui/ui-bundle.zip
74+
# snapshot: true
75+
RUN mkdir -p /opt/antora-ui \
76+
&& wget -q -O /opt/antora-ui/ui-bundle.zip \
77+
"https://gitlab.com/antora/antora-ui-default/-/jobs/artifacts/HEAD/raw/build/ui-bundle.zip?job=bundle-stable"
78+
79+
# Pre-download MathJax es5 for offline LaTeX. Copy /opt/mathjax/es5 into a
80+
# supplemental UI or reference it from your UI template for client-side math.
81+
RUN mkdir -p /opt/mathjax \
82+
&& npm pack mathjax@3 --pack-destination /tmp \
83+
&& tar -xzf /tmp/mathjax-*.tgz -C /tmp \
84+
&& cp -r /tmp/package/es5 /opt/mathjax/es5 \
85+
&& rm -rf /tmp/mathjax-* /tmp/package
86+
87+
# --- Build-time smoke test + rootless cleanup -----------------------------
88+
89+
# Verify mermaid-cli works end to end so a broken image fails the build,
90+
# not the user's first run. The cleanup also removes the mode-0700
91+
# root-owned dirs (/tmp/.config/puppeteer, /tmp/.local/share/chromium,
92+
# /tmp/.cache, /tmp/.npm) that Puppeteer/Chromium drop into $HOME during
93+
# this run — leaving them would make cosmiconfig EACCES when the container
94+
# is later launched with a non-root --user.
5195
RUN echo 'graph TD; A-->B;' > /tmp/test.mmd \
52-
&& npx mmdc -i /tmp/test.mmd -o /tmp/test.svg -p /opt/antora/puppeteer-config.json \
96+
&& /opt/antora/node_modules/.bin/mmdc \
97+
-i /tmp/test.mmd -o /tmp/test.svg \
98+
-p /opt/antora/puppeteer-config.json --quiet \
5399
&& rm -rf /tmp/test.mmd /tmp/test.svg /tmp/.config /tmp/.local /tmp/.npm /tmp/.cache
54100

101+
WORKDIR /antora
55102
ENTRYPOINT ["/opt/antora/node_modules/.bin/antora"]
56103
CMD ["--stacktrace", "antora-playbook.yml"]

docs/.docker/README.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Docs Antora image (`docs/.docker`)
2+
3+
Self-contained Antora image used to build this repo's documentation site.
4+
It is the consolidated "markup-antora" image — one definition shared across
5+
the SKaiNET docs projects — vendored here until the public registry image
6+
is published (after which this `Dockerfile` collapses to a single `FROM`).
7+
8+
## Features
9+
10+
- **Offline Mermaid** — every `[mermaid]` block is rendered to **inline SVG**
11+
at build time by `mermaid-cli` (Alpine Chromium + Puppeteer) via the baked-in
12+
`local-mermaid-extension.js` Asciidoctor block processor. No Kroki server,
13+
no `kroki.io`, no network — at build time *or* view time. Removes the
14+
asciidoctor-kroki 4 KB GET-URL limit that rejected large diagrams.
15+
- **Diagram caching** — content-hash, in-memory + optional on-disk
16+
(`MERMAID_CACHE_DIR`); identical diagrams render once.
17+
- **Rootless-safe** — runs under `--user $(id -u):$(id -g)` without the
18+
Chromium crashpad / cosmiconfig `EACCES` failures (`HOME=/tmp`, build-time
19+
cleanup of root-owned `/tmp` dirs).
20+
- **Build-time smoke test** — a broken image fails `docker build`, not your
21+
first render.
22+
- **Offline extras**`@antora/lunr-extension` (search), a pre-baked Antora
23+
UI bundle, and MathJax es5 for LaTeX are available in the image.
24+
- **Kroki escape hatch**`asciidoctor-kroki` is installed (unused here) for
25+
other diagram types if ever needed.
26+
- Full Alpine font set (`font-noto`, `font-noto-emoji`, `ttf-freefont`, …) so
27+
diagram labels, emoji and CJK render correctly.
28+
29+
## Files
30+
31+
| File | Purpose |
32+
|---|---|
33+
| `Dockerfile` | The consolidated image definition (build context = this dir). |
34+
| `local-mermaid-extension.js` | Offline Mermaid block processor; baked to `/opt/antora/`. |
35+
| `puppeteer-config.json` | Chromium flags for mermaid-cli; baked to `/opt/antora/`. |
36+
37+
The playbook wires the extension via
38+
`asciidoc.extensions: [ /opt/antora/local-mermaid-extension.js ]`.
39+
40+
## Usage
41+
42+
Build the image (context is this directory):
43+
44+
```bash
45+
docker build -t skainet-antora:local -f docs/.docker/Dockerfile docs/.docker/
46+
```
47+
48+
Render the site (run from the repo root; mount the repo at `/antora`, run as
49+
your user so output isn't root-owned):
50+
51+
```bash
52+
docker run --rm \
53+
--user "$(id -u):$(id -g)" \
54+
-v "$PWD:/antora" \
55+
--workdir /antora/docs \
56+
skainet-antora:local \
57+
--stacktrace antora-playbook.yml
58+
59+
# Output: docs/build/site/index.html
60+
```
61+
62+
This is exactly what `.github/workflows/docs.yml` does in CI — it builds the
63+
image from this directory and runs the container the same way.
64+
65+
Write diagrams as normal Asciidoctor blocks:
66+
67+
```adoc
68+
[mermaid]
69+
----
70+
graph TD; A-->B; B-->C;
71+
----
72+
```

0 commit comments

Comments
 (0)