ci: add GitHub Actions workflow to build and deploy Antora docs

michalharakal · claude · michalharakal · commit f8caaf8152fd · 2026-04-11T12:38:10.000+02:00
- Add docs.yml workflow: builds on push to main/develop, deploys to
  GitHub Pages from develop branch
- Uses dockerized Antora 3.1 with asciidoctor-kroki for Mermaid diagrams
- Add antora-playbook.yml with Kroki server integration
- Convert ASCII diagrams to Mermaid in pipeline, architecture,
  tool-calling, and pipeline-design pages

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -0,0 +1,58 @@
+name: Docs
+
+on:
+  push:
+    branches: [ main, develop ]
+    paths:
+      - 'docs/**'
+      - '.github/workflows/docs.yml'
+  pull_request:
+    paths:
+      - 'docs/**'
+      - '.github/workflows/docs.yml'
+  workflow_dispatch:
+
+concurrency:
+  group: docs-${{ github.ref }}
+  cancel-in-progress: true
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+jobs:
+  build-docs:
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v6
+
+      - name: Build Antora site
+        run: |
+          docker run --rm \
+            -v "${{ github.workspace }}:/antora" \
+            --workdir /antora/docs \
+            --entrypoint sh \
+            docker.io/antora/antora:3.1 \
+            -c "npm i asciidoctor-kroki && antora --stacktrace antora-playbook.yml"
+
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: docs/build/site
+
+  deploy-docs:
+    if: github.ref == 'refs/heads/develop' && github.event_name == 'push'
+    needs: build-docs
+    runs-on: ubuntu-latest
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
diff --git a/docs/antora-playbook.yml b/docs/antora-playbook.yml
@@ -0,0 +1,25 @@
+site:
+  title: SKaiNET Transformers
+  start_page: skainet-transformers::index.adoc
+
+content:
+  sources:
+    - url: .
+      start_path: docs
+      branches: HEAD
+
+asciidoc:
+  extensions:
+    - asciidoctor-kroki
+
+kroki:
+  server-url: https://kroki.io
+  fetch-diagram: true
+
+ui:
+  bundle:
+    url: https://gitlab.com/antora/antora-ui-default/-/jobs/artifacts/HEAD/raw/build/ui-bundle.zip?job=bundle-stable
+    snapshot: true
+
+output:
+  dir: ./build/site
diff --git a/docs/modules/ROOT/pages/explanation/pipeline-design.adoc b/docs/modules/ROOT/pages/explanation/pipeline-design.adoc
@@ -14,7 +14,29 @@ This led to:
 
 The pipeline is split into stages that are independently replaceable:
 
-[horizontal]
+[mermaid]
+....
+graph LR
+    subgraph "Model Format"
+        A[Weight Loading]
+    end
+    subgraph "Architecture"
+        B[Network Definition]
+    end
+    subgraph "Framework"
+        C[Graph Compilation]
+        D[Inference Runtime]
+    end
+    subgraph "I/O"
+        E[Tokenization]
+    end
+    subgraph "Application"
+        F[Chat Pipeline]
+    end
+
+    A --> B --> C --> D --> E --> F
+....
+
 Weight Loading:: Parse GGUF/SafeTensors into typed tensor maps. Model-format concern, not architecture concern.
 Network Definition:: Pure functions (`llamaNetwork()`, `apertusNetwork()`) that return a `Module` tree. Architecture concern only.
 Graph Compilation:: Trace the module tree into a DAG, apply optimization passes. Framework concern.
diff --git a/docs/modules/ROOT/pages/reference/architecture.adoc b/docs/modules/ROOT/pages/reference/architecture.adoc
@@ -28,23 +28,62 @@ llm-performance/            Benchmarking module
 
 == Dependency Graph
 
-----
-llm-apps/skainet-cli
-  -> llm-runtime/kllama -> llm-inference/llama -> llm-core
-  -> llm-agent -> llm-core
-  -> skainet-backend-cpu (SIMD tensor ops)
-  -> skainet-io-gguf (GGUF parsing)
+[mermaid]
+....
+graph LR
+    subgraph Apps
+        skainet-cli
+        kllama-cli
+    end
 
-llm-agent
-  -> llm-core (InferenceRuntime, Tokenizer)
+    subgraph Runtime
+        kllama
+        kqwen
+        kgemma
+        kapertus
+    end
 
-llm-core
-  -> skainet-lang-core (tensor types, DSL)
-  -> skainet-compile-dag (compute graph)
-  -> skainet-compile-opt (optimization passes)
-  -> skainet-io-core (I/O abstractions)
-  -> skainet-io-gguf (GGUF reader)
-----
+    subgraph Inference
+        llama
+        gemma
+        apertus
+        bert
+        voxtral
+    end
+
+    subgraph Core
+        llm-core
+        llm-agent
+    end
+
+    subgraph SKaiNET
+        skainet-lang-core
+        skainet-compile-dag
+        skainet-compile-opt
+        skainet-io-gguf
+        skainet-backend-cpu
+    end
+
+    skainet-cli --> kllama
+    skainet-cli --> llm-agent
+    kllama-cli --> kllama
+    kllama --> llama
+    kllama --> llm-agent
+    kqwen --> llama
+    kgemma --> gemma
+    kapertus --> apertus
+    llama --> llm-core
+    gemma --> llm-core
+    apertus --> llm-core
+    bert --> llm-core
+    voxtral --> llm-core
+    llm-agent --> llm-core
+    llm-core --> skainet-lang-core
+    llm-core --> skainet-compile-dag
+    llm-core --> skainet-compile-opt
+    llm-core --> skainet-io-gguf
+    kllama --> skainet-backend-cpu
+....
 
 == Key Interfaces
 
diff --git a/docs/modules/ROOT/pages/reference/pipeline.adoc b/docs/modules/ROOT/pages/reference/pipeline.adoc
@@ -3,37 +3,22 @@
 
 == Pipeline Stages
 
-[source]
-----
-GGUF/SafeTensors File
-    |
-    v
-[1] WeightLoader         Parse metadata + tensor data
-    |
-    v
-[2] DSL Network Def      llamaNetwork(), qwenNetwork(), apertusNetwork()
-    |
-    v
-[3] ComputeGraph (DAG)   Record forward pass into directed acyclic graph
-    |
-    v
-[4] Optimization          TransposeElim -> WeightDedup -> LLMFusion -> DCE
-    |
-    v
-[5] Executor              ComputeGraphExecutor with fused kernels
-    |
-    v
-[6] InferenceRuntime      forward(tokenId) -> logits, generate(), sample()
-    |
-    v
-[7] Tokenizer             encode(text) -> IntArray, decode(token) -> String
-    |
-    v
-[8] ChatPipeline          ChatTemplate + AgentLoop + ToolRegistry
-    |
-    v
-    Generated text / Tool call results
-----
+[mermaid]
+....
+graph TD
+    A[GGUF / SafeTensors File] --> B["[1] WeightLoader<br/>Parse metadata + tensor data"]
+    B --> C["[2] DSL Network Def<br/>llamaNetwork(), qwenNetwork()"]
+    C --> D["[3] ComputeGraph (DAG)<br/>Record forward pass"]
+    D --> E["[4] Optimization<br/>TransposeElim → WeightDedup → LLMFusion → DCE"]
+    E --> F["[5] Executor<br/>ComputeGraphExecutor with fused kernels"]
+    F --> G["[6] InferenceRuntime<br/>forward(tokenId) → logits"]
+    G --> H["[7] Tokenizer<br/>encode(text) → IntArray"]
+    H --> I["[8] ChatPipeline<br/>ChatTemplate + AgentLoop + ToolRegistry"]
+    I --> J[Generated text / Tool call results]
+
+    style A fill:#f9f,stroke:#333
+    style J fill:#9f9,stroke:#333
+....
 
 == Stage Details
 
diff --git a/docs/modules/ROOT/pages/tutorials/tool-calling.adoc b/docs/modules/ROOT/pages/tutorials/tool-calling.adoc
@@ -7,17 +7,19 @@ This tutorial shows how to use `ChatSession` to add tool calling to any model ru
 
 The tool calling pipeline is decoupled from the model runtime:
 
-----
-InferenceRuntime + Tokenizer + ModelMetadata
-        |
-    ChatSession
-        |
-    AgentLoop (generate -> parse -> execute -> re-prompt)
-        |
-    ChatTemplate (format messages, parse tool calls)
-        |
-    ToolRegistry (execute tool functions)
-----
+[mermaid]
+....
+graph TD
+    A[InferenceRuntime + Tokenizer + ModelMetadata] --> B[ChatSession]
+    B --> C[AgentLoop]
+    C --> D{Tool calls?}
+    D -->|Yes| E[ToolRegistry<br/>execute tool functions]
+    E --> F[Append result to messages]
+    F --> C
+    D -->|No| G[Final response]
+    B --> H[ChatTemplate<br/>format messages, parse tool calls]
+    H --> C
+....
 
 Any model that implements `InferenceRuntime` and has a `Tokenizer` can use tool calling.