Skip to content

Latest commit

 

History

History
683 lines (520 loc) · 36.6 KB

File metadata and controls

683 lines (520 loc) · 36.6 KB

usr/share/doc/mios/reference/sources.md -- References, Sub-Knowledge, and Iteration Pointers

This file consolidates every authoritative source consulted to build this knowledge base, plus pointers for further iteration on OpenAI API compliance and 'MiOS' upstream technologies. Every claim in usr/share/doc/mios/**/*.md should trace to one of these sources.


1. OpenAI Platform -- API Compliance Anchors

The 'MiOS' KB is authored against these specifications. Each link is the current (2026) reference -- re-fetch periodically; OpenAI iterates fast.

1.1 Responses API (recommended for new projects)

1.2 Chat Completions API (universal)

  • API reference: https://platform.openai.com/docs/api-reference/chat/create
  • Streaming: stream: true returns SSE chunks with delta shape
  • Tools: tools[].type = "function", then function: {name, description, parameters, strict?}
  • Structured outputs: response_format: {type: "json_schema", json_schema: {name, schema, strict}}
  • This is the form supported by every OpenAI-compatible local runtime (LocalAI, Ollama, vLLM, LM Studio, llama.cpp server, LiteLLM, OpenRouter)

1.3 Vector Stores / File Search

1.4 Function Calling (strict mode)

1.5 Batch API

1.6 Evals API

1.7 Fine-tuning -- SFT

1.8 Fine-tuning -- DPO (Direct Preference Optimization)

1.9 Embeddings

  • Models: text-embedding-3-small (default 1536 dims, configurable down to 512), text-embedding-3-large (default 3072, configurable down to 256)
  • Both have 8191-token context (cl100k_base tokenizer)
  • Recommended chunk sizing: 400-800 tokens for general docs, 200-500 for fact-dense reference material
  • Reference: https://vectorize.io/blog/openai-text-embedding-3-embedding-models-first-look

1.10 MCP Tool (Responses API only)

  • Tool object shape:
    {"type": "mcp",
     "server_label": "...",
     "server_url": "https://...",
     "require_approval": "never" | "always",
     "allowed_tools": [...],
     "headers": {"Authorization": "Bearer ..."}}
  • Security: OpenAI retains only schema/domain/subdomains of server_url between calls; auth headers must be re-sent every request
  • Not available via Chat Completions

1.11 Prompt engineering & reasoning effort

  • Prompt engineering guide: https://developers.openai.com/api/docs/guides/prompt-engineering
  • XML-style structuring works well; markdown supported; multi-section prompts (<role>, <task>, <output_contract>) recommended
  • For o-series reasoning models: reasoning.effort: "low" | "medium" | "high" and reasoning.summary are accepted in Responses API

2. 'MiOS' Repository -- File-Level Sources

Every chunk in usr/share/doc/mios/*.md (under this KB) traces to one or more of these 'MiOS' files. Re-fetch them via the mios_build_kb_refresh tool to refresh the KB.

2.1 Documentation files (root)

2.2 Agent-facing documentation

2.3 Build infrastructure

2.4 LLM ingestion entrypoints

2.5 Repository layout (FHS overlay)

The repo root is the system root. These directories ship 1:1 into the deployed image via the ctx scratch stage and the automation/08-system-files-overlay.sh overlay step:

  • usr/ -- read-only system content (binaries, libraries, vendor configs, kargs.d, systemd units, AI surface, SELinux modules)
  • etc/ -- host-overridable configs (Quadlets, repo files, AI overrides)
  • home/ -- bootstrap territory (per-user homes staged in Phase-3)
  • srv/ -- data served by the system (AI model weights, Ceph data -- declared via usr/lib/tmpfiles.d/)
  • v1/ -- versioned API surface artifacts
  • config/ -- build-time configs (notably config/artifacts/{bib,iso,qcow2,vhdx,wsl2}.toml for BIB)
  • tools/ -- build helpers (preflight.sh, mios-overlay.sh, mios-sysext-pack.sh, flight-control.sh, init-user-space.sh, log-to-bootstrap.sh); sourced helpers in tools/lib/ (userenv.sh)

2.6 CI

2.7 Bootstrap repo (separate)


3. Upstream Technologies

3.1 bootc (CNCF Sandbox)

3.2 ostree / libostree

3.3 composefs

3.4 Universal Blue / ucore / ucore-hci

3.5 Fedora bootc

3.6 dnf5

3.7 Podman / buildah / skopeo

3.8 bootc-image-builder (BIB)

3.9 rechunk

  • Project: https://github.com/hhd-dev/rechunk
  • Tool: bootc-base-imagectl rechunk --max-layers 67 <src> <dst>
  • Purpose: optimize OCI layer structure for 5-10× smaller bootc upgrade deltas

3.10 Cosign / Sigstore

3.11 syft (SBOM)

3.12 GitHub Container Registry (GHCR)

3.13 NVIDIA on Fedora bootc

3.14 Container Device Interface (CDI)

3.15 LocalAI (the 'MiOS' canonical local LLM endpoint)

  • Project: https://github.com/mudler/LocalAI
  • Docs: https://localai.io/
  • API surfaces (OpenAI-compatible): /v1/models, /v1/chat/completions (SSE + tools), /v1/embeddings, /v1/completions
  • 'MiOS' Quadlet: etc/containers/systemd/mios-ai.containerhttp://localhost:8080/v1
  • Backends: llama.cpp, vLLM-ish, transformers, gpt4all, exllama, etc.

3.16 Other local OpenAI-compatible runtimes (for Day-0 portability)

3.17 MCP (Model Context Protocol)

3.18 Looking Glass / KVMFR

  • Looking Glass: https://looking-glass.io/
  • KVMFR shared-memory module: built in-image via automation/52-bake-kvmfr.sh
  • Looking Glass B7 client: built in-image via automation/53-bake-lookingglass-client.sh

3.19 SecureBlue (security audit framework)

3.20 Defense-in-depth components

3.21 Cluster & remote access

3.22 FHS 3.0 specification

  • Spec: https://refspecs.linuxfoundation.org/FHS_3.0/
  • Key intent: /usr "shareable, read-only" (composefs/ostree enforce this at the kernel level), /etc host-specific config (3-way merged on bootc upgrade), /var mutable+persistent (never touched by upgrade), /srv data served by the system

3.23 Related immutable/atomic distros (comparison context)

3.24 llms.txt standard (LLM ingestion entrypoint)

  • Spec: https://llmstxt.org/ (Answer.AI proposal -- /llms.txt for LLM-friendly site indexing, /llms-full.txt for full-content variant). 'MiOS' publishes both at the repo root.

4. Sub-Knowledge -- For Iteration

4.1 Things this KB does NOT yet cover (next ingestion targets)

When you re-run the KB refresh, prioritize these files (they were referenced but not yet scraped to chunk-level detail):

  • usr/share/mios/PACKAGES.md -- actual fenced-block package contents
  • usr/share/mios/ai/system.md -- canonical agent prompt
  • usr/share/mios/ai/v1/models.json -- actual model catalog
  • usr/share/mios/ai/v1/mcp.json -- actual MCP server registry
  • automation/build.sh -- orchestrator entrypoint
  • automation/lib/{common,packages,masking}.sh -- shared lib functions
  • All automation/[0-9][0-9]-*.sh scripts (~48 files)
  • All etc/containers/systemd/mios-*.container Quadlet files
  • All usr/lib/bootc/kargs.d/*.toml (00-mios.toml is the entry point; later priority files exist)
  • usr/lib/sysctl.d/99-mios-hardening.conf
  • usr/lib/ostree/prepare-root.conf
  • usr/lib/tmpfiles.d/mios*.conf (notably mios-gpu.conf, mios.conf)
  • usr/share/selinux/packages/mios/*.te
  • config/artifacts/{bib,iso,qcow2,vhdx,wsl2}.toml -- actual BIB configs
  • etc/fapolicyd/fapolicyd.rules -- actual fapolicyd policy
  • .github/workflows/mios-ci.yml -- actual CI pipeline

4.2 Open OpenAI API surfaces to track

4.3 OpenAI API surfaces NOT supported by typical local runtimes

If your KB consumer is local-only (LocalAI, Ollama, vLLM, LM Studio, llama.cpp), these surfaces require either OpenAI cloud, Azure OpenAI, or a translation proxy (LiteLLM):

  • /v1/responses -- Responses API
  • /v1/vector_stores -- Vector Stores
  • /v1/batches -- Batch API
  • /v1/evals -- Evals API
  • /v1/fine_tuning/jobs -- Fine-tuning (local equivalent: axolotl, trl, llama-factory, MLX-LM, unsloth, all of which consume the same JSONL format 'MiOS' ships)
  • /v1/files with purpose: "assistants" -- file uploads for File Search

The KB ships local-compatible alternatives for each (see top-level README.md § "Day-0 local-model compatibility").

4.4 Tooling for KB development

4.5 Vector DBs for self-hosted RAG (consumes chunks.jsonl)


5. Citation Tier Reminder

Tier Definition Examples in this KB
Primary Official upstream documentation bootc.dev, osbuild.org, platform.openai.com, developers.openai.com, docs.redhat.com, the 'MiOS' repo itself
Secondary Official project repos (GitHub) bootc-dev/bootc, containers/composefs, ublue-os/ucore, sigstore/cosign, mios-dev/'MiOS'
Tertiary Vendor/community blogs corroborating primary sources Fedora Magazine, Microsoft Learn, OpenAI Cookbook
Validating Third-party format references for cross-checking OpenAI specs DeepWiki, Leanware, Vectorize, CodeFriends

When iterating on this KB, always cite Primary first; fall back to Secondary; cite Tertiary/Validating only when Primary doesn't yet document the surface (often the case for very recent OpenAI features).


6. Refresh Cadence

OpenAI surfaces change frequently. Recommended re-validation cadence:

  • Quarterly: re-fetch all OpenAI docs URLs in §1; verify field names, required fields, and limits haven't shifted.
  • On every 'MiOS' minor release: run the mios_build_kb_refresh tool to regenerate chunks from the live repo.
  • Immediately: when OpenAI announces a new GA model, fine-tuning technique, or eval grader type -- extend the relevant section here and in the KB chunks.

7. v2 Repo-Grounded Findings (live-fetched 2026-05-02)

These corrections supersede the corresponding parts of v1. Each is traceable to a specific 'MiOS' file fetched from github.com/mios-dev/'MiOS'@main.

7.1 Repo structure as fetched

.devcontainer/  .github/  automation/  config/  etc/  tools/  usr/  v1/
.clinerules .cursorrules .editorconfig .gitattributes .gitignore
AGENTS.md usr/share/doc/mios/concepts/architecture.md CLAUDE.md CONTRIBUTING.md Containerfile
usr/share/doc/mios/guides/deploy.md usr/share/doc/mios/guides/engineering.md GEMINI.md Get-MiOS.ps1 usr/share/mios/ai/INDEX.md Justfile
LICENSE usr/share/doc/mios/reference/licenses.md README.md SECURITY.md usr/share/doc/mios/guides/self-build.md VERSION
build-mios.sh image-versions.yml install.ps1 install.sh
llms-full.txt llms.txt mios-build-local.ps1
preflight.ps1 push-to-github.ps1 renovate.json system-prompt.md

The repo root is the system root (no system_files/ directory).

7.2 Architectural Laws -- verbatim from usr/share/mios/ai/INDEX.md §3

# Law Enforced by
1 USR-OVER-ETC -- static config in /usr/lib/<component>.d/; /etc/ is admin-override only. Exceptions: /etc/yum.repos.d/, /etc/nvidia-container-toolkit/. automation/, usr/lib/, etc/
2 NO-MKDIR-IN-VAR -- every /var/ path declared via usr/lib/tmpfiles.d/*.conf. usr/lib/tmpfiles.d/mios*.conf
3 BOUND-IMAGES -- every Quadlet image symlinked into /usr/lib/bootc/bound-images.d/. automation/08-system-files-overlay.sh:74-86
4 BOOTC-CONTAINER-LINT -- final RUN of Containerfile. Containerfile last RUN
5 UNIFIED-AI-REDIRECTS -- MIOS_AI_KEY/MODEL/ENDPOINThttp://localhost:8080/v1. No vendor URLs. usr/bin/mios, etc/mios/ai/
6 UNPRIVILEGED-QUADLETS -- User=, Group=, Delegate=yes on every Quadlet. Documented exceptions: mios-ceph, mios-k3s as User=root (Ceph/K3s require uid 0). etc/containers/systemd/, usr/share/containers/systemd/

7.3 Service gating table -- verbatim from usr/share/mios/ai/INDEX.md §5

Unit Condition Skips on
mios-ai ConditionPathIsDirectory=/etc/mios/ai bootstrap incomplete
mios-ceph ConditionPathExists=/etc/ceph/ceph.conf, !container Ceph not configured, nested
mios-k3s !wsl, !container WSL2, nested containers
crowdsec-dashboard ConditionPathExists=/etc/crowdsec/config.yaml CrowdSec not configured
cloudws-guacamole, guacd, guacamole-postgres !container nested containers
cloudws-pxe-hub !wsl, !container virtualized hosts without routable LAN
mios-gpu-{nvidia,amd,intel,status} ConditionPathExists=/dev/..., !container, !wsl (Intel) no matching GPU device
ollama none always runs (CPU fallback)

7.4 Pipeline phases (verbatim from usr/share/mios/ai/INDEX.md §6 and usr/share/doc/mios/guides/engineering.md)

Phase Owner Description
Phase-0 mios-bootstrap.git/install.sh Preflight + profile load + identity capture
Phase-1 mios-bootstrap.git/install.sh Total Root Merge of mios.git and mios-bootstrap.git to /
Phase-2 Containerfile/automation/build.sh Build the running system (~48 numbered phase scripts)
Phase-3 mios.git/install.sh + bootstrap profile staging systemd-sysusers/tmpfiles/daemon-reload + user create + per-user ~/.config/mios/{profile.toml,system-prompt.md}
Phase-4 mios-bootstrap.git/install.sh Reboot prompt

7.5 Build-mode summary (verbatim from usr/share/doc/mios/guides/self-build.md)

Mode Path Use
0 mios-bootstrap.git/install.sh curl one-liner initial install on fresh Linux
1 .github/workflows/mios-ci.yml production CI (build → rechunk on tag → cosign keyless → push GHCR)
2 mios-build-local.ps1 Windows local 5-phase orchestrator
3 Justfile recipes Linux local orchestrator
4 self-build (running 'MiOS' builds next 'MiOS') git clone && podman build && bootc switch --transport containers-storage localhost/mios:rechunked
5 config/ignition/ Butane configs → .ign fully automated builds on fresh Fedora CoreOS / Fedora Server

7.6 Justfile recipe inventory (verbatim from Justfile)

preflight, flight-status, init, deploy, live-init, lint, build, build-logged, build-verbose, embed-log, artifact, cloud-build, rechunk, raw, iso, qcow2, vhdx, wsl2, log-bootstrap, build-and-log, all-bootstrap, sbom, init-user-space, reinit-user-space, show-user-space, show-env, edit-env, edit-images, edit-build, edit-flatpaks.

7.7 Containerfile structure (verbatim from Containerfile)

  • Single-stage main build (FROM ${BASE_IMAGE}) plus a ctx scratch stage that COPYs automation/, usr/, etc/, usr/share/mios/PACKAGES.md, VERSION, config/artifacts/, tools/ into /ctx.
  • One large RUN block bind-mounts /ctx read-only and a writable /tmp/build copy, sources automation/lib/packages.sh, runs dnf clean metadata, install_packages_strict base, optionally writes /usr/share/mios/flatpak-list from MIOS_FLATPAKS, runs automation/08-system-files-overlay.sh pre-pipeline, then CTX=/tmp/build /tmp/build/automation/build.sh to iterate automation/[0-9][0-9]-*.sh.
  • Final two RUN instructions: ostree container commit, then bootc container lint (LAW 4, MUST be the final instruction).
  • OCI labels: containers.bootc=1, ostree.bootable=1, plus org.opencontainers.image.{title,description,licenses,source,version}.
  • CMD ["/sbin/init"].

7.8 SECURITY.md kargs corrections (verbatim)

Parameter Active in 'MiOS'? Rationale (per SECURITY.md)
slab_nomerge [ok] Heap isolation
init_on_alloc=1 Disabled -- CUDA memory init failures
init_on_free=1 Disabled -- same
page_alloc.shuffle=1 Disabled -- NVIDIA driver instability
randomize_kstack_offset=on [ok] Per-syscall stack randomization
pti=on [ok] Meltdown
vsyscall=none [ok] Legacy table off
iommu=pt [ok] VFIO passthrough
amd_iommu=on / intel_iommu=on [ok] IOMMU enable
nvidia-drm.modeset=1 [ok] GNOME Wayland
lockdown=integrity [ok] (NOT confidentiality -- chosen for kexec compatibility)
spectre_v2=on, spec_store_bypass_disable=on, l1tf=full,force, gather_data_sampling=force [ok] Side-channel mitigations

7.9 SELinux modules (verbatim from SECURITY.md §SELinux)

mios_portabled, mios_kvmfr, mios_cdi, mios_quadlet, mios_sysext in usr/share/selinux/packages/mios/. Booleans: container_use_cephfs, virt_use_samba. Fcontext: /var/home(/.*)?user_home_dir_t.

7.10 Composefs config (verbatim from usr/lib/ostree/prepare-root.conf)

[composefs]
enabled = true

[etc]
transient = true

[root]
transient-ro = true

7.11 Bootstrap repo (separate, owns Phase-0/1/4)

https://github.com/mios-dev/mios-bootstrap

Owns the user-facing installer, identity capture, Total Root Merge, and final reboot prompt. The mios-dev/'MiOS' repo (this KB's subject) is the system layer.


8. Day-0 Local-Model Compatibility Matrix (additive)

The KB is portable across every OpenAI-API-compatible runtime. The table below is the canonical compatibility surface -- keep it in sync with README.md §"Day-0 local-model compatibility".

Runtime Endpoint Chat Embed Tools Strict VStores Resp Batch Evals FT
OpenAI cloud https://api.openai.com/v1 [ok] [ok] [ok] [ok] [ok] [ok] [ok] [ok] [ok]
Azure OpenAI (your resource) [ok] [ok] [ok] [ok] [ok] [!] region [ok] [ok] [ok]
'MiOS' LocalAI (canonical, LAW 5) http://localhost:8080/v1 [ok] [ok] [ok] [!] ignored
Ollama http://localhost:11434/v1 [ok] [ok] [ok] [!] ignored external
vLLM http://localhost:8000/v1 [ok] [ok] [ok] [ok] via xgrammar partial external
LM Studio http://localhost:1234/v1 [ok] [ok] [ok] [!] ignored
llama.cpp server http://localhost:8080/v1 [ok] [ok] [ok] via grammars [!] ignored external
LiteLLM proxy http://localhost:4000/v1 [ok] [ok] [ok] proxied [ok] proxied proxied translates proxied proxied proxied
OpenRouter https://openrouter.ai/api/v1 [ok] partial [ok] per-model

[!] ignored = the runtime accepts strict: true as an unknown field and proceeds without enforcement (schema useful as documentation but runtime does not reject malformed JSON).

Local fine-tuning toolchains (consume the same sft.jsonl/dpo.jsonl)

Constrained-generation engines (enforce JSON Schema locally)

Vector DBs (consume chunks.jsonl)


9. KB Refresh Pointers (additive -- what to ingest next)

When mios_build_kb_refresh re-runs, prioritize these still-unscraped files that this v2 pass referenced but did not chunk to file-content detail:

  • automation/build.sh -- orchestrator entrypoint
  • automation/lib/{common,packages,masking}.sh -- shared lib functions
  • usr/share/mios/PACKAGES.md -- actual fenced-block contents
  • usr/share/mios/ai/system.md -- canonical agent prompt
  • usr/share/mios/ai/v1/{models,mcp}.json -- actual catalogs
  • All automation/[0-9][0-9]-*.sh (~48 files)
  • All etc/containers/systemd/mios-*.container Quadlets
  • All usr/lib/bootc/kargs.d/*.toml (00-mios.toml seen in references; 05-mios-plymouth.toml inferred; others may exist)
  • usr/lib/sysctl.d/99-mios-hardening.conf -- actual sysctl values
  • usr/lib/ostree/prepare-root.conf -- composefs config
  • usr/lib/tmpfiles.d/mios*.conf -- /var declarations (LAW 2)
  • usr/share/selinux/packages/mios/*.te -- five custom SELinux modules
  • config/artifacts/{bib,iso,qcow2,vhdx,wsl2}.toml -- BIB configs
  • etc/fapolicyd/fapolicyd.rules -- actual fapolicyd policy
  • .github/workflows/mios-ci.yml -- actual CI pipeline (build/rechunk/sign/push)
  • tools/{preflight,mios-overlay,mios-sysext-pack,flight-control,init-user-space,log-to-bootstrap}.sh, tools/lib/userenv.sh