You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(ai): bake default model set into the image (qwen2.5-coder:7b + nomic-embed-text)
Embed the global default models at build time and define the override
contract in the unified mios.toml dotfile. Tuned for the 12 GB system-
RAM baseline: CPU-only inference with ~8 GB available to the model.
Default model set (researched against the 8 GB resident envelope):
- qwen2.5-coder:7b chat / code primary
Q4_K_M ~4.7 GB on disk, ~5-6 GB resident
Apache 2.0; 128K-token context
HumanEval ~88%; trained heavily on bash,
PowerShell, Containerfiles, systemd units, TOML
Strong filesystem-path reasoning
- nomic-embed-text embeddings (v1.5)
~270 MB Q4 GGUF; 768-dim; 8192-token context
OpenAI /v1/embeddings-shaped via LocalAI
automation/37-ollama-prep.sh -- rewritten as a real build-time baker
(was a no-op stub from the prior commit). Pulls the model set into
/usr/share/ollama/models on the immutable composefs surface (FHS-
correct for "architecture-independent immutable data files"). The
final /var cleanup at the end of the Containerfile RUN does NOT
touch /usr/share, so the seed survives into the deployed image.
- MIOS_OLLAMA_BAKE_MODELS=<csv> at build time overrides the default
set; empty disables baking entirely (CI builds that just validate
the pipeline can opt out).
automation/build.sh -- 37-ollama-prep.sh removed from
CONTAINERFILE_SCRIPTS so it runs as a regular pipeline step instead
of being orphaned.
usr/libexec/mios/ollama-firstboot.sh -- replaces the network-pull-
only first-boot path with a two-layer flow:
1. Hardlink-copy the build-baked seed (/usr/share/ollama/models) into
/var/lib/ollama/models (the writable runtime store ollama uses
via OLLAMA_MODELS). Hardlinks keep on-disk usage to a single copy
until ollama mutates a manifest. cp -al falls back to plain cp -a
on cross-FS boundaries (composefs /usr -> ext4 /var).
2. Network pull only as a fallback for any model the seed did not
include (e.g. operator overrides via /etc/mios/install.env).
usr/share/containers/systemd/ollama.container -- two volumes:
/usr/share/ollama:/usr/share/ollama:ro,Z immutable seed (RO)
/var/lib/ollama:/var/lib/ollama:Z runtime store (RW)
OLLAMA_MODELS=/var/lib/ollama/models so ollama itself reads/writes
the runtime path. The seed is reachable to firstboot via the RO
mount.
usr/lib/tmpfiles.d/mios-ollama.conf (new) -- declares
/var/lib/ollama and /var/lib/ollama/models with mios-ollama:
mios-ollama ownership. Architectural Law 2 -- /var paths are never
imperatively mkdir'd at build time.
Unified user-overrides dotfile (usr/share/mios/mios.toml + env.defaults):
the [ai] section grows four new keys with explicit semantics:
bake_models -- comma-separated build-time bake list
(-> MIOS_OLLAMA_BAKE_MODELS)
ram_floor_gb -- recommended floor (informational; logged in
postcheck so operators see what they signed up for)
seed_dir -- /usr/share/ollama/models (immutable, build-baked)
runtime_dir -- /var/lib/ollama/models (writable, first-boot-seeded)
The [ai] section comment block enumerates the alternates considered
(qwen2.5-coder:14b for 24 GB+ systems, llama3.2:3b for low-RAM /
fast-response profiles) and their resource cost, so operators can
swap deliberately rather than guessing.
INDEX.md -- new section 2a documents the default set with disk /
resident sizing, license, and the build-baked / first-boot-seeded
flow so the model surface is discoverable next to the OpenAI API
endpoint table.
0 commit comments