11# SecAI OS
22
3- A bootable, local-first AI appliance with defense-in-depth security for consumer RTX workstations and Apple Silicon.
3+ A bootable, local-first AI appliance with defense-in-depth security. Supports NVIDIA, AMD, Intel, and Apple Silicon GPUs — all compute stays on-device .
44
55Built on [ uBlue] ( https://universal-blue.org/ ) (Fedora Atomic / Silverblue) with an immutable OS, encrypted vault, and sealed runtime where sensitive data never leaves the device by default.
66
@@ -37,18 +37,41 @@ Built on [uBlue](https://universal-blue.org/) (Fedora Atomic / Silverblue) with
3737| Tool Firewall | 8475 | Go | Policy-gated tool invocation gateway |
3838| Web UI | 8480 | Python | Chat, image/video generation, model management |
3939| Airlock | 8490 | Go | Sanitized egress proxy (disabled by default) |
40- | Inference Worker | 8465 | llama.cpp | LLM inference (CUDA + Metal) |
41- | Diffusion Worker | 8455 | Python | Image and video generation (Stable Diffusion ) |
40+ | Inference Worker | 8465 | llama.cpp | LLM inference (CUDA / ROCm / Vulkan / Metal / CPU ) |
41+ | Diffusion Worker | 8455 | Python | Image and video generation (CUDA / ROCm / XPU / MPS / CPU ) |
4242| Quarantine | -- | Python | 7-stage verify, scan, and promote pipeline |
4343
4444## Hardware Support
4545
46- | Platform | GPU Acceleration | Notes |
47- | ----------| -----------------| -------|
48- | NVIDIA RTX 5080 | CUDA (full offload) | Primary target; uses nvidia-open drivers |
49- | NVIDIA RTX 4090/4080/3090 | CUDA (full offload) | Any RTX card with sufficient VRAM |
50- | Apple M4 / M3 / M2 / M1 | Metal (via llama.cpp) | CPU-only container, Metal on host |
51- | Any x86_64 | CPU fallback | Slower but functional |
46+ GPU is ** auto-detected at first boot** — no manual configuration needed. The ` detect-gpu.sh ` script identifies your hardware and writes the optimal settings.
47+
48+ ### Supported GPUs
49+
50+ | Vendor | GPUs | Backend | LLM (llama.cpp) | Diffusion (PyTorch) |
51+ | --------| ------| ---------| -----------------| -------------------|
52+ | ** NVIDIA** | RTX 5090/5080/4090/4080/3090/3080, any CUDA GPU | CUDA | Full offload | Full offload |
53+ | ** AMD** | RX 7900 XTX/XT, RX 7800/7700, RX 6900/6800, any RDNA/CDNA | ROCm (HIP) | Full offload | Full offload |
54+ | ** Intel** | Arc A770/A750/A580, Arc B-series, Data Center Max | XPU (oneAPI) | Via Vulkan | Via IPEX |
55+ | ** Apple** | M4/M3/M2/M1 (Pro/Max/Ultra) | Metal / MPS | Full offload | MPS acceleration |
56+ | ** Any CPU** | x86_64 (AVX2/AVX-512), ARM64 (NEON) | CPU | Optimized | Functional |
57+
58+ ### Backend Priority
59+
60+ The system auto-selects the best available backend in this order:
61+ 1 . ** CUDA** (NVIDIA) — highest throughput for both LLM and diffusion
62+ 2 . ** ROCm** (AMD) — near-CUDA performance on RDNA3/CDNA
63+ 3 . ** MPS** (Apple Silicon) — Metal acceleration on macOS
64+ 4 . ** XPU** (Intel Arc) — oneAPI/SYCL for discrete Intel GPUs
65+ 5 . ** Vulkan** (cross-vendor) — universal GPU compute fallback for llama.cpp
66+ 6 . ** CPU** — AVX2/AVX-512/NEON auto-vectorized, works on everything
67+
68+ ### Security Note
69+
70+ All GPU backends run locally with the same sandboxing:
71+ - ` PrivateNetwork=yes ` — no network access regardless of GPU vendor
72+ - ` DeviceAllow ` restricts access to only the specific GPU device nodes needed
73+ - AMD ROCm uses ` /dev/kfd ` + ` /dev/dri/* ` ; NVIDIA uses ` /dev/nvidia* ` ; Intel uses ` /dev/dri/* `
74+ - No cloud compute, no driver telemetry endpoints (blocked by nftables default-deny)
5275
5376** Minimum requirements:**
5477
@@ -202,6 +225,9 @@ cd services/tool-firewall && go build -o ../../bin/tool-firewall . && cd ../..
202225cd services/airlock && go build -o ../../bin/airlock . && cd ../..
203226
204227# Install Python dependencies
228+ # For NVIDIA: pip install torch --index-url https://download.pytorch.org/whl/cu124
229+ # For AMD: pip install torch --index-url https://download.pytorch.org/whl/rocm6.1
230+ # For CPU: pip install torch --index-url https://download.pytorch.org/whl/cpu
205231pip install flask requests pyyaml diffusers transformers accelerate torch safetensors
206232
207233# Run the UI (Flask)
@@ -396,7 +422,7 @@ Every model — whether downloaded from the catalog or imported by the user —
396422| ** Tools** | Default-deny policy, path allowlisting, traversal protection, rate limiting |
397423| ** Egress** | Airlock disabled by default, PII/credential scanning, destination allowlist |
398424| ** Services** | Systemd sandboxing: ProtectSystem=strict, PrivateNetwork, syscall filters |
399- | ** GPU Isolation** | Diffusion worker sandboxed with explicit DeviceAllow for GPU access only |
425+ | ** GPU Isolation** | Vendor-specific DeviceAllow (NVIDIA ` /dev/nvidia* ` , AMD ` /dev/kfd ` , Intel ` /dev/dri/* ` ), PrivateNetwork on all |
400426| ** Emergency** | Panic switch: instant network kill + route flush + service stop |
401427
402428### Systemd Sandboxing
@@ -412,10 +438,13 @@ Every service runs with defense-in-depth sandboxing:
412438- ` SystemCallFilter=@system-service ` — restricted syscalls
413439- ` MemoryDenyWriteExecute=yes ` — no JIT/RWX memory
414440
415- The diffusion worker has additional GPU-specific sandboxing:
416- - ` DeviceAllow=/dev/nvidia* rw ` and ` DeviceAllow=/dev/dri/* rw ` — explicit GPU access
441+ Both inference and diffusion workers have GPU-specific sandboxing:
442+ - ` DeviceAllow=/dev/nvidia* rw ` — NVIDIA CUDA access
443+ - ` DeviceAllow=/dev/kfd rw ` — AMD ROCm compute access
444+ - ` DeviceAllow=/dev/dri/* rw ` — AMD/Intel DRI render nodes
417445- ` ReadWritePaths=/var/lib/secure-ai/vault/outputs ` — write only to outputs directory
418446- ` ReadOnlyPaths=/var/lib/secure-ai/registry ` — read-only model access
447+ - Unused GPU device nodes are harmless — systemd silently ignores DeviceAllow for non-existent devices
419448
420449### Verify Image Signatures
421450
@@ -440,6 +469,12 @@ All configuration lives in `/etc/secure-ai/` (baked into the image, read-only at
440469
441470### Key Configuration Options
442471
472+ ** GPU backend** (` config/appliance.yaml ` ):
473+ ``` yaml
474+ gpu :
475+ backend : " auto" # auto | cuda | rocm | xpu | vulkan | mps | cpu
476+ ` ` `
477+
443478**Inference settings** (` config/appliance.yaml`):
444479` ` ` yaml
445480inference:
@@ -601,14 +636,28 @@ mount | grep secure-ai
601636### GPU not detected
602637
603638``` bash
604- # Check NVIDIA driver
605- nvidia-smi
639+ # Re-run GPU detection
640+ sudo /usr/libexec/secure-ai/detect-gpu.sh
606641
607- # If not loaded, check kernel modules
642+ # Check what was detected
643+ cat /var/lib/secure-ai/inference.env
644+
645+ # NVIDIA: check driver
646+ nvidia-smi
608647lsmod | grep nvidia
609648
610- # For Apple Silicon, GPU acceleration runs on the host (not in container)
611- # Verify Metal support:
649+ # AMD: check ROCm
650+ rocminfo
651+ ls -la /dev/kfd /dev/dri/renderD128
652+
653+ # Intel: check DRI
654+ ls -la /dev/dri/renderD128
655+ cat /sys/class/drm/card0/device/vendor # should be 0x8086
656+
657+ # Vulkan (any vendor)
658+ vulkaninfo --summary
659+
660+ # Apple Silicon (Metal runs on host, not in container)
612661system_profiler SPDisplaysDataType
613662```
614663
0 commit comments