Skip to content

Commit 7b4b35b

Browse files
committed
Add additional functionality
1 parent 49918a4 commit 7b4b35b

11 files changed

Lines changed: 248 additions & 29 deletions

File tree

README.md

Lines changed: 66 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# SecAI OS
22

3-
A bootable, local-first AI appliance with defense-in-depth security for consumer RTX workstations and Apple Silicon.
3+
A bootable, local-first AI appliance with defense-in-depth security. Supports NVIDIA, AMD, Intel, and Apple Silicon GPUs — all compute stays on-device.
44

55
Built on [uBlue](https://universal-blue.org/) (Fedora Atomic / Silverblue) with an immutable OS, encrypted vault, and sealed runtime where sensitive data never leaves the device by default.
66

@@ -37,18 +37,41 @@ Built on [uBlue](https://universal-blue.org/) (Fedora Atomic / Silverblue) with
3737
| Tool Firewall | 8475 | Go | Policy-gated tool invocation gateway |
3838
| Web UI | 8480 | Python | Chat, image/video generation, model management |
3939
| Airlock | 8490 | Go | Sanitized egress proxy (disabled by default) |
40-
| Inference Worker | 8465 | llama.cpp | LLM inference (CUDA + Metal) |
41-
| Diffusion Worker | 8455 | Python | Image and video generation (Stable Diffusion) |
40+
| Inference Worker | 8465 | llama.cpp | LLM inference (CUDA / ROCm / Vulkan / Metal / CPU) |
41+
| Diffusion Worker | 8455 | Python | Image and video generation (CUDA / ROCm / XPU / MPS / CPU) |
4242
| Quarantine | -- | Python | 7-stage verify, scan, and promote pipeline |
4343

4444
## Hardware Support
4545

46-
| Platform | GPU Acceleration | Notes |
47-
|----------|-----------------|-------|
48-
| NVIDIA RTX 5080 | CUDA (full offload) | Primary target; uses nvidia-open drivers |
49-
| NVIDIA RTX 4090/4080/3090 | CUDA (full offload) | Any RTX card with sufficient VRAM |
50-
| Apple M4 / M3 / M2 / M1 | Metal (via llama.cpp) | CPU-only container, Metal on host |
51-
| Any x86_64 | CPU fallback | Slower but functional |
46+
GPU is **auto-detected at first boot** — no manual configuration needed. The `detect-gpu.sh` script identifies your hardware and writes the optimal settings.
47+
48+
### Supported GPUs
49+
50+
| Vendor | GPUs | Backend | LLM (llama.cpp) | Diffusion (PyTorch) |
51+
|--------|------|---------|-----------------|-------------------|
52+
| **NVIDIA** | RTX 5090/5080/4090/4080/3090/3080, any CUDA GPU | CUDA | Full offload | Full offload |
53+
| **AMD** | RX 7900 XTX/XT, RX 7800/7700, RX 6900/6800, any RDNA/CDNA | ROCm (HIP) | Full offload | Full offload |
54+
| **Intel** | Arc A770/A750/A580, Arc B-series, Data Center Max | XPU (oneAPI) | Via Vulkan | Via IPEX |
55+
| **Apple** | M4/M3/M2/M1 (Pro/Max/Ultra) | Metal / MPS | Full offload | MPS acceleration |
56+
| **Any CPU** | x86_64 (AVX2/AVX-512), ARM64 (NEON) | CPU | Optimized | Functional |
57+
58+
### Backend Priority
59+
60+
The system auto-selects the best available backend in this order:
61+
1. **CUDA** (NVIDIA) — highest throughput for both LLM and diffusion
62+
2. **ROCm** (AMD) — near-CUDA performance on RDNA3/CDNA
63+
3. **MPS** (Apple Silicon) — Metal acceleration on macOS
64+
4. **XPU** (Intel Arc) — oneAPI/SYCL for discrete Intel GPUs
65+
5. **Vulkan** (cross-vendor) — universal GPU compute fallback for llama.cpp
66+
6. **CPU** — AVX2/AVX-512/NEON auto-vectorized, works on everything
67+
68+
### Security Note
69+
70+
All GPU backends run locally with the same sandboxing:
71+
- `PrivateNetwork=yes` — no network access regardless of GPU vendor
72+
- `DeviceAllow` restricts access to only the specific GPU device nodes needed
73+
- AMD ROCm uses `/dev/kfd` + `/dev/dri/*`; NVIDIA uses `/dev/nvidia*`; Intel uses `/dev/dri/*`
74+
- No cloud compute, no driver telemetry endpoints (blocked by nftables default-deny)
5275

5376
**Minimum requirements:**
5477

@@ -202,6 +225,9 @@ cd services/tool-firewall && go build -o ../../bin/tool-firewall . && cd ../..
202225
cd services/airlock && go build -o ../../bin/airlock . && cd ../..
203226

204227
# Install Python dependencies
228+
# For NVIDIA: pip install torch --index-url https://download.pytorch.org/whl/cu124
229+
# For AMD: pip install torch --index-url https://download.pytorch.org/whl/rocm6.1
230+
# For CPU: pip install torch --index-url https://download.pytorch.org/whl/cpu
205231
pip install flask requests pyyaml diffusers transformers accelerate torch safetensors
206232

207233
# Run the UI (Flask)
@@ -396,7 +422,7 @@ Every model — whether downloaded from the catalog or imported by the user —
396422
| **Tools** | Default-deny policy, path allowlisting, traversal protection, rate limiting |
397423
| **Egress** | Airlock disabled by default, PII/credential scanning, destination allowlist |
398424
| **Services** | Systemd sandboxing: ProtectSystem=strict, PrivateNetwork, syscall filters |
399-
| **GPU Isolation** | Diffusion worker sandboxed with explicit DeviceAllow for GPU access only |
425+
| **GPU Isolation** | Vendor-specific DeviceAllow (NVIDIA `/dev/nvidia*`, AMD `/dev/kfd`, Intel `/dev/dri/*`), PrivateNetwork on all |
400426
| **Emergency** | Panic switch: instant network kill + route flush + service stop |
401427

402428
### Systemd Sandboxing
@@ -412,10 +438,13 @@ Every service runs with defense-in-depth sandboxing:
412438
- `SystemCallFilter=@system-service` — restricted syscalls
413439
- `MemoryDenyWriteExecute=yes` — no JIT/RWX memory
414440

415-
The diffusion worker has additional GPU-specific sandboxing:
416-
- `DeviceAllow=/dev/nvidia* rw` and `DeviceAllow=/dev/dri/* rw` — explicit GPU access
441+
Both inference and diffusion workers have GPU-specific sandboxing:
442+
- `DeviceAllow=/dev/nvidia* rw` — NVIDIA CUDA access
443+
- `DeviceAllow=/dev/kfd rw` — AMD ROCm compute access
444+
- `DeviceAllow=/dev/dri/* rw` — AMD/Intel DRI render nodes
417445
- `ReadWritePaths=/var/lib/secure-ai/vault/outputs` — write only to outputs directory
418446
- `ReadOnlyPaths=/var/lib/secure-ai/registry` — read-only model access
447+
- Unused GPU device nodes are harmless — systemd silently ignores DeviceAllow for non-existent devices
419448

420449
### Verify Image Signatures
421450

@@ -440,6 +469,12 @@ All configuration lives in `/etc/secure-ai/` (baked into the image, read-only at
440469

441470
### Key Configuration Options
442471

472+
**GPU backend** (`config/appliance.yaml`):
473+
```yaml
474+
gpu:
475+
backend: "auto" # auto | cuda | rocm | xpu | vulkan | mps | cpu
476+
```
477+
443478
**Inference settings** (`config/appliance.yaml`):
444479
```yaml
445480
inference:
@@ -601,14 +636,28 @@ mount | grep secure-ai
601636
### GPU not detected
602637

603638
```bash
604-
# Check NVIDIA driver
605-
nvidia-smi
639+
# Re-run GPU detection
640+
sudo /usr/libexec/secure-ai/detect-gpu.sh
606641

607-
# If not loaded, check kernel modules
642+
# Check what was detected
643+
cat /var/lib/secure-ai/inference.env
644+
645+
# NVIDIA: check driver
646+
nvidia-smi
608647
lsmod | grep nvidia
609648

610-
# For Apple Silicon, GPU acceleration runs on the host (not in container)
611-
# Verify Metal support:
649+
# AMD: check ROCm
650+
rocminfo
651+
ls -la /dev/kfd /dev/dri/renderD128
652+
653+
# Intel: check DRI
654+
ls -la /dev/dri/renderD128
655+
cat /sys/class/drm/card0/device/vendor # should be 0x8086
656+
657+
# Vulkan (any vendor)
658+
vulkaninfo --summary
659+
660+
# Apple Silicon (Metal runs on host, not in container)
612661
system_profiler SPDisplaysDataType
613662
```
614663

files/scripts/build-services.sh

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,13 @@ WRAPPER
5656
chmod +x "${INSTALL_DIR}/ui"
5757
echo " -> ${INSTALL_DIR}/ui"
5858

59+
# Diffusion worker
60+
echo "Installing: diffusion-worker"
61+
DIFFUSION_DIR="/opt/secure-ai/services/diffusion-worker"
62+
mkdir -p "$DIFFUSION_DIR"
63+
cp /tmp/services/diffusion-worker/app.py "$DIFFUSION_DIR/app.py"
64+
echo " -> ${DIFFUSION_DIR}/app.py"
65+
5966
# Cleanup build artifacts
6067
rm -rf "$SRC_DIR"
6168
dnf remove -y golang 2>/dev/null || true

files/system/etc/secure-ai/config/appliance.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,11 @@ paths:
1212
tmpdir: "/run/secure-ai/tmp"
1313
outputs: "/var/lib/secure-ai/vault/outputs"
1414

15+
gpu:
16+
# Auto-detected at first boot by detect-gpu.sh. Override here if needed.
17+
# backend: auto | cuda | rocm | xpu | vulkan | mps | cpu
18+
backend: "auto"
19+
1520
inference:
1621
engine: "llama-cpp"
1722
bind: "127.0.0.1:8465"

files/system/usr/lib/systemd/system/secure-ai-diffusion.service

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ Environment=OUTPUTS_DIR=/var/lib/secure-ai/vault/outputs
1515
Environment=APPLIANCE_CONFIG=/etc/secure-ai/config/appliance.yaml
1616
Environment=MAX_RESOLUTION=2048
1717
Environment=MAX_STEPS=100
18+
EnvironmentFile=-/var/lib/secure-ai/inference.env
1819

1920
# Sandboxing
2021
ProtectSystem=strict
@@ -29,9 +30,13 @@ ProtectControlGroups=yes
2930
NoNewPrivileges=yes
3031
RestrictSUIDSGID=yes
3132
MemoryDenyWriteExecute=no
32-
# GPU access requires broader syscalls and device access
33+
# GPU access — NVIDIA (CUDA), AMD (ROCm), Intel (DRI/XPU)
3334
DeviceAllow=/dev/nvidia* rw
35+
DeviceAllow=/dev/nvidiactl rw
36+
DeviceAllow=/dev/nvidia-uvm rw
37+
DeviceAllow=/dev/nvidia-uvm-tools rw
3438
DeviceAllow=/dev/dri/* rw
39+
DeviceAllow=/dev/kfd rw
3540
SupplementaryGroups=video render
3641

3742
[Install]

files/system/usr/lib/systemd/system/secure-ai-inference.service

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,12 +58,13 @@ SystemCallFilter=~@privileged @mount @clock @debug @swap @reboot @module @cpu-em
5858
SystemCallArchitectures=native
5959
SystemCallErrorNumber=EPERM
6060

61-
# GPU access — do NOT set PrivateDevices (need /dev/nvidia*, /dev/dri)
61+
# GPU access — NVIDIA (CUDA), AMD (ROCm via /dev/kfd + /dev/dri), Intel (DRI)
6262
DeviceAllow=/dev/nvidia* rw
6363
DeviceAllow=/dev/dri/* rw
6464
DeviceAllow=/dev/nvidiactl rw
6565
DeviceAllow=/dev/nvidia-uvm rw
6666
DeviceAllow=/dev/nvidia-uvm-tools rw
67+
DeviceAllow=/dev/kfd rw
6768

6869
# Resource limits — generous for inference
6970
MemoryMax=32G
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
#!/bin/bash
2+
#
3+
# Detect available GPU compute backends and write results to inference.env.
4+
# Called by secure-ai-firstboot.service and can be re-run manually.
5+
# Writes: GPU_BACKEND, GPU_NAME, GPU_LAYERS to /var/lib/secure-ai/inference.env
6+
#
7+
set -euo pipefail
8+
9+
ENV_FILE="/var/lib/secure-ai/inference.env"
10+
BACKEND="cpu"
11+
GPU_NAME="CPU (no GPU detected)"
12+
GPU_LAYERS="0"
13+
14+
echo "=== SecAI GPU Detection ==="
15+
16+
# --- NVIDIA (CUDA) ---
17+
if command -v nvidia-smi &>/dev/null && nvidia-smi &>/dev/null; then
18+
BACKEND="cuda"
19+
GPU_NAME=$(nvidia-smi --query-gpu=name --format=csv,noheader,nounits | head -1)
20+
GPU_LAYERS="-1"
21+
echo "Detected NVIDIA GPU: ${GPU_NAME}"
22+
23+
# --- AMD (ROCm) ---
24+
elif [ -e /dev/kfd ] && [ -e /dev/dri/renderD128 ]; then
25+
BACKEND="rocm"
26+
# Try rocminfo first, fall back to DRI
27+
if command -v rocminfo &>/dev/null; then
28+
GPU_NAME=$(rocminfo 2>/dev/null | grep -m1 "Marketing Name" | sed 's/.*: *//' || echo "AMD GPU")
29+
else
30+
GPU_NAME=$(cat /sys/class/drm/card0/device/product_name 2>/dev/null || echo "AMD GPU")
31+
fi
32+
GPU_LAYERS="-1"
33+
echo "Detected AMD GPU (ROCm): ${GPU_NAME}"
34+
35+
# --- Intel (XPU / Arc / integrated) ---
36+
elif [ -e /dev/dri/renderD128 ]; then
37+
# Check if it's an Intel GPU via sysfs
38+
DRM_VENDOR=$(cat /sys/class/drm/card0/device/vendor 2>/dev/null || echo "")
39+
if [ "$DRM_VENDOR" = "0x8086" ]; then
40+
BACKEND="xpu"
41+
GPU_NAME=$(cat /sys/class/drm/card0/device/product_name 2>/dev/null || echo "Intel GPU")
42+
# Intel Arc discrete GPUs get full offload; integrated gets partial
43+
if command -v intel_gpu_top &>/dev/null || [[ "$GPU_NAME" == *"Arc"* ]]; then
44+
GPU_LAYERS="-1"
45+
else
46+
GPU_LAYERS="0" # integrated Intel — CPU inference is usually faster
47+
fi
48+
echo "Detected Intel GPU: ${GPU_NAME}"
49+
else
50+
echo "DRI device found but vendor ${DRM_VENDOR} not recognized for compute"
51+
fi
52+
fi
53+
54+
# --- Vulkan fallback check ---
55+
if [ "$BACKEND" = "cpu" ] && command -v vulkaninfo &>/dev/null; then
56+
VULKAN_GPU=$(vulkaninfo --summary 2>/dev/null | grep -m1 "deviceName" | sed 's/.*= *//' || echo "")
57+
if [ -n "$VULKAN_GPU" ]; then
58+
BACKEND="vulkan"
59+
GPU_NAME="$VULKAN_GPU (Vulkan)"
60+
GPU_LAYERS="-1"
61+
echo "Detected Vulkan-capable GPU: ${GPU_NAME}"
62+
fi
63+
fi
64+
65+
echo "Result: backend=${BACKEND} gpu=${GPU_NAME} layers=${GPU_LAYERS}"
66+
67+
# Write environment file for inference and diffusion services
68+
mkdir -p "$(dirname "$ENV_FILE")"
69+
cat > "$ENV_FILE" <<EOF
70+
# Auto-detected by detect-gpu.sh — re-run to update
71+
GPU_BACKEND=${BACKEND}
72+
GPU_NAME=${GPU_NAME}
73+
GPU_LAYERS=${GPU_LAYERS}
74+
EOF
75+
76+
echo "Written to ${ENV_FILE}"
77+
echo "=== GPU Detection Complete ==="

files/system/usr/libexec/secure-ai/firstboot.sh

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,17 @@ EOF
7979
chmod 644 "${SECURE_AI_ROOT}/registry/manifest.json"
8080
fi
8181

82+
# Detect GPU and write inference.env
83+
log "Running GPU detection..."
84+
/usr/libexec/secure-ai/detect-gpu.sh 2>&1 | while IFS= read -r line; do log "$line"; done || {
85+
log "WARNING: GPU detection failed. Defaulting to CPU."
86+
cat > "${SECURE_AI_ROOT}/inference.env" <<'GPUEOF'
87+
GPU_BACKEND=cpu
88+
GPU_NAME=CPU (detection failed)
89+
GPU_LAYERS=0
90+
GPUEOF
91+
}
92+
8293
# Disable swap (belt-and-suspenders alongside kernel arg)
8394
log "Ensuring swap is disabled..."
8495
swapoff -a 2>/dev/null || true

recipes/recipe.yml

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ base-image: ghcr.io/ublue-os/silverblue-main
77
image-version: 42
88

99
modules:
10-
# 1) Install required packages.
10+
# 1) Install required packages (includes GPU compute libraries).
1111
- type: rpm-ostree
1212
install:
1313
- nftables
@@ -20,6 +20,13 @@ modules:
2020
- python3-flask
2121
- python3-requests
2222
- golang
23+
# GPU compute support
24+
- mesa-dri-drivers # Intel/AMD OpenGL + Vulkan
25+
- mesa-vulkan-drivers # Vulkan for Intel/AMD
26+
- vulkan-loader # Vulkan ICD loader
27+
- vulkan-tools # vulkaninfo for diagnostics
28+
- libdrm # DRM library (all GPUs)
29+
- clinfo # OpenCL diagnostics
2330

2431
# 2) Copy appliance config, systemd units, firewall rules, sysctl into image.
2532
- type: files
@@ -32,18 +39,23 @@ modules:
3239
scripts:
3340
- build-services.sh
3441

35-
# 4) Install NVIDIA kernel modules (nvidia-open for RTX 5080).
42+
# 4) NVIDIA kernel modules (nvidia-open for RTX 5080+).
43+
# AMD and Intel use in-kernel drivers (amdgpu, i915/xe) — no akmods needed.
3644
- type: akmods
3745
base: main
3846
install: []
3947
nvidia-driver: nvidia-open
4048

41-
# 5) Kernel args: NVIDIA setup + disable swap + security hardening.
49+
# 5) Kernel args: GPU setup + disable swap + security hardening.
4250
- type: kargs
4351
append:
52+
# NVIDIA
4453
- rd.driver.blacklist=nouveau
4554
- modprobe.blacklist=nouveau
4655
- nvidia-drm.modeset=1
56+
# AMD — amdgpu is in-kernel, just ensure it's preferred
57+
- amdgpu.dc=1
58+
# Security hardening
4759
- systemd.swap=0
4860
- slab_nomerge
4961
- init_on_alloc=1
@@ -65,6 +77,7 @@ modules:
6577
- secure-ai-ui.service
6678
- secure-ai-quarantine-watcher.service
6779
- secure-ai-inference.service
80+
- secure-ai-diffusion.service
6881
- nftables.service
6982
- secure-ai-firstboot.service
7083
- secure-ai-tmpdir.mount

services/diffusion-worker/Containerfile

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# Diffusion worker: image and video generation via diffusers library.
2-
# Build arg COMPUTE selects CUDA or CPU-only.
2+
# Build arg COMPUTE selects the GPU backend:
3+
# cuda - NVIDIA GPUs (CUDA 12.4)
4+
# rocm - AMD GPUs (ROCm 6.1)
5+
# cpu - CPU only (AVX2/AVX-512 optimized, works everywhere)
36
ARG COMPUTE=cuda
47

58
FROM python:3.12-slim AS base
@@ -10,9 +13,11 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
1013
libgl1-mesa-glx libglib2.0-0 && \
1114
rm -rf /var/lib/apt/lists/*
1215

13-
# Install PyTorch (CUDA or CPU)
16+
# Install PyTorch with the appropriate compute backend
1417
RUN if [ "$COMPUTE" = "cuda" ]; then \
1518
pip install --no-cache-dir torch torchvision --index-url https://download.pytorch.org/whl/cu124; \
19+
elif [ "$COMPUTE" = "rocm" ]; then \
20+
pip install --no-cache-dir torch torchvision --index-url https://download.pytorch.org/whl/rocm6.1; \
1621
else \
1722
pip install --no-cache-dir torch torchvision --index-url https://download.pytorch.org/whl/cpu; \
1823
fi
@@ -26,6 +31,11 @@ RUN pip install --no-cache-dir \
2631
pyyaml \
2732
Pillow
2833

34+
# Intel XPU support (install IPEX when building for CPU — it auto-detects Intel GPUs)
35+
RUN if [ "$COMPUTE" = "cpu" ]; then \
36+
pip install --no-cache-dir intel-extension-for-pytorch 2>/dev/null || true; \
37+
fi
38+
2939
COPY app.py /app/app.py
3040

3141
WORKDIR /app

0 commit comments

Comments
 (0)