Skip to content

Commit b7e06ff

Browse files
mios-devclaude
andcommitted
WSL AMD/Intel iGPU detection via Windows interop + wsl2-amd/intel.yaml
Operator-flagged 2026-05-17: "we have and AMD iGPU here on the Windows Host's (WSL--Podman-WSL/MiOS) Machine!!" -- the Ryzen iGPU was undetected because mios-cdi-detect's AMD branch only checked for /dev/kfd, which doesn't exist under WSL2 (AMD iGPUs there share /dev/dxg with NVIDIA via the WDDM paravirt driver). Two-part fix: 1. mios-cdi-detect detection block now queries the Windows host via powershell.exe Get-CimInstance Win32_VideoController when /dev/dxg is present. Matches AMD/Radeon and Intel name regexes case-insensitively. New HAS_AMD_WSL / HAS_INTEL_WSL flags so the downstream spec-generation branches know whether to hand- roll the WSL CDI spec or fall back to amd-ctk / intel-cdi- specs-generator. 5s timeout on the PSH call; fails open (boot never breaks on enumeration failure). Live on this host: "windows GPUs: Microsoft Remote Display Adapter, AMD Radeon(TM) Graphics, NVIDIA GeForce RTX 4090" -> nvidia=1 amd=1 (wsl=1) intel=0 (wsl=0). 2. New WSL CDI spec generation. Mirrors the existing wsl2-nvidia .yaml pattern (which hand-rolls /dev/dxg + /usr/lib/wsl rbind when nvidia-ctk is unavailable). Two new specs: /run/cdi/wsl2-amd.yaml kind: amd.com/gpu name: all /run/cdi/wsl2-intel.yaml kind: intel.com/gpu name: all Both register /dev/dxg as the device node + rbind-mount /usr/lib/wsl so the container can use Vulkan (mesa3d via WSL) or DirectML against the Windows-side AMD/Intel driver. Bare- metal hosts still get amd-ctk / intel-cdi-specs-generator output via the existing branches (HAS_*_WSL=0 path). 3. mios-gpu-passthrough now recognizes the WSL spec naming too (wsl2-amd.yaml / wsl2-intel.yaml) -- the earlier only-match for amd.json/amd.yaml meant the helper saw AMD_PRESENT=0 even though the WSL spec was sitting at /run/cdi/wsl2-amd.yaml. Operator clarification on scope ("iGPU's are ONLY micro-llms" -> "wire to ollama!! JUST ONLY USES Micro-llms in MiOS stack"): ollama IS in MIOS_AI_QUADLETS by design. MiOS-stack consumers of Ollama (mios-daemon -> qwen3:0.6b-cpu, pipe refine/polish -> qwen2.5-coder:7b, prefilter -> small) only ask for small models; big-model tags on disk (qwen3-coder:30b, gpt-oss:20b) are parked for ad-hoc operator use. With NVIDIA + AMD both registered as CDI devices, Ollama spreads the MiOS-stack small models across them without contending with the dGPU big-model VRAM. Comment block in mios-gpu-passthrough reflects the two-step directive. Live verification after restart of mios-cdi-detect.service: * status JSON shows nvidia=1, amd=1, intel=0 * drop-ins written to ollama.container.d + mios-open-webui. container.d with AddDevice=amd.com/gpu=all * ollama systemd ExecStart now reads `--device nvidia.com/gpu=all --device amd.com/gpu=all` Day-0 deployable: all in /usr/libexec/mios/ + /usr/lib/systemd/, both in repo; fresh clone + image build wires automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 1f765ea commit b7e06ff

2 files changed

Lines changed: 126 additions & 9 deletions

File tree

usr/libexec/mios/mios-cdi-detect

Lines changed: 96 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,13 @@ VIRT=$(systemd-detect-virt 2>/dev/null || echo none)
3131
HAS_NVIDIA=0
3232
HAS_AMD=0
3333
HAS_INTEL=0
34+
# WSL-detected flags: AMD/Intel iGPU on WSL is exposed via /dev/dxg
35+
# (not /dev/kfd or renderD*). When set, the iGPU branch writes a
36+
# hand-rolled wsl2-<vendor>.yaml CDI spec instead of relying on the
37+
# Linux-side vendor toolkit.
38+
HAS_AMD_WSL=0
39+
HAS_INTEL_WSL=0
40+
3441
[[ -e /dev/nvidia0 || -e /dev/dxg ]] && HAS_NVIDIA=1
3542
[[ -e /dev/kfd ]] && HAS_AMD=1
3643
# Intel iGPU/Arc detection: a renderD* node whose vendor reads 0x8086.
@@ -46,7 +53,41 @@ for r in /dev/dri/renderD*; do
4653
fi
4754
done
4855

49-
_log "virt=$VIRT nvidia=$HAS_NVIDIA amd=$HAS_AMD intel=$HAS_INTEL"
56+
# ── WSL2 AMD/Intel iGPU detection via Windows interop ──────────────
57+
# Under WSL2, AMD APUs (Ryzen with Radeon Graphics) and Intel iGPUs
58+
# do NOT surface as /dev/kfd or /dev/dri/renderD* -- they share the
59+
# /dev/dxg paravirt interface with NVIDIA. So the Linux-side probes
60+
# above can't tell which Windows GPU(s) are present; we have to ask
61+
# Windows via WMI. Operator-flagged 2026-05-17: Ryzen iGPU on the
62+
# Windows host was undetected because /dev/kfd doesn't exist in WSL,
63+
# so the AMD branch never ran and the iGPU never reached any
64+
# container.
65+
if [[ "$VIRT" == "wsl" || -e /dev/dxg ]]; then
66+
PSH=/mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe
67+
if [[ -x "$PSH" ]]; then
68+
# Capped 5s; never fails the boot. The first
69+
# "Microsoft Remote Display Adapter" is the WDDM paravirt
70+
# one; real GPUs follow.
71+
WIN_GPU_NAMES=$(timeout 5 "$PSH" -NoProfile -Command \
72+
"Get-CimInstance Win32_VideoController | Select-Object -ExpandProperty Name" \
73+
2>/dev/null | tr -d '\r' | tr '\n' '|')
74+
if [[ -n "$WIN_GPU_NAMES" ]]; then
75+
shopt -s nocasematch
76+
if [[ "$WIN_GPU_NAMES" =~ (AMD|Radeon) ]]; then
77+
HAS_AMD=1; HAS_AMD_WSL=1
78+
fi
79+
if [[ "$WIN_GPU_NAMES" =~ Intel ]]; then
80+
HAS_INTEL=1; HAS_INTEL_WSL=1
81+
fi
82+
shopt -u nocasematch
83+
_log "windows GPUs: ${WIN_GPU_NAMES//|/, }"
84+
else
85+
_log "wsl: powershell GPU query returned empty (skipping iGPU detection)"
86+
fi
87+
fi
88+
fi
89+
90+
_log "virt=$VIRT nvidia=$HAS_NVIDIA amd=$HAS_AMD (wsl=$HAS_AMD_WSL) intel=$HAS_INTEL (wsl=$HAS_INTEL_WSL)"
5091

5192
# ── NVIDIA ───────────────────────────────────────────────────────────
5293
# Prefer upstream nvidia-cdi-refresh.service when its unit is installed
@@ -118,11 +159,10 @@ fi
118159
# by default; podman accepts both .yaml and .json under /etc/cdi or
119160
# /run/cdi. Spec naming follows the CNCF CDI convention:
120161
# vendor.com/class=identifier -> amd.com/gpu=all.
121-
if [[ "$HAS_AMD" == 1 ]] && command -v amd-ctk >/dev/null 2>&1; then
162+
if [[ "$HAS_AMD" == 1 ]] && command -v amd-ctk >/dev/null 2>&1 && [[ "$HAS_AMD_WSL" == 0 ]]; then
163+
# Bare-metal / true-VM path: amd-ctk reads /dev/kfd + /dev/dri.
122164
if amd-ctk cdi generate --output=/run/cdi/amd.json 2>&1 | logger -t mios-cdi-detect; then
123165
_log "amd: wrote /run/cdi/amd.json"
124-
# Validate the spec with the toolkit's own checker; mark the
125-
# spec stale if it fails so podman doesn't pick up a broken file.
126166
if amd-ctk cdi validate --path=/run/cdi/amd.json >/dev/null 2>&1; then
127167
_log "amd: spec validated"
128168
else
@@ -132,6 +172,35 @@ if [[ "$HAS_AMD" == 1 ]] && command -v amd-ctk >/dev/null 2>&1; then
132172
else
133173
_log "amd: amd-ctk cdi generate failed (non-fatal)"
134174
fi
175+
elif [[ "$HAS_AMD_WSL" == 1 ]]; then
176+
# WSL2 path: AMD iGPU is exposed via /dev/dxg + /usr/lib/wsl libs
177+
# (DirectX paravirt). amd-ctk doesn't know how to generate this
178+
# shape, so we hand-roll the CDI spec the same way the WSL2
179+
# NVIDIA fallback does. Container can then use Vulkan (mesa3d)
180+
# or DirectML against the Windows-side AMD driver.
181+
#
182+
# Operator directive 2026-05-17: "iGPU's are ONLY micro-llms" --
183+
# this spec REGISTERS the device with podman so containers that
184+
# explicitly request AddDevice=amd.com/gpu=all get it. The base
185+
# mios-ollama (big-model, NVIDIA-lane) Quadlet does NOT request
186+
# this; only micro-LLM container(s) do.
187+
out=/run/cdi/wsl2-amd.yaml
188+
cat > "$out" <<'WSLAMDCDI'
189+
cdiVersion: "0.6.0"
190+
kind: amd.com/gpu
191+
devices:
192+
- name: all
193+
containerEdits:
194+
deviceNodes:
195+
- path: /dev/dxg
196+
mounts:
197+
- hostPath: /usr/lib/wsl
198+
containerPath: /usr/lib/wsl
199+
options: ["ro", "nosuid", "nodev", "rbind"]
200+
env:
201+
- LD_LIBRARY_PATH=/usr/lib/wsl/lib:/usr/local/amd/lib
202+
WSLAMDCDI
203+
_log "amd: wrote $out (WSL2 hand-rolled CDI; AMD iGPU via /dev/dxg + DirectX)"
135204
elif [[ "$HAS_AMD" == 1 ]]; then
136205
_log "amd: /dev/kfd present but amd-ctk missing -- run automation/41-gpu-cdi-toolkits.sh"
137206
fi
@@ -142,7 +211,7 @@ fi
142211
# produce a /etc/cdi/intel.yaml. Best-effort: this binary is at v0.x
143212
# upstream and lacks the polish of nvidia-ctk / amd-ctk -- failures
144213
# here are logged but never break the boot.
145-
if [[ "$HAS_INTEL" == 1 ]]; then
214+
if [[ "$HAS_INTEL" == 1 && "$HAS_INTEL_WSL" == 0 ]]; then
146215
INTEL_GEN=""
147216
for cand in /usr/libexec/mios/intel-cdi-specs-generator \
148217
/usr/local/bin/intel-cdi-specs-generator \
@@ -158,6 +227,28 @@ if [[ "$HAS_INTEL" == 1 ]]; then
158227
else
159228
_log "intel: GPU present but intel-cdi-specs-generator missing -- run automation/41-gpu-cdi-toolkits.sh"
160229
fi
230+
elif [[ "$HAS_INTEL_WSL" == 1 ]]; then
231+
# WSL2 path mirrors the AMD WSL branch above: hand-rolled CDI
232+
# spec keyed as intel.com/gpu using the same /dev/dxg + WSL libs.
233+
# Per operator directive 2026-05-17: "iGPU's are ONLY micro-llms"
234+
# -- only micro-LLM Quadlets should request this device.
235+
out=/run/cdi/wsl2-intel.yaml
236+
cat > "$out" <<'WSLINTELCDI'
237+
cdiVersion: "0.6.0"
238+
kind: intel.com/gpu
239+
devices:
240+
- name: all
241+
containerEdits:
242+
deviceNodes:
243+
- path: /dev/dxg
244+
mounts:
245+
- hostPath: /usr/lib/wsl
246+
containerPath: /usr/lib/wsl
247+
options: ["ro", "nosuid", "nodev", "rbind"]
248+
env:
249+
- LD_LIBRARY_PATH=/usr/lib/wsl/lib:/usr/local/intel/lib
250+
WSLINTELCDI
251+
_log "intel: wrote $out (WSL2 hand-rolled CDI; Intel iGPU via /dev/dxg + DirectX)"
161252
fi
162253

163254
# ── Status snapshot for the dashboard / mios-boot-diag ───────────────

usr/libexec/mios/mios-gpu-passthrough

Lines changed: 30 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,22 @@
2222
#
2323
# AI Quadlet list (extend by editing MIOS_AI_QUADLETS below or
2424
# overriding via /etc/mios/gpu-passthrough.conf):
25-
# ollama, mios-open-webui (RAG vectoring may use GPU),
26-
# mios-searxng (no AI; left out).
25+
#
26+
# Operator directive 2026-05-17 (two-step clarification):
27+
# first: "iGPU's are ONLY micro-llms"
28+
# then: "wire to ollama!! JUST ONLY USES Micro-llms in MiOS stack"
29+
#
30+
# Resolution: ollama IS in the list. MiOS-stack consumers of
31+
# Ollama only ask for small models (mios-daemon -> qwen3:0.6b-cpu,
32+
# pipe refine/polish -> qwen2.5-coder:7b, prefilter -> small).
33+
# The big-model tags on disk (qwen3-coder:30b, gpt-oss:20b) are
34+
# parked for ad-hoc operator use; Ollama's auto-scheduling lands
35+
# the small MiOS-stack models on whichever device has room. With
36+
# NVIDIA dGPU + AMD iGPU both registered as CDI devices, Ollama
37+
# spreads load across them; small models fit on the iGPU's WSL
38+
# DXG path (no VRAM contention with the dGPU), bigger ad-hoc
39+
# loads keep the dGPU free.
40+
#
2741
# Add a new Quadlet -> one line in the list; helper writes the
2842
# correct drop-in on next boot. Per operator directive
2943
# 2026-05-17: "make sure ALL is in code and can deploy Day-0 from
@@ -75,13 +89,25 @@ log() { logger -t mios-gpu-passthrough "$*" 2>/dev/null || true; echo "[gpu-pass
7589
NVIDIA_PRESENT=0
7690
AMD_PRESENT=0
7791
INTEL_PRESENT=0
92+
# Match every spec layout mios-cdi-detect can emit:
93+
# * <vendor>.yaml / .json -- bare-metal vendor toolkit output
94+
# * wsl2-<vendor>.yaml -- WSL2 hand-rolled (/dev/dxg + WSL libs)
95+
# * nvidia-wsl.yaml -- legacy nvidia-ctk WSL output naming
7896
for spec in "$CDI_DIR"/nvidia.yaml \
7997
"$CDI_DIR"/nvidia-wsl.yaml \
8098
"$CDI_DIR"/wsl2-nvidia.yaml; do
8199
[ -f "$spec" ] && { NVIDIA_PRESENT=1; break; }
82100
done
83-
[ -f "$CDI_DIR/amd.json" ] || [ -f "$CDI_DIR/amd.yaml" ] && AMD_PRESENT=1
84-
[ -f "$CDI_DIR/intel.yaml" ] || [ -f "$CDI_DIR/intel.json" ] && INTEL_PRESENT=1
101+
for spec in "$CDI_DIR"/amd.json \
102+
"$CDI_DIR"/amd.yaml \
103+
"$CDI_DIR"/wsl2-amd.yaml; do
104+
[ -f "$spec" ] && { AMD_PRESENT=1; break; }
105+
done
106+
for spec in "$CDI_DIR"/intel.yaml \
107+
"$CDI_DIR"/intel.json \
108+
"$CDI_DIR"/wsl2-intel.yaml; do
109+
[ -f "$spec" ] && { INTEL_PRESENT=1; break; }
110+
done
85111

86112
log "CDI present: nvidia=$NVIDIA_PRESENT amd=$AMD_PRESENT intel=$INTEL_PRESENT"
87113

0 commit comments

Comments
 (0)