Skip to content

feat(windows): DirectML GPU acceleration for Intel iGPU (Iris/UHD/Arc)#674

Open
RajeshKumar11 wants to merge 2 commits into
jamiepine:mainfrom
RajeshKumar11:feature/directml-windows-igpu
Open

feat(windows): DirectML GPU acceleration for Intel iGPU (Iris/UHD/Arc)#674
RajeshKumar11 wants to merge 2 commits into
jamiepine:mainfrom
RajeshKumar11:feature/directml-windows-igpu

Conversation

@RajeshKumar11
Copy link
Copy Markdown

@RajeshKumar11 RajeshKumar11 commented May 17, 2026

Closes #628

Summary

Add torch-directml support so Windows users with Intel integrated graphics (Iris Xe, UHD Graphics, Arc) get hardware-accelerated TTS/STT without needing NVIDIA CUDA.

Changes

backend/backends/base.py

  • Add _detect_iris_igpu() using WMI to identify Intel iGPU by name
  • Add allow_directml parameter to get_torch_device() with priority chain: CUDA > XPU > DirectML > MPS > CPU

backend/backends/pytorch_backend.py

  • Pass allow_directml=True in TTS and STT backends so DirectML is used automatically on eligible hardware

backend/requirements.txt

  • Add torch-directml>=0.2.0 ; platform_system == "Windows" (Windows-only PEP 508 marker)

backend/tests/test_directml_iris.py (new)

  • Windows-only tests for DirectML import, device count, device creation, tensor ops, memory management, and Iris iGPU WMI detection

Test plan

  • get_torch_device(allow_directml=True) returns a DirectML device on Windows with Intel iGPU
  • Server log shows "Using DirectML device" on Intel Iris/UHD/Arc hardware
  • Existing CUDA path unaffected (allow_directml only activates when CUDA/XPU are absent)
  • pytest backend/tests/test_directml_iris.py -v passes on Windows with torch-directml installed

Hardware tested

Intel 12th-gen i5 + Iris Xe Graphics, Windows 11, CPU-only PyTorch + torch-directml 0.2.5

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added Windows DirectML support with automatic detection of Intel integrated GPUs (Iris/UHD/Arc) and improved selection of GPU acceleration.
  • Documentation

    • Clarified device selection priority order: CUDA > XPU > DirectML > CPU.
  • Chores

    • Added optional Windows DirectML dependency.
  • Tests

    • Added Windows-only tests validating DirectML availability, Intel iGPU detection, device selection, basic tensor ops, and model load/unload on DirectML.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds Windows DirectML support with Intel Iris/iGPU detection. It introduces a WMI-based Iris detection helper, reworks DirectML device selection with device count validation and targeted logging, updates backend docstrings to clarify device priority, adds torch-directml as a Windows-only dependency, and provides a comprehensive test suite covering DirectML availability, device creation, tensor operations, and model loading.

Changes

Windows DirectML and Intel Iris iGPU Support

Layer / File(s) Summary
Iris Detection and DirectML Device Selection
backend/backends/base.py, backend/backends/pytorch_backend.py
_detect_iris_igpu() uses WMI to detect Intel Iris/UHD/Arc GPUs on Windows. get_torch_device() now checks torch_directml.device_count() before selecting device(0), logs whether Iris was detected versus generic DirectML acceleration, and improves error handling for missing or failing torch_directml. Backend _get_device() docstrings document priority order: CUDA > XPU > DirectML > CPU.
Windows DirectML Requirement and Test Suite
backend/requirements.txt, backend/tests/test_directml_iris.py
Adds torch-directml>=0.2.0 for Windows systems. New test module exercises DirectML import, device count/creation, Iris detection, tensor operations and memory management on DirectML, and async smoke tests for loading/unloading Whisper STT and Qwen TTS models; tests skip gracefully when DirectML is unavailable or when network/environment issues prevent model downloads.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • #628: Directly addresses the feature request for Windows DirectML support to expand iGPU acceleration for Intel Iris/Arc GPUs as an alternative to dedicated NVIDIA hardware.

Poem

🐰 Iris gleams on Windows' field tonight,
WMI listens, names the iGPU right,
DirectML wakes and shifts the tiny load,
Models load, then leap along the road,
Hooray — a rabbit hops where CPUs once toiled.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 79.17% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly summarizes the main change: adding DirectML GPU acceleration for Intel iGPU on Windows, which aligns with the primary feature introduced in the changeset.
Linked Issues check ✅ Passed The PR successfully implements Intel iGPU acceleration via DirectML with detection logic, device priority handling, and comprehensive tests, directly addressing the core objectives from #628.
Out of Scope Changes check ✅ Passed All changes are directly related to adding DirectML support for Intel iGPU on Windows; no unrelated or out-of-scope modifications were introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/build_binary.py`:
- Around line 36-39: The build now always enables use_spec_file when
voicebox-server.spec exists which prevents the CUDA packaging branch from
running; update the logic that sets use_spec_file (and the later checks around
lines where CUDA handling occurs) to only use the spec file when it exists AND
the --cuda flag is not set (e.g., change use_spec_file = spec_path.exists() to
use_spec_file = spec_path.exists() and not args.cuda) so the CUDA branch in the
CUDA-handling block (the branch that checks args.cuda) executes when --cuda is
provided.

In `@backend/requirements.txt`:
- Around line 17-19: The requirements file exposes a Windows-only package
unconditionally: the dependency line for torch-directml should be guarded with a
PEP 508 environment marker so it only installs on Windows. Update the
torch-directml requirement (the entry "torch-directml>=0.2.0") to append the
platform marker ; platform_system == "Windows" so non-Windows installs skip it
(adjust marker if you later need to include WSL explicitly).

In `@backend/tests/test_directml_iris.py`:
- Around line 143-149: The test currently catches all Exceptions around
backend.load_model_async("base") (and the similar block at 173-179), which hides
genuine regressions; change the broad except Exception to only handle expected
environmental/network errors by either catching specific exception types (e.g.,
asyncio.TimeoutError, OSError, or your project's NetworkError) or by inspecting
the exception message for known transient indicators (timeout/network/offline),
and for any other exception re-raise it after logging; update the handlers
around backend.load_model_async, backend.is_loaded, and backend.unload_model
accordingly (apply same fix to the second block at lines 173-179).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: adc8e8d7-ee8d-4dc3-873f-91441dab74c1

📥 Commits

Reviewing files that changed from the base of the PR and between b35b909 and fa88c0e.

📒 Files selected for processing (7)
  • backend/backends/base.py
  • backend/backends/pytorch_backend.py
  • backend/build_binary.py
  • backend/pyi_rth_torchaudio_compat.py
  • backend/requirements.txt
  • backend/tests/test_directml_iris.py
  • backend/voicebox-server.spec

Comment thread backend/build_binary.py Outdated
Comment thread backend/requirements.txt Outdated
Comment thread backend/tests/test_directml_iris.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
backend/tests/test_directml_iris.py (1)

143-149: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t let model-load failures log-and-pass.

These smoke tests still go green on real DirectML/model init regressions because RuntimeError is swallowed and only logged. Skip only expected environmental failures; otherwise re-raise so the suite actually protects the DirectML path.

Suggested fix
-        except (OSError, RuntimeError, ImportError) as e:
-            logger.warning(f"Model load test failed (expected on slow/offline systems): {e}")
+        except (OSError, ImportError) as e:
+            pytest.skip(f"Environment limitation during model load: {e}")
-        except (OSError, RuntimeError, ImportError) as e:
-            logger.warning(f"Model load test failed (expected on slow/offline systems): {e}")
+        except (OSError, ImportError) as e:
+            pytest.skip(f"Environment limitation during model load: {e}")

Also applies to: 173-179

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/tests/test_directml_iris.py` around lines 143 - 149, The test
currently swallows RuntimeError during model load; change the exception handling
so only expected environmental errors (e.g., OSError, ImportError) are caught
and logged, while unexpected failures (RuntimeError and others) are re-raised so
the test fails — update the try/except around backend.load_model_async("base")
(and the similar block around backend.load_model_async in the other test) to
catch only OSError and ImportError or, if catching a broader set, re-raise when
isinstance(e, RuntimeError) or not an expected env error; keep calls to
backend.is_loaded() and backend.unload_model() as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/tests/test_directml_iris.py`:
- Line 117: The logger.info call uses an unnecessary f-string prefix; update the
logger.info invocation in test_directml_iris.py (the line calling logger.info("✓
DirectML memory management OK")) to remove the stray f so it is a plain string
literal (i.e., replace logger.info(f"✓ DirectML memory management OK") with
logger.info("✓ DirectML memory management OK")) to satisfy the Ruff F541 lint
rule.

---

Duplicate comments:
In `@backend/tests/test_directml_iris.py`:
- Around line 143-149: The test currently swallows RuntimeError during model
load; change the exception handling so only expected environmental errors (e.g.,
OSError, ImportError) are caught and logged, while unexpected failures
(RuntimeError and others) are re-raised so the test fails — update the
try/except around backend.load_model_async("base") (and the similar block around
backend.load_model_async in the other test) to catch only OSError and
ImportError or, if catching a broader set, re-raise when isinstance(e,
RuntimeError) or not an expected env error; keep calls to backend.is_loaded()
and backend.unload_model() as-is.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1b6c5af-b55e-488f-850d-51a1f6bd4779

📥 Commits

Reviewing files that changed from the base of the PR and between fa88c0e and 97c5fc5.

📒 Files selected for processing (3)
  • backend/build_binary.py
  • backend/requirements.txt
  • backend/tests/test_directml_iris.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/requirements.txt
  • backend/build_binary.py

Comment thread backend/tests/test_directml_iris.py Outdated
Add torch-directml support so Windows users with Intel integrated graphics
get hardware-accelerated TTS/STT without needing NVIDIA CUDA.

- base.py: add _detect_iris_igpu() via WMI to identify Intel iGPU; add
  allow_directml parameter to get_torch_device() with priority chain
  CUDA > XPU > DirectML > MPS > CPU
- pytorch_backend.py: pass allow_directml=True in TTS and STT backends
- requirements.txt: add torch-directml>=0.2.0 (Windows-only platform marker)
- tests/test_directml_iris.py: Windows-only tests for device detection,
  tensor ops, and model load/unload on DirectML

Closes jamiepine#628

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@RajeshKumar11 RajeshKumar11 force-pushed the feature/directml-windows-igpu branch from 17c21fe to 3245fee Compare May 17, 2026 17:49
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
backend/backends/base.py (1)

110-117: ⚡ Quick win

Align get_torch_device() return contract with DirectML branch behavior.

The function is annotated as -> str, but DirectML returns a device object. That mismatch can break typed callers and string-based device checks.

Proposed contract-consistency patch
-from typing import Callable, List, Optional, Tuple
+from typing import Any, Callable, List, Optional, Tuple
@@
-def get_torch_device(
+def get_torch_device(
@@
-) -> str:
+) -> str | Any:

Also applies to: 147-159

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/backends/base.py` around lines 110 - 117, get_torch_device currently
declares -> str but the DirectML branch returns a torch.device object, causing a
type/behavior mismatch; update get_torch_device so its return contract is
consistent by either returning a string for all branches (e.g., "cpu", "cuda",
"mps", "dml" or formatted torch device strings) or by changing the annotated
return type to Union[str, torch.device] and normalizing callers accordingly;
locate the DirectML-specific branch inside get_torch_device and either convert
its torch.device to a string before returning or adjust the function annotation
and downstream uses to accept a torch.device.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/backends/base.py`:
- Around line 156-158: The two logger.info calls in backend/backends/base.py use
unnecessary f-string prefixes for static messages (the lines calling
logger.info("Using DirectML device (Intel Iris iGPU detected)") and
logger.info("Using DirectML device (Windows GPU acceleration via DirectML)")).
Edit those calls to remove the leading f so they are plain string literals
instead of f-strings to resolve Ruff F541; keep the messages and surrounding
logic unchanged.

---

Nitpick comments:
In `@backend/backends/base.py`:
- Around line 110-117: get_torch_device currently declares -> str but the
DirectML branch returns a torch.device object, causing a type/behavior mismatch;
update get_torch_device so its return contract is consistent by either returning
a string for all branches (e.g., "cpu", "cuda", "mps", "dml" or formatted torch
device strings) or by changing the annotated return type to Union[str,
torch.device] and normalizing callers accordingly; locate the DirectML-specific
branch inside get_torch_device and either convert its torch.device to a string
before returning or adjust the function annotation and downstream uses to accept
a torch.device.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0fcb0637-6cdd-474c-a026-6eb138fe2e2c

📥 Commits

Reviewing files that changed from the base of the PR and between 17c21fe and 3245fee.

📒 Files selected for processing (4)
  • backend/backends/base.py
  • backend/backends/pytorch_backend.py
  • backend/requirements.txt
  • backend/tests/test_directml_iris.py
✅ Files skipped from review due to trivial changes (2)
  • backend/backends/pytorch_backend.py
  • backend/requirements.txt
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/tests/test_directml_iris.py

Comment thread backend/backends/base.py Outdated
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/backends/base.py (1)

110-116: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Keep get_torch_device() return type consistent in DirectML path.

get_torch_device() is declared -> str, but the DirectML branch returns torch_directml.device(0) (a device object). This breaks string comparisons throughout the codebase (device == "cpu", etc.) and violates the function's return contract.

Suggested fix
 def get_torch_device(
@@
 ) -> str:
@@
-                device = torch_directml.device(0)
+                device = str(torch_directml.device(0))
                 iris_detected = _detect_iris_igpu()
@@
                 else:
                     logger.info("Using DirectML device (Windows GPU acceleration via DirectML)")
                 return device
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/backends/base.py` around lines 110 - 116, get_torch_device() is
declared to return a str but the DirectML branch currently returns a
torch_directml.device(0) object; change that branch to return a consistent
string (for example "directml" or "directml:0") instead of the device object so
it matches other branches ("cpu", "cuda", "mps") and preserves the -> str
contract used by callers; update the DirectML branch that references
torch_directml.device(0) to return the chosen string and ensure the fallback
behavior still returns "cpu" when DirectML isn't available.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@backend/backends/base.py`:
- Around line 110-116: get_torch_device() is declared to return a str but the
DirectML branch currently returns a torch_directml.device(0) object; change that
branch to return a consistent string (for example "directml" or "directml:0")
instead of the device object so it matches other branches ("cpu", "cuda", "mps")
and preserves the -> str contract used by callers; update the DirectML branch
that references torch_directml.device(0) to return the chosen string and ensure
the fallback behavior still returns "cpu" when DirectML isn't available.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 41248039-21d4-4cd4-b2d6-25dd2f2e68e8

📥 Commits

Reviewing files that changed from the base of the PR and between 3245fee and ab37969.

📒 Files selected for processing (1)
  • backend/backends/base.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Expand iGPU Support (Intel Iris, AMD Radeon) via OpenVINO or DirectML

1 participant