feat(windows): DirectML GPU acceleration for Intel iGPU (Iris/UHD/Arc)#674
feat(windows): DirectML GPU acceleration for Intel iGPU (Iris/UHD/Arc)#674RajeshKumar11 wants to merge 2 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds Windows DirectML support with Intel Iris/iGPU detection. It introduces a WMI-based Iris detection helper, reworks DirectML device selection with device count validation and targeted logging, updates backend docstrings to clarify device priority, adds torch-directml as a Windows-only dependency, and provides a comprehensive test suite covering DirectML availability, device creation, tensor operations, and model loading. ChangesWindows DirectML and Intel Iris iGPU Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/build_binary.py`:
- Around line 36-39: The build now always enables use_spec_file when
voicebox-server.spec exists which prevents the CUDA packaging branch from
running; update the logic that sets use_spec_file (and the later checks around
lines where CUDA handling occurs) to only use the spec file when it exists AND
the --cuda flag is not set (e.g., change use_spec_file = spec_path.exists() to
use_spec_file = spec_path.exists() and not args.cuda) so the CUDA branch in the
CUDA-handling block (the branch that checks args.cuda) executes when --cuda is
provided.
In `@backend/requirements.txt`:
- Around line 17-19: The requirements file exposes a Windows-only package
unconditionally: the dependency line for torch-directml should be guarded with a
PEP 508 environment marker so it only installs on Windows. Update the
torch-directml requirement (the entry "torch-directml>=0.2.0") to append the
platform marker ; platform_system == "Windows" so non-Windows installs skip it
(adjust marker if you later need to include WSL explicitly).
In `@backend/tests/test_directml_iris.py`:
- Around line 143-149: The test currently catches all Exceptions around
backend.load_model_async("base") (and the similar block at 173-179), which hides
genuine regressions; change the broad except Exception to only handle expected
environmental/network errors by either catching specific exception types (e.g.,
asyncio.TimeoutError, OSError, or your project's NetworkError) or by inspecting
the exception message for known transient indicators (timeout/network/offline),
and for any other exception re-raise it after logging; update the handlers
around backend.load_model_async, backend.is_loaded, and backend.unload_model
accordingly (apply same fix to the second block at lines 173-179).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: adc8e8d7-ee8d-4dc3-873f-91441dab74c1
📒 Files selected for processing (7)
backend/backends/base.pybackend/backends/pytorch_backend.pybackend/build_binary.pybackend/pyi_rth_torchaudio_compat.pybackend/requirements.txtbackend/tests/test_directml_iris.pybackend/voicebox-server.spec
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/tests/test_directml_iris.py (1)
143-149:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon’t let model-load failures log-and-pass.
These smoke tests still go green on real DirectML/model init regressions because
RuntimeErroris swallowed and only logged. Skip only expected environmental failures; otherwise re-raise so the suite actually protects the DirectML path.Suggested fix
- except (OSError, RuntimeError, ImportError) as e: - logger.warning(f"Model load test failed (expected on slow/offline systems): {e}") + except (OSError, ImportError) as e: + pytest.skip(f"Environment limitation during model load: {e}")- except (OSError, RuntimeError, ImportError) as e: - logger.warning(f"Model load test failed (expected on slow/offline systems): {e}") + except (OSError, ImportError) as e: + pytest.skip(f"Environment limitation during model load: {e}")Also applies to: 173-179
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/tests/test_directml_iris.py` around lines 143 - 149, The test currently swallows RuntimeError during model load; change the exception handling so only expected environmental errors (e.g., OSError, ImportError) are caught and logged, while unexpected failures (RuntimeError and others) are re-raised so the test fails — update the try/except around backend.load_model_async("base") (and the similar block around backend.load_model_async in the other test) to catch only OSError and ImportError or, if catching a broader set, re-raise when isinstance(e, RuntimeError) or not an expected env error; keep calls to backend.is_loaded() and backend.unload_model() as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/tests/test_directml_iris.py`:
- Line 117: The logger.info call uses an unnecessary f-string prefix; update the
logger.info invocation in test_directml_iris.py (the line calling logger.info("✓
DirectML memory management OK")) to remove the stray f so it is a plain string
literal (i.e., replace logger.info(f"✓ DirectML memory management OK") with
logger.info("✓ DirectML memory management OK")) to satisfy the Ruff F541 lint
rule.
---
Duplicate comments:
In `@backend/tests/test_directml_iris.py`:
- Around line 143-149: The test currently swallows RuntimeError during model
load; change the exception handling so only expected environmental errors (e.g.,
OSError, ImportError) are caught and logged, while unexpected failures
(RuntimeError and others) are re-raised so the test fails — update the
try/except around backend.load_model_async("base") (and the similar block around
backend.load_model_async in the other test) to catch only OSError and
ImportError or, if catching a broader set, re-raise when isinstance(e,
RuntimeError) or not an expected env error; keep calls to backend.is_loaded()
and backend.unload_model() as-is.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d1b6c5af-b55e-488f-850d-51a1f6bd4779
📒 Files selected for processing (3)
backend/build_binary.pybackend/requirements.txtbackend/tests/test_directml_iris.py
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/requirements.txt
- backend/build_binary.py
Add torch-directml support so Windows users with Intel integrated graphics get hardware-accelerated TTS/STT without needing NVIDIA CUDA. - base.py: add _detect_iris_igpu() via WMI to identify Intel iGPU; add allow_directml parameter to get_torch_device() with priority chain CUDA > XPU > DirectML > MPS > CPU - pytorch_backend.py: pass allow_directml=True in TTS and STT backends - requirements.txt: add torch-directml>=0.2.0 (Windows-only platform marker) - tests/test_directml_iris.py: Windows-only tests for device detection, tensor ops, and model load/unload on DirectML Closes jamiepine#628 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
17c21fe to
3245fee
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/backends/base.py (1)
110-117: ⚡ Quick winAlign
get_torch_device()return contract with DirectML branch behavior.The function is annotated as
-> str, but DirectML returns a device object. That mismatch can break typed callers and string-based device checks.Proposed contract-consistency patch
-from typing import Callable, List, Optional, Tuple +from typing import Any, Callable, List, Optional, Tuple @@ -def get_torch_device( +def get_torch_device( @@ -) -> str: +) -> str | Any:Also applies to: 147-159
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/backends/base.py` around lines 110 - 117, get_torch_device currently declares -> str but the DirectML branch returns a torch.device object, causing a type/behavior mismatch; update get_torch_device so its return contract is consistent by either returning a string for all branches (e.g., "cpu", "cuda", "mps", "dml" or formatted torch device strings) or by changing the annotated return type to Union[str, torch.device] and normalizing callers accordingly; locate the DirectML-specific branch inside get_torch_device and either convert its torch.device to a string before returning or adjust the function annotation and downstream uses to accept a torch.device.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/backends/base.py`:
- Around line 156-158: The two logger.info calls in backend/backends/base.py use
unnecessary f-string prefixes for static messages (the lines calling
logger.info("Using DirectML device (Intel Iris iGPU detected)") and
logger.info("Using DirectML device (Windows GPU acceleration via DirectML)")).
Edit those calls to remove the leading f so they are plain string literals
instead of f-strings to resolve Ruff F541; keep the messages and surrounding
logic unchanged.
---
Nitpick comments:
In `@backend/backends/base.py`:
- Around line 110-117: get_torch_device currently declares -> str but the
DirectML branch returns a torch.device object, causing a type/behavior mismatch;
update get_torch_device so its return contract is consistent by either returning
a string for all branches (e.g., "cpu", "cuda", "mps", "dml" or formatted torch
device strings) or by changing the annotated return type to Union[str,
torch.device] and normalizing callers accordingly; locate the DirectML-specific
branch inside get_torch_device and either convert its torch.device to a string
before returning or adjust the function annotation and downstream uses to accept
a torch.device.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0fcb0637-6cdd-474c-a026-6eb138fe2e2c
📒 Files selected for processing (4)
backend/backends/base.pybackend/backends/pytorch_backend.pybackend/requirements.txtbackend/tests/test_directml_iris.py
✅ Files skipped from review due to trivial changes (2)
- backend/backends/pytorch_backend.py
- backend/requirements.txt
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/tests/test_directml_iris.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/backends/base.py (1)
110-116:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winKeep
get_torch_device()return type consistent in DirectML path.
get_torch_device()is declared-> str, but the DirectML branch returnstorch_directml.device(0)(a device object). This breaks string comparisons throughout the codebase (device == "cpu", etc.) and violates the function's return contract.Suggested fix
def get_torch_device( @@ ) -> str: @@ - device = torch_directml.device(0) + device = str(torch_directml.device(0)) iris_detected = _detect_iris_igpu() @@ else: logger.info("Using DirectML device (Windows GPU acceleration via DirectML)") return device🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/backends/base.py` around lines 110 - 116, get_torch_device() is declared to return a str but the DirectML branch currently returns a torch_directml.device(0) object; change that branch to return a consistent string (for example "directml" or "directml:0") instead of the device object so it matches other branches ("cpu", "cuda", "mps") and preserves the -> str contract used by callers; update the DirectML branch that references torch_directml.device(0) to return the chosen string and ensure the fallback behavior still returns "cpu" when DirectML isn't available.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@backend/backends/base.py`:
- Around line 110-116: get_torch_device() is declared to return a str but the
DirectML branch currently returns a torch_directml.device(0) object; change that
branch to return a consistent string (for example "directml" or "directml:0")
instead of the device object so it matches other branches ("cpu", "cuda", "mps")
and preserves the -> str contract used by callers; update the DirectML branch
that references torch_directml.device(0) to return the chosen string and ensure
the fallback behavior still returns "cpu" when DirectML isn't available.
Closes #628
Summary
Add
torch-directmlsupport so Windows users with Intel integrated graphics (Iris Xe, UHD Graphics, Arc) get hardware-accelerated TTS/STT without needing NVIDIA CUDA.Changes
backend/backends/base.py_detect_iris_igpu()using WMI to identify Intel iGPU by nameallow_directmlparameter toget_torch_device()with priority chain:CUDA > XPU > DirectML > MPS > CPUbackend/backends/pytorch_backend.pyallow_directml=Truein TTS and STT backends so DirectML is used automatically on eligible hardwarebackend/requirements.txttorch-directml>=0.2.0 ; platform_system == "Windows"(Windows-only PEP 508 marker)backend/tests/test_directml_iris.py(new)Test plan
get_torch_device(allow_directml=True)returns a DirectML device on Windows with Intel iGPU"Using DirectML device"on Intel Iris/UHD/Arc hardwareallow_directmlonly activates when CUDA/XPU are absent)pytest backend/tests/test_directml_iris.py -vpasses on Windows withtorch-directmlinstalledHardware tested
Intel 12th-gen i5 + Iris Xe Graphics, Windows 11, CPU-only PyTorch + torch-directml 0.2.5
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Chores
Tests