Skip to content

feat: add Apple Silicon (MPS) support for macOS ARM64#1869

Open
jasagiri wants to merge 1 commit intoFunAudioLLM:mainfrom
jasagiri:feat/apple-silicon
Open

feat: add Apple Silicon (MPS) support for macOS ARM64#1869
jasagiri wants to merge 1 commit intoFunAudioLLM:mainfrom
jasagiri:feat/apple-silicon

Conversation

@jasagiri
Copy link
Copy Markdown

@jasagiri jasagiri commented Apr 5, 2026

Summary

  • Introduce a device abstraction layer (cosyvoice/utils/device.py) that unifies CUDA, MPS (Apple Silicon), and CPU device management
  • Replace all hardcoded CUDA-specific code paths in the inference pipeline with device-agnostic alternatives
  • Enable CosyVoice to run natively on Apple Silicon Macs (M1/M2/M3/M4) via PyTorch MPS backend

Changes

New files

  • cosyvoice/utils/device.py — Unified device detection (get_device()), stream context, autocast, cache management, and random seed utilities
  • requirements-cuda.txt — Separated CUDA-specific PyPI index URLs for Linux GPU environments
  • setup_macos.sh — One-command setup script for Apple Silicon

Modified files

  • cosyvoice/cli/model.py — Replace CUDA device init, streams (torch.cuda.stream), AMP (torch.cuda.amp.autocast), and cache clearing across CosyVoiceModel, CosyVoice2Model, CosyVoice3Model
  • cosyvoice/cli/cosyvoice.py — MPS-aware feature gates: TRT/vLLM require CUDA, JIT/fp16 work on any GPU including MPS
  • cosyvoice/cli/frontend.py — Add CoreMLExecutionProvider fallback for ONNX Runtime on Apple Silicon
  • cosyvoice/utils/common.py — Guard torch.cuda.manual_seed_all for non-CUDA environments
  • requirements.txt — Remove CUDA-only index URLs, loosen PyTorch version pin (>=2.3.1)
  • README.md — Add macOS Apple Silicon setup instructions

Design decisions

  • Device priority: cuda > mps > cpu — CUDA environments are unaffected
  • TensorRT/vLLM: Remain CUDA-only (no ARM64 builds exist) — gracefully disabled with warning on MPS
  • JIT/fp16: Enabled on MPS since PyTorch MPS supports both
  • Training: Out of scope — DeepSpeed/DDP do not support MPS. This PR focuses on inference only
  • Zero behavioral change on CUDA: All abstractions are transparent passthrough when CUDA is available

Platform support matrix

Feature CUDA MPS (Apple Silicon) CPU
Inference
fp16
JIT
TensorRT
vLLM
Training

Test plan

  • device.py: All functions tested on MPS (device detection, stream context, autocast with float16, cache clear, seed)
  • common.py: set_all_random_seed() does not crash without CUDA; fade_in_out() works on MPS tensors
  • model.py: All 3 model classes import correctly; no hardcoded CUDA references remain (except intentional load_trt assert)
  • cosyvoice.py: Feature gates correctly disable TRT/vLLM on MPS while keeping JIT/fp16
  • frontend.py: Device abstraction and CoreML provider fallback verified
  • Clean clone test: All checks pass from fresh git clone
  • End-to-end inference with model weights (requires pretrained model download)

🤖 Generated with Claude Code

Introduce a device abstraction layer (cosyvoice/utils/device.py) that
unifies CUDA, MPS, and CPU device management. Replace all hardcoded
CUDA-specific code paths in the inference pipeline with device-agnostic
alternatives, enabling CosyVoice to run natively on Apple Silicon Macs.

Key changes:
- Device abstraction: get_device(), get_stream_context(),
  get_autocast_context(), empty_cache()
- model.py: Replace CUDA device init, streams, AMP, and cache clearing
  across CosyVoiceModel, CosyVoice2Model, CosyVoice3Model
- cosyvoice.py: MPS-aware feature gates (TRT/vLLM require CUDA,
  JIT/fp16 require any GPU)
- frontend.py: CoreMLExecutionProvider support for ONNX Runtime
- common.py: Guard torch.cuda.manual_seed_all for non-CUDA environments
- requirements.txt: Remove CUDA-only index URLs, loosen PyTorch version
- setup_macos.sh: One-command setup script for Apple Silicon

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jasagiri jasagiri force-pushed the feat/apple-silicon branch from 029f931 to fb21fd2 Compare April 5, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant