Skip to content

feat: Apple Silicon (MPS) support and device-agnostic memory management#80

Open
fabriziosalmi wants to merge 4 commits intotravisvn:mainfrom
fabriziosalmi:feature/apple-silicon-mps-support
Open

feat: Apple Silicon (MPS) support and device-agnostic memory management#80
fabriziosalmi wants to merge 4 commits intotravisvn:mainfrom
fabriziosalmi:feature/apple-silicon-mps-support

Conversation

@fabriziosalmi
Copy link
Copy Markdown

@fabriziosalmi fabriziosalmi commented Apr 29, 2026

Description

This PR enables Chatterbox-TTS to run natively on Apple Silicon (M1/M2/M3/M4/M5) devices using MPS acceleration while maintaining full compatibility with CUDA and CPU-only setups. It also improves the overall resilience and usability of the API.

Key Changes

  • Transparent Device Mapping: Added a torch.load monkey-patch in app/core/tts_model.py that automatically maps CUDA-serialized models to CPU when running on non-CUDA hardware.
  • Stable Model Loading: Implemented a 'load-to-CPU-then-move' strategy for MPS. Directly loading large models to MPS can be unstable; this PR loads them to CPU first and then manually moves components (t3, s3gen, ve) to the target device.
  • Automatic Language Detection: Enhanced the speech endpoint to recognize language codes (e.g., it, fr, es, de) in the voice parameter. This allows for correct pronunciation and accent even when a specific voice sample isn't provided in the library.
  • Unified GPU Cache Management: Added empty_gpu_cache() helper to support both CUDA and MPS cache clearing, ensuring efficient memory usage across platforms.
  • Startup Resilience: Added a mock for the resemble-perth watermarker to prevent startup crashes on platforms (like macOS with Python 3.12+) where the native package fails to initialize due to missing system dependencies.
  • Enhanced Observability: Updated status APIs and logs to accurately report MPS availability and activity.

These changes have been verified on macOS with MPS enabled and successfully handle the typical RuntimeError related to CUDA deserialization on non-NVIDIA hardware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant