Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,33 @@ sources:
Each subtree has its own README / build flow / public C++ API. The
upstream whisper.cpp build below is unaffected by either.

### Platform & GPU-backend benchmark coverage

Backends each engine can build for, and where we currently have **CI benchmark
numbers** (2026-06; GPU runner = NVIDIA RTX 4000 SFF Ada Generation, Vulkan).
Legend: ✅ current CI numbers · ⚠️ maintainer-measured on Apple Silicon (not CI)
· ⏳ supported, benchmark pending · ⛔ GPU compute disabled (Adreno crash → CPU
fallback) · — n/a.

| Engine | CPU | Vulkan (Linux/Win) | Metal (macOS/iOS) | CUDA | Android (Vulkan/OpenCL) |
|--------|:---:|:------------------:|:-----------------:|:----:|:-----------------------:|
| whisper.cpp (ASR) | ⏳² | ⏳ | ⏳ | ⏳ | ⏳ |
| parakeet (ASR) | ✅ | ✅ | ⚠️ | ⏳ | ⛔ |
| Chatterbox (TTS) | ✅ | ✅ | ⚠️ | ⏳ | ⛔ |
| Supertonic (TTS) | ✅ | ⏳¹ | ⚠️ | ⏳ | ⛔ |

¹ Supertonic GPU (Vulkan/Metal) is re-landing in
[qvac#2506](https://github.com/tetherto/qvac/pull/2506); on the current package
it runs CPU-only. **Android GPU** (Adreno Vulkan/OpenCL) is force-disabled for
parakeet and Supertonic — the ggml graph compute aborts — so `useGPU` falls back
to CPU there (parakeet: [qvac#2525](https://github.com/tetherto/qvac/pull/2525)).
² whisper-cli ships the upstream [`whisper-bench`](examples/bench) tool with
community results in [ggml-org/whisper.cpp#89](https://github.com/ggml-org/whisper.cpp/issues/89);
refreshed QVAC CI speed coverage is still pending.

Per-engine detail: [`parakeet-cpp`](parakeet-cpp/README.md#backend--platform-coverage)
· [`tts-cpp`](tts-cpp/README.md#backend--platform-coverage).

## Quick start

First clone the repository:
Expand Down
15 changes: 15 additions & 0 deletions parakeet-cpp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,21 @@ Default **`--quant`** is **`q8_0`**. Use **`f16`** for parity-calibrated harness

Small tensors and shapes not divisible by 32 may stay f16; see `PROGRESS.md` for quant sweep detail.

### Backend & platform coverage

Where parakeet runs and where we have **CI benchmark numbers** (legend: ✅ current
CI · ⚠️ maintainer-measured, not CI · ⏳ supported, benchmark pending · ⛔ GPU
compute disabled, CPU fallback).

| Platform | Backend(s) | Benchmark status |
|----------|------------|------------------|
| Linux x86-64 | CPU, Vulkan | ✅ CPU + Vulkan (CI — see table below) |
| Linux arm64 | CPU | ✅ CPU (CI) |
| Windows x64 | CPU, Vulkan | ✅ CPU (CI); Vulkan supported, ⏳ not yet benched |
| macOS x64/arm64, iOS | CPU, Metal | CPU ✅ (CI); Metal ⚠️ maintainer-measured (`RTF (Metal)` column above); iOS ⏳ |
| Android (arm64) | CPU | ✅ CPU; Adreno Vulkan/OpenCL ⛔ force-disabled — ggml compute aborts ([qvac#2525](https://github.com/tetherto/qvac/pull/2525)) |
| NVIDIA CUDA | CPU, CUDA | ⏳ supported, not benchmarked |

### CI benchmarks (latest `ggml-speech`, Linux x86-64)

End-to-end RTF measured in CI on the `tetherto/qvac` self-hosted runners, using
Expand Down
14 changes: 14 additions & 0 deletions tts-cpp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -834,6 +834,20 @@ setup for the Apple rows:
- Reference voice: `test/reference-audio/jfk.wav` (11 s mono 16 kHz)
- Seed: 42, warm 3-run average, inference only (excludes model load)

### Backend & platform coverage

Where each TTS engine runs and where we have **CI benchmark numbers** (legend:
✅ current CI · ⚠️ maintainer-measured on Apple, not CI · ⏳ supported, benchmark
pending · ⛔ GPU compute disabled, CPU fallback).

| Platform | Chatterbox (Turbo / Multilingual) | Supertonic |
|----------|-----------------------------------|------------|
| Linux x86-64 | CPU ✅, Vulkan ✅ (CI) | CPU ✅ (CI); Vulkan ⏳ (re-land in [qvac#2506](https://github.com/tetherto/qvac/pull/2506)) |
| Windows x64 | CPU ✅, Vulkan ✅ (CI) | CPU ✅; Vulkan ⏳ |
| macOS / iOS | CPU ✅, Metal ⚠️ (maintainer) | CPU ✅, Metal ⚠️ (maintainer); iOS ⏳ |
| Android (arm64)| CPU only — GPU forced off at the engine boundary | CPU only — Adreno GPU aborts ([qvac#2506](https://github.com/tetherto/qvac/pull/2506)) |
| NVIDIA CUDA | ⏳ supported, not benchmarked | ⏳ supported, not benchmarked |

### CI benchmarks (latest `ggml-speech`, Linux x86-64)

End-to-end RTF measured in CI on the `tetherto/qvac` self-hosted runners, using
Expand Down
Loading