docs: update speech-to-text benchmarks (#1014)

IgorSwat · web-flow · commit 83a76f27f0f6 · 2026-03-25T16:31:22.000+01:00
## Description

Updated Whisper model (speech-to-text) inference time docs section.

### Introduces a breaking change?

- [ ] Yes
- [x] No

### Type of change

- [ ] Bug fix (change which fixes an issue)
- [ ] New feature (change which adds functionality)
- [x] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)

### Tested on

- [ ] iOS
- [ ] Android

### Testing instructions

No functional changes - no need for testing.

### Screenshots

&lt;!-- Add screenshots here, if applicable --&gt;

### Related issues

&lt;!-- Link related issues here using #issue-number --&gt;

### Checklist

- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [ ] My changes generate no new warnings

### Additional notes

&lt;!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. --&gt;
diff --git a/docs/docs/02-benchmarks/inference-time.md b/docs/docs/02-benchmarks/inference-time.md
@@ -139,15 +139,15 @@ Average time for encoding audio of given length over 10 runs. For `Whisper` mode
 
 | Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |             248              |             254              |            1145            |                435                |            526            |
+| Whisper-tiny (30s) |              89              |              93              |            403             |                277                |            260            |
 
 ### Decoding
 
 Average time for decoding one token in sequence of approximately 100 tokens, with encoding context is obtained from audio of noted length.
 
 | Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |              23              |              25              |            121             |                92                 |            115            |
+| Whisper-tiny (30s) |              6               |              6               |             40             |                28                 |            25             |
 
 ## Text to Speech
 
diff --git a/docs/versioned_docs/version-0.8.x/02-benchmarks/inference-time.md b/docs/versioned_docs/version-0.8.x/02-benchmarks/inference-time.md
@@ -139,15 +139,15 @@ Average time for encoding audio of given length over 10 runs. For `Whisper` mode
 
 | Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |             248              |             254              |            1145            |                435                |            526            |
+| Whisper-tiny (30s) |              89              |              93              |            403             |                277                |            260            |
 
 ### Decoding
 
 Average time for decoding one token in sequence of approximately 100 tokens, with encoding context is obtained from audio of noted length.
 
 | Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |              23              |              25              |            121             |                92                 |            115            |
+| Whisper-tiny (30s) |              6               |              6               |             40             |                28                 |            25             |
 
 ## Text to Speech