Commit 6fc7c33
authored
whisper : expose internal VAD speech segments (#3916)
When transcribing with params.vad = true, whisper already computes the speech
segments and keeps them in the state. Expose them so callers can reuse those
boundaries (for example to align or clip subtitles to speech) instead of running
a second, separate VAD pass.
Times are on the original audio timeline in centiseconds; the count is 0 when VAD
was not used. test-vad-full.cpp checks the segments are ordered and non-empty.1 parent 167d225 commit 6fc7c33
3 files changed
Lines changed: 47 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
681 | 681 | | |
682 | 682 | | |
683 | 683 | | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
684 | 695 | | |
685 | 696 | | |
686 | 697 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8139 | 8139 | | |
8140 | 8140 | | |
8141 | 8141 | | |
| 8142 | + | |
| 8143 | + | |
| 8144 | + | |
| 8145 | + | |
| 8146 | + | |
| 8147 | + | |
| 8148 | + | |
| 8149 | + | |
| 8150 | + | |
| 8151 | + | |
| 8152 | + | |
| 8153 | + | |
| 8154 | + | |
| 8155 | + | |
| 8156 | + | |
| 8157 | + | |
| 8158 | + | |
| 8159 | + | |
| 8160 | + | |
| 8161 | + | |
| 8162 | + | |
| 8163 | + | |
| 8164 | + | |
| 8165 | + | |
8142 | 8166 | | |
8143 | 8167 | | |
8144 | 8168 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
65 | 77 | | |
66 | 78 | | |
67 | 79 | | |
| |||
0 commit comments