Commit c008fa5
linxiaodong
whisper : map token timestamps to original time when VAD is enabled
The segment timestamp getters already remap t0/t1 back to the original
audio timeline when VAD is used, but the per-token timestamps returned by
whisper_full_get_token_data() are left in VAD-processed time. That makes
word-level timing unusable with VAD, since tokens end up shifted by
however much silence was removed.
Add whisper_full_get_token_t0/t1 (and the _from_state variants) that run
the token times through the same vad_mapping_table the segment getters
use. With VAD off they just return the stored token times, so existing
callers are unaffected.1 parent 43d78af commit c008fa5
2 files changed
Lines changed: 42 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
667 | 667 | | |
668 | 668 | | |
669 | 669 | | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
670 | 678 | | |
671 | 679 | | |
672 | 680 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8075 | 8075 | | |
8076 | 8076 | | |
8077 | 8077 | | |
| 8078 | + | |
| 8079 | + | |
| 8080 | + | |
| 8081 | + | |
| 8082 | + | |
| 8083 | + | |
| 8084 | + | |
| 8085 | + | |
| 8086 | + | |
| 8087 | + | |
| 8088 | + | |
| 8089 | + | |
| 8090 | + | |
| 8091 | + | |
| 8092 | + | |
| 8093 | + | |
| 8094 | + | |
| 8095 | + | |
| 8096 | + | |
| 8097 | + | |
| 8098 | + | |
| 8099 | + | |
| 8100 | + | |
| 8101 | + | |
| 8102 | + | |
| 8103 | + | |
| 8104 | + | |
| 8105 | + | |
| 8106 | + | |
| 8107 | + | |
| 8108 | + | |
| 8109 | + | |
| 8110 | + | |
| 8111 | + | |
8078 | 8112 | | |
8079 | 8113 | | |
8080 | 8114 | | |
| |||
0 commit comments