You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -57,7 +69,7 @@ The audio file to transcribe, in one of these formats: `flac`, `mp3`, `mp4`, `mp
57
69
```yaml
58
70
Type: String
59
71
Required: True
60
-
Position: 1
72
+
Position: 0
61
73
Accept pipeline input: True (ByValue)
62
74
```
63
75
@@ -83,7 +95,7 @@ Position: Named
83
95
```
84
96
85
97
### -ResponseFormat
86
-
The format of the transcript output, in one of these options: `json`, `text`, `srt`, `verbose_json`, or `vtt`.
98
+
The format of the transcript output, in one of these options: `json`, `text`, `srt`, `verbose_json`, `vtt` or `diarized_json`.
87
99
The default value is `text`.
88
100
89
101
```yaml
@@ -114,6 +126,61 @@ Required: False
114
126
Position: Named
115
127
```
116
128
129
+
### -KnownSpeakerNames
130
+
Optional list of speaker names that correspond to the audio samples provided in `-KnownSpeakerReferences`. Each entry should be a short identifier (for example customer or agent). Up to 4 speakers are supported.
131
+
132
+
```yaml
133
+
Type: String[]
134
+
Required: False
135
+
Position: Named
136
+
```
137
+
138
+
### -KnownSpeakerReferences
139
+
Optional list of audio samples that contain known speaker references matching `-KnownSpeakerNames`. Each sample must be between 2 and 10 seconds, and can use any of the same input audio formats supported by file.
140
+
141
+
```yaml
142
+
Type: String[]
143
+
Required: False
144
+
Position: Named
145
+
```
146
+
147
+
### -ChunkingStrategy
148
+
Controls how the audio is cut into chunks. Options are: `auto`, `server_vad`.
149
+
The default value is `auto`.
150
+
151
+
```yaml
152
+
Type: String
153
+
Required: False
154
+
Position: Named
155
+
```
156
+
157
+
### -ChunkingStrategyThreshold
158
+
Sensitivity threshold (0.0 to 1.0) for voice activity detection.
159
+
160
+
```yaml
161
+
Type: Float
162
+
Required: False
163
+
Position: Named
164
+
```
165
+
166
+
### -ChunkingStrategyPrefixPadding
167
+
Amount of audio to include before the VAD detected speech (in milliseconds).
168
+
169
+
```yaml
170
+
Type: UInt16
171
+
Required: False
172
+
Position: Named
173
+
```
174
+
175
+
### -ChunkingStrategySilenceDuration
176
+
Duration of silence to detect speech stop (in milliseconds).
177
+
178
+
```yaml
179
+
Type: UInt16
180
+
Required: False
181
+
Position: Named
182
+
```
183
+
117
184
### -TimestampGranularities
118
185
The timestamp granularities to populate for this transcription. Any of these options: `word`, or `segment`. The default is `segment`.
0 commit comments