Skip to content

Commit fdbe0bc

Browse files
authored
feat: flux multilang (#1034)
* Incorporating flux-general-multi * Stating languages supported by multilang flux
1 parent 8bf7334 commit fdbe0bc

3 files changed

Lines changed: 90 additions & 41 deletions

File tree

fern/customization/speech-configuration.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ This plan defines the parameters for when the assistant begins speaking after th
5151

5252
**Audio-text based providers:**
5353

54-
- **Deepgram Flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Flux combines high-quality speech-to-text with native turn detection, while delivering ultra-low latency and Nova-3 level accuracy.
54+
- **Deepgram Flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Flux combines high-quality speech-to-text with native turn detection, while delivering ultra-low latency and Nova-3 level accuracy. Available in English (`flux-general-en`) and multilingual (`flux-general-multi`) variants. Supported languages for `flux-general-multi`: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`).
5555

5656
- **Assembly**: Transcriber that also reports end-of-turn detection. To use Assembly, choose it as your transcriber without setting a separate smart endpointing plan. As transcripts arrive, we consider the `end_of_turn` flag that Assembly sends to mark the end-of-turn, stream to the LLM, and generate a response.
5757

fern/customization/transcriber-fallback-plan.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ Each transcriber provider supports different configuration options. Expand the a
124124

125125
<AccordionGroup>
126126
<Accordion title="Deepgram">
127-
- **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, etc.).
127+
- **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, `flux-general-multi`, etc.). Use `flux-general-en` for English-only conversations and `flux-general-multi` for multilingual conversations. Supported languages for `flux-general-multi`: `en`, `es`, `fr`, `de`, `hi`, `ru`, `pt`, `ja`, `it`, `nl`.
128128
- **language**: Language code for transcription.
129129
- **keywords**: Keywords with optional boost values for improved recognition (e.g., `["companyname", "productname:2"]`).
130130
- **keyterm**: Keyterm prompting for up to 90% keyword recall rate improvement.

fern/customization/voice-pipeline-configuration.mdx

Lines changed: 88 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ Uses AI models to analyze speech patterns, context, and audio cues to predict wh
189189
- **krisp**: Audio-based model analyzing prosodic features (intonation, pitch, rhythm)
190190

191191
**Audio-text based providers:**
192-
- **deepgram-flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. (English only)
192+
- **deepgram-flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Use `flux-general-en` for English-only conversations or `flux-general-multi` for multilingual conversations.
193193
- **assembly**: Transcriber with built-in end-of-turn detection (English only)
194194

195195
<hr />
@@ -199,7 +199,7 @@ Uses AI models to analyze speech patterns, context, and audio cues to predict wh
199199

200200
**When to use smart endpointing:**
201201

202-
- **Deepgram Flux**: English conversations using Deepgram as a transcriber.
202+
- **Deepgram Flux**: English and Multi-lingual conversations using Deepgram as a transcriber.
203203
- **Assembly**: Best used when Assembly is already your transcriber provider for English conversations with integrated end-of-turn detection
204204
- **LiveKit**: English conversations where Deepgram is not the transcriber of choice.
205205
- **Vapi**: Non-English conversations with default stop speaking plan settings
@@ -221,19 +221,37 @@ Deepgram Flux's end-of-turn detection is configured at the transcriber level, al
221221
- **4000-6000:** Standard timeout (default: 5000) - natural conversation flow
222222
- **7000-10000:** Extended timeout for complex or thoughtful responses
223223

224-
**Configuration example:**
224+
**Configuration examples:**
225225

226-
```json
227-
{
228-
"transcriber": {
229-
"provider": "deepgram",
230-
"model": "flux-general-en",
231-
"language": "en",
232-
"eotThreshold": 0.7,
233-
"eotTimeoutMs": 5000
234-
}
235-
}
236-
```
226+
<Tabs>
227+
<Tab title="English">
228+
```json
229+
{
230+
"transcriber": {
231+
"provider": "deepgram",
232+
"model": "flux-general-en",
233+
"language": "en",
234+
"eotThreshold": 0.7,
235+
"eotTimeoutMs": 5000
236+
}
237+
}
238+
```
239+
</Tab>
240+
<Tab title="Multilingual">
241+
```json
242+
{
243+
"transcriber": {
244+
"provider": "deepgram",
245+
"model": "flux-general-multi",
246+
"eotThreshold": 0.7,
247+
"eotTimeoutMs": 5000
248+
}
249+
}
250+
```
251+
252+
Supported languages: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`). Set the `language` field to one of these codes, or omit it to enable automatic language detection.
253+
</Tab>
254+
</Tabs>
237255

238256
### LiveKit's Wait function
239257

@@ -669,32 +687,63 @@ User Interrupts → Assistant Audio Stopped → backoffSeconds Blocks All Output
669687

670688
### Audio-text based endpointing (Deepgram Flux example)
671689

672-
```json
673-
{
674-
"transcriber": {
675-
"provider": "deepgram",
676-
"model": "flux-general-en",
677-
"language": "en",
678-
"eotThreshold": 0.7,
679-
"eotTimeoutMs": 5000
680-
},
681-
"stopSpeakingPlan": {
682-
"numWords": 2,
683-
"voiceSeconds": 0.2,
684-
"backoffSeconds": 1.0,
685-
"acknowledgementPhrases": [
686-
"okay",
687-
"right",
688-
"uh-huh",
689-
"yeah",
690-
"mm-hmm",
691-
"got it"
692-
]
693-
}
694-
}
695-
```
690+
<Tabs>
691+
<Tab title="English">
692+
```json
693+
{
694+
"transcriber": {
695+
"provider": "deepgram",
696+
"model": "flux-general-en",
697+
"language": "en",
698+
"eotThreshold": 0.7,
699+
"eotTimeoutMs": 5000
700+
},
701+
"stopSpeakingPlan": {
702+
"numWords": 2,
703+
"voiceSeconds": 0.2,
704+
"backoffSeconds": 1.0,
705+
"acknowledgementPhrases": [
706+
"okay",
707+
"right",
708+
"uh-huh",
709+
"yeah",
710+
"mm-hmm",
711+
"got it"
712+
]
713+
}
714+
}
715+
```
716+
717+
**Optimized for:** English conversations where Deepgram is set as transcriber.
718+
</Tab>
719+
<Tab title="Multilingual">
720+
```json
721+
{
722+
"transcriber": {
723+
"provider": "deepgram",
724+
"model": "flux-general-multi",
725+
"eotThreshold": 0.7,
726+
"eotTimeoutMs": 5000
727+
},
728+
"stopSpeakingPlan": {
729+
"numWords": 2,
730+
"voiceSeconds": 0.2,
731+
"backoffSeconds": 1.0,
732+
"acknowledgementPhrases": [
733+
"okay",
734+
"right",
735+
"uh-huh",
736+
"yeah",
737+
"mm-hmm",
738+
"got it"
739+
]
740+
}
741+
}
742+
```
696743

697-
**Optimized for:** English conversations where Deepgram is set as transcriber.
744+
**Optimized for:** Multilingual conversations where Deepgram is set as transcriber. Supported languages: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`). Omit `language` to enable automatic language detection.
745+
</Tab>
746+
</Tabs>
698747

699748
### Education and training
700749

0 commit comments

Comments
 (0)