You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/docs/03-hooks/01-natural-language-processing/useSpeechToText.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,7 +66,7 @@ Since speech-to-text models can only process audio segments up to 30 seconds lon
66
66
67
67
`useSpeechToText` takes [`SpeechToTextProps`](../../06-api-reference/interfaces/SpeechToTextProps.md) that consists of:
68
68
69
-
-`model` of type [`SpeechToTextConfig`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md), containing the [`isMultilingual` flag](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#ismultilingual), [tokenizer source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#tokenizersource), [encoder source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#encodersource), and [decoder source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#decodersource).
69
+
-`model` of type [`SpeechToTextConfig`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md), containing the [`isMultilingual` flag](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#ismultilingual), [tokenizer source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#tokenizersource)and [model source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#modelsource).
70
70
- An optional flag [`preventLoad`](../../06-api-reference/interfaces/SpeechToTextProps.md#preventload) which prevents auto-loading of the model.
71
71
72
72
You need more details? Check the following resources:
-[`isMultilingual`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#ismultilingual) - Flag indicating if model is multilingual.
47
47
48
-
-[`encoderSource`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#encodersource) - The location of the used encoder.
49
-
50
-
-[`decoderSource`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#decodersource) - The location of the used decoder.
48
+
-[`modelSource`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#modelsource) - The location of the used model (bundled encoder + decoder functionality).
51
49
52
50
-[`tokenizerSource`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#tokenizersource) - The location of the used tokenizer.
Copy file name to clipboardExpand all lines: docs/docs/06-api-reference/classes/SpeechToTextModule.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Class: SpeechToTextModule
2
2
3
-
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:16](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L16)
3
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:15](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L15)
4
4
5
5
Module for Speech to Text (STT) functionalities.
6
6
@@ -20,7 +20,7 @@ Module for Speech to Text (STT) functionalities.
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:91](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L91)
23
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:83](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L83)
24
24
25
25
Runs the decoder of the model.
26
26
@@ -50,7 +50,7 @@ Decoded output.
50
50
51
51
> **delete**(): `void`
52
52
53
-
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:69](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L69)
53
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:60](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L60)
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:80](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L80)
67
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:71](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L71)
68
68
69
69
Runs the encoding part of the model on the provided waveform.
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:27](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L27)
92
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:26](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L26)
93
93
94
94
Loads the model specified by the config object.
95
95
`onDownloadProgressCallback` allows you to monitor the current progress of the model download.
@@ -118,7 +118,7 @@ Optional callback to monitor download progress.
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:133](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L133)
121
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:124](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L124)
122
122
123
123
Starts a streaming transcription session.
124
124
Yields objects with `committed` and `nonCommitted` transcriptions.
@@ -148,7 +148,7 @@ An async generator yielding transcription updates.
148
148
149
149
> **streamInsert**(`waveform`): `void`
150
150
151
-
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:206](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L206)
151
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:197](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L197)
152
152
153
153
Inserts a new audio chunk into the streaming transcription session.
154
154
@@ -170,7 +170,7 @@ The audio chunk to insert.
170
170
171
171
> **streamStop**(): `void`
172
172
173
-
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:213](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L213)
173
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:204](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L204)
174
174
175
175
Stops the current streaming transcription session.
176
176
@@ -184,7 +184,7 @@ Stops the current streaming transcription session.
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:109](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L109)
187
+
Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:100](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L100)
188
188
189
189
Starts a transcription process for a given input array (16kHz waveform).
190
190
For multilingual models, specify the language in `options`.
Copy file name to clipboardExpand all lines: docs/docs/06-api-reference/classes/TextToSpeechModule.md
+64-7Lines changed: 64 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Class: TextToSpeechModule
2
2
3
-
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:17](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L17)
3
+
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:18](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L18)
4
4
5
5
Module for Text to Speech (TTS) functionalities.
6
6
@@ -20,7 +20,7 @@ Module for Text to Speech (TTS) functionalities.
20
20
21
21
> **nativeModule**: `any` = `null`
22
22
23
-
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:21](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L21)
23
+
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:22](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L22)
24
24
25
25
Native module instance
26
26
@@ -30,7 +30,7 @@ Native module instance
30
30
31
31
> **delete**(): `void`
32
32
33
-
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:182](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L182)
33
+
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:229](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L229)
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:109](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L109)
47
+
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:118](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L118)
48
48
49
49
Synthesizes the provided text into speech.
50
50
Returns a promise that resolves to the full audio waveform as a `Float32Array`.
@@ -71,11 +71,43 @@ A promise resolving to the synthesized audio waveform.
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:135](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L135)
79
+
80
+
Synthesizes pre-computed phonemes into speech, bypassing the built-in phonemizer.
81
+
This allows using an external G2P system (e.g. the Python `phonemizer` library,
82
+
espeak-ng, or any custom phonemizer).
83
+
84
+
#### Parameters
85
+
86
+
##### phonemes
87
+
88
+
`string`
89
+
90
+
The pre-computed IPA phoneme string.
91
+
92
+
##### speed?
93
+
94
+
`number` = `1.0`
95
+
96
+
Optional speed multiplier for the speech synthesis (default is 1.0).
97
+
98
+
#### Returns
99
+
100
+
`Promise`\<`Float32Array`\<`ArrayBufferLike`\>\>
101
+
102
+
A promise resolving to the synthesized audio waveform.
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:30](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L30)
110
+
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:31](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L31)
79
111
80
112
Loads the model and voice assets specified by the config object.
81
113
`onDownloadProgressCallback` allows you to monitor the current progress.
@@ -104,7 +136,7 @@ Optional callback to monitor download progress.
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:127](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L127)
139
+
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:196](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L196)
108
140
109
141
Starts a streaming synthesis session. Yields audio chunks as they are generated.
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:210](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L210)
164
+
165
+
Starts a streaming synthesis session from pre-computed phonemes.
166
+
Bypasses the built-in phonemizer, allowing use of external G2P systems.
An async generator yielding Float32Array audio chunks.
181
+
182
+
---
183
+
127
184
### streamStop()
128
185
129
186
> **streamStop**(): `void`
130
187
131
-
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:175](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L175)
188
+
Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:222](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L222)
132
189
133
190
Stops the streaming process if there is any ongoing.
Defined in: [hooks/natural_language_processing/useTextToSpeech.ts:19](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/hooks/natural_language_processing/useTextToSpeech.ts#L19)
5
+
Defined in: [hooks/natural_language_processing/useTextToSpeech.ts:22](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/hooks/natural_language_processing/useTextToSpeech.ts#L22)
Defined in: [types/stt.ts:277](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L277)
13
+
Defined in: [types/stt.ts:269](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L269)
14
14
15
-
A string that specifies the location of a `.pte` file for the decoder.
15
+
A boolean flag indicating whether the model supports multiple languages.
Defined in: [types/stt.ts:276](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L276)
22
24
23
-
Defined in: [types/stt.ts:272](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L272)
25
+
A string that specifies the location of a `.pte` file for the model.
24
26
25
-
A string that specifies the location of a `.pte` file for the encoder.
27
+
We expect the model to have 2 bundled methods: 'decode' and 'encode'.
Defined in: [types/stt.ts:267](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L267)
35
+
Defined in: [types/stt.ts:281](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L281)
34
36
35
-
A boolean flag indicating whether the model supports multiple languages.
37
+
A string that specifies the location to the tokenizer for the model.
Defined in: [types/stt.ts:282](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L282)
43
+
> **type**: `"whisper"`
44
44
45
-
A string that specifies the location to the tokenizer for the model.
45
+
Defined in: [types/stt.ts:264](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L264)
0 commit comments