software-mansion
diff --git a/‎docs/docs/03-hooks/01-natural-language-processing/useSpeechToText.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/docs/03-hooks/01-natural-language-processing/useSpeechToText.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/docs/04-typescript-api/01-natural-language-processing/SpeechToTextModule.md‎
Lines changed: 1 addition & 3 deletions b/‎docs/docs/04-typescript-api/01-natural-language-processing/SpeechToTextModule.md‎
Lines changed: 1 addition & 3 deletions
diff --git a/‎docs/docs/06-api-reference/classes/SpeechToTextModule.md‎
Lines changed: 9 additions & 9 deletions b/‎docs/docs/06-api-reference/classes/SpeechToTextModule.md‎
Lines changed: 9 additions & 9 deletions
diff --git a/‎docs/docs/06-api-reference/classes/TextToSpeechModule.md‎
Lines changed: 64 additions & 7 deletions b/‎docs/docs/06-api-reference/classes/TextToSpeechModule.md‎
Lines changed: 64 additions & 7 deletions
diff --git a/‎docs/docs/06-api-reference/functions/useTextToSpeech.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/docs/06-api-reference/functions/useTextToSpeech.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/docs/06-api-reference/index.md‎
Lines changed: 3 additions & 1 deletion b/‎docs/docs/06-api-reference/index.md‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎docs/docs/06-api-reference/interfaces/SpeechToTextModelConfig.md‎
Lines changed: 17 additions & 17 deletions b/‎docs/docs/06-api-reference/interfaces/SpeechToTextModelConfig.md‎
Lines changed: 17 additions & 17 deletions
@@ -66,7 +66,7 @@ Since speech-to-text models can only process audio segments up to 30 seconds lon
 
 `useSpeechToText` takes [`SpeechToTextProps`](../../06-api-reference/interfaces/SpeechToTextProps.md) that consists of:
 
-- `model` of type [`SpeechToTextConfig`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md), containing the [`isMultilingual` flag](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#ismultilingual), [tokenizer source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#tokenizersource), [encoder source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#encodersource), and [decoder source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#decodersource).
+- `model` of type [`SpeechToTextConfig`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md), containing the [`isMultilingual` flag](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#ismultilingual), [tokenizer source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#tokenizersource) and [model source](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#modelsource).
 - An optional flag [`preventLoad`](../../06-api-reference/interfaces/SpeechToTextProps.md#preventload) which prevents auto-loading of the model.
 
 You need more details? Check the following resources:
 
@@ -45,9 +45,7 @@ Create an instance of [`SpeechToTextModule`](../../06-api-reference/classes/Spee
 - [`model`](../../06-api-reference/classes/SpeechToTextModule.md#model) - Object containing:
   - [`isMultilingual`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#ismultilingual) - Flag indicating if model is multilingual.
 
-  - [`encoderSource`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#encodersource) - The location of the used encoder.
-
-  - [`decoderSource`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#decodersource) - The location of the used decoder.
+  - [`modelSource`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#modelsource) - The location of the used model (bundled encoder + decoder functionality).
 
   - [`tokenizerSource`](../../06-api-reference/interfaces/SpeechToTextModelConfig.md#tokenizersource) - The location of the used tokenizer.
 
 
@@ -1,6 +1,6 @@
 # Class: SpeechToTextModule
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:16](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L16)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:15](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L15)
 
 Module for Speech to Text (STT) functionalities.
 
@@ -20,7 +20,7 @@ Module for Speech to Text (STT) functionalities.
 
 > **decode**(`tokens`, `encoderOutput`): `Promise`\<`Float32Array`\<`ArrayBufferLike`\>\>
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:91](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L91)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:83](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L83)
 
 Runs the decoder of the model.
 
@@ -50,7 +50,7 @@ Decoded output.
 
 > **delete**(): `void`
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:69](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L69)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:60](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L60)
 
 Unloads the model from memory.
 
@@ -64,7 +64,7 @@ Unloads the model from memory.
 
 > **encode**(`waveform`): `Promise`\<`Float32Array`\<`ArrayBufferLike`\>\>
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:80](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L80)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:71](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L71)
 
 Runs the encoding part of the model on the provided waveform.
 Returns the encoded waveform as a Float32Array.
@@ -89,7 +89,7 @@ The encoded output.
 
 > **load**(`model`, `onDownloadProgressCallback?`): `Promise`\<`void`\>
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:27](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L27)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:26](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L26)
 
 Loads the model specified by the config object.
 `onDownloadProgressCallback` allows you to monitor the current progress of the model download.
@@ -118,7 +118,7 @@ Optional callback to monitor download progress.
 
 > **stream**(`options?`): `AsyncGenerator`\<\{ `committed`: [`TranscriptionResult`](../interfaces/TranscriptionResult.md); `nonCommitted`: [`TranscriptionResult`](../interfaces/TranscriptionResult.md); \}\>
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:133](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L133)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:124](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L124)
 
 Starts a streaming transcription session.
 Yields objects with `committed` and `nonCommitted` transcriptions.
@@ -148,7 +148,7 @@ An async generator yielding transcription updates.
 
 > **streamInsert**(`waveform`): `void`
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:206](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L206)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:197](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L197)
 
 Inserts a new audio chunk into the streaming transcription session.
 
@@ -170,7 +170,7 @@ The audio chunk to insert.
 
 > **streamStop**(): `void`
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:213](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L213)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:204](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L204)
 
 Stops the current streaming transcription session.
 
@@ -184,7 +184,7 @@ Stops the current streaming transcription session.
 
 > **transcribe**(`waveform`, `options?`): `Promise`\<[`TranscriptionResult`](../interfaces/TranscriptionResult.md)\>
 
-Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:109](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L109)
+Defined in: [modules/natural_language_processing/SpeechToTextModule.ts:100](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/SpeechToTextModule.ts#L100)
 
 Starts a transcription process for a given input array (16kHz waveform).
 For multilingual models, specify the language in `options`.
 
@@ -1,6 +1,6 @@
 # Class: TextToSpeechModule
 
-Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:17](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L17)
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:18](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L18)
 
 Module for Text to Speech (TTS) functionalities.
 
@@ -20,7 +20,7 @@ Module for Text to Speech (TTS) functionalities.
 
 > **nativeModule**: `any` = `null`
 
-Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:21](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L21)
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:22](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L22)
 
 Native module instance
 
@@ -30,7 +30,7 @@ Native module instance
 
 > **delete**(): `void`
 
-Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:182](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L182)
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:229](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L229)
 
 Unloads the model from memory.
 
@@ -44,7 +44,7 @@ Unloads the model from memory.
 
 > **forward**(`text`, `speed?`): `Promise`\<`Float32Array`\<`ArrayBufferLike`\>\>
 
-Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:109](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L109)
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:118](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L118)
 
 Synthesizes the provided text into speech.
 Returns a promise that resolves to the full audio waveform as a `Float32Array`.
@@ -71,11 +71,43 @@ A promise resolving to the synthesized audio waveform.
 
 ---
 
+### forwardFromPhonemes()
+
+> **forwardFromPhonemes**(`phonemes`, `speed?`): `Promise`\<`Float32Array`\<`ArrayBufferLike`\>\>
+
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:135](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L135)
+
+Synthesizes pre-computed phonemes into speech, bypassing the built-in phonemizer.
+This allows using an external G2P system (e.g. the Python `phonemizer` library,
+espeak-ng, or any custom phonemizer).
+
+#### Parameters
+
+##### phonemes
+
+`string`
+
+The pre-computed IPA phoneme string.
+
+##### speed?
+
+`number` = `1.0`
+
+Optional speed multiplier for the speech synthesis (default is 1.0).
+
+#### Returns
+
+`Promise`\<`Float32Array`\<`ArrayBufferLike`\>\>
+
+A promise resolving to the synthesized audio waveform.
+
+---
+
 ### load()
 
 > **load**(`config`, `onDownloadProgressCallback?`): `Promise`\<`void`\>
 
-Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:30](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L30)
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:31](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L31)
 
 Loads the model and voice assets specified by the config object.
 `onDownloadProgressCallback` allows you to monitor the current progress.
@@ -104,7 +136,7 @@ Optional callback to monitor download progress.
 
 > **stream**(`input`): `AsyncGenerator`\<`Float32Array`\<`ArrayBufferLike`\>\>
 
-Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:127](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L127)
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:196](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L196)
 
 Starts a streaming synthesis session. Yields audio chunks as they are generated.
 
@@ -124,11 +156,36 @@ An async generator yielding Float32Array audio chunks.
 
 ---
 
+### streamFromPhonemes()
+
+> **streamFromPhonemes**(`input`): `AsyncGenerator`\<`Float32Array`\<`ArrayBufferLike`\>\>
+
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:210](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L210)
+
+Starts a streaming synthesis session from pre-computed phonemes.
+Bypasses the built-in phonemizer, allowing use of external G2P systems.
+
+#### Parameters
+
+##### input
+
+[`TextToSpeechStreamingPhonemeInput`](../interfaces/TextToSpeechStreamingPhonemeInput.md)
+
+Input object containing phonemes and optional speed.
+
+#### Returns
+
+`AsyncGenerator`\<`Float32Array`\<`ArrayBufferLike`\>\>
+
+An async generator yielding Float32Array audio chunks.
+
+---
+
 ### streamStop()
 
 > **streamStop**(): `void`
 
-Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:175](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L175)
+Defined in: [modules/natural_language_processing/TextToSpeechModule.ts:222](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/modules/natural_language_processing/TextToSpeechModule.ts#L222)
 
 Stops the streaming process if there is any ongoing.
 
 
@@ -2,7 +2,7 @@
 
 > **useTextToSpeech**(`TextToSpeechProps`): [`TextToSpeechType`](../interfaces/TextToSpeechType.md)
 
-Defined in: [hooks/natural_language_processing/useTextToSpeech.ts:19](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/hooks/natural_language_processing/useTextToSpeech.ts#L19)
+Defined in: [hooks/natural_language_processing/useTextToSpeech.ts:22](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/hooks/natural_language_processing/useTextToSpeech.ts#L22)
 
 React hook for managing Text to Speech instance.
 
 
@@ -101,7 +101,6 @@
 - [WHISPER_SMALL_EN](variables/WHISPER_SMALL_EN.md)
 - [WHISPER_TINY](variables/WHISPER_TINY.md)
 - [WHISPER_TINY_EN](variables/WHISPER_TINY_EN.md)
-- [WHISPER_TINY_EN_QUANTIZED](variables/WHISPER_TINY_EN_QUANTIZED.md)
 
 ## Models - Style Transfer
 
@@ -262,8 +261,11 @@
 - [TextToImageType](interfaces/TextToImageType.md)
 - [TextToSpeechConfig](interfaces/TextToSpeechConfig.md)
 - [TextToSpeechInput](interfaces/TextToSpeechInput.md)
+- [TextToSpeechPhonemeInput](interfaces/TextToSpeechPhonemeInput.md)
 - [TextToSpeechProps](interfaces/TextToSpeechProps.md)
+- [TextToSpeechStreamingCallbacks](interfaces/TextToSpeechStreamingCallbacks.md)
 - [TextToSpeechStreamingInput](interfaces/TextToSpeechStreamingInput.md)
+- [TextToSpeechStreamingPhonemeInput](interfaces/TextToSpeechStreamingPhonemeInput.md)
 - [TextToSpeechType](interfaces/TextToSpeechType.md)
 - [TokenizerProps](interfaces/TokenizerProps.md)
 - [TokenizerType](interfaces/TokenizerType.md)
 
@@ -6,40 +6,40 @@ Configuration for Speech to Text model.
 
 ## Properties
 
-### decoderSource
+### isMultilingual
 
-> **decoderSource**: [`ResourceSource`](../type-aliases/ResourceSource.md)
+> **isMultilingual**: `boolean`
 
-Defined in: [types/stt.ts:277](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L277)
+Defined in: [types/stt.ts:269](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L269)
 
-A string that specifies the location of a `.pte` file for the decoder.
+A boolean flag indicating whether the model supports multiple languages.
 
 ---
 
-### encoderSource
+### modelSource
+
+> **modelSource**: [`ResourceSource`](../type-aliases/ResourceSource.md)
 
-> **encoderSource**: [`ResourceSource`](../type-aliases/ResourceSource.md)
+Defined in: [types/stt.ts:276](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L276)
 
-Defined in: [types/stt.ts:272](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L272)
+A string that specifies the location of a `.pte` file for the model.
 
-A string that specifies the location of a `.pte` file for the encoder.
+We expect the model to have 2 bundled methods: 'decode' and 'encode'.
 
 ---
 
-### isMultilingual
+### tokenizerSource
 
-> **isMultilingual**: `boolean`
+> **tokenizerSource**: [`ResourceSource`](../type-aliases/ResourceSource.md)
 
-Defined in: [types/stt.ts:267](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L267)
+Defined in: [types/stt.ts:281](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L281)
 
-A boolean flag indicating whether the model supports multiple languages.
+A string that specifies the location to the tokenizer for the model.
 
 ---
 
-### tokenizerSource
+### type
 
-> **tokenizerSource**: [`ResourceSource`](../type-aliases/ResourceSource.md)
-
-Defined in: [types/stt.ts:282](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L282)
+> **type**: `"whisper"`
 
-A string that specifies the location to the tokenizer for the model.
+Defined in: [types/stt.ts:264](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/stt.ts#L264)