Skip to content

fix!: speech to text live transcription#816

Merged
IgorSwat merged 24 commits intomainfrom
@is/speech-to-text
Mar 11, 2026
Merged

fix!: speech to text live transcription#816
IgorSwat merged 24 commits intomainfrom
@is/speech-to-text

Conversation

@IgorSwat
Copy link
Copy Markdown
Contributor

@IgorSwat IgorSwat commented Feb 17, 2026

Description

Various improvements & adjustments in Speech-to-Text module. The list of changes includes:

  • Adjusting native implementation to the new format of Whisper models (single file, bundled encode & decode methods)
  • Refactoring native implementation in order to support multiple STT models in the future
  • Fixing an impropriate behavior of Whisper streaming

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

You can run the tests defined for Speech-to-Text module, as well as test it manually with the 'speech' demo app (SpeechToText screen).

Screenshots

Related issues

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

Comment thread packages/react-native-executorch/src/constants/modelUrls.ts Outdated
@msluszniak msluszniak added the bug fix PRs that are fixing bugs label Feb 20, 2026
@msluszniak msluszniak linked an issue Feb 20, 2026 that may be closed by this pull request
@msluszniak msluszniak changed the title @is/speech to text fix: speech to text live transcription Feb 20, 2026
@IgorSwat IgorSwat force-pushed the @is/speech-to-text branch from 7b1e6ff to 2ee6d1d Compare March 2, 2026 09:21
Copy link
Copy Markdown
Member

@msluszniak msluszniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments are not needed imo

Copy link
Copy Markdown
Collaborator

@chmjkb chmjkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall solid work, thanks 👏🏻
Left a couple of nits

Comment thread packages/react-native-executorch/src/constants/modelUrls.ts Outdated
Comment thread packages/react-native-executorch/src/constants/modelUrls.ts
Comment thread packages/react-native-executorch/src/constants/modelUrls.ts Outdated
Copy link
Copy Markdown
Collaborator

@chmjkb chmjkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more things:

  1. I wasn't able to compile the app for Android (due to Norbert bumping minSdkVersion in RNET). You have to bump the minSdkVersion in the example app.
  2. Once compiled, it doesn't ask for mic permissions (im using a Pixel 10) and silently fails.

@IgorSwat IgorSwat force-pushed the @is/speech-to-text branch 3 times, most recently from 816c75a to ae017ef Compare March 6, 2026 16:12
Copy link
Copy Markdown
Collaborator

@chmjkb chmjkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should change the TS side as you're returnign a different thing from C++, for example:

  public async encode(waveform: Float32Array): Promise<Float32Array> {
    return new Float32Array(await this.nativeModule.encode(waveform));
  }            

Also, why did you switch back to type in SpeechToTextModelConfig?

Copy link
Copy Markdown
Member

@msluszniak msluszniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And rebase

@msluszniak
Copy link
Copy Markdown
Member

@IgorSwat could you fix the lint and reference API, then I will run test and speech demo app.

Copy link
Copy Markdown
Collaborator

@chmjkb chmjkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the docs (not the API reference) as they're outdated now.

@IgorSwat
Copy link
Copy Markdown
Contributor Author

Hot take: can we just remove the API reference from the repo, please?

@msluszniak
Copy link
Copy Markdown
Member

Hot take: can we just remove the API reference from the repo, please?

No, why would we do that?

@IgorSwat IgorSwat force-pushed the @is/speech-to-text branch 2 times, most recently from acd7682 to 04a71b4 Compare March 10, 2026 11:06
@IgorSwat
Copy link
Copy Markdown
Contributor Author

IgorSwat commented Mar 10, 2026

Hot take: can we just remove the API reference from the repo, please?

No, why would we do that?

Okay, nevermind, It passed after just 2 rebases. Could be worse :)

@msluszniak
Copy link
Copy Markdown
Member

msluszniak commented Mar 10, 2026

Okay, nevermind, It passed after just 2 rebases. Could be worse :)

It will always pass after rebase, I can guarantee you. So really it's not that bad, you can handle it once at the very end of the review process and it's just one command, but big benefit of having complete API. First try didn't work because signature of updated package Zod changed, and then you changed code so lines had mismatches ;)

@IgorSwat IgorSwat force-pushed the @is/speech-to-text branch from 1e9d17e to 3874dae Compare March 10, 2026 15:49
@IgorSwat IgorSwat force-pushed the @is/speech-to-text branch from 3874dae to b75f178 Compare March 10, 2026 16:23
@IgorSwat IgorSwat changed the title fix: speech to text live transcription fix!: speech to text live transcription Mar 10, 2026
Copy link
Copy Markdown
Member

@msluszniak msluszniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant work on this one 🚀

Copy link
Copy Markdown
Collaborator

@chmjkb chmjkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment, overall great work 👏🏻

Comment thread packages/react-native-executorch/src/index.ts Outdated
@chmjkb
Copy link
Copy Markdown
Collaborator

chmjkb commented Mar 11, 2026

Also please get rid of the API reference files

@IgorSwat
Copy link
Copy Markdown
Contributor Author

Also please get rid of the API reference files

Done.

@IgorSwat IgorSwat merged commit 6dd8fb6 into main Mar 11, 2026
5 checks passed
@IgorSwat IgorSwat deleted the @is/speech-to-text branch March 11, 2026 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix PRs that are fixing bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crash during live transcription in Speech-to-Text demo app Fix Speech to Text streaming mode

3 participants