voice-chat-transcriber

Automatically joins voice channels and transcribes speech in real time using local, offline speech recognition. Transcriptions are posted as embeds in a #logs-vc channel with per-user attribution (name and avatar). Supports running multiple bot instances in parallel to cover more than one voice channel at a time.

How It Works

A user joins a voice channel.
An available bot claims the channel and joins it.
Each user's audio is captured separately and streamed through Vosk for recognition.
Completed transcriptions are posted to #logs-vc as an embed attributed to that user.
When the channel is empty, the bot leaves and frees itself up for other channels.

Logs are only stored for a week on Discord and then automatically cleaned up.

Vosk Model

The bot requires a Vosk speech recognition model. Two recommended options:

Model	Size	Word Error Rate	Notes
`vosk-model-small-en-us-0.15`	~40 MB	~9.85%	Fast, low resource usage
`vosk-model-en-us-0.22`	~1.8 GB	~5.69%	Higher accuracy, recommended

Download from alphacephei.com/vosk/models and extract to a model/ directory in the project root.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

voice-chat-transcriber

How It Works

Vosk Model

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

voice-chat-transcriber

How It Works

Vosk Model