Bridge real-time phone call audio from Asterisk or FreeSWITCH into Deepgram's live speech-to-text API. This example shows how to capture RTP/PCM audio from a PBX and stream it over a WebSocket to Deepgram for real-time transcription.
A Python WebSocket server that acts as a bridge between your PBX and Deepgram. Incoming calls on Asterisk (via AudioSocket) or FreeSWITCH (via mod_audio_stream) send their audio to this server, which forwards it to Deepgram's streaming STT API and prints live transcripts to the console.
- Python 3.11+
- Deepgram account — get a free API key
- Asterisk 16+ with
app_audiosocketmodule, or FreeSWITCH withmod_audio_stream
| Variable | Where to find it |
|---|---|
DEEPGRAM_API_KEY |
Deepgram console |
Copy .env.example to .env and fill in your values.
cd examples/260-asterisk-freeswitch-deepgram-stt-python
pip install -r requirements.txt
cp .env.example .env
# Edit .env and add your DEEPGRAM_API_KEY
python src/bridge.pyThe bridge listens on ws://0.0.0.0:8765 by default. Use --port to change it.
Add to your Asterisk dialplan (extensions.conf) to route call audio to the bridge:
[transcribe]
exten => _X.,1,Answer()
same => n,AudioSocket(ws://bridge-host:8765/asterisk)
same => n,Hangup()Asterisk AudioSocket sends signed-linear 16-bit PCM at 8 kHz mono by default. The bridge parses AudioSocket's TLV framing (type-length-value) to extract audio frames.
Add to your FreeSWITCH dialplan to stream call audio to the bridge:
<action application="answer"/>
<action application="audio_stream" data="ws://bridge-host:8765/freeswitch 16000 mono L16"/>FreeSWITCH mod_audio_stream sends raw PCM frames directly — no framing protocol, just binary audio on the WebSocket.
| Parameter | Value | Description |
|---|---|---|
model |
nova-3-phonecall |
Deepgram model optimised for telephony audio (8/16 kHz) |
encoding |
linear16 |
Signed 16-bit little-endian PCM — the native format of both PBX platforms |
sample_rate |
8000 / 16000 |
8 kHz for Asterisk default, 16 kHz for FreeSWITCH (higher = better accuracy) |
smart_format |
True |
Adds punctuation, capitalisation, and number formatting |
interim_results |
True |
Returns partial transcripts while the caller is still speaking |
utterance_end_ms |
1000 |
Fires an utterance-end event after 1 second of silence |
- PBX receives a call — Asterisk or FreeSWITCH answers and is configured to stream audio to this bridge
- Audio reaches the bridge — Asterisk sends AudioSocket TLV frames to
/asterisk; FreeSWITCH sends raw PCM to/freeswitch - Bridge opens a Deepgram connection — using the Python SDK's
client.listen.v1.connect()with telephony-optimised settings - Audio is forwarded — each PCM chunk is sent to Deepgram via
connection.send_media() - Transcripts arrive — Deepgram fires
EventType.MESSAGEcallbacks with interim and final transcripts, which the bridge logs to the console - Call ends — the PBX closes the WebSocket; the bridge sends
close_streamto Deepgram
Phone Call
|
| RTP audio
v
Asterisk / FreeSWITCH PBX
|
| WebSocket (AudioSocket TLV or raw PCM)
v
bridge.py (this server)
|
| Deepgram Python SDK (WebSocket)
v
Deepgram Live STT (nova-3-phonecall)
|
| transcript events
v
Console output (or your application)
- Deepgram FreeSWITCH integration
- Deepgram Live STT docs
- Asterisk AudioSocket
- FreeSWITCH mod_audio_stream
If you want a ready-to-run base for your own project, check the deepgram-starters org — there are starter repos for every language and every Deepgram product.