Use the Vercel AI SDK's unified interface to transcribe audio and generate speech with Deepgram, using the same API patterns you'd use with any other AI provider. Swap between Deepgram, OpenAI, and others by changing one import.
A Node.js script that does two things: transcribes an audio file using Deepgram's nova-3 model via the AI SDK's transcribe() function, then generates speech audio from text using Deepgram's Aura 2 TTS via the AI SDK's generateSpeech() function. The transcript prints to the console; the generated audio saves to a file you can play back.
- Node.js 18+
- Deepgram account — get a free API key
Copy .env.example to .env and fill in your API key:
| Variable | Where to find it |
|---|---|
DEEPGRAM_API_KEY |
Deepgram console → API Keys |
npm install
npm startTo transcribe a different file, set the AUDIO_URL environment variable:
AUDIO_URL=https://example.com/my-audio.wav npm starttranscribe()from theaipackage provides a provider-agnostic transcription interfacedeepgram.transcription('nova-3')routes the request through the@ai-sdk/deepgramprovider to Deepgram's pre-recorded STT API- The transcript is returned with text, segments (with timestamps), and duration metadata
generateSpeech()provides a provider-agnostic TTS interfacedeepgram.speech('aura-2-helena-en')routes through Deepgram's Aura TTS API- The generated audio is saved as a raw PCM file
The key advantage of the AI SDK approach is portability: you can swap deepgram.transcription('nova-3') for openai.transcription('whisper-1') without changing any other code.
- Vercel AI SDK Deepgram provider docs
- Vercel AI SDK transcription docs
- Vercel AI SDK speech docs
- Deepgram pre-recorded STT docs
- Deepgram TTS docs
If you want a ready-to-run base for your own project, check the deepgram-starters org — there are starter repos for every language and every Deepgram product.