caLLMe

A tiny, voice-first assistant that turns your microphone into a real-time conversation with an LLM.
caLLMe listens, transcribes, generates a response, speaks it back, and lets you interrupt at any time – all in just a few hundred lines of Python.

Features

Silero Voice Activity Detection – smartly starts/stops recording when you speak.
Groq Whisper STT – high-quality speech-to-text.
Groq Llama 3 LLM – streams replies token-by-token.
Groq PlayAI TTS – natural, low-latency speech synthesis.
Async audio queue – responses are played while the next ones are being generated; speak again to interrupt.
Simple, hackable architecture – every component lives in src/ and follows small base interfaces (VAD, STT, TTS, Gen, Player).

Quick Start

# 1. Grab the code
$ git clone https://github.com/yourname/caLLMe.git
$ cd caLLMe

# 2. Create & activate a virtual env (optional but recommended)
$ python -m venv .venv
$ source .venv/bin/activate

# 3. Install Python dependencies
$ pip install -r requirements.txt

# 4. Set your Groq API key (required for STT, TTS & LLM)
$ export GROQ_API_KEY="sk_..."

# 5. Run the assistant 🎙️
$ python src/main.py

Customising

Change the system prompt & initial dialogue in src/main.py.
Swap out models by tweaking default parameters in:
- src/gen/groq.py (LLM)
- src/stt/groqWhisper.py (STT)
- src/tts/groqPlayai.py (TTS)
Adjust VAD sensitivity in src/vad/silerovad.py (on_threshold, off_threshold, etc.).

Troubleshooting

PyAudio may require system packages (e.g. portaudio, alsa-utils). On Ubuntu:
sudo apt install portaudio19-dev python3-pyaudio
If audio is choppy, lower max_audio_queue in Conversation or tweak model temperatures.

Built with ❤️ & open-source software. Enjoy hacking!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

caLLMe

Features

Quick Start

Customising

Troubleshooting

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

caLLMe

Features

Quick Start

Customising

Troubleshooting