Context Switch

Context Switch is a Rust-based framework for building real-time conversational applications with support for multiple modalities (audio and text). It provides a unified interface for interacting with various speech and language services like Azure Speech Services and OpenAI.

Features

Multi-modal conversation support (audio and text)
Pluggable service architecture
Integration with:
- Azure Speech Services (transcription, translation, synthesis)
- ElevenLabs realtime speech-to-text (Scribe v2 Realtime)
- OpenAI dialog services
Asynchronous processing using Tokio

Project Structure

core/: Core functionality and interfaces
services/: Implementation of various service integrations
- azure/: Azure Speech Services integration
- elevenlabs/: ElevenLabs speech-to-text integration
- google-transcribe/: Google Speech-to-Text integration (WIP)
- openai-dialog/: OpenAI conversational services integration
audio-knife/: WebSocket server that implements the mod_audio_fork protocol for real-time audio streaming from telephony systems via FreeSWITCH. Provides a bridge between audio sources and the Context Switch framework.
examples/: Example applications showcasing different features

Getting Started

Prerequisites

Rust
API keys for the services you intend to use:
- OpenAI API key
- Azure Speech Services subscription key
- Google Cloud API key (for Google transcription)
For Aristech services:
- Install protoc
  - macOS: brew install protobuf
  - Linux: apt-get install protobuf-compiler

Installation

Clone the repository:

git clone https://github.com/pragmatrix/context-switch.git
cd context-switch

Initialize submodules:
```
git submodule update --init --recursive
```
Create a .env file with your API keys (see .env.example for reference)

Running Examples

The project includes several examples showcasing different functionalities:

# Run OpenAI dialog example
cargo run --example openai-dialog

# Run generic transcribe example with Azure provider
cargo run --example transcribe -- azure

# Run generic transcribe example with ElevenLabs provider
cargo run --example transcribe -- elevenlabs

# Run generic transcribe example with Aristech provider
cargo run --example transcribe -- aristech

# Run Azure synthesize example
cargo run --example azure-synthesize

# Run dialog example with Google Agent Platform endpoint construction
cargo run --example dialog -- google-agent-platform --project your-project --location us-central1

Using Audio Knife

Audio Knife is a WebSocket server that implements the mod_audio_fork protocol, allowing real-time audio streaming from and to FreeSWITCH. It acts as a bridge between audio sources and the Context Switch framework.

To run the Audio Knife server:

cargo run -p audio-knife

By default, it listens on 127.0.0.1:8123. You can customize the address by setting the AUDIO_KNIFE_ADDRESS environment variable.

Configuration

Configure the services by setting the appropriate environment variables in your .env file:

# OpenAI Configuration
OPENAI_API_KEY=your_openai_key
OPENAI_REALTIME_API_MODEL=gpt-4o-mini-realtime-preview
OPENAI_REALTIME_ENDPOINT=

# Azure Configuration
AZURE_SUBSCRIPTION_KEY=your_azure_key
AZURE_REGION=your_azure_region

# ElevenLabs Configuration
ELEVENLABS_API_KEY=your_elevenlabs_key

# Audio Knife Configuration
AUDIO_KNIFE_ADDRESS=127.0.0.1:8123

# Google Agent Platform dialog example configuration
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
GOOGLE_AGENT_PLATFORM_PROJECT=your-gcp-project
GOOGLE_AGENT_PLATFORM_LOCATION=us-central1

The google-agent-platform dialog provider constructs the default Agent Platform WebSocket endpoint from GOOGLE_AGENT_PLATFORM_LOCATION as:

wss://{location}-aiplatform.googleapis.com/ws/google.cloud.aiplatform.v1.LlmBidiService/BidiGenerateContent

Use --endpoint to override the computed endpoint.

For Azure OpenAI realtime endpoints (*.openai.azure.com), the realtime client automatically appends api-key as a query parameter to the websocket URL. For other hosts, it uses the standard Authorization: Bearer ... header.

The websocket client does not follow redirects. If the endpoint responds with 3xx (for example 302 Found), update the configured endpoint URL to the final websocket target.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 569 Commits
.github		.github
audio-knife		audio-knife
audio-test		audio-test
core		core
examples		examples
external		external
filter-test		filter-test
services		services
src		src
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
.harper-dictionary.txt		.harper-dictionary.txt
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Context Switch

Features

Project Structure

Getting Started

Prerequisites

Installation

Running Examples

Using Audio Knife

Configuration

License

About

Uh oh!

Releases 16

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Context Switch

Features

Project Structure

Getting Started

Prerequisites

Installation

Running Examples

Using Audio Knife

Configuration

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages