|
| 1 | +# LLM Gateway |
| 2 | + |
| 3 | +Multi-provider LLM gateway with automatic fallback and cost tracking. Provides a single HTTP API that routes requests across DeepSeek, Gemini, OpenAI, and Anthropic — trying cheaper providers first and falling back automatically on failure. |
| 4 | + |
| 5 | +## Quick Start |
| 6 | + |
| 7 | +```bash |
| 8 | +# Install dependencies |
| 9 | +python -m venv venv |
| 10 | +source venv/bin/activate |
| 11 | +pip install -r requirements.txt |
| 12 | + |
| 13 | +# Set up at least one provider |
| 14 | +export LLM_PROVIDER=deepseek |
| 15 | +export DEEPSEEK_API_KEY=your-key |
| 16 | +export DEEPSEEK_MODEL=deepseek-chat |
| 17 | + |
| 18 | +# Start the server |
| 19 | +python main.py |
| 20 | +``` |
| 21 | + |
| 22 | +The server runs on `http://localhost:8090` by default. |
| 23 | + |
| 24 | +## API Endpoints |
| 25 | + |
| 26 | +| Endpoint | Method | Description | |
| 27 | +|----------|--------|-------------| |
| 28 | +| `/classify` | POST | Classify items using AI (returns JSON) | |
| 29 | +| `/plan` | POST | Generate structured plans using AI (returns JSON) | |
| 30 | +| `/embed` | POST | Generate text embeddings (requires OPENAI_API_KEY) | |
| 31 | +| `/v1/chat/completions` | POST | OpenAI-compatible chat with optional tool call support | |
| 32 | +| `/health` | GET | Health check with provider status | |
| 33 | + |
| 34 | +### POST /classify |
| 35 | + |
| 36 | +Send a prompt, get back a JSON classification response. |
| 37 | + |
| 38 | +```bash |
| 39 | +curl -X POST http://localhost:8090/classify \ |
| 40 | + -H "Content-Type: application/json" \ |
| 41 | + -d '{"prompt": "Classify these items: ..."}' |
| 42 | +``` |
| 43 | + |
| 44 | +### POST /plan |
| 45 | + |
| 46 | +Generate a structured plan from context and a system prompt. |
| 47 | + |
| 48 | +```bash |
| 49 | +curl -X POST http://localhost:8090/plan \ |
| 50 | + -H "Content-Type: application/json" \ |
| 51 | + -d '{ |
| 52 | + "context": {"task": "...", "constraints": []}, |
| 53 | + "system_prompt": "You are a planner. Return JSON." |
| 54 | + }' |
| 55 | +``` |
| 56 | + |
| 57 | +### POST /embed |
| 58 | + |
| 59 | +Generate text embeddings using OpenAI's embedding models. |
| 60 | + |
| 61 | +```bash |
| 62 | +curl -X POST http://localhost:8090/embed \ |
| 63 | + -H "Content-Type: application/json" \ |
| 64 | + -d '{"text": "text to embed"}' |
| 65 | +``` |
| 66 | + |
| 67 | +Request body: |
| 68 | +- `text`: String or list of strings to embed |
| 69 | +- `model`: Embedding model (default: `text-embedding-ada-002`) |
| 70 | + |
| 71 | +Response: |
| 72 | +```json |
| 73 | +{ |
| 74 | + "embeddings": [[0.1, 0.2, ...]], |
| 75 | + "model": "text-embedding-ada-002", |
| 76 | + "dimensions": 1536, |
| 77 | + "ai_call_log": { |
| 78 | + "provider": "openai", |
| 79 | + "model": "text-embedding-ada-002", |
| 80 | + "prompt_tokens": 5, |
| 81 | + "completion_tokens": 0, |
| 82 | + "cost_microcents": 1, |
| 83 | + "latency_ms": 150, |
| 84 | + "success": true |
| 85 | + } |
| 86 | +} |
| 87 | +``` |
| 88 | + |
| 89 | +### POST /v1/chat/completions |
| 90 | + |
| 91 | +OpenAI-compatible endpoint supporting optional tool calls. Provider-specific translation (e.g. Anthropic tool format) is handled transparently. |
| 92 | + |
| 93 | +```bash |
| 94 | +curl -X POST http://localhost:8090/v1/chat/completions \ |
| 95 | + -H "Content-Type: application/json" \ |
| 96 | + -d '{ |
| 97 | + "messages": [{"role": "user", "content": "Hello"}] |
| 98 | + }' |
| 99 | +``` |
| 100 | + |
| 101 | +### GET /health |
| 102 | + |
| 103 | +Check service health and provider status. |
| 104 | + |
| 105 | +```bash |
| 106 | +curl http://localhost:8090/health |
| 107 | +``` |
| 108 | + |
| 109 | +Response: |
| 110 | +```json |
| 111 | +{ |
| 112 | + "status": "healthy", |
| 113 | + "providers": [{"name": "deepseek", "model": "deepseek-chat"}], |
| 114 | + "embeddings_available": true |
| 115 | +} |
| 116 | +``` |
| 117 | + |
| 118 | +## Configuration |
| 119 | + |
| 120 | +All configuration is via environment variables. Copy `.env.example` to `.env` and fill in your keys. |
| 121 | + |
| 122 | +### Provider Selection |
| 123 | + |
| 124 | +| Variable | Default | Description | |
| 125 | +|----------|---------|-------------| |
| 126 | +| `LLM_PROVIDER` | `auto` | Provider: `auto`, `deepseek`, `gemini`, `openai`, `anthropic` | |
| 127 | + |
| 128 | +When `LLM_PROVIDER=auto`, providers are tried in cost-effectiveness order: |
| 129 | +1. DeepSeek — $0.12/1M input, $0.20/1M output |
| 130 | +2. Gemini — $0.10/1M input, $0.40/1M output |
| 131 | +3. OpenAI — $0.15/1M input, $0.60/1M output |
| 132 | +4. Anthropic — $3/1M input, $15/1M output |
| 133 | + |
| 134 | +### Provider API Keys |
| 135 | + |
| 136 | +| Variable | Description | |
| 137 | +|----------|-------------| |
| 138 | +| `DEEPSEEK_API_KEY` | DeepSeek API key | |
| 139 | +| `DEEPSEEK_MODEL` | DeepSeek model (e.g., `deepseek-chat`) | |
| 140 | +| `GEMINI_API_KEY` | Google Gemini API key | |
| 141 | +| `GEMINI_MODEL` | Gemini model (e.g., `gemini-2.0-flash`) | |
| 142 | +| `OPENAI_API_KEY` | OpenAI API key (also required for `/embed`) | |
| 143 | +| `OPENAI_MODEL` | OpenAI model (e.g., `gpt-4o-mini`) | |
| 144 | +| `ANTHROPIC_API_KEY` | Anthropic API key | |
| 145 | +| `ANTHROPIC_MODEL` | Anthropic model (e.g., `claude-3-5-sonnet-20241022`) | |
| 146 | + |
| 147 | +At least one provider must have both API key and model configured. |
| 148 | + |
| 149 | +### Service Settings |
| 150 | + |
| 151 | +| Variable | Default | Description | |
| 152 | +|----------|---------|-------------| |
| 153 | +| `PORT` | `8090` | HTTP port | |
| 154 | +| `LOG_LEVEL` | `INFO` | Logging level | |
| 155 | + |
| 156 | +## Development |
| 157 | + |
| 158 | +### Running Tests |
| 159 | + |
| 160 | +```bash |
| 161 | +# Run all tests |
| 162 | +pytest -v |
| 163 | + |
| 164 | +# Run with coverage |
| 165 | +pytest --cov=. --cov-report=term-missing |
| 166 | + |
| 167 | +# Run specific test file |
| 168 | +pytest tests/test_providers.py -v |
| 169 | +``` |
| 170 | + |
| 171 | +### Docker |
| 172 | + |
| 173 | +```bash |
| 174 | +# Build |
| 175 | +docker build -t llm-gateway . |
| 176 | + |
| 177 | +# Run |
| 178 | +docker run -p 8090:8090 \ |
| 179 | + -e LLM_PROVIDER=auto \ |
| 180 | + -e DEEPSEEK_API_KEY=key \ |
| 181 | + -e DEEPSEEK_MODEL=deepseek-chat \ |
| 182 | + llm-gateway |
| 183 | +``` |
| 184 | + |
| 185 | +## Architecture |
| 186 | + |
| 187 | +``` |
| 188 | +┌─────────────┐ ┌─────────────┐ ┌─────────────┐ |
| 189 | +│ Your Svc A │ │ Your Svc B │ │ Your Svc C │ |
| 190 | +│ │ │ │ │ │ |
| 191 | +└──────┬──────┘ └──────┬──────┘ └──────┬──────┘ |
| 192 | + │ HTTP │ HTTP │ HTTP |
| 193 | + ▼ ▼ ▼ |
| 194 | +┌──────────────────────────────────────────────────────┐ |
| 195 | +│ llm-gateway (Python) │ |
| 196 | +│ ┌────────────────────────────────────────────────┐ │ |
| 197 | +│ │ Providers: DeepSeek | Gemini | OpenAI | Anthropic│ │ |
| 198 | +│ │ Features: Auto-fallback, Cost tracking, Retries │ │ |
| 199 | +│ │ Endpoints: /plan, /classify, /embed, /health │ │ |
| 200 | +│ └────────────────────────────────────────────────┘ │ |
| 201 | +└──────────────────────────────────────────────────────┘ |
| 202 | +``` |
| 203 | + |
| 204 | +## Contributing |
| 205 | + |
| 206 | +See [CONTRIBUTING.md](CONTRIBUTING.md). |
| 207 | + |
| 208 | +## License |
| 209 | + |
| 210 | +MIT — see [LICENSE](LICENSE). |
0 commit comments