Skip to content

Commit bb5c352

Browse files
committed
initial commit: extract llm-gateway as standalone OSS project
0 parents  commit bb5c352

25 files changed

Lines changed: 4043 additions & 0 deletions

.env.example

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Provider selection: auto, deepseek, gemini, openai, anthropic
2+
LLM_PROVIDER=auto
3+
4+
# DeepSeek
5+
DEEPSEEK_API_KEY=
6+
DEEPSEEK_MODEL=deepseek-chat
7+
8+
# Google Gemini
9+
GEMINI_API_KEY=
10+
GEMINI_MODEL=gemini-2.0-flash
11+
12+
# OpenAI (also required for /embed endpoint)
13+
OPENAI_API_KEY=
14+
OPENAI_MODEL=gpt-4o-mini
15+
16+
# Anthropic
17+
ANTHROPIC_API_KEY=
18+
ANTHROPIC_MODEL=claude-3-5-sonnet-20241022
19+
20+
# Server
21+
PORT=8090
22+
LOG_LEVEL=INFO

.github/workflows/test.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
name: Tests
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
test:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- uses: actions/checkout@v4
14+
- uses: actions/setup-python@v5
15+
with:
16+
python-version: "3.11"
17+
- run: pip install -r requirements.txt
18+
- run: pytest --cov=. --cov-report=term-missing --cov-fail-under=90

.gitignore

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
venv/
2+
__pycache__/
3+
*.pyc
4+
*.pyo
5+
.pytest_cache/
6+
.coverage
7+
htmlcov/
8+
*.egg-info/
9+
dist/
10+
build/
11+
.env
12+
.env.local

CONTRIBUTING.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Contributing
2+
3+
## Getting Started
4+
5+
1. Fork the repository
6+
2. Create a feature branch: `git checkout -b my-feature`
7+
3. Install dependencies: `pip install -r requirements.txt`
8+
9+
## Development Workflow
10+
11+
This project follows test-driven development:
12+
13+
1. **Write a failing test first** — describe the behaviour you want
14+
2. **Write minimal code to pass** — no more than needed
15+
3. **Refactor** — clean up with tests still passing
16+
17+
## Running Tests
18+
19+
```bash
20+
pytest -v
21+
pytest --cov=. --cov-report=term-missing
22+
```
23+
24+
Coverage must stay above 90% for new code.
25+
26+
## Submitting a PR
27+
28+
- Keep changes focused — one feature or fix per PR
29+
- Ensure all tests pass and coverage does not drop
30+
- Write a clear PR description explaining what and why

Dockerfile

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
FROM python:3.11-slim
2+
3+
WORKDIR /app
4+
5+
# Install dependencies
6+
COPY requirements.txt .
7+
RUN pip install --no-cache-dir -r requirements.txt
8+
9+
# Copy application code
10+
COPY . .
11+
12+
# Create non-root user
13+
RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app
14+
USER appuser
15+
16+
EXPOSE 8090
17+
18+
CMD ["python", "main.py"]

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2026 NullRabbit
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
# LLM Gateway
2+
3+
Multi-provider LLM gateway with automatic fallback and cost tracking. Provides a single HTTP API that routes requests across DeepSeek, Gemini, OpenAI, and Anthropic — trying cheaper providers first and falling back automatically on failure.
4+
5+
## Quick Start
6+
7+
```bash
8+
# Install dependencies
9+
python -m venv venv
10+
source venv/bin/activate
11+
pip install -r requirements.txt
12+
13+
# Set up at least one provider
14+
export LLM_PROVIDER=deepseek
15+
export DEEPSEEK_API_KEY=your-key
16+
export DEEPSEEK_MODEL=deepseek-chat
17+
18+
# Start the server
19+
python main.py
20+
```
21+
22+
The server runs on `http://localhost:8090` by default.
23+
24+
## API Endpoints
25+
26+
| Endpoint | Method | Description |
27+
|----------|--------|-------------|
28+
| `/classify` | POST | Classify items using AI (returns JSON) |
29+
| `/plan` | POST | Generate structured plans using AI (returns JSON) |
30+
| `/embed` | POST | Generate text embeddings (requires OPENAI_API_KEY) |
31+
| `/v1/chat/completions` | POST | OpenAI-compatible chat with optional tool call support |
32+
| `/health` | GET | Health check with provider status |
33+
34+
### POST /classify
35+
36+
Send a prompt, get back a JSON classification response.
37+
38+
```bash
39+
curl -X POST http://localhost:8090/classify \
40+
-H "Content-Type: application/json" \
41+
-d '{"prompt": "Classify these items: ..."}'
42+
```
43+
44+
### POST /plan
45+
46+
Generate a structured plan from context and a system prompt.
47+
48+
```bash
49+
curl -X POST http://localhost:8090/plan \
50+
-H "Content-Type: application/json" \
51+
-d '{
52+
"context": {"task": "...", "constraints": []},
53+
"system_prompt": "You are a planner. Return JSON."
54+
}'
55+
```
56+
57+
### POST /embed
58+
59+
Generate text embeddings using OpenAI's embedding models.
60+
61+
```bash
62+
curl -X POST http://localhost:8090/embed \
63+
-H "Content-Type: application/json" \
64+
-d '{"text": "text to embed"}'
65+
```
66+
67+
Request body:
68+
- `text`: String or list of strings to embed
69+
- `model`: Embedding model (default: `text-embedding-ada-002`)
70+
71+
Response:
72+
```json
73+
{
74+
"embeddings": [[0.1, 0.2, ...]],
75+
"model": "text-embedding-ada-002",
76+
"dimensions": 1536,
77+
"ai_call_log": {
78+
"provider": "openai",
79+
"model": "text-embedding-ada-002",
80+
"prompt_tokens": 5,
81+
"completion_tokens": 0,
82+
"cost_microcents": 1,
83+
"latency_ms": 150,
84+
"success": true
85+
}
86+
}
87+
```
88+
89+
### POST /v1/chat/completions
90+
91+
OpenAI-compatible endpoint supporting optional tool calls. Provider-specific translation (e.g. Anthropic tool format) is handled transparently.
92+
93+
```bash
94+
curl -X POST http://localhost:8090/v1/chat/completions \
95+
-H "Content-Type: application/json" \
96+
-d '{
97+
"messages": [{"role": "user", "content": "Hello"}]
98+
}'
99+
```
100+
101+
### GET /health
102+
103+
Check service health and provider status.
104+
105+
```bash
106+
curl http://localhost:8090/health
107+
```
108+
109+
Response:
110+
```json
111+
{
112+
"status": "healthy",
113+
"providers": [{"name": "deepseek", "model": "deepseek-chat"}],
114+
"embeddings_available": true
115+
}
116+
```
117+
118+
## Configuration
119+
120+
All configuration is via environment variables. Copy `.env.example` to `.env` and fill in your keys.
121+
122+
### Provider Selection
123+
124+
| Variable | Default | Description |
125+
|----------|---------|-------------|
126+
| `LLM_PROVIDER` | `auto` | Provider: `auto`, `deepseek`, `gemini`, `openai`, `anthropic` |
127+
128+
When `LLM_PROVIDER=auto`, providers are tried in cost-effectiveness order:
129+
1. DeepSeek — $0.12/1M input, $0.20/1M output
130+
2. Gemini — $0.10/1M input, $0.40/1M output
131+
3. OpenAI — $0.15/1M input, $0.60/1M output
132+
4. Anthropic — $3/1M input, $15/1M output
133+
134+
### Provider API Keys
135+
136+
| Variable | Description |
137+
|----------|-------------|
138+
| `DEEPSEEK_API_KEY` | DeepSeek API key |
139+
| `DEEPSEEK_MODEL` | DeepSeek model (e.g., `deepseek-chat`) |
140+
| `GEMINI_API_KEY` | Google Gemini API key |
141+
| `GEMINI_MODEL` | Gemini model (e.g., `gemini-2.0-flash`) |
142+
| `OPENAI_API_KEY` | OpenAI API key (also required for `/embed`) |
143+
| `OPENAI_MODEL` | OpenAI model (e.g., `gpt-4o-mini`) |
144+
| `ANTHROPIC_API_KEY` | Anthropic API key |
145+
| `ANTHROPIC_MODEL` | Anthropic model (e.g., `claude-3-5-sonnet-20241022`) |
146+
147+
At least one provider must have both API key and model configured.
148+
149+
### Service Settings
150+
151+
| Variable | Default | Description |
152+
|----------|---------|-------------|
153+
| `PORT` | `8090` | HTTP port |
154+
| `LOG_LEVEL` | `INFO` | Logging level |
155+
156+
## Development
157+
158+
### Running Tests
159+
160+
```bash
161+
# Run all tests
162+
pytest -v
163+
164+
# Run with coverage
165+
pytest --cov=. --cov-report=term-missing
166+
167+
# Run specific test file
168+
pytest tests/test_providers.py -v
169+
```
170+
171+
### Docker
172+
173+
```bash
174+
# Build
175+
docker build -t llm-gateway .
176+
177+
# Run
178+
docker run -p 8090:8090 \
179+
-e LLM_PROVIDER=auto \
180+
-e DEEPSEEK_API_KEY=key \
181+
-e DEEPSEEK_MODEL=deepseek-chat \
182+
llm-gateway
183+
```
184+
185+
## Architecture
186+
187+
```
188+
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
189+
│ Your Svc A │ │ Your Svc B │ │ Your Svc C │
190+
│ │ │ │ │ │
191+
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
192+
│ HTTP │ HTTP │ HTTP
193+
▼ ▼ ▼
194+
┌──────────────────────────────────────────────────────┐
195+
│ llm-gateway (Python) │
196+
│ ┌────────────────────────────────────────────────┐ │
197+
│ │ Providers: DeepSeek | Gemini | OpenAI | Anthropic│ │
198+
│ │ Features: Auto-fallback, Cost tracking, Retries │ │
199+
│ │ Endpoints: /plan, /classify, /embed, /health │ │
200+
│ └────────────────────────────────────────────────┘ │
201+
└──────────────────────────────────────────────────────┘
202+
```
203+
204+
## Contributing
205+
206+
See [CONTRIBUTING.md](CONTRIBUTING.md).
207+
208+
## License
209+
210+
MIT — see [LICENSE](LICENSE).

0 commit comments

Comments
 (0)