A drop-in dual API proxy that serves both OpenAI-compatible and Anthropic-compatible endpoints,
powered by the Claude Code CLI and your Claude Max subscription. No API key required.
Any tool, library, or application that speaks the OpenAI or Anthropic API can now use Claude models locally.
Quick Start • Usage Examples • Configuration • API Reference
┌──────────────────────────┐
OpenAI SDK clients ────▶│ /openai/v1/* │
│ │ ┌─────────────────┐
│ FastAPI Proxy │────▶│ Claude Code │
│ (server.py) │ │ CLI (claude -p)│
│ Port 4000 │◀────│ Max Sub Auth │
Anthropic SDK clients ─▶│ │ └─────────────────┘
│ /anthropic/v1/* │
└──────────────────────────┘
- Your app sends a standard API request to the proxy (OpenAI or Anthropic format)
- The proxy translates it and calls
claude -p(pipe mode) with your Max subscription auth - Claude's response is formatted back into the correct API response schema
- Your app receives a response identical to what the real API would return
|
|
Shared across both APIs: Real streaming via SSE • All three Claude tiers (Opus, Sonnet, Haiku) • Multi-turn conversations • CORS enabled • Graceful shutdown • Multi-worker support • Concurrency limiting with 429 backpressure • Request ID tracing • Health check with CLI verification
| Requirement | Details |
|---|---|
| Claude Code CLI | Install and authenticate with claude auth login |
| Claude Max | Active subscription |
| Python | 3.10 or later |
| Linux | With systemd (for auto-start — the proxy itself runs anywhere Python runs) |
git clone https://github.com/RandomSynergy17/claudeMax-OpenAI-HTTP-Proxy.git
cd claudeMax-OpenAI-HTTP-Proxy
chmod +x install.sh
./install.shThe installer creates a virtual environment, installs dependencies, configures a systemd user service, and starts the proxy automatically.
git clone https://github.com/RandomSynergy17/claudeMax-OpenAI-HTTP-Proxy.git
cd claudeMax-OpenAI-HTTP-Proxy
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python server.py --port 4000 --host 0.0.0.0curl http://localhost:4000/healthfrom openai import OpenAI
client = OpenAI(
base_url="http://192.168.x.x:4000/openai/v1",
api_key="not-needed"
)
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)import anthropic
client = anthropic.Anthropic(
base_url="http://192.168.x.x:4000/anthropic",
api_key="not-needed"
)
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)Streaming (OpenAI)
stream = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Streaming (Anthropic)
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Tell me a story"}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)Tool / Function Calling (OpenAI)
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "What's the weather in Paris?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}
}]
)Tool Use (Anthropic)
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[{
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}],
messages=[{"role": "user", "content": "What's the weather in Paris?"}]
)curl
# OpenAI format
curl http://localhost:4000/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"Hello!"}]}'
# Anthropic format
curl http://localhost:4000/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: not-needed" \
-H "anthropic-version: 2023-06-01" \
-d '{"model":"claude-sonnet-4-6","max_tokens":1024,"messages":[{"role":"user","content":"Hello!"}]}'JavaScript / TypeScript
// OpenAI
import OpenAI from "openai";
const openai = new OpenAI({ baseURL: "http://host:4000/openai/v1", apiKey: "na" });
const resp = await openai.chat.completions.create({
model: "claude-sonnet-4-6",
messages: [{ role: "user", content: "Hi" }]
});
// Anthropic
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({ baseURL: "http://host:4000/anthropic", apiKey: "na" });
const msg = await anthropic.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Hi" }]
});| Tool | Configuration |
|---|---|
| LangChain (OpenAI) | ChatOpenAI(base_url="http://host:4000/openai/v1", api_key="na") |
| LangChain (Anthropic) | ChatAnthropic(base_url="http://host:4000/anthropic", api_key="na") |
| LlamaIndex | OpenAI(api_base="http://host:4000/openai/v1", api_key="na") |
| Continue (VS Code) | Set API base to http://host:4000/openai/v1 |
| Open WebUI | Add as OpenAI provider with http://host:4000/openai/v1 |
| Cursor | Set OpenAI base URL to http://host:4000/openai/v1 |
| Endpoint | Method | Status |
|---|---|---|
/openai/v1/chat/completions |
POST | Full support (streaming, tools, system messages) |
/openai/v1/completions |
POST | Full support (streaming + non-streaming) |
/openai/v1/models |
GET | Lists Claude models + OpenAI aliases |
/openai/v1/models/{id} |
GET | Retrieve individual model |
/openai/v1/embeddings |
POST | 501 — Not available |
/openai/v1/images/* |
POST | 501 — Not available |
/openai/v1/audio/* |
POST | 501 — Not available |
/openai/v1/fine_tuning/jobs |
POST | 501 — Not available |
/openai/v1/moderations |
POST | 501 — Not available |
| Endpoint | Method | Status |
|---|---|---|
/anthropic/v1/messages |
POST | Full support (streaming, tools, system, stop_sequences) |
/anthropic/v1/models |
GET | Lists all Claude models |
/anthropic/v1/models/{id} |
GET | Retrieve individual model |
/anthropic/v1/messages/count_tokens |
POST | Token count estimation |
| Endpoint | Description |
|---|---|
GET /health |
Health check — verifies CLI availability, reports active/max concurrent requests, CLI version |
GET / |
Service info and endpoint directory |
Use Claude model names directly on both APIs. The OpenAI surface also accepts these aliases:
| OpenAI Alias | Routes To | Tier |
|---|---|---|
gpt-4, gpt-4-turbo, gpt-4o |
claude-sonnet-4-6 |
Balanced |
o1-mini, o3-mini, o4-mini |
claude-sonnet-4-6 |
Balanced |
gpt-4o-mini, gpt-3.5-turbo |
claude-haiku-4-5 |
Fast |
o1, o1-preview, o3 |
claude-opus-4-6 |
Most capable |
All models: 200K context window • 32K max output tokens
python server.py [OPTIONS]| Flag | Default | Description |
|---|---|---|
--port |
4000 |
Port to listen on |
--host |
0.0.0.0 |
Host to bind to (0.0.0.0 = all interfaces) |
--timeout |
300 |
CLI subprocess timeout in seconds |
--max-concurrent |
10 |
Max concurrent requests per worker |
--workers |
1 |
Number of uvicorn workers |
--log-level |
INFO |
Log level (DEBUG, INFO, WARNING, ERROR) |
--no-access-log |
— | Disable uvicorn access log |
Environment variables can be set via shell export, in a .env file in the project directory, or in a systemd EnvironmentFile. The proxy loads .env automatically on startup using python-dotenv.
To get started, copy the template:
cp .env.example .env
# Edit .env with your settings| Variable | Default | Description |
|---|---|---|
CLAUDE_PROXY_API_KEY |
(unset) | API key for authentication. When set, all requests must provide this key via Authorization: Bearer <key> (OpenAI) or x-api-key: <key> (Anthropic). When unset, no auth is required. |
CLAUDE_PROXY_LOG_LEVEL |
INFO |
Log level (alternative to --log-level) |
CLAUDE_PROXY_CORS_ORIGINS |
* |
Comma-separated allowed CORS origins |
CLAUDE_PROXY_MAX_BODY_BYTES |
10485760 |
Max request body size (10 MB) |
CLAUDE_PROXY_ACCESS_LOG |
true |
Enable/disable access log |
CLAUDE_PROXY_DRAIN_SECONDS |
5 |
Graceful shutdown drain period |
For higher throughput, run with multiple workers:
python server.py --workers 4 --max-concurrent 5Each worker gets its own concurrency pool.
Total capacity = workers × max-concurrent (e.g., 4 × 5 = 20 concurrent requests).
When all concurrency slots are full, the proxy returns HTTP 429 with a rate_limit_exceeded error. Clients should back off and retry.
./install.shThe script handles venv creation, dependency install, systemd service setup, and boot persistence.
mkdir -p ~/.config/systemd/user
cp claude-proxy.service ~/.config/systemd/user/
# Edit paths in the service file if needed
systemctl --user daemon-reload
systemctl --user enable claude-proxy.service
systemctl --user start claude-proxy.service
loginctl enable-linger $USERsystemctl --user status claude-proxy # Check status
systemctl --user restart claude-proxy # Restart
systemctl --user stop claude-proxy # Stop
journalctl --user -u claude-proxy -f # Tail logscd claudeMax-OpenAI-HTTP-Proxy
git pull
cp server.py ~/claude-proxy/server.py
systemctl --user restart claude-proxy| Problem | Solution |
|---|---|
| 502 errors | Check claude auth login and subscription status. Test: echo "hi" | claude -p |
| 429 errors | Server at capacity. Increase --max-concurrent or --workers, or retry with backoff |
| Connection refused from LAN | Ensure --host 0.0.0.0. Check firewall: sudo ufw allow 4000/tcp |
| Service won't start at boot | Run loginctl enable-linger $USER |
| Port already in use | fuser -k 4000/tcp && systemctl --user restart claude-proxy |
| High latency | Normal ~1–2s overhead per subprocess. Use Haiku for faster responses |
| Rate limiting | Bound by Claude Max subscription limits. Reduce concurrent requests |
| Limitation | Detail |
|---|---|
| Subprocess latency | ~1–2s overhead per request (subprocess spawn) |
| Rate limits | Bound by Claude Max subscription limits |
| No embeddings | Claude does not provide embedding models |
| No image generation | Claude does not generate images |
| No audio | Claude does not provide TTS/STT |
| Image input | Image URLs/base64 noted in prompt but not passed through to Claude |
| Tool calling | Implemented via prompt injection — works well but not native |
| n > 1 | Only n=1 supported (single completion per request) |
| Ignored parameters | OpenAI params like temperature, top_p, max_tokens are accepted but not forwarded to Claude CLI |
| Feature | Detail |
|---|---|
| API key auth | Optional. Set CLAUDE_PROXY_API_KEY to require authentication. Clients must send the key via Authorization: Bearer (OpenAI) or x-api-key (Anthropic). Uses timing-safe comparison. /health and / are exempt. When unset, all requests are accepted (auth via Claude Max subscription). |
| No filesystem access | Claude Code tools are disabled (--tools ""). Claude cannot access your filesystem |
| CORS | Defaults to * for LAN convenience. Restrict via CLAUDE_PROXY_CORS_ORIGINS |
| Request size | 10 MB default, configurable via CLAUDE_PROXY_MAX_BODY_BYTES |
| Tool sanitization | Tool names and descriptions are sanitized to prevent prompt injection |
| Error sanitization | CLI error messages have file paths stripped before returning to clients |
| Request tracing | Every request gets an X-Request-ID header |
Report a Bug • Request a Feature • Contribute
Made with Claude Code CLI