Provider-agnostic LLM abstraction layer. Normalizes Claude (Anthropic), OpenAI-compatible (OpenAI, OpenRouter, Groq, etc.), Gemini, Cloudflare Workers AI, and any other HTTP-based LLM provider behind a single LLMAdapter.complete(request) -> LLMResponse interface.
- Single interface — all adapters implement the
LLMAdapterProtocol:complete(LLMRequest) -> LLMResponse. - Retry on transient errors — all adapters use
_call_with_retrywith a configurableLLMRetryConfig. Default: exponential backoff on 429 / 5xx. - Streaming — when
LLMRequest.stream=True, responses are streamed via SSE. Theon_chunkcallback receives each text chunk incrementally. - Tool calls — tool call blocks are extracted from provider-specific response shapes and normalized into
LLMToolCalldataclasses. - Safety blocks — Gemini safety blocks are surfaced via
LLMSafetyBlockinLLMResponse.safety. Other providers returnNone. - Cost estimation —
LLMResponse.estimated_cost_centscomputes a best-effort cost using token counts and provider pricing from_config.py. - Workers AI quirks —
WorkersAIAdapterstrips@cf/model prefix for AI Gateway, removesresponse_formatto avoid 500s, and can disable tools entirely.
| Provider | Adapter class | Protocol |
|---|---|---|
claude |
ClaudeAdapter |
Anthropic Messages API |
openai, openrouter, groq, etc. |
OpenAICompatibleAdapter |
OpenAI Chat Completions |
gemini |
GeminiAdapter |
Gemini generateContent |
workers-ai |
WorkersAIAdapter |
Cloudflare Workers AI (OAI-compat) |
LLMRequest
→ adapter._prepare_payload() [provider-specific JSON]
→ transport.post_json() [HTTP POST]
→ _extract_*_content() [text extraction]
→ _extract_*_tool_calls() [tool call normalization]
→ LLMResponse
LLMResponse.contentis always a string (may be empty if only tool calls returned).LLMResponse.tool_callsis always a list (may be empty).ProviderConfig.resolved_api_key()raisesLLMConfigurationErrorif env var not set.ProviderConfig.resolved_model()always returns a string (falls back todefault_model).