api7
diff --git a/‎docs/internals/llm-types.md‎
Lines changed: 84 additions & 0 deletions b/‎docs/internals/llm-types.md‎
Lines changed: 84 additions & 0 deletions
diff --git a/‎src/gateway/error.rs‎
Lines changed: 164 additions & 0 deletions b/‎src/gateway/error.rs‎
Lines changed: 164 additions & 0 deletions
diff --git a/‎src/gateway/mod.rs‎
Lines changed: 2 additions & 0 deletions b/‎src/gateway/mod.rs‎
Lines changed: 2 additions & 0 deletions
@@ -0,0 +1,84 @@
+# LLM Types Deep Dive
+
+This document describes the Layer 1 type system for the LLM subsystem. It covers the wire models for the supported chat APIs plus the shared metadata and error types used by later bridge and provider code.
+
+## Namespace boundaries
+
+The OpenAI namespace owns both Chat Completions and Responses. They are related APIs from the same vendor, but they are not peers of Anthropic types.
+
+## OpenAI Chat as the hub format
+
+The types in `gateway::types::openai` are both the OpenAI Chat Completions wire models and the internal hub format used by the N:M bridge architecture.
+
+Three choices matter here:
+
+- `ChatCompletionRequest.extra` uses `#[serde(flatten)]` so provider-specific fields can survive round-trips without polluting the hub model.
+- `MessageContent` is untagged because OpenAI accepts either a plain string or a structured content array.
+- `StopCondition` is untagged because the API accepts either a single stop string or a list.
+
+That keeps the hub representation close to the public OpenAI schema while still being permissive enough for bridge code.
+
+## OpenAI Responses under the OpenAI namespace
+
+The Responses API types now live in `gateway::types::openai::responses`.
+
+That module models the parts that are specific to Responses rather than Chat Completions:
+
+- polymorphic input (`ResponsesInput`)
+- built-in OpenAI tools (`ResponsesTool`)
+- richer output items (`ResponsesOutputItem`)
+- fine-grained SSE event types (`ResponsesApiStreamEvent`)
+
+Keeping these types under the OpenAI namespace avoids presenting `responses` as a top-level peer alongside provider-agnostic or vendor-level modules.
+
+## Anthropic message models
+
+`gateway::types::anthropic` stays separate because it describes a different vendor protocol.
+
+Its main differences from the hub format are:
+
+- system prompts are top-level, not embedded in the message list
+- content blocks are internally tagged by `type`
+- streaming uses event-specific records instead of a single chunk envelope
+- prompt caching metadata is part of the native schema
+
+## Shared bridge metadata
+
+`gateway::types::common::BridgeContext` carries information that cannot be represented cleanly in the hub request alone.
+
+It currently has three buckets:
+
+- `anthropic_messages_extras` for Anthropic Messages-specific request data
+- `openai_responses_extras` for OpenAI Responses-specific state such as `previous_response_id`
+- `passthrough` for arbitrary provider-specific values
+
+This lets future `to_hub()` implementations return a normalized hub request without losing format-specific data that must be restored later.
+
+## Unified usage accounting
+
+`gateway::types::common::Usage` is intentionally sparse: every field is optional.
+
+That matches real provider behavior. Some providers report only prompt tokens, some only final totals, and some stream usage late. `Usage::merge()` therefore follows two rules:
+
+- overwrite only fields that are present in the incoming value
+- derive `total_tokens` only when it was not explicitly provided and both prompt and completion counts exist
+
+The tests cover overwrite behavior, derived totals, and preservation of explicit totals.
+
+## GatewayError
+
+`gateway::error::GatewayError` is the common error surface for the LLM subsystem.
+
+It separates four concerns:
+
+- client-side request problems (`Validation`, `Bridge`)
+- data conversion problems (`Transform`)
+- upstream/provider failures (`Provider`, `Http`)
+- stream lifecycle failures (`Stream`)
+
+Two helper methods make this usable from higher layers:
+
+- `is_retryable()` centralizes retry policy
+- `status_code()` maps failures to proxy-facing HTTP status codes
+
+This keeps the later LLM runtime and proxy code from duplicating provider error classification logic.
@@ -0,0 +1,164 @@
+//! Gateway error types.
+//!
+//! `GatewayError` is the unified error type for the gateway SDK layer
+//! (Layer 1-3). It covers validation errors, format bridging errors,
+//! provider HTTP errors, and stream errors. Each variant carries enough
+//! context for the proxy layer to produce an appropriate HTTP response.
+
+use http::StatusCode;
+use serde_json::Value;
+
+/// Unified error type for the gateway SDK.
+#[derive(Debug, thiserror::Error)]
+pub enum GatewayError {
+    // ── Client errors (not retryable) ──
+    /// Request validation failed (e.g., missing required field).
+    #[error("validation: {0}")]
+    Validation(String),
+
+    /// Format bridging failed (e.g., cannot map an Anthropic field to hub format).
+    #[error("format bridge: {0}")]
+    Bridge(String),
+
+    /// Data transformation failed (e.g., JSON deserialization of provider response).
+    #[error("data transform: {0}")]
+    Transform(String),
+
+    /// The requested format is not natively supported by the provider.
+    #[error("format not natively supported by provider {provider}")]
+    NativeNotSupported { provider: String },
+
+    // ── Provider errors (may be retryable) ──
+    /// The upstream provider returned an error response.
+    #[error("provider {provider} returned {status}: {body}")]
+    Provider {
+        status: StatusCode,
+        body: Value,
+        provider: String,
+        retryable: bool,
+    },
+
+    // ── Infrastructure errors (usually retryable) ──
+    /// HTTP transport error (connection, timeout, etc.).
+    #[error("HTTP: {0}")]
+    Http(#[source] reqwest::Error),
+
+    /// Error during stream processing.
+    #[error("stream: {0}")]
+    Stream(String),
+}
+
+impl GatewayError {
+    /// Whether this error is safe to retry.
+    pub fn is_retryable(&self) -> bool {
+        match self {
+            Self::Provider { retryable, .. } => *retryable,
+            Self::Http(e) => e.is_timeout() || e.is_connect(),
+            Self::Stream(_) => true,
+            _ => false,
+        }
+    }
+
+    /// Map to an HTTP status code for proxy-layer responses.
+    pub fn status_code(&self) -> StatusCode {
+        match self {
+            Self::Validation(_) | Self::Bridge(_) => StatusCode::BAD_REQUEST,
+            Self::Transform(_) => StatusCode::UNPROCESSABLE_ENTITY,
+            Self::Provider { status, .. } => *status,
+            Self::Http(_) | Self::Stream(_) => StatusCode::BAD_GATEWAY,
+            Self::NativeNotSupported { .. } => StatusCode::NOT_IMPLEMENTED,
+        }
+    }
+}
+
+/// Convenience alias for gateway results.
+pub type Result<T> = std::result::Result<T, GatewayError>;
+
+#[cfg(test)]
+mod tests {
+    use serde_json::json;
+
+    use super::*;
+
+    #[test]
+    fn validation_not_retryable() {
+        let e = GatewayError::Validation("missing field".into());
+        assert!(!e.is_retryable());
+        assert_eq!(e.status_code(), StatusCode::BAD_REQUEST);
+    }
+
+    #[test]
+    fn bridge_not_retryable() {
+        let e = GatewayError::Bridge("cannot map field X".into());
+        assert!(!e.is_retryable());
+        assert_eq!(e.status_code(), StatusCode::BAD_REQUEST);
+    }
+
+    #[test]
+    fn transform_not_retryable() {
+        let e = GatewayError::Transform("bad json".into());
+        assert!(!e.is_retryable());
+        assert_eq!(e.status_code(), StatusCode::UNPROCESSABLE_ENTITY);
+    }
+
+    #[test]
+    fn native_not_supported() {
+        let e = GatewayError::NativeNotSupported {
+            provider: "gemini".into(),
+        };
+        assert!(!e.is_retryable());
+        assert_eq!(e.status_code(), StatusCode::NOT_IMPLEMENTED);
+        assert!(e.to_string().contains("gemini"));
+    }
+
+    #[test]
+    fn provider_retryable_when_flagged() {
+        let e = GatewayError::Provider {
+            status: StatusCode::TOO_MANY_REQUESTS,
+            body: json!({"error": "rate limited"}),
+            provider: "openai".into(),
+            retryable: true,
+        };
+        assert!(e.is_retryable());
+        assert_eq!(e.status_code(), StatusCode::TOO_MANY_REQUESTS);
+    }
+
+    #[test]
+    fn provider_not_retryable_when_not_flagged() {
+        let e = GatewayError::Provider {
+            status: StatusCode::BAD_REQUEST,
+            body: json!({"error": "bad request"}),
+            provider: "anthropic".into(),
+            retryable: false,
+        };
+        assert!(!e.is_retryable());
+        assert_eq!(e.status_code(), StatusCode::BAD_REQUEST);
+    }
+
+    #[test]
+    fn stream_error_retryable() {
+        let e = GatewayError::Stream("connection reset".into());
+        assert!(e.is_retryable());
+        assert_eq!(e.status_code(), StatusCode::BAD_GATEWAY);
+    }
+
+    #[test]
+    fn display_messages() {
+        assert_eq!(
+            GatewayError::Validation("x".into()).to_string(),
+            "validation: x"
+        );
+        assert_eq!(
+            GatewayError::Bridge("y".into()).to_string(),
+            "format bridge: y"
+        );
+        let provider_err = GatewayError::Provider {
+            status: StatusCode::INTERNAL_SERVER_ERROR,
+            body: json!("err"),
+            provider: "openai".into(),
+            retryable: false,
+        };
+        assert!(provider_err.to_string().contains("openai"));
+        assert!(provider_err.to_string().contains("500"));
+    }
+}
@@ -0,0 +1,2 @@
+pub mod error;
+pub mod types;