api7
diff --git a/‎docs/internals/llm-types.md‎
Lines changed: 69 additions & 1 deletion b/‎docs/internals/llm-types.md‎
Lines changed: 69 additions & 1 deletion
diff --git a/‎src/gateway/mod.rs‎
Lines changed: 1 addition & 0 deletions b/‎src/gateway/mod.rs‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎src/gateway/traits/chat_format.rs‎
Lines changed: 316 additions & 0 deletions b/‎src/gateway/traits/chat_format.rs‎
Lines changed: 316 additions & 0 deletions
@@ -1,4 +1,4 @@
-# LLM Types Deep Dive
+# LLM Type and Trait System
 
 This document describes the Layer 1 type system for the LLM subsystem. It covers the wire models for the supported chat APIs plus the shared metadata and error types used by later bridge and provider code.
 
@@ -82,3 +82,71 @@ Two helper methods make this usable from higher layers:
 - `status_code()` maps failures to proxy-facing HTTP status codes
 
 This keeps the later LLM runtime and proxy code from duplicating provider error classification logic.
+
+## Trait stack
+
+The type layer and the trait layer are tightly coupled, so they are documented together here.
+
+Two independent axes shape the design:
+
+- API format semantics such as OpenAI Chat, Anthropic Messages, and OpenAI Responses
+- provider-specific behavior such as endpoint shape, auth headers, request transforms, and native bypasses
+
+`ChatFormat` models the first axis. `ProviderMeta`, `ChatTransform`, and `ProviderCapabilities` model the second.
+
+### `ChatFormat`
+
+`ChatFormat` defines a complete external chat protocol:
+
+- request type
+- non-streaming response type
+- streaming chunk type
+- bridge rules to and from the hub format
+
+The hub format remains OpenAI Chat. Every format therefore explains how to:
+
+- convert its request into a hub request plus `BridgeContext`
+- convert a hub response back into its own response type
+- convert a hub stream into its own stream events
+
+The trait also includes a native escape hatch for providers that can serve the source format directly.
+
+### Explicit stream state
+
+Two stream-state associated types are explicit:
+
+- `BridgeState` for hub-to-format conversion
+- `NativeStreamState` for provider-native streaming conversion
+
+That keeps stream state typed and local to the format implementation instead of hiding it behind erased containers.
+
+Hub stream state also keeps partially assembled tool calls keyed by `(choice_index, tool_call_index)`, because tool call indices are scoped to a streamed choice rather than globally unique across the whole chunk stream.
+
+### Provider layering
+
+The provider side is split into three layers.
+
+`ProviderMeta` contains stable metadata such as provider name, default base URL, endpoint path, stream reader kind, and auth header construction.
+
+`ChatTransform` contains hub-to-provider request and response mapping. Its default behavior is intentionally OpenAI-compatible: serialize the request, apply `CompatQuirks`, deserialize the response, and treat SSE `data:` lines as OpenAI-style chunks.
+
+`ProviderCapabilities` is capability discovery. It returns typed trait objects such as `as_native_anthropic_messages()` and `as_native_openai_responses()` instead of booleans, so a provider cannot claim support for a feature without also exposing the methods behind that feature.
+
+### Native support traits
+
+`NativeAnthropicMessagesSupport` and `NativeOpenAIResponsesSupport` are optional extensions layered on top of `ChatTransform`.
+
+`NativeHandler` is the small type-erased enum that carries those typed trait objects across format dispatch boundaries.
+
+### `CompatQuirks`
+
+`CompatQuirks` is the declarative escape hatch for OpenAI-compatible providers that are almost, but not exactly, compatible.
+
+The current implementation supports:
+
+- removing unsupported parameters
+- renaming request parameters
+- forcing `stream_options.include_usage` when a provider requires usage in streaming mode
+- recording provider-specific stream termination markers and tool-argument behavior
+
+This keeps provider-specific compatibility patches out of custom transform implementations unless the provider genuinely needs bespoke logic.
@@ -1,2 +1,3 @@
 pub mod error;
+pub mod traits;
 pub mod types;
@@ -0,0 +1,316 @@
+use std::collections::HashMap;
+
+use serde::{Serialize, de::DeserializeOwned};
+use serde_json::Value;
+
+use crate::gateway::{
+    error::{GatewayError, Result},
+    traits::{NativeHandler, ProviderCapabilities},
+    types::{
+        common::BridgeContext,
+        openai::{ChatCompletionChunk, ChatCompletionRequest, ChatCompletionResponse},
+    },
+};
+
+/// A complete chat API format contract and its bridge rules to the hub format.
+pub trait ChatFormat: Send + Sync + 'static {
+    /// Request type for this format.
+    type Request: DeserializeOwned + Serialize + Send + Sync;
+    /// Non-streaming response type for this format.
+    type Response: Serialize + Send + Sync;
+    /// Streaming chunk type for this format.
+    type StreamChunk: Serialize + Send + Sync;
+    /// Stateful bridge data used while converting hub chunks.
+    type BridgeState: Default + Send + Unpin;
+    /// Stateful bridge data used on native streaming paths.
+    type NativeStreamState: Default + Send + Unpin;
+
+    /// Stable format name used for logs and diagnostics.
+    fn name() -> &'static str;
+
+    /// Whether the request expects a streaming response.
+    fn is_stream(req: &Self::Request) -> bool;
+
+    /// Extract the model identifier from the request.
+    fn extract_model(req: &Self::Request) -> &str;
+
+    /// Convert this request into the hub request plus side-channel bridge data.
+    fn to_hub(req: &Self::Request) -> Result<(ChatCompletionRequest, BridgeContext)>;
+
+    /// Convert a hub response back into this format.
+    fn from_hub(resp: &ChatCompletionResponse, ctx: &BridgeContext) -> Result<Self::Response>;
+
+    /// Convert a hub streaming chunk into zero or more chunks of this format.
+    fn from_hub_stream(
+        chunk: &ChatCompletionChunk,
+        state: &mut Self::BridgeState,
+        ctx: &BridgeContext,
+    ) -> Result<Vec<Self::StreamChunk>>;
+
+    /// Emit any format-specific end-of-stream events.
+    fn stream_end_events(
+        _state: &mut Self::BridgeState,
+        _ctx: &BridgeContext,
+    ) -> Vec<Self::StreamChunk> {
+        vec![]
+    }
+
+    /// Return a native handler when the provider can bypass the hub format.
+    fn native_support(_provider: &dyn ProviderCapabilities) -> Option<NativeHandler<'_>>
+    where
+        Self: Sized,
+    {
+        None
+    }
+
+    /// Prepare a native request body for providers that support this format directly.
+    fn call_native(
+        native: &NativeHandler<'_>,
+        request: &Self::Request,
+        stream: bool,
+    ) -> Result<(String, Value)>
+    where
+        Self: Sized,
+    {
+        let _ = (request, stream);
+        Err(GatewayError::NativeNotSupported {
+            provider: native.provider_name().into(),
+        })
+    }
+
+    /// Convert a native streaming chunk into zero or more chunks of this format.
+    fn transform_native_stream_chunk(
+        provider: &dyn ProviderCapabilities,
+        raw: &str,
+        state: &mut Self::NativeStreamState,
+    ) -> Result<Vec<Self::StreamChunk>>;
+
+    /// Parse a native non-streaming response into this format.
+    fn parse_native_response(native: &NativeHandler<'_>, body: Value) -> Result<Self::Response>
+    where
+        Self: Sized,
+    {
+        let _ = body;
+        Err(GatewayError::Bridge(format!(
+            "parse_native_response called on a non-native format for provider {}",
+            native.provider_name()
+        )))
+    }
+
+    /// Serialize a chunk into the JSON payload used by SSE framing.
+    fn serialize_chunk_payload(chunk: &Self::StreamChunk) -> String;
+
+    /// Optional SSE event type for this chunk.
+    fn sse_event_type(_chunk: &Self::StreamChunk) -> Option<&'static str> {
+        None
+    }
+}
+
+/// Incremental state for reconstructing tool calls across hub chunks.
+#[derive(Debug, Clone, Default)]
+pub struct ToolCallAccumulator {
+    pub id: Option<String>,
+    pub kind: Option<String>,
+    pub name: Option<String>,
+    pub arguments: String,
+}
+
+/// Key for partially assembled tool calls: (choice_index, tool_call_index).
+pub type ToolCallAccumulatorKey = (u32, usize);
+
+/// Stateful data used while transforming provider chunks into hub chunks.
+#[derive(Debug, Clone, Default)]
+pub struct ChatStreamState {
+    pub chunk_index: usize,
+    pub tool_call_accumulators: HashMap<ToolCallAccumulatorKey, ToolCallAccumulator>,
+    pub input_tokens: u32,
+    pub output_tokens: u32,
+}
+
+#[cfg(test)]
+mod tests {
+    use std::borrow::Cow;
+
+    use http::HeaderMap;
+    use serde_json::json;
+
+    use super::{ChatFormat, ChatStreamState, ToolCallAccumulator};
+    use crate::gateway::{
+        error::GatewayError,
+        traits::{
+            NativeHandler, NativeOpenAIResponsesSupport, ProviderAuth, ProviderMeta,
+            StreamReaderKind, provider::ChatTransform,
+        },
+        types::{
+            common::BridgeContext,
+            openai::{ChatCompletionChunk, ChatCompletionRequest, ChatCompletionResponse},
+        },
+    };
+
+    struct DummyNativeProvider;
+
+    impl ProviderMeta for DummyNativeProvider {
+        fn name(&self) -> &'static str {
+            "dummy-native-provider"
+        }
+
+        fn default_base_url(&self) -> &'static str {
+            "https://example.com"
+        }
+
+        fn stream_reader_kind(&self) -> StreamReaderKind {
+            StreamReaderKind::Sse
+        }
+
+        fn build_auth_headers(
+            &self,
+            _auth: &ProviderAuth,
+        ) -> crate::gateway::error::Result<HeaderMap> {
+            Ok(HeaderMap::new())
+        }
+    }
+
+    impl ChatTransform for DummyNativeProvider {}
+
+    impl NativeOpenAIResponsesSupport for DummyNativeProvider {
+        fn native_openai_responses_endpoint(&self, _model: &str) -> Cow<'static, str> {
+            Cow::Borrowed("/v1/responses")
+        }
+
+        fn transform_openai_responses_request(
+            &self,
+            _req: &crate::gateway::types::openai::responses::ResponsesApiRequest,
+        ) -> crate::gateway::error::Result<serde_json::Value> {
+            Ok(json!({}))
+        }
+
+        fn transform_openai_responses_response(
+            &self,
+            _body: serde_json::Value,
+        ) -> crate::gateway::error::Result<
+            crate::gateway::types::openai::responses::ResponsesApiResponse,
+        > {
+            unreachable!("not used in this test")
+        }
+
+        fn transform_openai_responses_stream_chunk(
+            &self,
+            _raw: &str,
+            _state: &mut crate::gateway::traits::OpenAIResponsesNativeStreamState,
+        ) -> crate::gateway::error::Result<
+            Vec<crate::gateway::types::openai::responses::ResponsesApiStreamEvent>,
+        > {
+            Ok(vec![])
+        }
+    }
+
+    struct DummyFormat;
+
+    impl ChatFormat for DummyFormat {
+        type Request = serde_json::Value;
+        type Response = serde_json::Value;
+        type StreamChunk = serde_json::Value;
+        type BridgeState = ();
+        type NativeStreamState = ();
+
+        fn name() -> &'static str {
+            "dummy"
+        }
+
+        fn is_stream(_req: &Self::Request) -> bool {
+            false
+        }
+
+        fn extract_model(_req: &Self::Request) -> &str {
+            "dummy-model"
+        }
+
+        fn to_hub(
+            _req: &Self::Request,
+        ) -> crate::gateway::error::Result<(ChatCompletionRequest, BridgeContext)> {
+            unreachable!("not used in this test")
+        }
+
+        fn from_hub(
+            _resp: &ChatCompletionResponse,
+            _ctx: &BridgeContext,
+        ) -> crate::gateway::error::Result<Self::Response> {
+            unreachable!("not used in this test")
+        }
+
+        fn from_hub_stream(
+            _chunk: &ChatCompletionChunk,
+            _state: &mut Self::BridgeState,
+            _ctx: &BridgeContext,
+        ) -> crate::gateway::error::Result<Vec<Self::StreamChunk>> {
+            Ok(vec![])
+        }
+
+        fn transform_native_stream_chunk(
+            _provider: &dyn crate::gateway::traits::ProviderCapabilities,
+            _raw: &str,
+            _state: &mut Self::NativeStreamState,
+        ) -> crate::gateway::error::Result<Vec<Self::StreamChunk>> {
+            Ok(vec![])
+        }
+
+        fn serialize_chunk_payload(chunk: &Self::StreamChunk) -> String {
+            serde_json::to_string(chunk).unwrap()
+        }
+    }
+
+    #[test]
+    fn default_call_native_uses_provider_name() {
+        let provider = DummyNativeProvider;
+        let native = NativeHandler::OpenAIResponses(&provider);
+
+        let error = DummyFormat::call_native(&native, &json!({}), false).unwrap_err();
+        assert!(matches!(
+            error,
+            GatewayError::NativeNotSupported { provider } if provider == "dummy-native-provider"
+        ));
+    }
+
+    #[test]
+    fn default_parse_native_response_returns_bridge_error() {
+        let provider = DummyNativeProvider;
+        let native = NativeHandler::OpenAIResponses(&provider);
+
+        let error = DummyFormat::parse_native_response(&native, json!({})).unwrap_err();
+        assert!(matches!(
+            error,
+            GatewayError::Bridge(message)
+            if message.contains("parse_native_response called on a non-native format")
+                && message.contains("dummy-native-provider")
+        ));
+    }
+
+    #[test]
+    fn stream_state_separates_tool_call_accumulators_by_choice_and_index() {
+        let mut state = ChatStreamState::default();
+        state.tool_call_accumulators.insert(
+            (0, 0),
+            ToolCallAccumulator {
+                arguments: "first".into(),
+                ..Default::default()
+            },
+        );
+        state.tool_call_accumulators.insert(
+            (1, 0),
+            ToolCallAccumulator {
+                arguments: "second".into(),
+                ..Default::default()
+            },
+        );
+
+        assert_eq!(state.tool_call_accumulators.len(), 2);
+        assert_eq!(
+            state.tool_call_accumulators.get(&(0, 0)).unwrap().arguments,
+            "first"
+        );
+        assert_eq!(
+            state.tool_call_accumulators.get(&(1, 0)).unwrap().arguments,
+            "second"
+        );
+    }
+}
Original file line number	Diff line number	Diff line change
`@@ -1,2 +1,3 @@`
`1`	`1`	`pub mod error;`
	`2`	`+pub mod traits;`
`2`	`3`	`pub mod types;`