RSPEED-2885: sanitize model and MCP output in all response paths (#1563)

Lifto · sisyphus-dev-ai · web-flow · commit e59b4b9599bb · 2026-04-23T16:46:46.000+02:00
* test(responses): add failing tests for output and model sanitization Red tests for _sanitize_response_dict to cover: - mcp_list_tools/mcp_call items not filtered from output array - model field not stripping provider prefix (google-vertex/...) These tests document the expected behavior before the fix. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(responses): sanitize model and MCP output in all response paths Move output array filtering into _sanitize_response_dict() so both streaming and non-streaming paths strip server-deployed MCP items (mcp_list_tools, mcp_call, mcp_approval_request). Strip provider routing prefix from model field (e.g. google-vertex/.../gemini-2.5-flash becomes gemini-2.5-flash). Removes redundant ad-hoc output filtering from the streaming generator that was missing from the non-streaming path, causing the leak QE found. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * test(e2e): update response model assertions for sanitized model field E2E feature files now use {MODEL_SHORT} placeholder for response body assertions since the model field is stripped of provider prefix. Request bodies still send {PROVIDER}/{MODEL} unchanged. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(responses): only strip model prefix when server-substituted Follow the same pattern as instructions: if the client specified a model, echo it back unchanged. Only strip the provider routing prefix when the server chose the model (client sent empty/no model). Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * docs(responses): document model sanitization behavior * Revert "test(e2e): update response model assertions for sanitized model field" This reverts commit d49aa55. * test: assert model prefix is stripped when server auto-selects Update e2e auto-select scenarios to verify the response model field contains just the model name (no provider prefix) when the client omits the model parameter and the server substitutes it. * style: add docstring to _make_streaming_completed_chunk helper * test: align auto-select e2e assertions with string input scenario Use full response structure assertion (object, status, model, output) matching the pattern from 'Responses accepts string input' scenario, per reviewer feedback. --------- Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
diff --git a/docs/responses.md b/docs/responses.md
@@ -399,7 +399,7 @@ The following response attributes are inherited directly from the LLS OpenAPI sp
 | `completed_at` | integer | Completion time (Unix), if set |
 | `error` | object | Error details if failed or incompleted |
 | `id` | string | Unique response ID or moderation ID |
-| `model` | string | Model ID (provider/model) used |
+| `model` | string | Model used for generation. If the client specified `model` in the request, it is echoed unchanged; if the server selected the model, the provider routing prefix is stripped (see [Model Selection](#model-selection)) |
 | `object` | string | Always `"response"` |
 | `output` | array[object] | Structured output (messages, tool calls, etc.) |
 | `parallel_tool_calls` | boolean | Parallel tool calls allowed |
@@ -515,6 +515,8 @@ In OpenResponses the `model` field is required; in LCORE it is optional. If you
 3. **First available** — Otherwise, the first available LLM model is used.
 4. If no model can be selected (e.g. no default and no LLM models), the request fails with 404 (model not found).
 
+**Model in response:** If the client specified a `model` in the request, it is echoed back unchanged in the response. If the server selected the model (because `model` was omitted from the request), the provider routing prefix is stripped and only the base model name is returned (e.g. `google-vertex/publishers/google/models/gemini-2.5-flash` → `gemini-2.5-flash`). This prevents leaking server infrastructure details and follows the same pattern as [System Prompt Resolution](#system-prompt-resolution).
+
 ### Output Representation
 
 Responses expose both:
diff --git a/src/app/endpoints/responses.py b/src/app/endpoints/responses.py
@@ -280,10 +280,12 @@ async def responses_endpoint_handler(
 
     # LCORE-specific: Automatically select model if not provided in request
     # This extends the base LLS API which requires model to be specified.
+    client_model = responses_request.model
     if not responses_request.model:
         responses_request.model = await select_model_for_responses(
             client, response_context.user_conversation
         )
+    model_substituted = not client_model
     if not await check_model_configured(client, responses_request.model):
         _, model_id = extract_provider_and_model_from_model_id(responses_request.model)
         error_response = NotFoundResponse(resource="model", resource_id=model_id)
@@ -371,6 +373,7 @@ async def responses_endpoint_handler(
         inline_rag_context=inline_rag_context,
         filter_server_tools=filter_server_tools,
         instructions_substituted=instructions_substituted,
+        model_substituted=model_substituted,
         background_tasks=background_tasks,
         rh_identity_context=rh_identity_context,
     )
@@ -386,6 +389,7 @@ async def handle_streaming_response(
     inline_rag_context: RAGContext,
     filter_server_tools: bool = False,
     instructions_substituted: bool = False,
+    model_substituted: bool = False,
     background_tasks: Optional[BackgroundTasks] = None,
     rh_identity_context: tuple[str, str] = ("", ""),
 ) -> StreamingResponse:
@@ -401,6 +405,7 @@ async def handle_streaming_response(
         inline_rag_context: Inline RAG context to be used for the response
         filter_server_tools: Whether to filter server-deployed MCP tool events from the stream
         instructions_substituted: Whether the server substituted the instructions
+        model_substituted: Whether the server substituted the model
         background_tasks: FastAPI background task manager for telemetry events
         rh_identity_context: Tuple of (org_id, system_id) from RH identity
     Returns:
@@ -453,6 +458,7 @@ async def handle_streaming_response(
                 inline_rag_context=inline_rag_context,
                 filter_server_tools=filter_server_tools,
                 instructions_substituted=instructions_substituted,
+                model_substituted=model_substituted,
             )
         except RuntimeError as e:  # library mode wraps 413 into runtime error
             if is_context_length_error(str(e)):
@@ -624,6 +630,7 @@ def _sanitize_response_dict(
     response_dict: dict[str, Any],
     configured_mcp_labels: set[str],
     instructions_substituted: bool = False,
+    model_substituted: bool = False,
 ) -> None:
     """Sanitize a serialized response object in-place to remove internal details.
 
@@ -637,7 +644,14 @@ def _sanitize_response_dict(
       they were used as-is, the value is left unchanged.
     - ``tools``: server-deployed MCP tool definitions are removed; client-
       provided tools (those whose ``server_label`` is not in
-      ``configured_mcp_labels``) are preserved
+      ``configured_mcp_labels``) are preserved.
+    - ``output``: server-deployed MCP output items (``mcp_list_tools``,
+      ``mcp_call``, ``mcp_approval_request``) are stripped so clients only
+      see item types they understand (``message``, ``function_call``, etc.).
+    - ``model``: the provider routing prefix (everything before the last
+      ``/``) is stripped only when the server selected the model
+      (``model_substituted=True``).  When the client specified the model,
+      it is echoed back unchanged.
 
     Args:
         response_dict: Mutable dict produced by ``model_dump`` on a response
@@ -646,6 +660,8 @@ def _sanitize_response_dict(
             server-deployed MCP servers.
         instructions_substituted: Whether the server substituted the
             instructions (True) or the client provided them (False).
+        model_substituted: Whether the server substituted the model
+            (True) or the client provided it (False).
     """
     if instructions_substituted:
         response_dict["instructions"] = SUBSTITUTED_INSTRUCTIONS_PLACEHOLDER
@@ -658,6 +674,18 @@ def _sanitize_response_dict(
             if tool.get("server_label") not in configured_mcp_labels
         ]
 
+    if output := response_dict.get("output"):
+        response_dict["output"] = [
+            item
+            for item in output
+            if not _is_server_mcp_output_item(item, configured_mcp_labels)
+        ]
+
+    if model_substituted:
+        model = response_dict.get("model")
+        if model and "/" in model:
+            response_dict["model"] = model.rsplit("/", 1)[-1]
+
 
 def _is_server_mcp_output_item(
     item: dict[str, Any], configured_mcp_labels: set[str]
@@ -775,6 +803,7 @@ async def response_generator(
     inline_rag_context: RAGContext,
     filter_server_tools: bool = False,
     instructions_substituted: bool = False,
+    model_substituted: bool = False,
 ) -> AsyncIterator[str]:
     """Generate SSE-formatted streaming response with LCORE-enriched events.
 
@@ -787,6 +816,7 @@ async def response_generator(
         inline_rag_context: Inline RAG context to be used for the response
         filter_server_tools: Whether to filter server-deployed MCP tool events from the stream
         instructions_substituted: Whether the server substituted the instructions
+        model_substituted: Whether the server substituted the model
     Yields:
         SSE-formatted strings for streaming events, ending with [DONE]
     """
@@ -824,6 +854,7 @@ async def response_generator(
                 chunk_dict["response"],
                 configured_mcp_labels,
                 instructions_substituted,
+                model_substituted,
             )
             tools = chunk_dict["response"].get("tools")
             if tools is not None:
@@ -833,16 +864,6 @@ async def response_generator(
                         configuration.rag_id_mapping,
                     )
                 )
-            # Remove server-deployed MCP items from the output array so
-            # clients only see item types they understand (message, function_call, etc.)
-            output = chunk_dict["response"].get("output")
-            if output is not None:
-                chunk_dict["response"]["output"] = [
-                    item
-                    for item in output
-                    if not _is_server_mcp_output_item(item, configured_mcp_labels)
-                ]
-
         # Intermediate response - no quota consumption and text yet
         if event_type == "response.in_progress":
             chunk_dict["response"]["available_quotas"] = {}
@@ -988,6 +1009,7 @@ async def handle_non_streaming_response(
     inline_rag_context: RAGContext,
     filter_server_tools: bool = False,
     instructions_substituted: bool = False,
+    model_substituted: bool = False,
     background_tasks: Optional[BackgroundTasks] = None,
     rh_identity_context: tuple[str, str] = ("", ""),
 ) -> ResponsesResponse:
@@ -1003,6 +1025,7 @@ async def handle_non_streaming_response(
         inline_rag_context: Inline RAG context to be used for the response
         filter_server_tools: Whether to filter server-deployed MCP tool output
         instructions_substituted: Whether the server substituted the instructions
+        model_substituted: Whether the server substituted the model
         background_tasks: FastAPI background task manager for telemetry events
         rh_identity_context: Tuple of (org_id, system_id) from RH identity
     Returns:
@@ -1168,7 +1191,10 @@ async def handle_non_streaming_response(
     configured_mcp_labels = {s.name for s in configuration.mcp_servers}
     response_dict = api_response.model_dump(exclude_none=True)
     _sanitize_response_dict(
-        response_dict, configured_mcp_labels, instructions_substituted
+        response_dict,
+        configured_mcp_labels,
+        instructions_substituted,
+        model_substituted,
     )
     tools = response_dict.get("tools")
     if tools is not None:
diff --git a/tests/e2e/features/responses.feature b/tests/e2e/features/responses.feature
@@ -150,6 +150,20 @@ Feature: Responses endpoint API tests
     """
     Then The status code of the response is 200
       And The body of the response contains hello
+      And the body of the response has the following structure
+        """
+        {
+          "object": "response",
+          "status": "completed",
+          "model": "{MODEL}",
+          "output": [
+            {
+              "type": "message",
+              "role": "assistant"
+            }
+          ]
+        }
+        """
 
   Scenario: Responses returns 404 for unknown model segment in provider slash model id
     Given The system is in default state
diff --git a/tests/e2e/features/responses_streaming.feature b/tests/e2e/features/responses_streaming.feature
@@ -200,6 +200,20 @@ Feature: Responses endpoint streaming API tests
     """
     Then The status code of the response is 200
       And The body of the response contains hello
+      And the body of the response has the following structure
+        """
+        {
+          "object": "response",
+          "status": "completed",
+          "model": "{MODEL}",
+          "output": [
+            {
+              "type": "message",
+              "role": "assistant"
+            }
+          ]
+        }
+        """
 
   Scenario: Streaming responses returns 404 for unknown model segment in provider slash model id  
     When I use "responses" to ask question with authorization header
diff --git a/tests/unit/app/endpoints/test_responses.py b/tests/unit/app/endpoints/test_responses.py