Skip to content

Commit 310def1

Browse files
ankitbkoAnkit SinhaCopilotRaviPidaparthi
authored
[agentserver-responses] Harden response model, type safety, and builder API (#46302)
* Always include model field in response payload Default model to empty string when not provided in the request, ensuring the field is always present in the response payload. The OpenAI SDK requires model to be present to deserialize the response object. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add e2e tests for model field always present in response Address PR review feedback: add contract tests verifying the model field is present in the response payload when omitted from the request, for both sync (stream=False) and streaming (stream=True) modes. * fix: DELETE response returns object='response' instead of 'response.deleted' The OpenAI spec returns {id, object: 'response', deleted: true} for DELETE /responses/{id}. Our handler was returning 'response.deleted' which doesn't match. Fixed the handler and updated all 5 test assertions. * fix: stamp agent_session_id and conversation_id on to_snapshot fallback ResponseExecution now carries agent_session_id and conversation_id so that _RuntimeState.to_snapshot can forcibly stamp them (S-038/S-040) on both the response.as_dict() path and the minimal fallback dict. All four orchestrator ResponseExecution creation sites pass both fields from the execution context. * fix: remove output:list override breaking get_history deserialization The manual _patch.py override of ResponseObject.output erased the element type (list instead of list[OutputItem]), preventing the model framework from deserializing nested dicts into OutputItem instances. This caused get_history to return plain dicts instead of typed models. Changes: - Remove output:list override; use generated list[OutputItem] - Remove ToolChoiceAllowed override (generated type is identical) - Move Sphinx docstring fixes into models_patch.py shim so make generate-models preserves them instead of overwriting - Accept emitter upgrade to model_base.py (XML refactor) - Regenerate _validators.py from current TypeSpec sources * fix: use polymorphic _deserialize for OutputItem subtypes + contract type tests - Fix track_completed_output_item to use OutputItem._deserialize(dict, []) instead of OutputItem(dict) so response.output contains proper discriminated subtypes (OutputItemMessage, OutputItemFunctionToolCall, etc.) instead of base OutputItem instances. This ensures handler devs can use isinstance() and attribute access on output items. - Add test_public_contract_types.py with 22 tests covering every public handler/consumer surface for type fidelity: * context.request → CreateResponse * context.get_input_items() → Item subtypes * context.get_input_text() → str * context.get_history() → OutputItem subtypes (first-ever coverage) * stream.response → ResponseObject * stream.response.output → OutputItem subtypes * Builder emit_* → ResponseStreamEvent subtypes * Generator convenience → ResponseStreamEvent subtypes * InMemoryProvider round-trip preserves subtypes - Add isinstance assertions to existing tests in test_builders.py, test_event_stream_generators.py, and test_response_event_stream_builder.py * feat: deterministic session ID derivation from conversational context Replace random UUID fallback for agent_session_id with deterministic SHA-256 derivation matching .NET SessionIdDerivation logic: Priority chain: 1. Explicit agent_session_id from payload (unchanged) 2. Platform env FOUNDRY_AGENT_SESSION_ID (unchanged) 3. Deterministic: SHA256(agent_name:agent_version:partition_hint) where partition_hint is extracted from conversation_id or previous_response_id via IdGenerator.extract_partition_key 4. Random 63-char lowercase hex (one-shot, no conversational context) This ensures session affinity: the same conversation + agent identity always resolves to the same session ID, enabling stateful backends to route consistently without requiring explicit session IDs. New functions in _request_parsing.py: - derive_session_id() — public deterministic derivation - _compute_hex_hash() — SHA-256 → 63-char hex - _generate_random_hex() — os.urandom fallback - _extract_agent_identity() — name/version from agent_reference Updated _resolve_session_id() signature to accept agent_reference. Updated call site in _endpoint_handler.py to pass agent_reference. Updated all tests (unit + contract) from UUID to 63-char hex format. Added 14 new derivation tests covering determinism, agent isolation, version isolation, priority, and non-standard ID formats. * feat: strongly-typed return types for all emit_* builder methods Port .NET pattern: every emit_* method now returns its specific event subtype (e.g. ResponseCreatedEvent, ResponseOutputItemAddedEvent) via typing.cast() instead of the base ResponseStreamEvent. Covers all builders: - ResponseEventStream: 6 lifecycle methods - OutputItemBuilder / BaseOutputItemBuilder: emit_added, emit_done - OutputItemMessageBuilder, TextContentBuilder, RefusalContentBuilder - FunctionCallBuilder, FunctionCallOutputBuilder - ReasoningSummaryPartBuilder, ReasoningItemBuilder - FileSearchCall, WebSearchCall, CodeInterpreter, ImageGen, McpCall, McpListTools, CustomToolCall builders Adds test_emit_return_types.py with 70 isinstance assertions covering every public emit_* method across all 16 builder classes. * refactor: tighten OutputItemBuilder.emit_added/emit_done to accept OutputItem only Remove dict[str, Any] from the public signature — all item types are generated models. Internal callers use _emit_added/_emit_done directly. Also: fix handler guide (emit_failed/emit_incomplete kwargs, request= pattern), revert CHANGELOG to initial-release form, remove session ID derivation docs (internal detail). * refactor: tighten all public API parameters from dict to generated model types - ResponseEventStream constructor: agent_reference, request, response now accept only their respective model types (no dict[str, Any]) - Terminal methods (emit_completed/failed/incomplete): usage accepts only ResponseUsage (no dict[str, Any]) - Convenience generators (output_item_computer_call, _computer_call_output, _local_shell_call, _function_shell_call, _function_shell_call_output, _apply_patch_call): all action/output/environment params accept only their respective generated model types (no dict[str, Any]) - Async mirrors: same tightening as sync counterparts - emit_annotation_added: annotation accepts only Annotation (no dict) - _set_terminal_fields: usage tightened - Internal _build_events: coerce dict→AgentReference before passing to ResponseEventStream - Tests updated to use model constructors instead of raw dicts - Docs updated to show ResponseUsage model usage * refactor: internalize escape-hatch methods and tighten remaining list[Any] types - emit_event → _emit_event: internal only, all callers are sibling emit_* methods and _builders subpackage - with_output_item_defaults → _with_output_item_defaults: internal only, called only by _builders._base - validate_response_event_stream → _validate_response_event_stream: internal only, called only by _normalize_lifecycle_events - normalize_lifecycle_events → _normalize_lifecycle_events: internal only, called only by hosting._endpoint_handler - Removed both from streaming/__init__.py exports - output_item_custom_tool_call_output: output tightened from str | list[Any] to str | list[FunctionAndCustomToolCallOutput] - OutputItemFunctionCallOutputBuilder.emit_added/emit_done: output tightened from str | list[Any] to str | list[InputTextContentParam | InputImageContentParamAutoParam | InputFileContentParam] - Removed unused Any import from _function.py * refactor: reduce public API surface — remove EVENT_TYPE alias, internalize 22 symbols - Remove EVENT_TYPE alias: replaced all ~80 usages across 10 files with generated_models.ResponseStreamEventType directly - Remove from streaming exports: EVENT_TYPE, encode_sse_event, encode_keep_alive_comment - Remove from hosting exports: CreateSpan, CreateSpanHook, InMemoryCreateSpanHook, RecordedSpan, build_create_span_tags, build_platform_server_header, start_create_span, build_api_error_response, build_invalid_mode_error_response, build_not_found_error_response, parse_and_validate_create_response, parse_create_response, to_api_error_response, validate_create_response - Remove from models exports: ResponseExecution, StreamEventRecord, StreamReplayState, get_instruction_items, get_output_item_id - Remove from top-level exports: to_output_item - Keep public: get_conversation_id, get_input_expanded, get_content_expanded, get_conversation_expanded, get_tool_choice_expanded, all builder classes, ResponseEventStream, TextResponse, all store/Foundry types --------- Co-authored-by: Ankit Sinha <anksinha@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: rapida <rapida@microsoft.com>
1 parent 6b9ea94 commit 310def1

35 files changed

Lines changed: 3108 additions & 788 deletions

sdk/agentserver/azure-ai-agentserver-responses/azure/ai/agentserver/responses/__init__.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414
from .models._helpers import (
1515
get_conversation_id,
1616
get_input_expanded,
17-
to_output_item,
1817
)
1918
from .store._base import ResponseProviderProtocol, ResponseStreamProviderProtocol
2019
from .store._foundry_errors import (
@@ -51,5 +50,4 @@
5150
"ResponseObject",
5251
"get_conversation_id",
5352
"get_input_expanded",
54-
"to_output_item",
5553
]

sdk/agentserver/azure-ai-agentserver-responses/azure/ai/agentserver/responses/hosting/__init__.py

Lines changed: 0 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -2,40 +2,8 @@
22
# Licensed under the MIT license.
33
"""HTTP hosting, routing, and request orchestration for the Responses server."""
44

5-
from ._observability import (
6-
CreateSpan,
7-
CreateSpanHook,
8-
InMemoryCreateSpanHook,
9-
RecordedSpan,
10-
build_create_span_tags,
11-
build_platform_server_header,
12-
start_create_span,
13-
)
145
from ._routing import ResponsesAgentServerHost
15-
from ._validation import (
16-
build_api_error_response,
17-
build_invalid_mode_error_response,
18-
build_not_found_error_response,
19-
parse_and_validate_create_response,
20-
parse_create_response,
21-
to_api_error_response,
22-
validate_create_response,
23-
)
246

257
__all__ = [
268
"ResponsesAgentServerHost",
27-
"CreateSpan",
28-
"CreateSpanHook",
29-
"InMemoryCreateSpanHook",
30-
"RecordedSpan",
31-
"build_api_error_response",
32-
"build_create_span_tags",
33-
"build_invalid_mode_error_response",
34-
"build_not_found_error_response",
35-
"build_platform_server_header",
36-
"parse_and_validate_create_response",
37-
"parse_create_response",
38-
"start_create_span",
39-
"to_api_error_response",
40-
"validate_create_response",
419
]

sdk/agentserver/azure-ai-agentserver-responses/azure/ai/agentserver/responses/hosting/_endpoint_handler.py

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -22,17 +22,21 @@
2222
from starlette.responses import JSONResponse, Response, StreamingResponse
2323

2424
from azure.ai.agentserver.core import detach_context, end_span, flush_spans, set_current_span, trace_stream
25-
from azure.ai.agentserver.responses.models._generated import AgentReference, CreateResponse
25+
from azure.ai.agentserver.responses.models._generated import (
26+
AgentReference,
27+
CreateResponse,
28+
ResponseStreamEventType,
29+
)
2630

2731
from .._options import ResponsesServerOptions
2832
from .._response_context import IsolationContext, ResponseContext
2933
from ..models._helpers import get_input_expanded, to_output_item
3034
from ..models.errors import RequestValidationError
3135
from ..models.runtime import ResponseExecution, ResponseModeFlags, build_cancelled_response, build_failed_response
3236
from ..store._base import ResponseProviderProtocol, ResponseStreamProviderProtocol
33-
from ..streaming._helpers import EVENT_TYPE, _encode_sse
37+
from ..streaming._helpers import _encode_sse
3438
from ..streaming._sse import encode_sse_any_event
35-
from ..streaming._state_machine import normalize_lifecycle_events
39+
from ..streaming._state_machine import _normalize_lifecycle_events
3640
from ._execution_context import _ExecutionContext
3741
from ._observability import (
3842
CreateSpan,
@@ -208,11 +212,11 @@ def __init__(
208212

209213
# Validate the lifecycle event state machine on startup so
210214
# misconfigured state machines surface immediately.
211-
normalize_lifecycle_events(
215+
_normalize_lifecycle_events(
212216
response_id="resp_validation",
213217
events=[
214-
{"type": EVENT_TYPE.RESPONSE_CREATED.value, "response": {"status": "in_progress"}},
215-
{"type": EVENT_TYPE.RESPONSE_COMPLETED.value, "response": {"status": "completed"}},
218+
{"type": ResponseStreamEventType.RESPONSE_CREATED.value, "response": {"status": "in_progress"}},
219+
{"type": ResponseStreamEventType.RESPONSE_COMPLETED.value, "response": {"status": "completed"}},
216220
],
217221
)
218222

@@ -342,7 +346,7 @@ def _build_execution_context(
342346
stream = bool(getattr(parsed, "stream", False))
343347
store = True if getattr(parsed, "store", None) is None else bool(parsed.store)
344348
background = bool(getattr(parsed, "background", False))
345-
model = getattr(parsed, "model", None)
349+
model = getattr(parsed, "model", None) or ""
346350
_expanded = get_input_expanded(parsed)
347351
input_items = [out for item in _expanded if (out := to_output_item(item, response_id)) is not None]
348352
previous_response_id: str | None = (
@@ -469,7 +473,9 @@ async def handle_create(self, request: Request) -> Response: # pylint: disable=
469473

470474
# B39: Resolve session ID
471475
config_session_id = getattr(getattr(self._host, "config", None), "session_id", "") or ""
472-
agent_session_id = _resolve_session_id(parsed, payload, env_session_id=config_session_id)
476+
agent_session_id = _resolve_session_id(
477+
parsed, payload, env_session_id=config_session_id, agent_reference=agent_reference
478+
)
473479

474480
ctx = self._build_execution_context(
475481
parsed=parsed,
@@ -833,7 +839,7 @@ async def handle_delete(self, request: Request) -> Response:
833839
)
834840

835841
return JSONResponse(
836-
{"id": response_id, "object": "response.deleted", "deleted": True},
842+
{"id": response_id, "object": "response", "deleted": True},
837843
status_code=200,
838844
)
839845

sdk/agentserver/azure-ai-agentserver-responses/azure/ai/agentserver/responses/hosting/_orchestrator.py

Lines changed: 23 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@
3333
)
3434
from ..store._base import ResponseProviderProtocol, ResponseStreamProviderProtocol
3535
from ..streaming._helpers import (
36-
EVENT_TYPE,
3736
_apply_stream_event_defaults,
3837
_build_events,
3938
_coerce_handler_event,
@@ -126,20 +125,20 @@ async def _iter_with_winddown(
126125

127126
_OUTPUT_ITEM_EVENT_TYPES: frozenset[str] = frozenset(
128127
{
129-
EVENT_TYPE.RESPONSE_OUTPUT_ITEM_ADDED.value,
130-
EVENT_TYPE.RESPONSE_OUTPUT_ITEM_DONE.value,
128+
generated_models.ResponseStreamEventType.RESPONSE_OUTPUT_ITEM_ADDED.value,
129+
generated_models.ResponseStreamEventType.RESPONSE_OUTPUT_ITEM_DONE.value,
131130
}
132131
)
133132

134133
# Response-level lifecycle events whose ``response`` field carries a full Response snapshot.
135134
# Used by FR-008a output manipulation detection.
136135
_RESPONSE_SNAPSHOT_TYPES: frozenset[str] = frozenset(
137136
{
138-
EVENT_TYPE.RESPONSE_IN_PROGRESS.value,
139-
EVENT_TYPE.RESPONSE_COMPLETED.value,
140-
EVENT_TYPE.RESPONSE_FAILED.value,
141-
EVENT_TYPE.RESPONSE_INCOMPLETE.value,
142-
EVENT_TYPE.RESPONSE_QUEUED.value,
137+
generated_models.ResponseStreamEventType.RESPONSE_IN_PROGRESS.value,
138+
generated_models.ResponseStreamEventType.RESPONSE_COMPLETED.value,
139+
generated_models.ResponseStreamEventType.RESPONSE_FAILED.value,
140+
generated_models.ResponseStreamEventType.RESPONSE_INCOMPLETE.value,
141+
generated_models.ResponseStreamEventType.RESPONSE_QUEUED.value,
143142
}
144143
)
145144

@@ -308,7 +307,8 @@ async def _run_background_non_stream( # pylint: disable=too-many-locals,too-man
308307
record.response_created_signal.set()
309308
else:
310309
# Track output_item.added events for FR-008a
311-
if normalized.get("type") == EVENT_TYPE.RESPONSE_OUTPUT_ITEM_ADDED.value:
310+
_item_added = generated_models.ResponseStreamEventType.RESPONSE_OUTPUT_ITEM_ADDED
311+
if normalized.get("type") == _item_added.value:
312312
output_item_count += 1
313313

314314
# FR-008a: detect direct Output manipulation on response.* events
@@ -510,9 +510,9 @@ class _ResponseOrchestrator: # pylint: disable=too-many-instance-attributes
510510

511511
_TERMINAL_SSE_TYPES: frozenset[str] = frozenset(
512512
{
513-
EVENT_TYPE.RESPONSE_COMPLETED.value,
514-
EVENT_TYPE.RESPONSE_FAILED.value,
515-
EVENT_TYPE.RESPONSE_INCOMPLETE.value,
513+
generated_models.ResponseStreamEventType.RESPONSE_COMPLETED.value,
514+
generated_models.ResponseStreamEventType.RESPONSE_FAILED.value,
515+
generated_models.ResponseStreamEventType.RESPONSE_INCOMPLETE.value,
516516
}
517517
)
518518

@@ -619,7 +619,7 @@ async def _cancel_terminal_sse_dict(
619619
:rtype: ResponseStreamEvent
620620
"""
621621
cancel_event: dict[str, Any] = {
622-
"type": EVENT_TYPE.RESPONSE_FAILED.value,
622+
"type": generated_models.ResponseStreamEventType.RESPONSE_FAILED.value,
623623
"response": _build_cancelled_response(ctx.response_id, ctx.agent_reference, ctx.model).as_dict(),
624624
}
625625
return await self._normalize_and_append(ctx, state, cancel_event)
@@ -640,7 +640,7 @@ async def _make_failed_event(
640640
:rtype: ResponseStreamEvent
641641
"""
642642
failed_event: dict[str, Any] = {
643-
"type": EVENT_TYPE.RESPONSE_FAILED.value,
643+
"type": generated_models.ResponseStreamEventType.RESPONSE_FAILED.value,
644644
"response": {
645645
"id": ctx.response_id,
646646
"object": "response",
@@ -685,6 +685,8 @@ async def _register_bg_execution(
685685
input_items=deepcopy(ctx.input_items),
686686
previous_response_id=ctx.previous_response_id,
687687
cancel_signal=ctx.cancellation_signal,
688+
agent_session_id=ctx.agent_session_id,
689+
conversation_id=ctx.conversation_id,
688690
)
689691
execution.set_response_snapshot(generated_models.ResponseObject(initial_payload))
690692
execution.subject = _ResponseEventSubject()
@@ -899,7 +901,7 @@ async def _process_handler_events( # pylint: disable=too-many-return-statements
899901
# appended to the state machine before we emit response.failed.
900902
_pre_coerced = _coerce_handler_event(raw)
901903
_pre_type = _pre_coerced.get("type", "")
902-
if _pre_type == EVENT_TYPE.RESPONSE_OUTPUT_ITEM_ADDED.value:
904+
if _pre_type == generated_models.ResponseStreamEventType.RESPONSE_OUTPUT_ITEM_ADDED.value:
903905
output_item_count += 1
904906
if _pre_type in _RESPONSE_SNAPSHOT_TYPES:
905907
_pre_response = _pre_coerced.get("response") or {}
@@ -1105,6 +1107,8 @@ async def _finalize_stream(self, ctx: _ExecutionContext, state: _PipelineState)
11051107
input_items=deepcopy(ctx.input_items),
11061108
previous_response_id=ctx.previous_response_id,
11071109
cancel_signal=ctx.cancellation_signal if ctx.background else None,
1110+
agent_session_id=ctx.agent_session_id,
1111+
conversation_id=ctx.conversation_id,
11081112
)
11091113
execution.set_response_snapshot(generated_models.ResponseObject(response_payload))
11101114
await self._runtime_state.add(execution)
@@ -1381,6 +1385,8 @@ async def run_sync(self, ctx: _ExecutionContext) -> dict[str, Any]:
13811385
input_items=deepcopy(ctx.input_items),
13821386
previous_response_id=ctx.previous_response_id,
13831387
response_context=ctx.context,
1388+
agent_session_id=ctx.agent_session_id,
1389+
conversation_id=ctx.conversation_id,
13841390
)
13851391
record.set_response_snapshot(generated_models.ResponseObject(response_payload))
13861392

@@ -1444,6 +1450,8 @@ async def run_background(self, ctx: _ExecutionContext) -> dict[str, Any]:
14441450
cancel_signal=ctx.cancellation_signal,
14451451
initial_model=ctx.model,
14461452
initial_agent_reference=ctx.agent_reference,
1453+
agent_session_id=ctx.agent_session_id,
1454+
conversation_id=ctx.conversation_id,
14471455
)
14481456

14491457
# Register so GET can observe in-flight state

0 commit comments

Comments
 (0)