Skip to content

Commit 30035ea

Browse files
[agentserver] responses: restore full spec 015/016 work on top of core PR
This commit restores the responses-package spec 015/016 work that was moved out of the core PR (#46997) to keep scope manageable. Sits on top of the core PR branch so it only shows the responses delta. ⚠️ NOT FOR REVIEW — responses package is not the focus this cycle. The branch is preserved so the work isn't lost and can be picked up once core lands. Restored from safety-spec016-backup-2026-06-02 (SHA 3df9c5b). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 7dfbd7a commit 30035ea

125 files changed

Lines changed: 20715 additions & 335 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

sdk/agentserver/azure-ai-agentserver-responses/CHANGELOG.md

Lines changed: 55 additions & 5 deletions
Large diffs are not rendered by default.

sdk/agentserver/azure-ai-agentserver-responses/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,10 @@ The library orchestrates the complete response lifecycle: `created` → `in_prog
113113

114114
For detailed handler implementation guidance, see [docs/handler-implementation-guide.md](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/agentserver/azure-ai-agentserver-responses/docs/handler-implementation-guide.md).
115115

116+
### Durability
117+
118+
Background responses with `store=True` are automatically crash-recoverable. If the server crashes mid-response, the handler is re-invoked on restart — no code changes needed. Stream events are persisted incrementally so clients can reconnect and resume from where they left off. For advanced scenarios (metadata checkpointing, multi-turn steering), see the [Durable Responses Developer Guide](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/agentserver/azure-ai-agentserver-responses/docs/durable-responses-developer-guide.md).
119+
116120
## Examples
117121

118122
### Echo handler
@@ -214,6 +218,10 @@ Visit the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/
214218
| [File Inputs](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-responses/samples/sample_14_file_inputs.py) | Receive files via base64 data URL, URL, or file ID |
215219
| [Annotations](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-responses/samples/sample_15_annotations.py) | Attach file_path, file_citation, and url_citation annotations |
216220
| [Structured Outputs](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-responses/samples/sample_16_structured_outputs.py) | Return structured JSON as a `structured_outputs` item |
221+
| [Durable Claude](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-responses/samples/durable_claude/agent.py) | Claude Agent SDK with stateful sessions and three-phase cancel |
222+
| [Durable Copilot](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-responses/samples/durable_copilot/agent.py) | Copilot SDK with session lifecycle and steering |
223+
| [Durable LangGraph](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-responses/samples/durable_langgraph/agent.py) | LangGraph multi-step graph with per-node checkpointing |
224+
| [Durable Multi-turn](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-responses/samples/durable_multiturn/agent.py) | Multi-turn conversation with bounded metadata |
217225

218226
- [Handler implementation guide](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/agentserver/azure-ai-agentserver-responses/docs/handler-implementation-guide.md) — Detailed reference for building handlers
219227

sdk/agentserver/azure-ai-agentserver-responses/azure/ai/agentserver/responses/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
get_input_expanded,
1717
to_output_item,
1818
)
19+
from .models.runtime import CancellationReason
1920
from .store._base import ResponseProviderProtocol, ResponseStreamProviderProtocol
2021
from .store._foundry_errors import (
2122
FoundryApiError,
@@ -32,6 +33,7 @@
3233
__all__ = [
3334
"__version__",
3435
"data_url", # pylint: disable=naming-mismatch
36+
"CancellationReason",
3537
"ResponsesAgentServerHost",
3638
"ResponseContext",
3739
"IsolationContext",
Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
# Copyright (c) Microsoft Corporation.
2+
# Licensed under the MIT license.
3+
"""DurabilityContext — recovery-awareness state exposed to response handlers.
4+
5+
Per spec 015 FR-040 / FR-005, the handler-facing metadata wrapper rejects
6+
any key (or named-namespace name) starting with ``_`` so that response
7+
handlers cannot accidentally collide with framework-reserved namespaces
8+
(e.g. ``_responses``). The framework layer reaches those namespaces via
9+
the underlying :class:`~azure.ai.agentserver.core.durable.TaskContext`
10+
directly — the primitive itself does not enforce the convention.
11+
"""
12+
13+
from __future__ import annotations
14+
15+
from collections.abc import Iterator, MutableMapping
16+
from typing import Any, Literal, Optional
17+
18+
DurabilityEntryMode = Literal["fresh", "recovered"]
19+
20+
21+
class _DeveloperMetadataFacade(MutableMapping[str, Any]):
22+
"""Handler-facing wrapper over a ``TaskMetadata``-like backing store.
23+
24+
Provides the same dict-like + callable shape as
25+
:class:`~azure.ai.agentserver.core.durable.TaskMetadata` but rejects
26+
any key (or namespace name) starting with ``_``. Framework layers
27+
that need to write into reserved namespaces (e.g. ``_responses``)
28+
must use the underlying ``TaskContext.metadata`` directly — they do
29+
NOT go through this wrapper.
30+
"""
31+
32+
def __init__(self, raw: Any, _namespaces: Optional[dict[str, Any]] = None) -> None:
33+
self._raw = raw
34+
# For plain-dict backing stores (used in unit tests where the
35+
# backing object isn't a real TaskMetadata), maintain a private
36+
# per-namespace dict registry so ``facade(name)`` returns a
37+
# genuinely isolated store. For real TaskMetadata stores (callable),
38+
# the underlying primitive owns the registry.
39+
self._namespaces: dict[str, Any] = _namespaces if _namespaces is not None else {}
40+
41+
@staticmethod
42+
def _check_key(key: Any) -> None:
43+
if isinstance(key, str) and key.startswith("_"):
44+
raise ValueError(
45+
f"metadata keys starting with '_' are reserved for "
46+
f"framework-internal namespaces (got {key!r}). Pick a "
47+
f"non-underscore-prefixed name."
48+
)
49+
50+
def __getitem__(self, key: str) -> Any:
51+
self._check_key(key)
52+
return self._raw[key]
53+
54+
def __setitem__(self, key: str, value: Any) -> None:
55+
self._check_key(key)
56+
self._raw[key] = value
57+
58+
def __delitem__(self, key: str) -> None:
59+
self._check_key(key)
60+
del self._raw[key]
61+
62+
def __iter__(self) -> Iterator[str]:
63+
return iter(k for k in self._raw if not (isinstance(k, str) and k.startswith("_")))
64+
65+
def __len__(self) -> int:
66+
return sum(1 for k in self._raw if not (isinstance(k, str) and k.startswith("_")))
67+
68+
def __contains__(self, key: object) -> bool:
69+
if isinstance(key, str) and key.startswith("_"):
70+
return False
71+
return key in self._raw
72+
73+
def get(self, key: str, default: Any = None) -> Any:
74+
if isinstance(key, str) and key.startswith("_"):
75+
return default
76+
return self._raw.get(key, default)
77+
78+
def __call__(self, name: Optional[str] = None) -> "_DeveloperMetadataFacade":
79+
"""Return a sibling namespace facade.
80+
81+
``ctx.metadata`` accesses the default (unnamed) namespace.
82+
``ctx.metadata(name)`` accesses a named namespace.
83+
84+
:raises ValueError: If ``name`` starts with ``_`` (reserved).
85+
"""
86+
if name is None:
87+
return self
88+
if not isinstance(name, str):
89+
raise TypeError(
90+
f"namespace name must be a str, got {type(name).__name__}"
91+
)
92+
if name.startswith("_"):
93+
raise ValueError(
94+
f"named namespace {name!r} starts with '_', which is "
95+
f"reserved for framework-internal layers (e.g. "
96+
f"'_responses'). Pick a non-underscore-prefixed name."
97+
)
98+
raw = self._raw
99+
if callable(raw):
100+
sub = raw(name)
101+
return _DeveloperMetadataFacade(sub)
102+
# Plain-dict fallback: keep an isolated sub-dict per namespace
103+
sub = self._namespaces.setdefault(name, {})
104+
return _DeveloperMetadataFacade(sub)
105+
106+
async def flush(self) -> None:
107+
"""Force-persist any pending metadata writes for this namespace.
108+
109+
Delegates to the underlying ``TaskMetadata.flush()`` when present.
110+
For non-durable / transient contexts (e.g. ``store=false`` responses
111+
or unit tests where the backing store is a plain ``dict``), this
112+
is a no-op.
113+
"""
114+
flush = getattr(self._raw, "flush", None)
115+
if callable(flush):
116+
import asyncio # local import to avoid top-level cycle # noqa: PLC0415
117+
118+
result = flush()
119+
if asyncio.iscoroutine(result):
120+
await result
121+
122+
123+
class DurabilityContext:
124+
"""Recovery-awareness context exposed to response handlers.
125+
126+
All properties are read-only except :attr:`metadata`, which is a
127+
mutable mapping (also callable for named namespaces) for
128+
developer-controlled checkpointing.
129+
130+
:param entry_mode: How the handler was entered — ``"fresh"`` for
131+
normal invocation or ``"recovered"`` after a crash.
132+
:param retry_attempt: Retry attempt counter — durable across crash
133+
recovery. Resets to 0 on a successful invocation chain; increments
134+
only on retryable failures.
135+
:param was_steered: Whether this invocation resulted from steering.
136+
:param pending_inputs: Number of queued steering inputs after this one.
137+
:param metadata: Developer-accessible checkpoint store. Use
138+
``ctx.metadata`` for the default namespace or
139+
``ctx.metadata(name)`` for a named namespace.
140+
"""
141+
142+
__slots__ = (
143+
"_entry_mode",
144+
"_retry_attempt",
145+
"_was_steered",
146+
"_pending_inputs",
147+
"_metadata",
148+
)
149+
150+
def __init__(
151+
self,
152+
*,
153+
entry_mode: DurabilityEntryMode,
154+
retry_attempt: int,
155+
was_steered: bool,
156+
pending_inputs: int,
157+
metadata: Any,
158+
) -> None:
159+
self._entry_mode = entry_mode
160+
self._retry_attempt = retry_attempt
161+
self._was_steered = was_steered
162+
self._pending_inputs = pending_inputs
163+
self._metadata = (
164+
metadata
165+
if isinstance(metadata, _DeveloperMetadataFacade)
166+
else _DeveloperMetadataFacade(metadata)
167+
)
168+
169+
@property
170+
def entry_mode(self) -> DurabilityEntryMode:
171+
"""How the handler was entered: ``'fresh'`` or ``'recovered'``."""
172+
return self._entry_mode
173+
174+
@property
175+
def is_recovery(self) -> bool:
176+
"""Convenience: True when this is a recovered re-invocation after a crash.
177+
178+
Equivalent to ``entry_mode == "recovered"``.
179+
"""
180+
return self._entry_mode == "recovered"
181+
182+
@property
183+
def retry_attempt(self) -> int:
184+
"""Retry attempt counter — durable across crash recovery.
185+
186+
Resets to 0 on a successful invocation; increments only when the
187+
handler is re-invoked due to a retryable failure. The value is
188+
persisted to the task store at lifecycle boundaries, so it is
189+
stable across both in-process retries and post-crash recovery.
190+
191+
Per spec 015 FR-001/FR-002, this counter unifies the previous
192+
``run_attempt`` (per-process) and the cross-lifetime intent: the
193+
framework now tracks a single durable retry count.
194+
"""
195+
return self._retry_attempt
196+
197+
@property
198+
def was_steered(self) -> bool:
199+
"""Whether this invocation was triggered by a steering input."""
200+
return self._was_steered
201+
202+
@property
203+
def pending_inputs(self) -> int:
204+
"""Number of queued steering inputs remaining after this one."""
205+
return self._pending_inputs
206+
207+
@property
208+
def metadata(self) -> _DeveloperMetadataFacade:
209+
"""Developer-accessible checkpoint store.
210+
211+
Use ``ctx.metadata["key"] = value`` for the default namespace, or
212+
``ctx.metadata("my_namespace")["key"] = value`` for a named
213+
namespace. Keys (and namespace names) starting with ``_`` are
214+
rejected — those are reserved for framework-internal layers.
215+
"""
216+
return self._metadata

sdk/agentserver/azure-ai-agentserver-responses/azure/ai/agentserver/responses/_options.py

Lines changed: 32 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,11 @@ def __init__(
2323
sse_keep_alive_interval_seconds: int | None = None,
2424
shutdown_grace_period_seconds: int = 10,
2525
create_span_hook: "CreateSpanHook | None" = None,
26+
durable_background: bool = True,
27+
steerable_conversations: bool = False,
28+
store_disabled: bool = False,
29+
max_pending: int = 10,
30+
replay_event_ttl_seconds: float = 600,
2631
) -> None:
2732
if additional_server_version is not None:
2833
normalized = additional_server_version.strip()
@@ -34,7 +39,10 @@ def __init__(
3439
default_model = normalized_model or None
3540
self.default_model = default_model
3641

37-
if sse_keep_alive_interval_seconds is not None and sse_keep_alive_interval_seconds <= 0:
42+
if (
43+
sse_keep_alive_interval_seconds is not None
44+
and sse_keep_alive_interval_seconds <= 0
45+
):
3846
raise ValueError("sse_keep_alive_interval_seconds must be > 0 when set")
3947
self.sse_keep_alive_interval_seconds = sse_keep_alive_interval_seconds
4048

@@ -48,8 +56,30 @@ def __init__(
4856

4957
self.create_span_hook = create_span_hook
5058

59+
# Durability options (developer-controlled, baked into container image)
60+
if steerable_conversations and store_disabled:
61+
raise ValueError(
62+
"steerable_conversations=True requires store to be enabled "
63+
"(store_disabled must be False)"
64+
)
65+
if steerable_conversations and not durable_background:
66+
raise ValueError(
67+
"steerable_conversations=True requires durable_background=True "
68+
"for background responses"
69+
)
70+
if max_pending <= 0:
71+
raise ValueError("max_pending must be > 0")
72+
73+
self.durable_background = durable_background
74+
self.steerable_conversations = steerable_conversations
75+
self.store_disabled = store_disabled
76+
self.max_pending = max_pending
77+
self.replay_event_ttl_seconds = replay_event_ttl_seconds
78+
5179
@classmethod
52-
def from_env(cls, environ: Mapping[str, str] | None = None) -> "ResponsesServerOptions":
80+
def from_env(
81+
cls, environ: Mapping[str, str] | None = None
82+
) -> "ResponsesServerOptions":
5383
"""Create options from environment variables.
5484
5585
:param environ: Optional mapping of environment variables. Defaults to ``os.environ``.

0 commit comments

Comments
 (0)