Skip to content

Commit c286146

Browse files
committed
docs(rfd): Initial draft of prompt format change
This is a very early draft with some lingering open questions. But I am opening now for discussion.
1 parent bbd2933 commit c286146

File tree

1 file changed

+306
-0
lines changed

1 file changed

+306
-0
lines changed

docs/rfds/v2-prompt.mdx

Lines changed: 306 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,306 @@
1+
---
2+
title: "Notification-based Prompting"
3+
---
4+
5+
Author(s): [@benbrandt](https://github.com/benbrandt)
6+
7+
## Elevator pitch
8+
9+
> What are you proposing to change?
10+
11+
For v2 of the protocol wire format, I am proposing that we should move to a fully notification-based format for prompting. In fact, for session updates for both sides in general.
12+
13+
Once a session is created, both client and agent will be able to update the session at any point in time by sending updates to each other. As I'll go into later, this not only removes some current awkwardness around the prompt request lifecycle, but also provides a more flexible foundation to add features like queued messages and multi-client replay. This can even allow the agent to initiate an interaction in a session rather than requiring it to wait for a user prompt, which is becoming increasingly important for background tasks and agents which may send updates before or after a "turn" is over since its runtime might be different than the main conversation.
14+
15+
## Status quo
16+
17+
> How do things work today and what problems does this cause? Why would we change things?
18+
19+
Currently, the protocol kind of assumes that all turns will be initiated by a client and ended by an agent, with a series of session update notififcatons in-between. While in many cases this is enough, it is becoming clear that this model is not flexible enough.
20+
21+
It is not clear how to model queued messages for instance: would these create a new turn request lifecycle? Fit into the existing one?
22+
23+
What if the agent wants to submit some text at the start of a session _before_ the user prompts? Or a status update? Also, if an agent finishes it's turn, wants to wait for the next user agent, but had a background subagent or task running, can it only submit updates about that status after the user prompts again? When replaying a session, the prompt request can be turned into a user message notification, but what about the end of turn response?
24+
25+
Some clients handle these out-of-turn updates more gracefully than others. But it is a constant point of confusion in discussions and issues.
26+
27+
In the spirit of allowing as much flexibility in the protocol for new paradigms and designs to emerge in the prompt lifecycle, I think imposing fewer restrictions, whether described or just inferred from the protocol, on when participants can make session updates will allow for more dynamic sessions, as well as make it easier to extend to new use cases in the future.
28+
29+
## What we propose to do about it
30+
31+
> What are you proposing to improve the situation?
32+
33+
The first thing that needs to be decided is which `session/update` types are allowed from each side. While there may be some nice aspects of allowing both sides to send all updates, it is not clear this makes sense. It seems that there would be a more limited set of updates a client would/should make, and the agent session updates remain a superset of all session update notifications.
34+
35+
So an agent is allowed to send all session update types, and the client would be able to send only a few types. (This has the nice side-effect of allowing the agent, or a proxy in-between, to forward all client session updates to other connected clients in the case of attaching multiple clients to the same session. Perhaps in this case, the notifications have a client id of some kind, but this can be reserved for another RFD).
36+
37+
### Client user message `session/update` notification
38+
39+
With the current spec, this would basically be one variant: `prompt`.
40+
41+
It would look something like:
42+
43+
```json
44+
{
45+
"jsonrpc": "2.0",
46+
"method": "session/update",
47+
"params": {
48+
"sessionId": "sess_789xyz",
49+
"update": {
50+
"sessionUpdate": "prompt",
51+
"content": [
52+
{
53+
"type": "text",
54+
"text": "What's the capital of France?"
55+
}
56+
]
57+
}
58+
}
59+
}
60+
```
61+
62+
With content being an array of `ContentBlock`s.
63+
64+
v2 may bring some other session update types, but the proposal would be to just allow user message based notifications for now.
65+
66+
### Additional Agent `session/update` notification types
67+
68+
As far as I can tell at the moment, we only need to add a few notification types on the agent side to make this work.
69+
70+
#### User message accepted/acknowledged
71+
72+
In order to have a consistent understanding between agent and client on where the user message appears within the session history in relation to other messages, it is important to see when and where the agent has accepted the user message into the feed.
73+
74+
This will also be important for queueing messages, depending on how we implement that, so that the client can know if it is still allowed to edit the queued message, or where in the turn order it got inserted.
75+
76+
Even without a new queue, which may allow for editing the queued message, it means that the client doesn't necessarily have to send a `session/cancel` before prompting. This would need some exploration, but potentially the agent could decide whether it cancels the current turn and inserts it immediately, or inserts it at the next convenient break point. This should probably still be defined as "as soon as possible" and queueing would enable some later points, but it could still be more graceful than needing to cancel all current tool calls for example, as is required at the moment.
77+
78+
The question then turns to what makes up this notification. Which brings us to:
79+
80+
**Who owns the user message id?**
81+
82+
This is an open question at the moment for the [message id RFD](./message-id). If we allow the client to define the message id, this allows the client to eagerly create it and rely on it. However, if there isn't an agreement on "uniqueness", or if a given agent requires all message ids to be UUIDs or something similar, this could cause issues if both sides are allowed to treat ids as an opaque string, since there would need to be some agreement on them.
83+
84+
However, for acknowledgement, we will need to know which message the agent acknowledged. An option is to have the agent replay the full user message, which would be the safest way (and would be required for replay/fanning out to multiple clients). Or we would have to have some concept of a "prompt id" which would potentially be different than a message id, so that the client can generate an id it can rely on and let the agent continue to generate their own ids, and it is the sole source of truth. This is tempting, as ultimately the agent is responsible for the session persistence, and otherwise we may need to align on UUIDs or something similar for message ids, which may or may not fit well with the current agent implementations.
85+
86+
My current proposal is that this would look like the client sending the following message:
87+
88+
```json
89+
{
90+
"jsonrpc": "2.0",
91+
"method": "session/update",
92+
"params": {
93+
"sessionId": "sess_789xyz",
94+
"update": {
95+
"sessionUpdate": "prompt",
96+
"promptId": "prompt_123abc",
97+
"content": [
98+
{
99+
"type": "text",
100+
"text": "What's the capital of France?"
101+
}
102+
]
103+
}
104+
}
105+
}
106+
```
107+
108+
And the Agent responds with:
109+
110+
```json
111+
{
112+
"jsonrpc": "2.0",
113+
"method": "session/update",
114+
"params": {
115+
"sessionId": "sess_789xyz",
116+
"update": {
117+
"sessionUpdate": "user_message",
118+
"promptId": "prompt_123abc",
119+
"messageId": "mess_456def",
120+
"content": [
121+
{
122+
"type": "text",
123+
"text": "What's the capital of France?"
124+
}
125+
]
126+
}
127+
}
128+
}
129+
```
130+
131+
This likely needs some discussion, but it allows for:
132+
133+
- The client to send a prompt with a given id
134+
- And the agent to tell the client which inserted user message was created by that prompt, as well as it's new message id.
135+
136+
The reason to differentiate the two ids is because a user message may be originated from other sources besides a prompt, so this lets the client know that the given prompt was accepted. And also allows for the message to be fanned out to other clients as necessary.
137+
138+
Queuing could potentially build off of this by supporting some new notifications like some param on the prompt notification of how eagerly to insert it, and potentially a `prompt_update` notification to edit the content of that prompt id. Again to be determined, but it feels like it could fit nicely in the design.
139+
140+
#### `end_turn` notification
141+
142+
This would be a notification from the agent to indicate that it's current "turn" has ended, carrying information like `stopReason` and `usage` data for that turn.
143+
144+
It would also carry an optional error payload if a prompt was invalid or the turn failed to run for any reason.
145+
146+
Success:
147+
148+
```json
149+
{
150+
"jsonrpc": "2.0",
151+
"method": "session/update",
152+
"params": {
153+
"sessionId": "sess_789xyz",
154+
"update": {
155+
"sessionUpdate": "end_turn",
156+
"stop_reason": "end_turn",
157+
"usage": {...},
158+
}
159+
}
160+
}
161+
```
162+
163+
Error for max tokens or other stop reason:
164+
165+
```json
166+
{
167+
"jsonrpc": "2.0",
168+
"method": "session/update",
169+
"params": {
170+
"sessionId": "sess_789xyz",
171+
"update": {
172+
"sessionUpdate": "end_turn",
173+
"stop_reason": "max_tokens",
174+
"usage": {...},
175+
}
176+
}
177+
}
178+
```
179+
180+
#### Error Notification
181+
182+
During the course of execution, it is possible for errors to occur. Since we won't have the ability to return these errors in the response, we need a way to send them for a given session.
183+
184+
For example, an error for invalid prompt, using the normal ACP error shape:
185+
186+
```json
187+
{
188+
"jsonrpc": "2.0",
189+
"method": "session/update",
190+
"params": {
191+
"sessionId": "sess_789xyz",
192+
"update": {
193+
"sessionUpdate": "error",
194+
// Normal ACP Error shape
195+
"code": -36203,
196+
"message": "Invalid params",
197+
"data": {...}
198+
}
199+
}
200+
}
201+
```
202+
203+
Or auth required:
204+
205+
For example, an error for invalid prompt, using the normal ACP error shape:
206+
207+
```json
208+
{
209+
"jsonrpc": "2.0",
210+
"method": "session/update",
211+
"params": {
212+
"sessionId": "sess_789xyz",
213+
"update": {
214+
"sessionUpdate": "error",
215+
// Normal ACP Error shape
216+
"code": -32000,
217+
"message": "Authentication required",
218+
"data": {...}
219+
}
220+
}
221+
}
222+
```
223+
224+
## Shiny future
225+
226+
> How will things will play out once this feature exists?
227+
228+
Overall, we move a few pieces of data from request/response into notifications, and we unlock a lot of new patterns of behavior. Basically the session becomes
229+
230+
## Implementation details and plan
231+
232+
> Tell me more about your implementation. What is your detailed implementation plan?
233+
234+
Overall, this isn't a huge lift on schema definition, but it is a large, **breaking** change in behavior which means we can only stabilize in protocol version 2.
235+
236+
Depending on how the rest of v2 testing goes, we can either:
237+
238+
1. Make this an opt-in "future-flag" capability on v1 so people can experiement, but it would be an unstable feature regardless.
239+
2. We establish a preview/beta flow for v2
240+
241+
We definitely need 2 regardless, and can likely handle this in a similar "unstable" manner as we do for current unstabilized features. It's likely timing will work out that we can just do it that way. If for some reason the timing doesn't work out, we can start experimenting with an unstable capability or some `_meta` flag.
242+
243+
## Frequently asked questions
244+
245+
> What questions have arisen over the course of authoring this document or during subsequent discussions?
246+
247+
#### Should we use the same method name here, and just rely on the schema defining for each side which variants are allowed?
248+
249+
This seems to be the right approach, each side is allowed to receive session update notifications, and the schema determines which types of notifications to expect.
250+
251+
#### Should `end_turn` be more like a `running/idle/requires_action` state change?
252+
253+
Since we are moving away from a user-initiated turn lifecycle, it begs the question of if end_turn is the right naming here. It likely is still a helpful concept, as it says when the generation stopped. But potentially it might be helpful to do something similar to what the claude agent SDK is doing with their status update messages and just indicate at a high-level whether it is "running", "idle" or "requires action" when user input is necessary to continue the current task.
254+
255+
This would require the agents to emit more of these at various points. But perhaps there is value in allowing the agent more explicit indications of the current status.
256+
257+
#### Should we more closely associate errros with specific client notifications?
258+
259+
Or is it ok to have errors just occur in the stream and the lack of a replayed user message will mean it wasn't accepted?
260+
261+
We could also provide more fields on the error if necessary to make it easier to identify where it came from. An alternative would be:
262+
263+
```json
264+
{
265+
"jsonrpc": "2.0",
266+
"method": "session/error",
267+
"params": {
268+
"sessionId": "sess_789xyz",
269+
"promptId": "prompt_123abc", // Optional id fields as needed
270+
"error": {
271+
// Normal ACP Error shape
272+
"code": -32000,
273+
"message": "Authentication required",
274+
"data": {...}
275+
}
276+
}
277+
}
278+
```
279+
280+
This would make it a bit more generic, and might be a good alternative.
281+
282+
### What alternative approaches did you consider, and why did you settle on this one?
283+
284+
#### Prompt remains a request
285+
286+
We could have prompt remain a request rather than a notification. The difference is, it would receive a response once it is _accepted_ not when the turn is over.
287+
288+
The response would become something like:
289+
290+
```json
291+
{
292+
"jsonrpc": "2.0",
293+
"id": "req_12345",
294+
"result": {
295+
"messageId": "msg_789xyz"
296+
}
297+
}
298+
```
299+
300+
And then the agent would still replay the message in the correct position. The benefit here is that we have perhaps a better mechanism of associating errors, since a prompt could be invalid or unsupported.
301+
302+
This might also be more desirable for queueing messages, so that you can cancel the request if you want to edit and requeue.
303+
304+
## Revision history
305+
306+
2026-04-13: Initial draft

0 commit comments

Comments
 (0)