|
| 1 | +@startuml |
| 2 | + |
| 3 | +participant Client |
| 4 | +participant Endpoint as "Streaming query endpoint handler" |
| 5 | +participant Auth |
| 6 | +participant LlamaStack as "Llama Stack Client" |
| 7 | +participant EventHandler as "Stream build event" |
| 8 | +participant SSE as "SSE Response Stream" |
| 9 | + |
| 10 | +Client->>Endpoint: HTTP POST /stream_query |
| 11 | +Endpoint->>Auth: Validate auth, user, conversation access |
| 12 | +Auth-->>Endpoint: Access granted |
| 13 | +Endpoint->>LlamaStack: Call retrieve_response(model, query) |
| 14 | +LlamaStack-->>Endpoint: AsyncIterator[AgentTurnResponseStreamChunk] |
| 15 | + |
| 16 | +Endpoint->>SSE: stream_start_event(conversation_id) |
| 17 | +SSE-->>Client: SSE: start |
| 18 | + |
| 19 | +loop For each chunk from LlamaStack |
| 20 | + Endpoint->>EventHandler: stream_build_event(chunk, chunk_id, metadata) |
| 21 | + alt Chunk Type: turn_start |
| 22 | + EventHandler->>SSE: emit turn_start event |
| 23 | + else Chunk Type: inference |
| 24 | + EventHandler->>SSE: emit inference (token) event |
| 25 | + else Chunk Type: tool_execution |
| 26 | + EventHandler->>SSE: emit tool_call + tool_result events |
| 27 | + else Chunk Type: shield |
| 28 | + EventHandler->>SSE: emit shield validation event |
| 29 | + else Chunk Type: turn_complete |
| 30 | + EventHandler->>SSE: emit turn_complete event |
| 31 | + else Error |
| 32 | + EventHandler->>SSE: emit error event |
| 33 | + end |
| 34 | + SSE-->>Client: SSE event(s) |
| 35 | +end |
| 36 | + |
| 37 | +Endpoint->>SSE: stream_end_event(metadata, summary, token_usage) |
| 38 | +SSE-->>Client: SSE: end (with metadata) |
| 39 | + |
| 40 | +Endpoint->>Endpoint: Conditionally persist transcript & cache |
| 41 | +Endpoint-->>Client: Close stream |
| 42 | + |
| 43 | +@enduml |
0 commit comments