Skip to content

Commit b58f646

Browse files
authored
Merge pull request #108 from awslabs/feature/session-management-docs
Add session management documentation guide
2 parents 2b1155f + 2f1724b commit b58f646

2 files changed

Lines changed: 301 additions & 0 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,7 @@ fullstack-agentcore-solution-template/
159159
│ ├── CEDAR_POLICY_GUIDE.md # Cedar policy syntax, capabilities & reference
160160
│ ├── REPLACING_COGNITO.md # Identity provider swap & Gateway interceptors guide
161161
│ ├── RUNTIME_GATEWAY_AUTH.md # M2M authentication workflow
162+
│ ├── SESSION_MANAGEMENT.md # Session persistence & resumption guide
162163
│ ├── CONTEXT_MANAGEMENT.md # Context window management guide
163164
│ ├── STREAMING.md # Streaming implementation guide
164165
│ ├── TOOL_AC_CODE_INTERPRETER.md # Code Interpreter guide

docs/SESSION_MANAGEMENT.md

Lines changed: 300 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,300 @@
1+
# Session Management Guide
2+
3+
By default, FAST sessions are ephemeral — the session ID lives only in React state, so refreshing the page or logging out loses the conversation. To let users resume past conversations, two things are needed: a place where conversations live, and an API the frontend can call to list and read them.
4+
5+
---
6+
7+
## The Shape That Doesn't Change
8+
9+
Whichever storage pattern you pick, the wiring looks the same:
10+
11+
- An **API Gateway endpoint** protected by Cognito
12+
- A **Lambda** that reads from the chosen store
13+
- A **frontend service** that calls the API and powers a sidebar
14+
15+
The minimum endpoints are:
16+
17+
| Method | Path | Description |
18+
|--------|------|-------------|
19+
| `GET` | `/sessions` | List sessions for the authenticated user |
20+
| `GET` | `/sessions/{id}` | Return a session's conversation history |
21+
| `DELETE` | `/sessions/{id}` | Remove a session |
22+
23+
There is no `POST /sessions`. New sessions start implicitly the first time the agent runs with a fresh session ID, generated client-side via `crypto.randomUUID()`. The agent itself does the writing inside the runtime — the same way it already writes to AgentCore Memory today. The API only reads.
24+
25+
---
26+
27+
## Three Patterns
28+
29+
### Pattern 1: AgentCore Memory Only
30+
31+
Use the memory resource the agent already writes to. The Lambda calls `ListSessions` and `ListEvents` from the AgentCore data plane to power the sidebar and load history, plus `DeleteEvent` for deletion. No new storage to manage.
32+
33+
#### API Calls
34+
35+
`ListSessions` — list a user's sessions:
36+
37+
```python
38+
client.list_sessions(memoryId=MEMORY_ID, actorId=user_sub)
39+
# {
40+
# "sessionSummaries": [
41+
# {"sessionId": "...", "actorId": "...", "createdAt": <datetime>},
42+
# ...
43+
# ],
44+
# "nextToken": "..." # when paginated
45+
# }
46+
```
47+
48+
`ListEvents` — read a session's history:
49+
50+
```python
51+
client.list_events(
52+
memoryId=MEMORY_ID,
53+
actorId=user_sub,
54+
sessionId=session_id,
55+
includePayloads=True,
56+
)
57+
# {
58+
# "events": [
59+
# {
60+
# "eventId": "...",
61+
# "eventTimestamp": <datetime>,
62+
# "payload": [
63+
# {"conversational": {"role": "USER", "content": {"text": "..."}}}
64+
# ],
65+
# },
66+
# ...
67+
# ],
68+
# "nextToken": "..."
69+
# }
70+
```
71+
72+
Events come back newest-first.
73+
74+
`DeleteEvent` — delete events from a session:
75+
76+
```python
77+
client.delete_event(
78+
memoryId=MEMORY_ID,
79+
actorId=user_sub,
80+
sessionId=session_id,
81+
eventId=event_id,
82+
)
83+
```
84+
85+
To delete a full session, page through `ListEvents` and call `DeleteEvent` on each. The session summary remains in `ListSessions` afterwards — there is no API to remove it, so a fully "deleted" session needs to be masked out-of-band (for example, in DynamoDB).
86+
87+
#### Limitations
88+
89+
- AgentCore Memory is short-term — events expire after the `eventExpiryDuration` configured on the memory resource. Past that horizon, the conversation is gone.
90+
- Listing results are ordered by `sessionId`, not recency. Session summaries don't carry a name or last-activity field, so the Lambda needs an extra `ListEvents` call per session to derive a title and sort the sidebar.
91+
92+
#### Best Fit
93+
94+
Short-lived apps, prototypes, or experiences where the sidebar only needs to surface recent sessions within the configured retention window.
95+
96+
---
97+
98+
### Pattern 2: AgentCore Memory + DynamoDB
99+
100+
Keep AgentCore Memory as the agent's runtime memory and add a DynamoDB table next to it. The agent still writes to Memory exactly as before — DynamoDB is a second store that the API reads from, not a replacement for Memory.
101+
102+
Two reasonable flavors:
103+
104+
| Flavor | DynamoDB Stores | Conversation Source | Use When |
105+
|--------|----------------|--------------------| ---------|
106+
| **Metadata only** | Session name, last-activity, status, custom fields | `ListEvents` from Memory | You want a fast, sortable sidebar but conversations fit within Memory's retention |
107+
| **Full duplication** | Metadata + full conversation history | DynamoDB directly | You need conversations to outlive Memory's `eventExpiryDuration`, or want a durable record |
108+
109+
In both flavors the agent and the API stay decoupled — the agent doesn't know DynamoDB exists. Some other process (the runtime, a stream hook, or the Lambda itself on first read) populates it. The frontend reads from DynamoDB through the API.
110+
111+
#### Data Model (Metadata Flavor)
112+
113+
```
114+
Table: {stack_name_base}-Sessions
115+
├── Partition Key: userId (String) — Cognito user sub
116+
├── Sort Key: sessionId (String) — UUID generated client-side
117+
├── Attributes:
118+
│ ├── name (String) — Display name
119+
│ ├── status (String) — "active" | "completed" | "cancelled"
120+
│ ├── createdAt (String) — ISO 8601
121+
│ ├── updatedAt (String) — ISO 8601 (last activity)
122+
│ └── metadata (Map) — Application-specific data
123+
```
124+
125+
**Why userId as partition key?** Each user only queries their own sessions. This gives efficient `Query` operations without a GSI.
126+
127+
#### Best Fit
128+
129+
Production applications where you need durable session metadata, fast sidebar listing sorted by recency, or session data that outlives Memory's retention window.
130+
131+
---
132+
133+
### Pattern 3: Skip AgentCore Memory
134+
135+
Drop AgentCore Memory and use your own storage as the agent's memory system. The agent reads and writes session state to that store on every turn — the storage *is* the conversation history, not a parallel copy of it.
136+
137+
Strands makes this straightforward with its [`S3SessionManager`](https://strandsagents.com/docs/user-guide/concepts/agents/session-management/#s3sessionmanager--s3storage), which loads and persists session state to an S3 prefix on each invocation:
138+
139+
```python
140+
from strands import Agent
141+
from strands.session import S3SessionManager
142+
143+
agent = Agent(
144+
model=model,
145+
tools=tools,
146+
session_manager=S3SessionManager(
147+
bucket_name="my-sessions-bucket",
148+
prefix="sessions/",
149+
session_id=session_id,
150+
),
151+
)
152+
```
153+
154+
A DynamoDB-backed equivalent works the same way. The Lambda powering the API reads from the same store the agent uses.
155+
156+
#### Best Fit
157+
158+
When you don't want to operate a Memory resource at all, or when you'd rather keep all conversation state in storage your team already manages.
159+
160+
---
161+
162+
## Picking a Pattern
163+
164+
| Consideration | Memory Only | Memory + DDB | Skip Memory |
165+
|---------------|-------------|--------------|-------------|
166+
| New infrastructure | None | DynamoDB table | S3 bucket or DDB table |
167+
| Sidebar speed | Slow (extra API calls per session) | Fast (single Query) | Fast (direct read) |
168+
| Session retention | Limited by `eventExpiryDuration` | Unlimited | Unlimited |
169+
| Custom metadata (name, status) | No | Yes | Yes |
170+
| Complexity | Low | Medium | Medium |
171+
172+
Start with **Memory-only** if your retention horizon fits the configured expiry and your sidebar needs are basic. Add **DynamoDB** when you need durable session metadata or fast listing. **Skip Memory** entirely if your framework already persists state where you want it.
173+
174+
In all three cases the API contract, the Lambda's role, and the frontend stay the same. Only the Lambda's data source changes.
175+
176+
---
177+
178+
## Frontend Pieces
179+
180+
The same set of pieces applies regardless of pattern:
181+
182+
1. **A service module** that calls the three endpoints (`GET /sessions`, `GET /sessions/{id}`, `DELETE /sessions/{id}`)
183+
2. **A sidebar component** that lists sessions and triggers resume on click
184+
3. **A resume handler** that loads the chosen session's history and reuses its session ID for follow-up messages
185+
4. **A "new chat" handler** that generates a fresh session ID and clears the panel
186+
187+
### Session Resumption
188+
189+
When resuming a session, the key insight is that the agent's session manager (whether AgentCore Memory, S3, or DDB) stores conversation history keyed by session ID. You don't need to re-send past messages. Simply:
190+
191+
1. Set the `runtimeSessionId` to the existing session's ID
192+
2. Send the new user message
193+
3. The session manager loads prior context automatically
194+
195+
### Session Naming
196+
197+
| Strategy | Implementation | Quality |
198+
|----------|---------------|---------|
199+
| **First message truncation** | `message.slice(0, 50)` | Low — often not descriptive |
200+
| **LLM-generated title** | Ask the model after first exchange | High — adds latency/cost |
201+
| **User-provided** | Rename option in sidebar | Highest — requires user action |
202+
203+
---
204+
205+
## Implementation Reference: DynamoDB + API Gateway
206+
207+
This section provides concrete infrastructure guidance for Pattern 2. The implementation follows the same patterns as the existing FAST Feedback API (`infra-cdk/lambdas/feedback/`).
208+
209+
### CDK: DynamoDB Table
210+
211+
```typescript
212+
import * as dynamodb from "aws-cdk-lib/aws-dynamodb";
213+
214+
const sessionsTable = new dynamodb.Table(this, "SessionsTable", {
215+
tableName: `${config.stack_name_base}-Sessions`,
216+
partitionKey: { name: "userId", type: dynamodb.AttributeType.STRING },
217+
sortKey: { name: "sessionId", type: dynamodb.AttributeType.STRING },
218+
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
219+
removalPolicy: cdk.RemovalPolicy.DESTROY,
220+
encryption: dynamodb.TableEncryption.AWS_MANAGED,
221+
pointInTimeRecovery: true,
222+
});
223+
```
224+
225+
### CDK: API Gateway Routes
226+
227+
Add session routes to the existing REST API, protected by the Cognito authorizer:
228+
229+
```typescript
230+
const sessionsResource = api.root.addResource("sessions");
231+
const sessionByIdResource = sessionsResource.addResource("{sessionId}");
232+
233+
sessionsResource.addMethod("GET", sessionsLambdaIntegration, {
234+
authorizer: cognitoAuthorizer,
235+
authorizationType: apigateway.AuthorizationType.COGNITO,
236+
});
237+
sessionByIdResource.addMethod("GET", sessionsLambdaIntegration, {
238+
authorizer: cognitoAuthorizer,
239+
authorizationType: apigateway.AuthorizationType.COGNITO,
240+
});
241+
sessionByIdResource.addMethod("DELETE", sessionsLambdaIntegration, {
242+
authorizer: cognitoAuthorizer,
243+
authorizationType: apigateway.AuthorizationType.COGNITO,
244+
});
245+
```
246+
247+
Follow the existing Feedback API pattern for CORS configuration and Lambda integration. See `infra-cdk/lib/backend-stack.ts` for the full reference.
248+
249+
### Lambda Handler
250+
251+
Create `infra-cdk/lambdas/sessions/index.py` following the Powertools pattern used by the Feedback Lambda. The handler should:
252+
253+
- Use `APIGatewayRestResolver` with CORS config from environment
254+
- Extract `userId` from `request_context.authorizer.claims["sub"]`
255+
- Query DynamoDB with `userId` as partition key
256+
- Return sessions sorted by `updatedAt` descending
257+
258+
See `infra-cdk/lambdas/feedback/index.py` for the exact patterns to follow (imports, CORS setup, Cognito claims extraction, error handling).
259+
260+
### SSM Parameter
261+
262+
Store the API URL for cross-stack access:
263+
264+
```typescript
265+
new ssm.StringParameter(this, "SessionsApiUrlParam", {
266+
parameterName: `/${config.stack_name_base}/sessions-api-url`,
267+
stringValue: api.url,
268+
});
269+
```
270+
271+
### Cost
272+
273+
| Resource | Cost | Notes |
274+
|----------|------|-------|
275+
| DynamoDB (on-demand) | ~$1.25 per million writes, ~$0.25 per million reads | Negligible for most apps |
276+
| API Gateway | $3.50 per million requests | Shared with other routes |
277+
| Lambda | Free tier covers 1M requests/month | Minimal compute |
278+
279+
For most GenAI applications, these costs are negligible — token usage dominates.
280+
281+
---
282+
283+
## Advanced: Long-Running Agent Sessions
284+
285+
For agents that run autonomously for extended periods (minutes to hours), session management becomes more complex. Consider:
286+
287+
- **Status tracking** — Extend DynamoDB with `status`, `detail`, `lastHeartbeat` fields. Agent writes progress updates during execution.
288+
- **Polling** — If the agent runs longer than SSE connection timeout (~60–90s on AgentCore), the frontend polls `GET /sessions/{id}` for progress instead of streaming.
289+
- **Cancellation** — Frontend sets `status: "cancelled"` in DynamoDB. Agent checks before each tool call and stops gracefully.
290+
- **Streaming to S3** — Agent writes detailed output to S3 (JSONL format) for the frontend to consume independently of the session metadata.
291+
292+
---
293+
294+
## Further Reading
295+
296+
- [Strands S3SessionManager](https://strandsagents.com/docs/user-guide/concepts/agents/session-management/#s3sessionmanager--s3storage)
297+
- AgentCore Memory data plane: `CreateEvent`, `ListSessions`, `ListEvents`, `DeleteEvent`
298+
- [FAST Memory Integration Guide](./MEMORY_INTEGRATION.md)
299+
- [FAST Streaming Guide](./STREAMING.md)
300+
- [FAST Deployment Guide](./DEPLOYMENT.md)

0 commit comments

Comments
 (0)