|
| 1 | +# Session Management Guide |
| 2 | + |
| 3 | +By default, FAST sessions are ephemeral — the session ID lives only in React state, so refreshing the page or logging out loses the conversation. To let users resume past conversations, two things are needed: a place where conversations live, and an API the frontend can call to list and read them. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## The Shape That Doesn't Change |
| 8 | + |
| 9 | +Whichever storage pattern you pick, the wiring looks the same: |
| 10 | + |
| 11 | +- An **API Gateway endpoint** protected by Cognito |
| 12 | +- A **Lambda** that reads from the chosen store |
| 13 | +- A **frontend service** that calls the API and powers a sidebar |
| 14 | + |
| 15 | +The minimum endpoints are: |
| 16 | + |
| 17 | +| Method | Path | Description | |
| 18 | +|--------|------|-------------| |
| 19 | +| `GET` | `/sessions` | List sessions for the authenticated user | |
| 20 | +| `GET` | `/sessions/{id}` | Return a session's conversation history | |
| 21 | +| `DELETE` | `/sessions/{id}` | Remove a session | |
| 22 | + |
| 23 | +There is no `POST /sessions`. New sessions start implicitly the first time the agent runs with a fresh session ID, generated client-side via `crypto.randomUUID()`. The agent itself does the writing inside the runtime — the same way it already writes to AgentCore Memory today. The API only reads. |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## Three Patterns |
| 28 | + |
| 29 | +### Pattern 1: AgentCore Memory Only |
| 30 | + |
| 31 | +Use the memory resource the agent already writes to. The Lambda calls `ListSessions` and `ListEvents` from the AgentCore data plane to power the sidebar and load history, plus `DeleteEvent` for deletion. No new storage to manage. |
| 32 | + |
| 33 | +#### API Calls |
| 34 | + |
| 35 | +`ListSessions` — list a user's sessions: |
| 36 | + |
| 37 | +```python |
| 38 | +client.list_sessions(memoryId=MEMORY_ID, actorId=user_sub) |
| 39 | +# { |
| 40 | +# "sessionSummaries": [ |
| 41 | +# {"sessionId": "...", "actorId": "...", "createdAt": <datetime>}, |
| 42 | +# ... |
| 43 | +# ], |
| 44 | +# "nextToken": "..." # when paginated |
| 45 | +# } |
| 46 | +``` |
| 47 | + |
| 48 | +`ListEvents` — read a session's history: |
| 49 | + |
| 50 | +```python |
| 51 | +client.list_events( |
| 52 | + memoryId=MEMORY_ID, |
| 53 | + actorId=user_sub, |
| 54 | + sessionId=session_id, |
| 55 | + includePayloads=True, |
| 56 | +) |
| 57 | +# { |
| 58 | +# "events": [ |
| 59 | +# { |
| 60 | +# "eventId": "...", |
| 61 | +# "eventTimestamp": <datetime>, |
| 62 | +# "payload": [ |
| 63 | +# {"conversational": {"role": "USER", "content": {"text": "..."}}} |
| 64 | +# ], |
| 65 | +# }, |
| 66 | +# ... |
| 67 | +# ], |
| 68 | +# "nextToken": "..." |
| 69 | +# } |
| 70 | +``` |
| 71 | + |
| 72 | +Events come back newest-first. |
| 73 | + |
| 74 | +`DeleteEvent` — delete events from a session: |
| 75 | + |
| 76 | +```python |
| 77 | +client.delete_event( |
| 78 | + memoryId=MEMORY_ID, |
| 79 | + actorId=user_sub, |
| 80 | + sessionId=session_id, |
| 81 | + eventId=event_id, |
| 82 | +) |
| 83 | +``` |
| 84 | + |
| 85 | +To delete a full session, page through `ListEvents` and call `DeleteEvent` on each. The session summary remains in `ListSessions` afterwards — there is no API to remove it, so a fully "deleted" session needs to be masked out-of-band (for example, in DynamoDB). |
| 86 | + |
| 87 | +#### Limitations |
| 88 | + |
| 89 | +- AgentCore Memory is short-term — events expire after the `eventExpiryDuration` configured on the memory resource. Past that horizon, the conversation is gone. |
| 90 | +- Listing results are ordered by `sessionId`, not recency. Session summaries don't carry a name or last-activity field, so the Lambda needs an extra `ListEvents` call per session to derive a title and sort the sidebar. |
| 91 | + |
| 92 | +#### Best Fit |
| 93 | + |
| 94 | +Short-lived apps, prototypes, or experiences where the sidebar only needs to surface recent sessions within the configured retention window. |
| 95 | + |
| 96 | +--- |
| 97 | + |
| 98 | +### Pattern 2: AgentCore Memory + DynamoDB |
| 99 | + |
| 100 | +Keep AgentCore Memory as the agent's runtime memory and add a DynamoDB table next to it. The agent still writes to Memory exactly as before — DynamoDB is a second store that the API reads from, not a replacement for Memory. |
| 101 | + |
| 102 | +Two reasonable flavors: |
| 103 | + |
| 104 | +| Flavor | DynamoDB Stores | Conversation Source | Use When | |
| 105 | +|--------|----------------|--------------------| ---------| |
| 106 | +| **Metadata only** | Session name, last-activity, status, custom fields | `ListEvents` from Memory | You want a fast, sortable sidebar but conversations fit within Memory's retention | |
| 107 | +| **Full duplication** | Metadata + full conversation history | DynamoDB directly | You need conversations to outlive Memory's `eventExpiryDuration`, or want a durable record | |
| 108 | + |
| 109 | +In both flavors the agent and the API stay decoupled — the agent doesn't know DynamoDB exists. Some other process (the runtime, a stream hook, or the Lambda itself on first read) populates it. The frontend reads from DynamoDB through the API. |
| 110 | + |
| 111 | +#### Data Model (Metadata Flavor) |
| 112 | + |
| 113 | +``` |
| 114 | +Table: {stack_name_base}-Sessions |
| 115 | +├── Partition Key: userId (String) — Cognito user sub |
| 116 | +├── Sort Key: sessionId (String) — UUID generated client-side |
| 117 | +├── Attributes: |
| 118 | +│ ├── name (String) — Display name |
| 119 | +│ ├── status (String) — "active" | "completed" | "cancelled" |
| 120 | +│ ├── createdAt (String) — ISO 8601 |
| 121 | +│ ├── updatedAt (String) — ISO 8601 (last activity) |
| 122 | +│ └── metadata (Map) — Application-specific data |
| 123 | +``` |
| 124 | + |
| 125 | +**Why userId as partition key?** Each user only queries their own sessions. This gives efficient `Query` operations without a GSI. |
| 126 | + |
| 127 | +#### Best Fit |
| 128 | + |
| 129 | +Production applications where you need durable session metadata, fast sidebar listing sorted by recency, or session data that outlives Memory's retention window. |
| 130 | + |
| 131 | +--- |
| 132 | + |
| 133 | +### Pattern 3: Skip AgentCore Memory |
| 134 | + |
| 135 | +Drop AgentCore Memory and use your own storage as the agent's memory system. The agent reads and writes session state to that store on every turn — the storage *is* the conversation history, not a parallel copy of it. |
| 136 | + |
| 137 | +Strands makes this straightforward with its [`S3SessionManager`](https://strandsagents.com/docs/user-guide/concepts/agents/session-management/#s3sessionmanager--s3storage), which loads and persists session state to an S3 prefix on each invocation: |
| 138 | + |
| 139 | +```python |
| 140 | +from strands import Agent |
| 141 | +from strands.session import S3SessionManager |
| 142 | + |
| 143 | +agent = Agent( |
| 144 | + model=model, |
| 145 | + tools=tools, |
| 146 | + session_manager=S3SessionManager( |
| 147 | + bucket_name="my-sessions-bucket", |
| 148 | + prefix="sessions/", |
| 149 | + session_id=session_id, |
| 150 | + ), |
| 151 | +) |
| 152 | +``` |
| 153 | + |
| 154 | +A DynamoDB-backed equivalent works the same way. The Lambda powering the API reads from the same store the agent uses. |
| 155 | + |
| 156 | +#### Best Fit |
| 157 | + |
| 158 | +When you don't want to operate a Memory resource at all, or when you'd rather keep all conversation state in storage your team already manages. |
| 159 | + |
| 160 | +--- |
| 161 | + |
| 162 | +## Picking a Pattern |
| 163 | + |
| 164 | +| Consideration | Memory Only | Memory + DDB | Skip Memory | |
| 165 | +|---------------|-------------|--------------|-------------| |
| 166 | +| New infrastructure | None | DynamoDB table | S3 bucket or DDB table | |
| 167 | +| Sidebar speed | Slow (extra API calls per session) | Fast (single Query) | Fast (direct read) | |
| 168 | +| Session retention | Limited by `eventExpiryDuration` | Unlimited | Unlimited | |
| 169 | +| Custom metadata (name, status) | No | Yes | Yes | |
| 170 | +| Complexity | Low | Medium | Medium | |
| 171 | + |
| 172 | +Start with **Memory-only** if your retention horizon fits the configured expiry and your sidebar needs are basic. Add **DynamoDB** when you need durable session metadata or fast listing. **Skip Memory** entirely if your framework already persists state where you want it. |
| 173 | + |
| 174 | +In all three cases the API contract, the Lambda's role, and the frontend stay the same. Only the Lambda's data source changes. |
| 175 | + |
| 176 | +--- |
| 177 | + |
| 178 | +## Frontend Pieces |
| 179 | + |
| 180 | +The same set of pieces applies regardless of pattern: |
| 181 | + |
| 182 | +1. **A service module** that calls the three endpoints (`GET /sessions`, `GET /sessions/{id}`, `DELETE /sessions/{id}`) |
| 183 | +2. **A sidebar component** that lists sessions and triggers resume on click |
| 184 | +3. **A resume handler** that loads the chosen session's history and reuses its session ID for follow-up messages |
| 185 | +4. **A "new chat" handler** that generates a fresh session ID and clears the panel |
| 186 | + |
| 187 | +### Session Resumption |
| 188 | + |
| 189 | +When resuming a session, the key insight is that the agent's session manager (whether AgentCore Memory, S3, or DDB) stores conversation history keyed by session ID. You don't need to re-send past messages. Simply: |
| 190 | + |
| 191 | +1. Set the `runtimeSessionId` to the existing session's ID |
| 192 | +2. Send the new user message |
| 193 | +3. The session manager loads prior context automatically |
| 194 | + |
| 195 | +### Session Naming |
| 196 | + |
| 197 | +| Strategy | Implementation | Quality | |
| 198 | +|----------|---------------|---------| |
| 199 | +| **First message truncation** | `message.slice(0, 50)` | Low — often not descriptive | |
| 200 | +| **LLM-generated title** | Ask the model after first exchange | High — adds latency/cost | |
| 201 | +| **User-provided** | Rename option in sidebar | Highest — requires user action | |
| 202 | + |
| 203 | +--- |
| 204 | + |
| 205 | +## Implementation Reference: DynamoDB + API Gateway |
| 206 | + |
| 207 | +This section provides concrete infrastructure guidance for Pattern 2. The implementation follows the same patterns as the existing FAST Feedback API (`infra-cdk/lambdas/feedback/`). |
| 208 | + |
| 209 | +### CDK: DynamoDB Table |
| 210 | + |
| 211 | +```typescript |
| 212 | +import * as dynamodb from "aws-cdk-lib/aws-dynamodb"; |
| 213 | + |
| 214 | +const sessionsTable = new dynamodb.Table(this, "SessionsTable", { |
| 215 | + tableName: `${config.stack_name_base}-Sessions`, |
| 216 | + partitionKey: { name: "userId", type: dynamodb.AttributeType.STRING }, |
| 217 | + sortKey: { name: "sessionId", type: dynamodb.AttributeType.STRING }, |
| 218 | + billingMode: dynamodb.BillingMode.PAY_PER_REQUEST, |
| 219 | + removalPolicy: cdk.RemovalPolicy.DESTROY, |
| 220 | + encryption: dynamodb.TableEncryption.AWS_MANAGED, |
| 221 | + pointInTimeRecovery: true, |
| 222 | +}); |
| 223 | +``` |
| 224 | + |
| 225 | +### CDK: API Gateway Routes |
| 226 | + |
| 227 | +Add session routes to the existing REST API, protected by the Cognito authorizer: |
| 228 | + |
| 229 | +```typescript |
| 230 | +const sessionsResource = api.root.addResource("sessions"); |
| 231 | +const sessionByIdResource = sessionsResource.addResource("{sessionId}"); |
| 232 | + |
| 233 | +sessionsResource.addMethod("GET", sessionsLambdaIntegration, { |
| 234 | + authorizer: cognitoAuthorizer, |
| 235 | + authorizationType: apigateway.AuthorizationType.COGNITO, |
| 236 | +}); |
| 237 | +sessionByIdResource.addMethod("GET", sessionsLambdaIntegration, { |
| 238 | + authorizer: cognitoAuthorizer, |
| 239 | + authorizationType: apigateway.AuthorizationType.COGNITO, |
| 240 | +}); |
| 241 | +sessionByIdResource.addMethod("DELETE", sessionsLambdaIntegration, { |
| 242 | + authorizer: cognitoAuthorizer, |
| 243 | + authorizationType: apigateway.AuthorizationType.COGNITO, |
| 244 | +}); |
| 245 | +``` |
| 246 | + |
| 247 | +Follow the existing Feedback API pattern for CORS configuration and Lambda integration. See `infra-cdk/lib/backend-stack.ts` for the full reference. |
| 248 | + |
| 249 | +### Lambda Handler |
| 250 | + |
| 251 | +Create `infra-cdk/lambdas/sessions/index.py` following the Powertools pattern used by the Feedback Lambda. The handler should: |
| 252 | + |
| 253 | +- Use `APIGatewayRestResolver` with CORS config from environment |
| 254 | +- Extract `userId` from `request_context.authorizer.claims["sub"]` |
| 255 | +- Query DynamoDB with `userId` as partition key |
| 256 | +- Return sessions sorted by `updatedAt` descending |
| 257 | + |
| 258 | +See `infra-cdk/lambdas/feedback/index.py` for the exact patterns to follow (imports, CORS setup, Cognito claims extraction, error handling). |
| 259 | + |
| 260 | +### SSM Parameter |
| 261 | + |
| 262 | +Store the API URL for cross-stack access: |
| 263 | + |
| 264 | +```typescript |
| 265 | +new ssm.StringParameter(this, "SessionsApiUrlParam", { |
| 266 | + parameterName: `/${config.stack_name_base}/sessions-api-url`, |
| 267 | + stringValue: api.url, |
| 268 | +}); |
| 269 | +``` |
| 270 | + |
| 271 | +### Cost |
| 272 | + |
| 273 | +| Resource | Cost | Notes | |
| 274 | +|----------|------|-------| |
| 275 | +| DynamoDB (on-demand) | ~$1.25 per million writes, ~$0.25 per million reads | Negligible for most apps | |
| 276 | +| API Gateway | $3.50 per million requests | Shared with other routes | |
| 277 | +| Lambda | Free tier covers 1M requests/month | Minimal compute | |
| 278 | + |
| 279 | +For most GenAI applications, these costs are negligible — token usage dominates. |
| 280 | + |
| 281 | +--- |
| 282 | + |
| 283 | +## Advanced: Long-Running Agent Sessions |
| 284 | + |
| 285 | +For agents that run autonomously for extended periods (minutes to hours), session management becomes more complex. Consider: |
| 286 | + |
| 287 | +- **Status tracking** — Extend DynamoDB with `status`, `detail`, `lastHeartbeat` fields. Agent writes progress updates during execution. |
| 288 | +- **Polling** — If the agent runs longer than SSE connection timeout (~60–90s on AgentCore), the frontend polls `GET /sessions/{id}` for progress instead of streaming. |
| 289 | +- **Cancellation** — Frontend sets `status: "cancelled"` in DynamoDB. Agent checks before each tool call and stops gracefully. |
| 290 | +- **Streaming to S3** — Agent writes detailed output to S3 (JSONL format) for the frontend to consume independently of the session metadata. |
| 291 | + |
| 292 | +--- |
| 293 | + |
| 294 | +## Further Reading |
| 295 | + |
| 296 | +- [Strands S3SessionManager](https://strandsagents.com/docs/user-guide/concepts/agents/session-management/#s3sessionmanager--s3storage) |
| 297 | +- AgentCore Memory data plane: `CreateEvent`, `ListSessions`, `ListEvents`, `DeleteEvent` |
| 298 | +- [FAST Memory Integration Guide](./MEMORY_INTEGRATION.md) |
| 299 | +- [FAST Streaming Guide](./STREAMING.md) |
| 300 | +- [FAST Deployment Guide](./DEPLOYMENT.md) |
0 commit comments