Skip to content

Commit 974c94a

Browse files
committed
feat: added agent.md
1 parent 8c74d25 commit 974c94a

2 files changed

Lines changed: 462 additions & 43 deletions

File tree

AGENTS.md

Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
# AGENTS.md
2+
3+
AI coding agent instructions for the AWS Lambda Durable Execution Java SDK.
4+
5+
## Project Overview
6+
7+
**Java SDK for AWS Lambda Durable Functions** - enables building resilient, multi-step workflows that can run for up to one year with automatic state management and failure recovery.
8+
9+
### Key Concepts
10+
11+
- **Checkpoint-and-replay**: Operations create checkpoints; on interruption, replay skips completed work
12+
- **Durable operations**: `step()` executes with retry, `wait()` suspends without compute charges
13+
- **Use cases**: Order processing, human approvals, AI agent workflows, distributed transactions
14+
15+
This implements the Java version of AWS's durable execution SDK (official SDKs exist for JavaScript/TypeScript and Python).
16+
17+
## Build & Test Commands
18+
19+
```bash
20+
# Build all modules
21+
mvn clean install
22+
23+
# Run unit tests only
24+
mvn test
25+
26+
# Run specific test class
27+
mvn test -Dtest=DurableContextTest
28+
29+
# Skip tests
30+
mvn install -DskipTests
31+
```
32+
33+
## Key Directories
34+
35+
```
36+
sdk/ # Core SDK module
37+
├── src/main/java/com/amazonaws/lambda/durable/
38+
│ ├── DurableHandler.java # Lambda entry point (extend this)
39+
│ ├── DurableContext.java # User-facing API (step, wait)
40+
│ ├── DurableExecutor.java # Execution lifecycle
41+
│ ├── execution/ # Thread coordination, checkpointing
42+
│ ├── operation/ # StepOperation, WaitOperation
43+
│ ├── model/ # Data structures
44+
│ ├── serde/ # JSON serialization
45+
│ ├── client/ # AWS API integration
46+
│ └── exception/ # Domain exceptions
47+
48+
sdk-testing/ # Test utilities (LocalDurableTestRunner, etc.)
49+
examples/ # Customer-facing examples with local and cloud tests
50+
sdk-integration-tests/ # Integration tests for the sdk
51+
```
52+
53+
## Coding Guidelines
54+
55+
### Java Style (MUST follow)
56+
57+
```java
58+
// USE var when type is obvious
59+
var ctx = new DurableContext();
60+
var operations = new HashMap<Integer, Operation>();
61+
62+
// USE static imports for common utilities and factory methods
63+
import static org.junit.jupiter.api.Assertions.*; // Tests
64+
import static java.util.Collections.emptyList; // Factory methods
65+
import static com.amazonaws.lambda.durable.model.Status.*; // Enums
66+
67+
// AVOID fully qualified names in code
68+
// Bad: com.amazonaws.lambda.durable.model.Status.SUCCESS
69+
// Good: import static and use SUCCESS directly
70+
71+
// USE constructor injection
72+
public DurableExecutor(DurableExecutionClient client, SerDes serDes) {
73+
this.client = client;
74+
this.serDes = serDes;
75+
}
76+
```
77+
78+
### Architecture Rules
79+
80+
- **No unnecessary interfaces** - Use concrete classes when only one implementation exists
81+
- **Constructor injection** - All dependencies via constructor, no field injection
82+
- **Defensive copies** - Copy mutable collections in constructors
83+
- **Single responsibility** - One class, one job
84+
- **Methods ≤30 lines** - Extract if longer
85+
86+
### Package Naming
87+
88+
Prefer descriptive domain names: `model`, `execution`, `operation`, `serde`, `exception`
89+
90+
## Do Not
91+
92+
- Add new dependencies without explicit approval
93+
- Create interfaces for single implementations
94+
- Write tests for POJO getters/setters
95+
- Expose mutable state via getters
96+
- Change public API signatures without instruction
97+
- Swallow exceptions silently
98+
- Use field injection
99+
100+
## Testing Approach
101+
102+
### Test Organization
103+
104+
```
105+
sdk/src/test/ # Unit tests for SDK internals
106+
├── DurableContextTest # Test DurableContext behavior
107+
├── DurableExecutorTest # Test execution lifecycle
108+
├── serde/JacksonSerDesTest # Test serialization
109+
└── retry/RetryStrategiesTest # Test retry logic
110+
111+
sdk-integration-tests/src/test/ # Integration tests (SDK + mock AWS)
112+
├── IntegrationTest # End-to-end with LocalDurableTestRunner
113+
├── RetryIntegrationTest # Retry behavior across operations
114+
└── StepSemanticsIntegrationTest # Step execution semantics
115+
116+
examples/src/test/ # Customer-facing examples + cloud tests
117+
├── SimpleStepExampleTest # Local test with LocalDurableTestRunner
118+
├── WaitExampleTest # Local test for wait operations
119+
└── CloudBasedIntegrationTest # Cloud tests with CloudDurableTestRunner
120+
```
121+
122+
### Testing Strategy
123+
124+
**Unit Tests (sdk/src/test/)**
125+
- Test individual classes in isolation
126+
- Mock dependencies
127+
- Fast, no external dependencies
128+
- Run on every build
129+
130+
```java
131+
@Test
132+
void stepReturnsResultOnReplay() {
133+
var context = createTestContext(completedOperations);
134+
var result = context.step("test", String.class, () -> "new");
135+
assertEquals("cached", result); // Returns cached, doesn't re-execute
136+
}
137+
```
138+
139+
**Integration Tests (sdk-integration-tests/)**
140+
- Test SDK components working together
141+
- Use `LocalDurableTestRunner` (in-memory, no AWS)
142+
- Test replay, checkpointing, error handling
143+
- Run on every build
144+
145+
```java
146+
@Test
147+
void testRetryBehavior() {
148+
var runner = LocalDurableTestRunner.create(Input.class, handler::handleRequest);
149+
var result = runner.run(new Input("test"));
150+
assertEquals(ExecutionStatus.SUCCEEDED, result.getStatus());
151+
}
152+
```
153+
154+
**Example Tests (examples/src/test/)**
155+
- Demonstrate SDK usage patterns
156+
- Local tests use `LocalDurableTestRunner`
157+
- Cloud tests use `CloudDurableTestRunner` (requires deployed Lambda)
158+
- Cloud tests disabled by default (`-Dtest.cloud.enabled=true`)
159+
160+
```java
161+
@Test
162+
@EnabledIf("isCloudTestsEnabled")
163+
void testAgainstRealLambda() {
164+
var arn = "arn:aws:lambda:us-east-1:123456789012:function:my-fn";
165+
var runner = CloudDurableTestRunner.create(arn, Input.class, Output.class);
166+
var result = runner.run(new Input("test"));
167+
assertEquals(ExecutionStatus.SUCCEEDED, result.getStatus());
168+
}
169+
```
170+
171+
### Test Guidelines
172+
173+
- Test business logic, replay behavior, edge cases
174+
- Don't test POJO getters/setters
175+
- Use `LocalDurableTestRunner` for fast tests
176+
- Use `CloudDurableTestRunner` only for end-to-end validation
177+
- JUnit 5 with static imports for assertions
178+
179+
## Architecture Essentials
180+
181+
### Checkpoint-and-Replay
182+
183+
1. Operations get sequential IDs
184+
2. Completed operations stored in ExecutionManager
185+
3. On replay: return cached result, skip re-execution
186+
4. New operations: execute, checkpoint, continue
187+
188+
### Key Classes
189+
190+
| Class | Responsibility |
191+
|-------|----------------|
192+
| `DurableHandler<I,O>` | Lambda entry point, extend this |
193+
| `DurableContext` | User API: `step()`, `wait()` |
194+
| `DurableExecutor` | Orchestrates execution lifecycle |
195+
| `ExecutionManager` | Thread coordination, state management |
196+
| `CheckpointBatcher` | Batches checkpoint API calls (750KB limit) |
197+
| `StepOperation` | Executes steps with retry logic |
198+
| `WaitOperation` | Handles wait checkpointing |
199+
200+
## Common Tasks
201+
202+
### Add a New Operation Type
203+
204+
1. Create class in `operation/` implementing `DurableOperation<T>`
205+
2. Add method to `DurableContext` that delegates to new operation
206+
3. Add tests for: first execution, replay, error cases
207+
208+
### Add a Test
209+
210+
```java
211+
@Test
212+
void descriptiveTestName() {
213+
// Given
214+
var handler = new MyHandler();
215+
var runner = LocalDurableTestRunner.create(MyInput.class, handler::handleRequest);
216+
217+
// When
218+
var result = runner.runUntilComplete(new MyInput("test"));
219+
220+
// Then
221+
assertEquals(expected, result);
222+
}
223+
```
224+
225+
### Debug Thread Coordination
226+
227+
Check `ExecutionManager` for thread registration and coordination logic if debugging concurrency issues.
228+
229+
## When Unsure
230+
231+
- Ask clarifying questions before making assumptions
232+
- Check existing code for patterns (especially in `operation/` package)
233+
- Prefer minimal changes over large refactors
234+
235+
## Further Reading
236+
237+
### Official AWS SDKs
238+
239+
- **JavaScript/TypeScript**: https://github.com/aws/aws-durable-execution-sdk-js
240+
- **Python**: https://github.com/aws/aws-durable-execution-sdk-python
241+
242+
### AWS Documentation
243+
244+
- [Lambda Durable Functions](https://docs.aws.amazon.com/lambda/latest/dg/durable-functions.html)
245+
- [Durable Execution SDK](https://docs.aws.amazon.com/lambda/latest/dg/durable-execution-sdk.html)
246+
- [Best Practices](https://docs.aws.amazon.com/lambda/latest/dg/durable-best-practices.html)

0 commit comments

Comments
 (0)