Skip to content

Commit 8121ee0

Browse files
scottschreckengaustclaudekrokoko
authored
docs: establish ADR framework and ADR-001 stacked pull requests (aws-samples#130)
* docs: establish ADR framework and ADR-001 stacked pull requests Introduces `docs/decisions/` as the normative location for Architecture Decision Records. ADR-001 documents the stacked PR methodology for delivering multi-PR features with reduced review fatigue. - docs/decisions/README.md — ADR process, template, lifecycle - docs/decisions/001-stacked-pull-requests.md — stacked PR methodology - AGENTS.md routing table updated for agent discoverability - Starlight sync + sidebar includes Decisions section Closes aws-samples#129 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(adr-001): address PR aws-samples#130 review feedback - Drop `of {M}` from position statement to avoid backfill thrashing when stack size changes (scottschreckengaust, ayushtr-aws) - Make "Next/Remaining" section explicitly optional — only include when scope is fixed and known (scottschreckengaust, isadeks, ayushtr-aws) - Add `blocked by` / `blocking` dependency relationships to sub-issue linking section (scottschreckengaust, ayushtr-aws) - Keep Pragmatic Engineer reference (ayushtr-aws) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(ADR-001): add merge semantics section Clarify what happens when an early PR in a stack merges before later PRs are reviewed: retarget to main, rebase, verify CI. Resolves the tension between "PR N targets N-1" and "early PRs can ship." Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(ADR-001): add infra-stack deployability exception For multi-PR CDK/IAM stacks, intermediate slices may not deploy independently. Clarify that synth+tests is the validation gate, and introduce a "deploy-gate PR" concept in the stack position block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(ADR-001): address remaining nits (3-5) from review - Clarify that review time/blast radius drives stacking, not file count - Add "Relationship to docs/design/" section in ADR README - Add ADR pointer in CONTRIBUTING.md alongside design docs guidance Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: bgagent <345885+scottschreckengaust@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Alain Krok <alkrok@amazon.com>
1 parent 557e22c commit 8121ee0

9 files changed

Lines changed: 397 additions & 0 deletions

File tree

AGENTS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Use this routing before editing so the right package and tests get updated:
2222
| Agent runtime (clone, tools, prompts, container) | `agent/src/` (`pipeline.py`, `runner.py`, `config.py`, `hooks.py`, `policy.py`, `prompts/`, Dockerfile, etc.) | `agent/tests/`, `agent/README.md` for env/PAT |
2323
| Agent progress events (written to `TaskEventsTable` from the MicroVM; read by `bgagent watch`) | `agent/src/progress_writer.py`, `agent/src/pipeline.py` and `agent/src/runner.py` (integration points) | `agent/tests/test_progress_writer.py`; `cli/src/commands/watch.ts` for the consumer side |
2424
| User-facing or design prose | `docs/guides/`, `docs/design/` | Run **`mise //docs:sync`** or **`mise //docs:build`** (do not edit `docs/src/content/docs/` by hand) |
25+
| Architecture decisions (ADRs) | `docs/decisions/` | Run **`mise //docs:sync`** after adding or editing an ADR |
2526
| Monorepo tasks, CI glue | Root `mise.toml`, `scripts/`, `.github/workflows/` ||
2627

2728
### CDK handler tests (quick map)

CONTRIBUTING.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Guidelines:
2828
- If you change API types in `cdk/src/handlers/shared/types.ts`, update `cli/src/types.ts` to match.
2929
- If you change docs sources (`docs/guides/`, `docs/design/`), run `mise //docs:sync` so generated content stays in sync.
3030
- For significant features, add a design document to `docs/design/`.
31+
- For cross-cutting or hard-to-reverse decisions, add an ADR to `docs/decisions/` (see [ADR README](./docs/decisions/README.md)).
3132

3233
### 4. Commit
3334

docs/astro.config.mjs

Lines changed: 5 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# ADR-001: Stacked pull requests for multi-PR features
2+
3+
**Status:** accepted
4+
**Date:** 2026-05-19
5+
6+
## Context
7+
8+
Complex features in ABCA often span multiple packages, resource types, and concerns. Delivering these as a single large PR creates several problems:
9+
10+
- **Review fatigue:** PRs exceeding ~500 lines suffer from diminished reviewer attention — critical issues get missed in the noise of mechanical changes.
11+
- **Context loss:** Without a framework, sequential PRs leave reviewers without knowledge of where they are in the overall delivery, what came before, or what remains.
12+
- **Agent discoverability:** AI coding agents picking up a sub-task cannot determine the broader goal, prior decisions, or remaining work without reconstructing context from scattered commits and issues.
13+
- **Blocked progress:** A single large PR blocks all progress until the entire feature is reviewed. Stalling on one concern (e.g., IAM review) blocks unrelated work (e.g., documentation).
14+
15+
The [Pragmatic Engineer analysis of stacked diffs](https://newsletter.pragmaticengineer.com/p/stacked-diffs) documents how organizations (Meta, Google, Graphite users) use this pattern to maintain velocity on complex changes while keeping review quality high.
16+
17+
## Decision
18+
19+
Use **stacked pull requests** for features spanning multiple concerns or where review time and blast radius justify decomposition. The numeric thresholds below are guidelines — the primary signal is whether a single PR would exceed a reasonable review session, not file count alone. Each PR in the stack follows these rules:
20+
21+
### 1. Position statement
22+
23+
Every PR description states its position:
24+
25+
```markdown
26+
## Stack position
27+
28+
PR {N} for #{parent-issue} — {overall goal one-liner}
29+
30+
### Prior: {what the previous PR delivered}
31+
### This PR: {what this adds}
32+
### Next (optional): {what comes next, if scope is known}
33+
```
34+
35+
This gives reviewers and agents immediate orientation. The "Next" section is optional — include it when the remaining scope is fixed and known; omit it when scope is still evolving. The parent issue is the source of truth for overall progress.
36+
37+
### 2. Branch targeting
38+
39+
- PR 1 targets `main`
40+
- PR N targets PR N-1's branch
41+
- Final PR merges the full stack to `main`
42+
43+
```
44+
main
45+
└── feat/first-concern (PR 1)
46+
└── feat/second-concern (PR 2)
47+
└── feat/third-concern (PR 3 → merge to main)
48+
```
49+
50+
### 3. Self-contained reviewability
51+
52+
Each PR:
53+
- Compiles and passes tests independently
54+
- Can be deployed without breaking the system (see exception below)
55+
- Has a single clear responsibility (one concern per PR)
56+
- Does not leave dead code, TODOs, or broken intermediate states
57+
58+
**Infrastructure stack exception:** For multi-PR CDK/IAM changes where intermediate slices cannot deploy independently (e.g., a policy referencing a resource added in a later PR), the validation gate is **synth + tests passing** — not a successful deploy. In this case, designate a **deploy-gate PR** in the stack position block: the specific PR where the stack becomes end-to-end deployable. Acceptable intermediate states include feature-flagged resources, no-op stubs, and constructs gated behind context variables.
59+
60+
### 4. Size guidelines
61+
62+
| Metric | Target | Maximum |
63+
|--------|--------|---------|
64+
| Lines changed | 200–400 | 600 |
65+
| Review time | 20–30 min | 45 min |
66+
| Files touched | 3–8 | 12 |
67+
68+
If a PR exceeds these, decompose further.
69+
70+
### 5. Rebase discipline
71+
72+
When a lower PR changes after review feedback:
73+
- All PRs above it in the stack must be rebased
74+
- CI must pass on each PR independently after rebase
75+
- Reviewers are notified of the rebase (GitHub does this automatically)
76+
77+
### 6. Sub-issue linking
78+
79+
- Parent issue lists all sub-issues with a stack visualization diagram
80+
- Each sub-issue references the parent and its position in the stack
81+
- GitHub's task list in the parent tracks completion
82+
- Estimated review time is listed per sub-issue to help reviewers plan
83+
- Sub-issues use `blocked by #NNN` / `blocking #NNN` relationships to express dependency order — agents and reviewers can identify which issues are unblocked and ready for pickup
84+
85+
### 7. When NOT to use stacked PRs
86+
87+
- Changes under ~200 lines that fit naturally in one PR
88+
- Hotfixes that need immediate merge
89+
- Dependency bumps (use Dependabot grouping instead)
90+
- Documentation-only changes that are self-contained
91+
92+
### 8. Merge semantics
93+
94+
The default topology is a **classic stack** — each PR targets its predecessor's branch. When an early PR merges to `main` before later PRs are reviewed:
95+
96+
1. **Retarget** all PRs that pointed at the merged branch to `main` (or to the next unmerged predecessor). Use `gh pr edit <N> --base main` or GitHub's "Retarget" button.
97+
2. **Rebase** each retargeted PR onto its new base so the diff is clean.
98+
3. **CI must pass** on each retargeted PR independently after rebase.
99+
100+
After retargeting, the remaining PRs form a shorter stack rooted on `main`. This is the expected, normal path — not an exception.
101+
102+
**When the stack diverges:** If review feedback on PR 2 invalidates assumptions in PRs 3+, prefer closing and re-opening the affected PRs over accumulating fixup commits that obscure intent. The parent issue remains the source of truth for what shipped and what remains.
103+
104+
## Consequences
105+
106+
- (+) Each PR stays in the "reviewable without fatigue" window (~15–40 min)
107+
- (+) Agents can pick up any sub-issue independently — the position statement provides full context
108+
- (+) Partial delivery is meaningful — each merged PR adds value independently
109+
- (+) Reviewers approve incrementally without needing full-stack mental context
110+
- (+) Early PRs can merge and ship while later ones are still in review
111+
- (-) Rebase cascades when early PRs receive feedback
112+
- (-) More overhead in PR descriptions and branch management
113+
- (-) Requires discipline to keep each PR independently valid (no "this will be fixed in PR N+1")
114+
- (!) If the stack grows beyond ~8 PRs, consider decomposing into independent sub-stacks
115+
116+
## References
117+
118+
- [Stacked Diffs — Pragmatic Engineer](https://newsletter.pragmaticengineer.com/p/stacked-diffs)
119+
- RFC #120 — first formal use of this pattern in ABCA
120+
- Issue #129 — implementation of this ADR

docs/decisions/README.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Architecture Decision Records (ADRs)
2+
3+
This directory captures significant design decisions for the ABCA project. Each ADR explains **why** a decision was made — not just what was decided — so that future contributors (human and AI) can understand the reasoning without excavating git history or PR discussions.
4+
5+
## When to write an ADR
6+
7+
Write an ADR when a decision:
8+
9+
- Affects multiple packages or the overall architecture
10+
- Establishes a pattern other code will follow
11+
- Is non-obvious — a reasonable person might choose differently
12+
- Is hard to reverse once implemented
13+
14+
Do **not** write an ADR for routine implementation choices that are self-evident from the code.
15+
16+
## Template
17+
18+
```markdown
19+
# ADR-NNN: Title
20+
21+
**Status:** proposed | accepted | superseded | deprecated
22+
**Date:** YYYY-MM-DD
23+
**Supersedes:** ADR-NNN (if applicable)
24+
**Superseded by:** ADR-NNN (if applicable)
25+
26+
## Context
27+
28+
What is the problem or situation that requires a decision? Include constraints, requirements, and forces at play.
29+
30+
## Decision
31+
32+
What was decided and why. Be specific — name the approach chosen.
33+
34+
## Consequences
35+
36+
What follows from this decision:
37+
- (+) Positive outcomes
38+
- (-) Negative outcomes or trade-offs
39+
- (!) Risks or things to watch
40+
41+
## References
42+
43+
- Links to RFCs, issues, PRs, or external resources that informed the decision
44+
```
45+
46+
## Numbering
47+
48+
ADRs are numbered sequentially with zero-padded three-digit prefixes: `001-slug.md`, `002-slug.md`, etc. Numbers are never reused.
49+
50+
## Lifecycle
51+
52+
| Status | Meaning |
53+
|--------|---------|
54+
| `proposed` | Under discussion, not yet binding |
55+
| `accepted` | Active and authoritative |
56+
| `superseded` | Replaced by a newer ADR (link to successor) |
57+
| `deprecated` | No longer applicable (context changed) |
58+
59+
A decision starts as `proposed` during RFC discussion and moves to `accepted` when the implementing PR merges. To change an accepted decision, write a new ADR that supersedes it — do not edit the original.
60+
61+
## Relationship to `docs/design/`
62+
63+
Design documents describe system shape, interfaces, and implementation detail. ADRs capture cross-cutting choices that constrain multiple designs. When a design decision is significant enough to be "hard to reverse" or "non-obvious," extract it as an ADR and reference it from the design doc. An ADR may supersede another ADR; a design doc is simply updated in place.
64+
65+
## Discovery
66+
67+
- **Agents:** `AGENTS.md` routes to this directory for understanding past design rationale.
68+
- **Humans:** Browse this directory or the docs site under the "Decisions" section.
69+
- **Search:** Each ADR title and context section are written to be grep-friendly.

docs/scripts/sync-starlight.mjs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -262,5 +262,8 @@ mirrorMarkdownFile(
262262
// AGENTS.md reference for contributors. The rename happens only at the site level.
263263
mirrorDirectory(path.join(docsRoot, 'design'), path.join('src', 'content', 'docs', 'architecture'));
264264

265+
// --- Decision records (ADRs): mirror to decisions/ ---
266+
mirrorDirectory(path.join(docsRoot, 'decisions'), path.join('src', 'content', 'docs', 'decisions'));
267+
265268
// Guardrail: ensure target tree exists when running in a clean checkout.
266269
fs.mkdirSync(targetRoot, { recursive: true });
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
---
2+
title: 001 stacked pull requests
3+
---
4+
5+
# ADR-001: Stacked pull requests for multi-PR features
6+
7+
**Status:** accepted
8+
**Date:** 2026-05-19
9+
10+
## Context
11+
12+
Complex features in ABCA often span multiple packages, resource types, and concerns. Delivering these as a single large PR creates several problems:
13+
14+
- **Review fatigue:** PRs exceeding ~500 lines suffer from diminished reviewer attention — critical issues get missed in the noise of mechanical changes.
15+
- **Context loss:** Without a framework, sequential PRs leave reviewers without knowledge of where they are in the overall delivery, what came before, or what remains.
16+
- **Agent discoverability:** AI coding agents picking up a sub-task cannot determine the broader goal, prior decisions, or remaining work without reconstructing context from scattered commits and issues.
17+
- **Blocked progress:** A single large PR blocks all progress until the entire feature is reviewed. Stalling on one concern (e.g., IAM review) blocks unrelated work (e.g., documentation).
18+
19+
The [Pragmatic Engineer analysis of stacked diffs](https://newsletter.pragmaticengineer.com/p/stacked-diffs) documents how organizations (Meta, Google, Graphite users) use this pattern to maintain velocity on complex changes while keeping review quality high.
20+
21+
## Decision
22+
23+
Use **stacked pull requests** for features spanning multiple concerns or where review time and blast radius justify decomposition. The numeric thresholds below are guidelines — the primary signal is whether a single PR would exceed a reasonable review session, not file count alone. Each PR in the stack follows these rules:
24+
25+
### 1. Position statement
26+
27+
Every PR description states its position:
28+
29+
```markdown
30+
## Stack position
31+
32+
PR {N} for #{parent-issue} — {overall goal one-liner}
33+
34+
### Prior: {what the previous PR delivered}
35+
### This PR: {what this adds}
36+
### Next (optional): {what comes next, if scope is known}
37+
```
38+
39+
This gives reviewers and agents immediate orientation. The "Next" section is optional — include it when the remaining scope is fixed and known; omit it when scope is still evolving. The parent issue is the source of truth for overall progress.
40+
41+
### 2. Branch targeting
42+
43+
- PR 1 targets `main`
44+
- PR N targets PR N-1's branch
45+
- Final PR merges the full stack to `main`
46+
47+
```
48+
main
49+
└── feat/first-concern (PR 1)
50+
└── feat/second-concern (PR 2)
51+
└── feat/third-concern (PR 3 → merge to main)
52+
```
53+
54+
### 3. Self-contained reviewability
55+
56+
Each PR:
57+
- Compiles and passes tests independently
58+
- Can be deployed without breaking the system (see exception below)
59+
- Has a single clear responsibility (one concern per PR)
60+
- Does not leave dead code, TODOs, or broken intermediate states
61+
62+
**Infrastructure stack exception:** For multi-PR CDK/IAM changes where intermediate slices cannot deploy independently (e.g., a policy referencing a resource added in a later PR), the validation gate is **synth + tests passing** — not a successful deploy. In this case, designate a **deploy-gate PR** in the stack position block: the specific PR where the stack becomes end-to-end deployable. Acceptable intermediate states include feature-flagged resources, no-op stubs, and constructs gated behind context variables.
63+
64+
### 4. Size guidelines
65+
66+
| Metric | Target | Maximum |
67+
|--------|--------|---------|
68+
| Lines changed | 200–400 | 600 |
69+
| Review time | 20–30 min | 45 min |
70+
| Files touched | 3–8 | 12 |
71+
72+
If a PR exceeds these, decompose further.
73+
74+
### 5. Rebase discipline
75+
76+
When a lower PR changes after review feedback:
77+
- All PRs above it in the stack must be rebased
78+
- CI must pass on each PR independently after rebase
79+
- Reviewers are notified of the rebase (GitHub does this automatically)
80+
81+
### 6. Sub-issue linking
82+
83+
- Parent issue lists all sub-issues with a stack visualization diagram
84+
- Each sub-issue references the parent and its position in the stack
85+
- GitHub's task list in the parent tracks completion
86+
- Estimated review time is listed per sub-issue to help reviewers plan
87+
- Sub-issues use `blocked by #NNN` / `blocking #NNN` relationships to express dependency order — agents and reviewers can identify which issues are unblocked and ready for pickup
88+
89+
### 7. When NOT to use stacked PRs
90+
91+
- Changes under ~200 lines that fit naturally in one PR
92+
- Hotfixes that need immediate merge
93+
- Dependency bumps (use Dependabot grouping instead)
94+
- Documentation-only changes that are self-contained
95+
96+
### 8. Merge semantics
97+
98+
The default topology is a **classic stack** — each PR targets its predecessor's branch. When an early PR merges to `main` before later PRs are reviewed:
99+
100+
1. **Retarget** all PRs that pointed at the merged branch to `main` (or to the next unmerged predecessor). Use `gh pr edit <N> --base main` or GitHub's "Retarget" button.
101+
2. **Rebase** each retargeted PR onto its new base so the diff is clean.
102+
3. **CI must pass** on each retargeted PR independently after rebase.
103+
104+
After retargeting, the remaining PRs form a shorter stack rooted on `main`. This is the expected, normal path — not an exception.
105+
106+
**When the stack diverges:** If review feedback on PR 2 invalidates assumptions in PRs 3+, prefer closing and re-opening the affected PRs over accumulating fixup commits that obscure intent. The parent issue remains the source of truth for what shipped and what remains.
107+
108+
## Consequences
109+
110+
- (+) Each PR stays in the "reviewable without fatigue" window (~15–40 min)
111+
- (+) Agents can pick up any sub-issue independently — the position statement provides full context
112+
- (+) Partial delivery is meaningful — each merged PR adds value independently
113+
- (+) Reviewers approve incrementally without needing full-stack mental context
114+
- (+) Early PRs can merge and ship while later ones are still in review
115+
- (-) Rebase cascades when early PRs receive feedback
116+
- (-) More overhead in PR descriptions and branch management
117+
- (-) Requires discipline to keep each PR independently valid (no "this will be fixed in PR N+1")
118+
- (!) If the stack grows beyond ~8 PRs, consider decomposing into independent sub-stacks
119+
120+
## References
121+
122+
- [Stacked Diffs — Pragmatic Engineer](https://newsletter.pragmaticengineer.com/p/stacked-diffs)
123+
- RFC #120 — first formal use of this pattern in ABCA
124+
- Issue #129 — implementation of this ADR

0 commit comments

Comments
 (0)