|
| 1 | +--- |
| 2 | +title: 001 stacked pull requests |
| 3 | +--- |
| 4 | + |
| 5 | +# ADR-001: Stacked pull requests for multi-PR features |
| 6 | + |
| 7 | +**Status:** accepted |
| 8 | +**Date:** 2026-05-19 |
| 9 | + |
| 10 | +## Context |
| 11 | + |
| 12 | +Complex features in ABCA often span multiple packages, resource types, and concerns. Delivering these as a single large PR creates several problems: |
| 13 | + |
| 14 | +- **Review fatigue:** PRs exceeding ~500 lines suffer from diminished reviewer attention — critical issues get missed in the noise of mechanical changes. |
| 15 | +- **Context loss:** Without a framework, sequential PRs leave reviewers without knowledge of where they are in the overall delivery, what came before, or what remains. |
| 16 | +- **Agent discoverability:** AI coding agents picking up a sub-task cannot determine the broader goal, prior decisions, or remaining work without reconstructing context from scattered commits and issues. |
| 17 | +- **Blocked progress:** A single large PR blocks all progress until the entire feature is reviewed. Stalling on one concern (e.g., IAM review) blocks unrelated work (e.g., documentation). |
| 18 | + |
| 19 | +The [Pragmatic Engineer analysis of stacked diffs](https://newsletter.pragmaticengineer.com/p/stacked-diffs) documents how organizations (Meta, Google, Graphite users) use this pattern to maintain velocity on complex changes while keeping review quality high. |
| 20 | + |
| 21 | +## Decision |
| 22 | + |
| 23 | +Use **stacked pull requests** for features spanning multiple concerns or where review time and blast radius justify decomposition. The numeric thresholds below are guidelines — the primary signal is whether a single PR would exceed a reasonable review session, not file count alone. Each PR in the stack follows these rules: |
| 24 | + |
| 25 | +### 1. Position statement |
| 26 | + |
| 27 | +Every PR description states its position: |
| 28 | + |
| 29 | +```markdown |
| 30 | +## Stack position |
| 31 | + |
| 32 | +PR {N} for #{parent-issue} — {overall goal one-liner} |
| 33 | + |
| 34 | +### Prior: {what the previous PR delivered} |
| 35 | +### This PR: {what this adds} |
| 36 | +### Next (optional): {what comes next, if scope is known} |
| 37 | +``` |
| 38 | + |
| 39 | +This gives reviewers and agents immediate orientation. The "Next" section is optional — include it when the remaining scope is fixed and known; omit it when scope is still evolving. The parent issue is the source of truth for overall progress. |
| 40 | + |
| 41 | +### 2. Branch targeting |
| 42 | + |
| 43 | +- PR 1 targets `main` |
| 44 | +- PR N targets PR N-1's branch |
| 45 | +- Final PR merges the full stack to `main` |
| 46 | + |
| 47 | +``` |
| 48 | +main |
| 49 | + └── feat/first-concern (PR 1) |
| 50 | + └── feat/second-concern (PR 2) |
| 51 | + └── feat/third-concern (PR 3 → merge to main) |
| 52 | +``` |
| 53 | + |
| 54 | +### 3. Self-contained reviewability |
| 55 | + |
| 56 | +Each PR: |
| 57 | +- Compiles and passes tests independently |
| 58 | +- Can be deployed without breaking the system (see exception below) |
| 59 | +- Has a single clear responsibility (one concern per PR) |
| 60 | +- Does not leave dead code, TODOs, or broken intermediate states |
| 61 | + |
| 62 | +**Infrastructure stack exception:** For multi-PR CDK/IAM changes where intermediate slices cannot deploy independently (e.g., a policy referencing a resource added in a later PR), the validation gate is **synth + tests passing** — not a successful deploy. In this case, designate a **deploy-gate PR** in the stack position block: the specific PR where the stack becomes end-to-end deployable. Acceptable intermediate states include feature-flagged resources, no-op stubs, and constructs gated behind context variables. |
| 63 | + |
| 64 | +### 4. Size guidelines |
| 65 | + |
| 66 | +| Metric | Target | Maximum | |
| 67 | +|--------|--------|---------| |
| 68 | +| Lines changed | 200–400 | 600 | |
| 69 | +| Review time | 20–30 min | 45 min | |
| 70 | +| Files touched | 3–8 | 12 | |
| 71 | + |
| 72 | +If a PR exceeds these, decompose further. |
| 73 | + |
| 74 | +### 5. Rebase discipline |
| 75 | + |
| 76 | +When a lower PR changes after review feedback: |
| 77 | +- All PRs above it in the stack must be rebased |
| 78 | +- CI must pass on each PR independently after rebase |
| 79 | +- Reviewers are notified of the rebase (GitHub does this automatically) |
| 80 | + |
| 81 | +### 6. Sub-issue linking |
| 82 | + |
| 83 | +- Parent issue lists all sub-issues with a stack visualization diagram |
| 84 | +- Each sub-issue references the parent and its position in the stack |
| 85 | +- GitHub's task list in the parent tracks completion |
| 86 | +- Estimated review time is listed per sub-issue to help reviewers plan |
| 87 | +- Sub-issues use `blocked by #NNN` / `blocking #NNN` relationships to express dependency order — agents and reviewers can identify which issues are unblocked and ready for pickup |
| 88 | + |
| 89 | +### 7. When NOT to use stacked PRs |
| 90 | + |
| 91 | +- Changes under ~200 lines that fit naturally in one PR |
| 92 | +- Hotfixes that need immediate merge |
| 93 | +- Dependency bumps (use Dependabot grouping instead) |
| 94 | +- Documentation-only changes that are self-contained |
| 95 | + |
| 96 | +### 8. Merge semantics |
| 97 | + |
| 98 | +The default topology is a **classic stack** — each PR targets its predecessor's branch. When an early PR merges to `main` before later PRs are reviewed: |
| 99 | + |
| 100 | +1. **Retarget** all PRs that pointed at the merged branch to `main` (or to the next unmerged predecessor). Use `gh pr edit <N> --base main` or GitHub's "Retarget" button. |
| 101 | +2. **Rebase** each retargeted PR onto its new base so the diff is clean. |
| 102 | +3. **CI must pass** on each retargeted PR independently after rebase. |
| 103 | + |
| 104 | +After retargeting, the remaining PRs form a shorter stack rooted on `main`. This is the expected, normal path — not an exception. |
| 105 | + |
| 106 | +**When the stack diverges:** If review feedback on PR 2 invalidates assumptions in PRs 3+, prefer closing and re-opening the affected PRs over accumulating fixup commits that obscure intent. The parent issue remains the source of truth for what shipped and what remains. |
| 107 | + |
| 108 | +## Consequences |
| 109 | + |
| 110 | +- (+) Each PR stays in the "reviewable without fatigue" window (~15–40 min) |
| 111 | +- (+) Agents can pick up any sub-issue independently — the position statement provides full context |
| 112 | +- (+) Partial delivery is meaningful — each merged PR adds value independently |
| 113 | +- (+) Reviewers approve incrementally without needing full-stack mental context |
| 114 | +- (+) Early PRs can merge and ship while later ones are still in review |
| 115 | +- (-) Rebase cascades when early PRs receive feedback |
| 116 | +- (-) More overhead in PR descriptions and branch management |
| 117 | +- (-) Requires discipline to keep each PR independently valid (no "this will be fixed in PR N+1") |
| 118 | +- (!) If the stack grows beyond ~8 PRs, consider decomposing into independent sub-stacks |
| 119 | + |
| 120 | +## References |
| 121 | + |
| 122 | +- [Stacked Diffs — Pragmatic Engineer](https://newsletter.pragmaticengineer.com/p/stacked-diffs) |
| 123 | +- RFC #120 — first formal use of this pattern in ABCA |
| 124 | +- Issue #129 — implementation of this ADR |
0 commit comments