Skip to content

Commit 8bb9407

Browse files
docs: ADR-012 operational knowledge stack, ADR-013 validation pyramid, ADR-003 enforcement
ADR-012: Three-layer pattern (Decision → Guide → Skill) for operational knowledge. Includes ADR-003 decomposition example and documented failure mode where prose governance was bypassed. ADR-013: Four-tier validation pyramid addressing the missing Tier 2 (local sandbox) that causes agents to waste cycles on slow remote feedback loops. ADR-003: Add "no branches without an issue" rule, enforcement mechanisms table, and "conversational approval is NOT issue approval" section based on observed governance bypass. Refs #148 Refs #149 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 9806a7a commit 8bb9407

6 files changed

Lines changed: 1038 additions & 0 deletions

docs/decisions/003-contribution-governance.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ The rules below define how any contributor — human or AI — picks up, owns, a
1111

1212
## Decision
1313

14+
### No branches without an Issue
15+
16+
Every feature branch references an issue in its name (e.g., `feat/123-short-description` or `fix/456-bug-name`). A branch without an issue reference is unauthorized work. This prevents the failure mode where work is started "just to explore" and then snowballs into a PR without governance.
17+
1418
### No PRs without an Issue
1519

1620
Every PR references an issue. The issue provides rationale, sufficient context for the solution to be obvious, and verifiable acceptance criteria.
@@ -69,19 +73,52 @@ Provide progress signals at checkpoints. If blocked or abandoning, comment and u
6973

7074
CI passes before requesting review. After merge, verify acceptance criteria and close. Create follow-up issues for discovered work before closing.
7175

76+
### Conversational approval is NOT issue approval
77+
78+
A user saying "yes, do it" or "go ahead" in a conversation does NOT satisfy the governance gate. The correct response to conversational approval is:
79+
80+
1. Create an issue with acceptance criteria
81+
2. Request the `approved` label from an admin
82+
3. Self-assign once approved
83+
4. Then begin implementation
84+
85+
**Known failure mode:** Agents interpret conversational momentum ("Yes start with X") as authorization to skip issue creation. This is the most common governance bypass — it feels like permission because the user explicitly directed the work, but the governance requires a *durable, reviewable artifact* (the issue), not a transient conversation.
86+
87+
**Why this matters:** Conversations are ephemeral. Issues are auditable. If an agent creates work based on a conversation and that conversation is lost (context compaction, session end), no record exists of what was authorized, what the acceptance criteria were, or why the work was started.
88+
89+
### Enforcement mechanisms
90+
91+
Prose governance is necessary but insufficient. The following enforcement points prevent bypass:
92+
93+
| Mechanism | Layer | What it catches |
94+
|-----------|-------|-----------------|
95+
| Branch name convention | Git workflow | Branch must match `(feat|fix|chore|docs)/<issue-number>-*` — rejects branches without issue reference |
96+
| Commit-msg hook (Tier 0) | Pre-commit | Rejects commits without `Refs #N` or `Fixes #N` |
97+
| Pre-push hook (Tier 1) | Pre-push | Validates referenced issue exists and has `approved` label via `gh` API |
98+
| Claude Code hook (`PreToolUse: Write`) | Agent runtime | Blocks file creation in governed paths without declared issue context |
99+
| Skill gate: `pickup-issue` | Agent workflow | Agent must invoke before implementation — hard-fails without valid issue |
100+
| AGENTS.md directive | Agent prompt | Explicit instruction: "Do NOT begin implementation without an approved issue, even if the user says 'go ahead' in conversation" |
101+
102+
**Progressive enforcement:** Start with the commit-msg hook (cheapest, catches all contributors). Add pre-push validation next. Skill gates enforce at the agent-workflow level (ADR-012, Layer 3).
103+
72104
## Consequences
73105

74106
- (+) Prevents duplicate effort — assignment signals ownership
75107
- (+) Prevents priority inversion — agents challenge low-priority requests
76108
- (+) Prevents rework — predecessor validation catches out-of-order work
77109
- (+) Issue body stays current — threads are folded back
78110
- (+) Cross-reference audit catches duplicates early
111+
- (+) Enforcement mechanisms catch bypass at multiple points
79112
- (-) Pre-start overhead for small tasks
80113
- (-) Requires discipline to fold threads into body
114+
- (-) Commit-msg hook adds friction for rapid iteration on approved work
81115
- (!) Assumes priority labels exist and are maintained
116+
- (!) Conversational approval bypass is the most common failure — enforcement must be structural, not behavioral
82117

83118
## References
84119

85120
- Issue #134 — full RFC with resolved questions and automation requirements
86121
- Roadmap: Scale and collaboration (Agent swarm, Multi-user and teams)
87122
- ADR-001 — delivery methodology referenced by completion rules
123+
- ADR-012 — operational knowledge stack (enforcement via skill gates)
124+
- ADR-013 — tiered validation (enforcement hooks at Tier 0 and Tier 1)
Lines changed: 265 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,265 @@
1+
# ADR-012: Operational knowledge as a three-layer stack (Decision → Guide → Skill)
2+
3+
**Status:** proposed
4+
**Date:** 2026-05-19
5+
6+
## Context
7+
8+
Several ADRs in this repository contain operational runbook material embedded directly in the decision record. ADR-003 (contribution governance) prescribes a full pre-start review checklist. ADR-010 (error recovery) defines a decision tree and step-by-step protocols. ADR-008 (definition of done) provides per-issue-type checklists.
9+
10+
This creates three problems:
11+
12+
1. **Stale procedures** — Teams hesitate to update ADRs for minor procedural tweaks (timing thresholds, label names), so runbooks drift from practice.
13+
2. **Agent execution gap** — Agents must parse prose ADRs, extract the operational steps, and interpret judgment calls. The ADR format is optimized for decision rationale, not execution.
14+
3. **Persona mismatch** — A planner reading ADR-003 for the governance philosophy gets bogged down in GraphQL query syntax. An implementor executing the pre-start checklist must skip rationale paragraphs to find the steps.
15+
16+
The agentic-first model requires operational knowledge to be **invocable**, not just **readable**. An agent should execute a governance workflow the same way it invokes a tool — with defined inputs, gates, and outputs.
17+
18+
## Decision
19+
20+
### Three-layer operational knowledge stack
21+
22+
Every operational procedure identified in an ADR is decomposed into three layers:
23+
24+
```
25+
┌─────────────────────────────────────────┐
26+
│ Layer 1: ADR (Decision Record) │ Immutable-ish
27+
│ WHY we do it this way │ Changes: decision is superseded
28+
│ Consumer: architects, future deciders │
29+
└─────────────────────────┬───────────────┘
30+
│ references
31+
┌─────────────────────────▼───────────────┐
32+
│ Layer 2: Guide (Reference Document) │ Living document
33+
│ WHAT to do, organized by persona │ Changes: process is refined
34+
│ Consumer: humans + agents needing │
35+
│ context │
36+
└─────────────────────────┬───────────────┘
37+
│ operationalized by
38+
┌─────────────────────────▼───────────────┐
39+
│ Layer 3: Skill (Executable Runbook) │ Versioned, invocable
40+
│ HOW to execute, with gates and outputs │ Changes: implementation shifts
41+
│ Consumer: agents during execution │
42+
└─────────────────────────────────────────┘
43+
```
44+
45+
### Layer definitions
46+
47+
**Layer 1 — ADR (Decision Record)**
48+
49+
- Records the architectural or process decision and its rationale
50+
- States WHAT was decided and WHY
51+
- Does NOT contain step-by-step procedures (those belong in Layer 2/3)
52+
- References the guide(s) that operationalize the decision
53+
- Changes only when the decision itself is superseded or amended
54+
55+
**Layer 2 — Guide (Reference Document)**
56+
57+
- Lives in `docs/guides/`
58+
- Organized by persona (planner, implementor, reviewer, admin)
59+
- Contains the WHAT and WHEN — what to do in which situations
60+
- Includes context that helps humans (and agents needing background) understand the workflow
61+
- References the ADR for justification
62+
- Links to the skill(s) that mechanize the workflow
63+
- Changes when the process is refined
64+
65+
**Layer 3 — Skill (Executable Runbook)**
66+
67+
- Lives as a Claude Code skill (or plugin skill) — invocable by name
68+
- Encodes the HOW — the mechanical execution with explicit gates, inputs, outputs
69+
- Structured as bounded, invocable units with clear entry/exit criteria
70+
- An agent invokes the skill rather than parsing the guide/ADR
71+
- References the guide for context when judgment is needed
72+
- Changes when implementation details shift
73+
74+
### Reference direction
75+
76+
References always point upward:
77+
78+
- Skill → references Guide (for context)
79+
- Guide → references ADR (for justification)
80+
- ADR → references Guide (for operationalization, "see Guide X for the workflow")
81+
82+
This means a change at any layer triggers review of layers below:
83+
84+
- ADR amended → review Guide → review Skill
85+
- Guide refined → review Skill
86+
- Skill updated → no upstream change needed (unless the procedure itself changed)
87+
88+
### When a layer is NOT needed
89+
90+
| Situation | Layers needed |
91+
|-----------|---------------|
92+
| Pure policy decision (no steps to follow) | ADR only |
93+
| Decision with human-executed steps (rare, non-repeatable) | ADR + Guide |
94+
| Decision with agent-executable procedure | ADR + Guide + Skill |
95+
| Lightweight procedure (< 3 steps, no gates) | ADR + Guide (skill is overhead) |
96+
97+
### ADR content rules (post-adoption)
98+
99+
After adoption, ADRs:
100+
- **MUST** contain: Context, Decision (the choice made), Consequences, References
101+
- **MUST NOT** contain: Step-by-step procedures, checklists with >3 items, decision trees with branches, protocol sequences
102+
- **SHOULD** contain: A one-paragraph summary of the operational approach (enough to understand without reading the guide)
103+
- **SHOULD** reference: The guide that operationalizes the decision
104+
105+
Existing ADRs are updated incrementally (not rewritten) — operational content is extracted, and a reference to the new guide/skill is added.
106+
107+
### Skill structure requirements
108+
109+
Skills that operationalize ADRs must:
110+
- State which ADR/guide they implement (in frontmatter or header)
111+
- Define explicit gates (conditions that MUST be true to proceed)
112+
- Define explicit outputs (what the skill produces on completion)
113+
- Be independently invocable (no implicit state from prior skills)
114+
- Fail loudly at gates (not silently skip)
115+
116+
## Example: ADR-003 decomposition
117+
118+
ADR-003 (Contribution Governance) is the first ADR to be decomposed under this pattern because it is the most frequently executed procedure and the dependency root for other governance ADRs.
119+
120+
### Current state (ADR-003 contains everything)
121+
122+
ADR-003 currently holds:
123+
- The decision to govern contributions (rationale) ✓ belongs in ADR
124+
- Pre-start review checklist (8 mechanical steps) ✗ belongs in Guide + Skill
125+
- Priority evaluation procedure ✗ belongs in Guide + Skill
126+
- Predecessor validation with GraphQL queries ✗ belongs in Skill
127+
- Cross-reference audit steps ✗ belongs in Guide + Skill
128+
- Work-in-progress discipline rules ✗ belongs in Guide
129+
- Completion and handoff procedure ✗ belongs in Guide + Skill
130+
131+
### Target state (three layers)
132+
133+
**Layer 1 — ADR-003 (slimmed)**
134+
135+
Retains:
136+
- Context (why governance is needed for async agents)
137+
- Decision summary: "Every contribution follows: issue → approval → assignment → pre-start validation → implementation → completion"
138+
- The principles: no PRs without issues, issue quality bar, admin approval gate, no self-approval, GraphQL as authoritative dependency source
139+
- Consequences
140+
- Reference: "See `docs/guides/CONTRIBUTOR_WORKFLOW.md` for the full workflow"
141+
142+
Removes (extracted to Guide/Skill):
143+
- The detailed pre-start review checklist
144+
- GraphQL query specifics
145+
- Step-by-step completion protocol
146+
147+
**Layer 2 — `docs/guides/CONTRIBUTOR_WORKFLOW.md`**
148+
149+
Organized by persona:
150+
151+
```markdown
152+
# Contributor Workflow
153+
154+
> Operationalizes [ADR-003](../decisions/003-contribution-governance.md)
155+
156+
## For Planners
157+
- Issue quality bar (what makes an issue "ready")
158+
- Approval process
159+
- Priority labeling
160+
- Dependency graph maintenance
161+
162+
## For Implementors
163+
- How to pick up an issue
164+
- Pre-start review (summary — invoke skill for execution)
165+
- Work-in-progress signals
166+
- Completion criteria (references ADR-008 guide)
167+
168+
## For Reviewers
169+
- Review comment classification (references ADR-005 guide)
170+
- When to block vs. approve
171+
- Propagation responsibilities
172+
```
173+
174+
**Layer 3 — Skills (invocable by agents)**
175+
176+
| Skill | Inputs | Gates | Outputs |
177+
|-------|--------|-------|---------|
178+
| `pickup-issue` | Issue number | Issue approved, unassigned, no unresolved conflicts, predecessors complete (GraphQL check) | Assignment confirmed, "Starting implementation" comment |
179+
| `validate-dependencies` | Issue number | GraphQL `blockedBy` returns no open blockers | Dependency report (clear / blocked with reason) |
180+
| `complete-work` | Issue number, PR number | CI passes, DoD level met (ADR-008), no stale assignments | Completion comment, follow-up issues created |
181+
| `cross-reference-audit` | Issue number | No duplicate issues, no conflicting open PRs | Audit report (clear / conflicts listed) |
182+
183+
Each skill is a bounded unit. An agent picking up work invokes `pickup-issue` — it doesn't read ADR-003 and improvise.
184+
185+
## Why prose alone fails: observed failure mode
186+
187+
This ADR was itself initially created in violation of ADR-003. The agent (author) had ADR-003 loaded in context, analyzed it, called it "ready for contributing" — then immediately began implementation without creating an issue, requesting approval, or self-assigning.
188+
189+
**The rationalization chain:**
190+
1. "The user said 'yes, start with ADR-012'" → interpreted conversational approval as issue approval
191+
2. "We're just writing ADRs, not code" → no governance exception exists for document type
192+
3. "We're on a testing branch" → no governance exception exists for branch type
193+
4. "Momentum — we're exploring" → governance exists precisely to interrupt unstructured momentum
194+
195+
**What this proves:** An agent with full knowledge of the governance rules will still bypass them when the rules are prose-only. The agent *understood* ADR-003 intellectually but had no structural enforcement preventing violation. Reading a rule is not the same as being gated by it.
196+
197+
**What would have caught it:**
198+
- A `pickup-issue` skill with a hard gate ("issue number required — none provided — STOP")
199+
- A branch naming convention hook rejecting a branch without an issue number
200+
- A commit-msg hook rejecting the commit (no `Refs #N`)
201+
- A Claude Code `PreToolUse` hook on `Write` asking "which approved issue?"
202+
203+
This failure mode is the primary motivation for Layer 3 (skills with gates). Prose governance (Layer 1) establishes the rule. Guides (Layer 2) explain how to follow it. But only executable skills with hard gates (Layer 3) *enforce* it at the point of action.
204+
205+
## Migration plan
206+
207+
### Phase 1: Establish pattern (this ADR)
208+
209+
- Adopt this ADR
210+
- No existing ADRs are modified yet (operational content stays in place until guides/skills exist)
211+
212+
### Phase 2: Decompose ADR-003 (proof of concept)
213+
214+
- Create `docs/guides/CONTRIBUTOR_WORKFLOW.md`
215+
- Create skills: `pickup-issue`, `validate-dependencies`, `complete-work`, `cross-reference-audit`
216+
- Slim ADR-003 to decision + rationale + reference to guide
217+
- Validate: an agent can invoke the skills and complete the governance workflow
218+
219+
### Phase 3: Decompose remaining ADRs (incremental)
220+
221+
Priority order (by execution frequency and mechanical content):
222+
223+
| ADR | Guide | Skills |
224+
|-----|-------|--------|
225+
| 010 (Error Recovery) | `ERROR_RECOVERY.md` | `classify-breakage`, `revert-protocol`, `fix-forward` |
226+
| 008 (Definition of Done) | `DEFINITION_OF_DONE.md` | `verify-done` (parameterized by level) |
227+
| 005 (Feedback Loop) | `PR_REVIEW_GUIDE.md` | `classify-review-comment`, `propagate-upstream` |
228+
| 011 (Conflict Resolution) | Append to `CONTRIBUTOR_WORKFLOW.md` | `resolve-conflict` (escalation ladder) |
229+
230+
ADRs without operational content (001, 002, 004, 006, 007, 009) remain unchanged.
231+
232+
### Phase 4: Plugin marketplace (future)
233+
234+
Skills become shareable across projects:
235+
- Fork governance skills for team-specific thresholds
236+
- Compose skills from multiple ADRs into project-specific workflows
237+
- Version skills independently from the ADRs that justify them
238+
239+
## Consequences
240+
241+
- (+) ADRs stay stable as decision records — not burdened with procedure maintenance
242+
- (+) Guides serve the human reader organized by what they need to do
243+
- (+) Skills make agents execute consistently — no prose interpretation, no drift
244+
- (+) Change cadence is appropriate per layer — procedures evolve without "amending an ADR"
245+
- (+) The three layers serve different consumers without redundancy
246+
- (+) Skills are testable — you can verify an agent follows the procedure correctly
247+
- (+) Hard gates in skills prevent the "understood but violated" failure mode
248+
- (-) Three artifacts per procedure increases maintenance surface
249+
- (-) Migration of existing ADRs requires effort
250+
- (-) Skill development requires understanding the skill format and tooling
251+
- (!) Reference chain integrity must be maintained — a broken link between layers means drift goes undetected
252+
- (!) Not every ADR needs all three layers — applying this pattern to pure policy decisions is overhead
253+
- (!) Without Layer 3 enforcement, Layers 1 and 2 are advisory-only — agents WILL rationalize bypasses
254+
255+
## References
256+
257+
- Issue #148 — implementation tracking for this ADR
258+
- ADR-003 — first decomposition target (contribution governance); enforcement mechanisms added
259+
- ADR-004 — documentation quality standard (guides must meet tabula rasa test)
260+
- ADR-007 — knowledge acquisition (skills enable Level 3 self-improving)
261+
- ADR-008 — definition of done (skill `verify-done` is a natural fit)
262+
- ADR-010 — error recovery (decision tree is a natural skill)
263+
- ADR-013 — tiered validation pyramid (enforcement hooks at Tier 0 and Tier 1)
264+
- [agentskills.io](https://agentskills.io/) — skill marketplace concept for shareable operational knowledge
265+
- Claude Code plugin/skill format — the implementation vehicle for Layer 3

0 commit comments

Comments
 (0)