You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Capabilities as code for AI-native software teams.
4
4
5
-
## Why CapabilityKit?
5
+
CapabilityKit helps developers review what changed in product behavior, not only what changed in code. It keeps capability intent, acceptance criteria, implementation references, dependency relationships, and verification evidence in a repo-native `.capabilities/` folder so a PR can answer three questions quickly:
6
6
7
-
AI agents can write more code faster, but teams still need a reliable way to describe what the system is supposed to do and how to verify it.
7
+
1. Which capabilities changed?
8
+
2. How deeply are those capabilities verified against implementation?
9
+
3. What other capabilities may be affected by this change?
8
10
9
-
CapabilityKit adds a `.capabilities/` folder to your repo so product intent, acceptance criteria, human guidance, implementation review notes, and verification checks live beside the code.
11
+
## Why CapabilityKit?
10
12
11
-
The practical goal is to reduce the human bottleneck in review. Humans should not have to rediscover intent or manually invent every regression check after each AI-assisted change. Capability specs make the expected behavior and required verification visible before code changes start.
13
+
AI agents can produce a lot of implementation quickly. The harder engineering problem is preserving the reason the code was written and proving that the resulting system still delivers the intended capability.
12
14
13
-
## Install
15
+
Planning documents help make code decisions, but they often diverge from implementation. After an AI agent finishes coding, the plan may no longer explain what behavior exists, which files implement it, what checks prove it works, or which downstream behavior depends on it.
14
16
15
-
This repository is currently set up as a workspace project:
17
+
CapabilityKit makes that review surface explicit. A capability file is not a one-time plan. It is a living contract between product intent, code, tests, manual review, and future agent work.
16
18
17
-
```bash
18
-
npm install
19
-
npm run build
20
-
```
21
-
22
-
The package is designed for pnpm workspaces and the CLI package is named `@capabilitykit/cli`.
19
+
## The Developer Review Loop
23
20
24
-
## Quick Start
21
+
Use CapabilityKit during review when a change is more meaningful than a raw code diff can explain:
25
22
26
23
```bash
27
24
npm run build
28
-
npm run capabilitykit -- validate
29
-
npm run capabilitykit -- compile
25
+
npm run capabilitykit -- status
26
+
npm run capabilitykit -- diff HEAD
27
+
npm run capabilitykit -- assess core/assessment/assess-implementation-coverage
28
+
npm run capabilitykit -- impact core/graph/compile-capabilities
30
29
```
31
30
32
-
In another repository, the CLI will eventually be used as:
31
+
`status` gives a project-wide health view. It separates capabilities into `ok`, `needs-review`, `needs-action`, and `planned` so reviewers know where confidence is thin.
33
32
34
-
```bash
35
-
npx @capabilitykit/cli init
36
-
capabilitykit create "User login" --area account
37
-
capabilitykit skill
38
-
capabilitykit validate
39
-
capabilitykit compile
40
-
```
41
-
42
-
## What Is A Capability?
33
+
`diff` compares capability intent against a Git base. Instead of asking reviewers to infer product meaning from YAML or code, it summarizes added, changed, and removed capabilities, highlights changes to intent, acceptance, verification, implementation references, and ignore policy, and includes downstream impact context.
43
34
44
-
A capability is a repo-native description of something the system should do. The default format keeps human-authored intent and guidance at the root of the file and puts implementation details, dependencies, and verification that agents can infer or maintain under `agent`.
35
+
`assess` reads the implementation references declared by a capability and places each acceptance criterion beside concrete source, test, or documentation evidence. It marks criteria as `covered`, `uncovered`, or `uncertain`; uncertainty is intentional because deterministic text evidence can identify review targets but cannot prove semantic correctness by itself.
45
36
46
-
Capability IDs should mirror the file path when a project has enough capabilities to benefit from hierarchy. For example, `.capabilities/core/validation/validate-capability-files.capability.yaml` should use `id: core/validation/validate-capability-files`.
47
-
48
-
Use folders to show ownership and maintenance boundaries:
49
-
50
-
-`core/model` for schema and format capabilities.
51
-
-`core/validation` for checks that protect capability quality.
52
-
-`core/graph` for compile-time graph and impact analysis.
53
-
-`core/agents` for agent handoff and review workflows.
54
-
-`developer-experience/*` for CLI, examples, skills, and integrations.
55
-
-`docs/*` for user-facing and reference documentation.
37
+
`impact` traverses explicit `agent.depends_on` relationships to show direct and transitive dependents. A small edit to a foundational capability can affect agent handoff, diff reporting, CLI behavior, and verification commands; the graph makes that visible before review narrows too early.
56
38
57
-
Capability dependencies still belong in `agent.depends_on`. Folder hierarchy makes the map easier to scan, but explicit dependencies are the source of truth for impact analysis.
39
+
## What A Capability Captures
58
40
59
-
## Example Capability File
41
+
A capability is a repo-native description of something the system should do and how that claim is checked.
60
42
61
43
```yaml
44
+
id: account/user-login
62
45
title: User login
63
46
status: implemented
64
47
area: account
@@ -71,17 +54,96 @@ acceptance:
71
54
guidance:
72
55
- Keep credential errors clear without exposing sensitive details.
73
56
agent:
74
-
verification:
75
-
manual:
76
-
- Review login behavior against the acceptance criteria.
57
+
depends_on:
58
+
- account/session-management
77
59
implementation:
78
60
references:
79
61
- src/auth/login.ts
80
62
- src/auth/session.ts
81
-
review:
82
-
depth: partial
63
+
- tests/auth/login.test.ts
64
+
verification:
65
+
automated:
66
+
- id: login-tests
67
+
description: Covers valid and invalid credential flows.
68
+
command: npm test -- tests/auth/login.test.ts
69
+
manual:
70
+
- Review login copy and lockout behavior against the acceptance criteria.
83
71
gaps:
84
-
- Add automated tests for invalid credentials.
72
+
- Add rate-limit tests before marking this verified.
73
+
```
74
+
75
+
The root fields are human-authored intent. The `agent` section contains the implementation references, dependencies, verification checks, review evidence, and accepted gaps that developers and AI agents use during follow-up work.
76
+
77
+
## Reviewing Capability Diffs
78
+
79
+
Code diffs show how files changed. Capability diffs show how declared behavior changed.
80
+
81
+
CapabilityKit reports:
82
+
83
+
- Added, changed, and removed capabilities by ID.
84
+
- Intent, summary, status, and acceptance changes.
85
+
- Implementation reference changes.
86
+
- Automated and manual verification changes.
87
+
- Verification gaps and ignore policy changes.
88
+
- Direct and transitive downstream impact.
89
+
90
+
Review evidence churn is excluded from the default diff because saved review output can be large and stale. Use `--include-review` when review evidence itself is the subject of the change.
91
+
92
+
## Assessing Verification Depth
93
+
94
+
CapabilityKit treats verification as part of the capability, not a separate checklist that gets reconstructed during PR review.
95
+
96
+
Verification depth comes from several signals:
97
+
98
+
- Acceptance criteria that are specific enough to inspect.
99
+
- Implementation references that point to real files.
100
+
- Automated checks with commands reviewers can run.
101
+
- Manual review steps for behavior that cannot be proven by tests alone.
102
+
- Saved `agent.review` evidence when a human or external agent has reviewed semantic coverage.
103
+
- Declared gaps and ignored findings with explicit reasons.
104
+
105
+
Missing confidence is visible by design. `validate`, `status`, `assess`, `advise`, `review-noisy`, `agent-review`, `review-result`, and `sync-review` all exist to help teams grow capabilities from planned intent toward properly verified behavior without pretending that filename matches or generated prose are proof.
106
+
107
+
## Understanding Impact
108
+
109
+
Capability folders help people navigate ownership, but explicit dependencies are the source of truth for impact analysis.
110
+
111
+
Use `agent.depends_on` when one capability relies on another:
112
+
113
+
```yaml
114
+
agent:
115
+
depends_on:
116
+
- core/model/define-capability-format
117
+
- core/validation/validate-capability-files
118
+
```
119
+
120
+
Then run:
121
+
122
+
```bash
123
+
npm run capabilitykit -- impact core/graph/compile-capabilities
124
+
```
125
+
126
+
The report includes dependencies, direct dependents, transitive dependents, impacted capabilities, suggested automated checks, manual review steps, and known verification gaps. This is useful when a simple-looking change affects shared schema, compiled output, agent prompts, CLI behavior, or docs.
127
+
128
+
## Install
129
+
130
+
This repository is currently set up as a workspace project:
131
+
132
+
```bash
133
+
npm install
134
+
npm run build
135
+
```
136
+
137
+
The package is designed for pnpm workspaces and the CLI package is named `@capabilitykit/cli`.
138
+
139
+
In another repository, the CLI will eventually be used as:
140
+
141
+
```bash
142
+
npx @capabilitykit/cli init
143
+
capabilitykit create "User login" --area account
144
+
capabilitykit skill
145
+
capabilitykit validate
146
+
capabilitykit compile
85
147
```
86
148
87
149
## CLI Commands
@@ -90,25 +152,34 @@ agent:
90
152
- `capabilitykit create <name> --area <area>`creates a capability file.
91
153
- `capabilitykit skill`creates or updates CapabilityKit skill files and agent entrypoints.
92
154
- `capabilitykit status [capability-id]`shows a developer-friendly capability health summary.
155
+
- `capabilitykit diff [base]`compares capability changes against a Git base ref.
156
+
- `capabilitykit assess <capability-id>`compares acceptance criteria with referenced implementation evidence.
157
+
- `capabilitykit advise [capability-id]`groups assessment findings into recommended next actions.
158
+
- `capabilitykit impact <capability-id>`reports direct and transitive downstream capabilities plus suggested verification.
93
159
- `capabilitykit validate`validates capability files and reports verification gaps.
94
160
- `capabilitykit compile`writes normalized JSON to `.capabilities/dist/capabilities.json`.
95
161
- `capabilitykit inspect <capability-id>`prints one capability and its relationships.
96
-
- `capabilitykit impact <capability-id>`reports direct and transitive downstream capabilities plus suggested verification.
97
-
- `capabilitykit diff [capability-id]`compares capability changes against a Git base ref.
98
-
- `capabilitykit assess <capability-id>`compares acceptance criteria with referenced implementation evidence.
99
-
- `capabilitykit advise [capability-id]`groups assessment findings into recommended next actions.
100
162
- `capabilitykit review-noisy --limit 5`lists high-value capabilities for semantic Codex or human review.
163
+
- `capabilitykit agent-task <capability-id>`creates an inspectable implementation or review prompt bundle.
164
+
- `capabilitykit agent-review <capability-id>`combines a review bundle with deterministic coverage evidence.
165
+
- `capabilitykit review-result <capability-id>`validates or saves structured review JSON under `agent.review`.
101
166
- `capabilitykit sync-review [capability-id]`updates `agent.review` from current implementation evidence without changing capability status.
102
167
103
-
`status`is the best first command when you want to understand what the
104
-
capability map says about the project:
168
+
## Organizing Capabilities
105
169
106
-
```bash
107
-
capabilitykit status
108
-
capabilitykit status core/graph/compile-capabilities
109
-
capabilitykit diff --base HEAD
110
-
capabilitykit diff --base HEAD --verbose
111
-
```
170
+
Capability IDs should mirror the file path when a project has enough capabilities to benefit from hierarchy. For example, `.capabilities/core/validation/validate-capability-files.capability.yaml` should use `id: core/validation/validate-capability-files`.
171
+
172
+
Use folders to show ownership and maintenance boundaries:
173
+
174
+
- `core/model`for schema and format capabilities.
175
+
- `core/validation`for checks that protect capability quality.
176
+
- `core/graph`for compile-time graph, diff, and impact analysis.
177
+
- `core/assessment`for implementation coverage and review depth.
178
+
- `core/agents`for agent handoff and review workflows.
179
+
- `developer-experience/*`for CLI, examples, skills, and integrations.
180
+
- `docs/*`for user-facing and reference documentation.
181
+
182
+
Capability dependencies still belong in `agent.depends_on`. Folder hierarchy makes the map easier to scan, but explicit dependencies power impact analysis.
112
183
113
184
## Verification Gaps
114
185
@@ -129,12 +200,7 @@ agent:
129
200
reason: Tracked outside CapabilityKit for this release.
130
201
```
131
202
132
-
Use `code: "*"` only when every verification gap for that capability is intentionally handled elsewhere.
133
-
134
-
Advisory assessment findings can also be ignored when a maintainer accepts the
135
-
deterministic assessor's limitation for a specific criterion. Ignored findings
136
-
are removed from recommended actions and `review-noisy` scoring, but remain
137
-
auditable in the capability file:
203
+
Advisory assessment findings can also be ignored when a maintainer accepts the deterministic assessor's limitation for a specific criterion:
138
204
139
205
```yaml
140
206
agent:
@@ -145,30 +211,21 @@ agent:
145
211
reason: Documentation wording was manually reviewed and accepted.
146
212
```
147
213
148
-
Use `criterion_contains` for a small family of related findings, and `status: "*"` only for intentionally accepted findings across statuses.
214
+
Ignored findings are removed from recommended actions and `review-noisy` scoring, but remain auditable in the capability file.
149
215
150
216
## Dogfooding
151
217
152
-
CapabilityKit uses its own `.capabilities/` folder from the first usable version. Each MVP feature has a matching capability spec, and the project verification loop validates and compiles those specs.
153
-
154
-
## Roadmap
155
-
156
-
- Bootstrap the TypeScript CLI and core library.
157
-
- Strengthen validation and verification gap detection.
158
-
- Add richer examples and documentation.
159
-
- Prepare for editor integrations without making the MVP dependent on them.
160
-
161
-
## Contributing
218
+
CapabilityKit uses its own `.capabilities/` folder. Current capabilities cover the schema, validation, implementation reference checks, compiled graph output, capability diffing, impact analysis, implementation coverage assessment, external agent handoff, CLI workflow, skill installation, examples, and documentation.
162
219
163
-
Keep changes close to the code, specs, and tests they affect. When behavior changes, update the relevant capability spec and run the local verification loop:
220
+
The project verification loop validates and compiles those specs:
164
221
165
222
```bash
166
223
npm run verify
167
224
```
168
225
169
-
## Website (capabilitykit.com)
226
+
## Website
170
227
171
-
A simple static marketing site is available in `website/` and is ready for Amazon S3 static hosting.
228
+
A static site is available in `website/` and is ready for Amazon S3 static hosting.
0 commit comments