You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/templates/AGENTS.md
+22-3Lines changed: 22 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -173,10 +173,20 @@ Local `AGENTS.md` files may tighten these values, but they must not loosen them
173
173
- task goal and scope
174
174
- a detailed implementation plan with detailed ordered steps
175
175
- constraints and risks
176
+
- explicit test steps as part of the ordered plan, not as a later add-on
177
+
- the test and verification strategy for each planned step
178
+
- the testing methodology for the task: what flows will be tested, how they will be tested, and what quality bar the tests must meet
179
+
- an explicit full-test baseline step after the plan is prepared
180
+
- a tracked list of already failing tests, with one checklist item per failing test
181
+
- root-cause notes and intended fix path for each failing test that must be addressed
176
182
- a checklist with explicit done criteria for each step
177
183
- ordered final validation skills and commands, with reason for each
178
184
- Use the Ralph Loop for every non-trivial task:
179
185
- plan in detail in `<slug>.plan.md` before coding or document edits
186
+
- include test creation, test updates, and verification work in the ordered steps from the start
187
+
- once the initial plan is ready, run the full relevant test suite to establish the real baseline
188
+
- if tests are already failing, add each failing test back into `<slug>.plan.md` as a tracked item with its failure symptom, suspected cause, and fix status
189
+
- work through failing tests one by one: reproduce, find the root cause, apply the fix, rerun, and update the plan file
180
190
- include ordered final validation skills in the plan file, with reason for each skill
181
191
- require each selected skill to produce a concrete action, artifact, or verification outcome
182
192
- execute one planned step at a time
@@ -190,6 +200,7 @@ Local `AGENTS.md` files may tighten these values, but they must not loosen them
190
200
- broader required regressions
191
201
- If `build` is separate from `test`, run `build` before `test`.
192
202
- After tests pass, run `format`, then the final required verification commands.
203
+
- The task is complete only when every planned checklist item is done and all relevant tests are green.
193
204
- Summarize the change, risks, and verification before marking the task complete.
194
205
195
206
### Documentation
@@ -204,6 +215,11 @@ Local `AGENTS.md` files may tighten these values, but they must not loosen them
204
215
- Public bootstrap templates are limited to root-level agent files. Authoring scaffolds for architecture, features, ADRs, and other workflows live in skills.
205
216
- Update feature docs when behaviour changes.
206
217
- Update ADRs when architecture, boundaries, or standards change.
218
+
- For non-trivial work, the plan file, feature doc, or ADR MUST document the testing methodology:
219
+
- what flows are covered
220
+
- how they are tested
221
+
- which commands prove them
222
+
- what quality and coverage requirements must hold
207
223
- Every feature doc under `docs/Features/` MUST contain at least one Mermaid diagram for the main behaviour or flow.
208
224
- Every ADR under `docs/ADR/` MUST contain at least one Mermaid diagram for the decision, boundaries, or interactions.
209
225
- Mermaid diagrams are mandatory in architecture docs, feature docs, and ADRs.
@@ -213,16 +229,19 @@ Local `AGENTS.md` files may tighten these values, but they must not loosen them
213
229
214
230
- TDD is the default for new behaviour and bug fixes: write the failing test first, make it pass, then refactor.
215
231
- Bug fixes start with a failing regression test that reproduces the issue.
216
-
- Every behaviour change needs automated tests with meaningful assertions.
217
-
- Tests must prove the user flow or caller-visible system flow, including the happy path and the most important failure or edge path.
232
+
- Every behaviour change needs new or updated automated tests with meaningful assertions. New tests are mandatory for new behaviour and bug fixes.
233
+
- Tests must prove the real user flow or caller-visible system flow, not only internal implementation details.
234
+
- Tests should be as realistic as possible and exercise the system through real flows, contracts, and dependencies.
235
+
- Tests must cover positive flows, negative flows, edge cases, and unexpected paths from multiple relevant angles when the behaviour can fail in different ways.
218
236
- Prefer integration/API/UI tests over isolated unit tests when behaviour crosses boundaries.
219
237
- Do not use mocks, fakes, stubs, or service doubles in verification.
220
238
- Exercise internal and external dependencies through real containers, test instances, or sandbox environments that match the real contract.
221
239
- Flaky tests are failures. Fix the cause.
222
240
- Changed production code MUST reach at least 80% line coverage, and at least 70% branch coverage where branch coverage is available.
223
241
- Critical flows and public contracts MUST reach at least 90% line coverage with explicit success and failure assertions.
224
-
- Repository or module coverage must not decrease without an explicit written exception.
242
+
- Repository or module coverage must not decrease without an explicit written exception. Coverage after the change must stay at least at the previous baseline or improve.
225
243
- Coverage is for finding gaps, not gaming a number. Coverage numbers do not replace scenario coverage or user-flow verification.
244
+
- The task is not done until the full relevant test suite is green, not only the newly added tests.
226
245
- If the stack is `.NET`, document the active framework and runner model explicitly so agents do not mix VSTest and Microsoft.Testing.Platform assumptions.
227
246
- If the stack is `.NET`, after changing production code run the repo-defined quality pass: format, build, analyze, focused tests, broader tests, coverage, and any configured extra gates such as architecture, security, or mutation checks.
0 commit comments