Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 58 additions & 57 deletions docs/spec-driven-workflow.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ This document describes how to build production-quality software with AI agents,

Semantic Anchors are compact terms that reliably activate rich knowledge domains in LLMs.
Instead of writing pages of instructions, you reference a concept the model already understands deeply.
"Use TDD" is shorter than explaining test-driven development.
"Follow arc42" is shorter than describing 12 architecture sections.
"Use link:#/anchor/tdd-london-school[TDD, London School]" is shorter than explaining test-driven development with mocks and outside-in design.
"Follow link:#/anchor/arc42[arc42]" is shorter than describing 12 architecture sections.
The prompts stay short, precise, and maintainable.

This workflow was used to build three open source projects, all 100% AI-generated.
The golden rule: I only prompt, I never touch the code myself.
*The golden rule: I only prompt, I never touch the code myself.*
Every line of code, every test, every documentation file was written by the AI under my guidance.

* https://github.com/docToolchain/dacli[*dacli*]: A full CLI tool with spec, architecture docs, tests, and user manual. Built by one AI, cross-reviewed by 5 different LLMs.
Expand All @@ -41,20 +41,40 @@ That is like asking a junior developer to build an entire application from a one
The result will be unpredictable at best.

This workflow takes the opposite approach: break the work into small, well-defined steps and let the AI handle each one autonomously.
Each step produces a concrete artifact that you can review before moving on.
Each step produces a concrete artifact that you can review if you see the need for it.

The paradox: the smaller you make each task, the more autonomy you can give the agent.
A vague "build me X" needs constant supervision.
A precise "implement issue #42 using TDD, respecting the spec in src/docs/specs/" can run on its own.
A precise "implement issue #42 using TDD, respecting the spec and architecture in src/docs/" can run on its own.
The phases described in this document are designed to produce exactly that kind of precise, self-contained task.

This connects directly to Shannon's theorem.
Each small step is a short transmission over the noisy channel.
Short transmissions with error correction (tests, linting, review) are far more reliable than one long, unchecked transmission.
LLM output is inherently non-deterministic -- the same prompt can produce different code each time.
TDD and reviews are the error correction that makes this noisy channel usable.
The agent works in a tight loop: implement, test, commit, check docs.
You review at the boundaries between phases, not after every line of code.
This connects directly to *Eichhorst's Principle*, which applies Shannon's noisy channel theorem to LLM coding.
An LLM is not a deterministic tool.
It is a noisy, non-deterministic channel.
It hallucinates, loses context, and is sometimes plain wrong.
But an agent in a loop corrects itself: the compiler reports an error, the agent reads it, fixes the code, runs the tests, reads the failure, fixes the logic, and repeats until green.
That is not magic -- that is error correction, exactly as Shannon described.

When you prompt an LLM and paste the result into your project, you run an *open loop*.
No compiler check, no test suite, no review.
The LLM guesses once and you hope it guessed right.
When an agent writes code, runs the compiler, runs the tests, and iterates until everything passes, that is a *closed loop*.
The same principle that makes a thermostat work.

Different tests correct different error classes.
The compiler catches syntax errors.
Unit tests catch logic errors.
BDD tests catch domain errors.
Each layer increases the reliability of the channel.
Untested code is an uncorrected channel -- the noise passes straight through.

The consequence: *better tests beat better prompts*.
A comprehensive test suite turns a mediocre model into a reliable coding partner.
And if the complexity of a specification exceeds the capacity of the LLM, more tokens will not help.
The answer is smaller specifications, clearer boundaries, and better tests.

Each small step in this workflow is a short transmission over the noisy channel.
Short transmissions with error correction are far more reliable than one long, unchecked transmission.

== Workflow Overview

Expand All @@ -68,7 +88,7 @@ These principles apply to all phases:
⚓ link:#/anchor/conventional-commits[*Conventional Commits*]:: All commits follow a standardized format for a clean, parseable git history.
⚓ link:#/anchor/docs-as-code[*Docs-as-Code according to Ralf D. Müller*]:: Documentation lives in the repository as AsciiDoc, built by https://doctoolchain.org[docToolchain].
Docs-as-Code treats documentation like source code: version-controlled, peer-reviewed, and built automatically.
*Definition of Done*:: Code passes all tests, feature branch is merged or PR is created, documentation is updated, architecture decisions are recorded.
⚓ link:#/anchor/definition-of-done[*Definition of Done*]:: Code passes all tests, feature branch is merged or PR is created, documentation is updated, architecture decisions are recorded.

== Prerequisites

Expand All @@ -77,6 +97,8 @@ Before starting, set up your project infrastructure:
. Initialize a git repository
. Install https://doctoolchain.org[docToolchain] and download the ⚓ link:#/anchor/arc42[*arc42*] template
. Configure your AI coding environment with an `AGENTS.md` (or tool-specific equivalent like `CLAUDE.md`)
. Give the AI agent access to GitHub or GitLab via the CLI (`gh` or `glab`). The agent will need this later to create issues, pull requests, and ADR discussions. Consider using a dedicated account for audit traceability.
. Following Eichhorst's Principle, set up error correction layers for your project: linters, pre-commit hooks, CI pipelines, and static analysis. Each layer catches a different class of error and makes the LLM channel more reliable. The https://github.com/LLM-Coding/vibe-coding-risk-radar[Vibe Coding Risk Radar] can help determine which checks are appropriate for your project's risk profile. These checks unfold their full effect once the first lines of runnable code exist -- set them up early, but revisit as the project grows.

.Installing docToolchain (Linux/macOS)
[source,bash]
Expand Down Expand Up @@ -112,20 +134,20 @@ A minimal `AGENTS.md` for this workflow:
## Conventions
- Documentation: Plain English according to Strunk & White
- Testing: TDD (London or Chicago School as appropriate)
- Code: DRY, SOLID, KISS, Ubiquitous Language (DDD)
- Commits: Conventional Commits, reference issue number
- Branches: feature/<issue-description>

## Current State
- Working on: EPIC #3
- Next issue: #42
----

Update this file as the project evolves.
As the project progresses, the AI agent will maintain this file itself.
When starting a new AI session, the agent reads `AGENTS.md` and immediately has the context it needs.

[TIP]
====
Start a new AI session for each EPIC or when the conversation becomes sluggish.
Compact the context before starting a new EPIC.
Within a session, keep an eye on the context window.
Compact the conversation manually at natural breakpoints (e.g. after completing an issue) rather than waiting for the model to auto-compact at an inconvenient moment and lose important context.
The agent picks up context from `AGENTS.md` and the spec files automatically.
====

Expand Down Expand Up @@ -153,8 +175,8 @@ Prompt the AI to use the ⚓ link:#/anchor/socratic-method[*Socratic Method*] to
[source]
----
Use the Socratic Method to help me clarify requirements for [your project].
Ask me at most 3 questions at a time. Challenge my assumptions before
documenting anything.
Ask me at most 3 questions at a time. Challenge my assumptions.
Keep asking until you fully understand the requirements.
----

This activates targeted questioning, assumption challenging, and productive use of not-knowing.
Expand All @@ -173,13 +195,12 @@ Continue the dialogue until both you and the AI are satisfied that the requireme

=== Step 3: Document as PRD

Ask the AI to write a *Product Requirements Document (PRD)* and save it as AsciiDoc.
Ask the AI to write a ⚓ link:#/anchor/prd[*Product Requirements Document (PRD)*] and save it as AsciiDoc.
A PRD captures the what and why, not the how: problem statement, goals, user personas, success criteria, and scope boundaries.

[source]
----
Write a PRD based on our discussion. Save it as src/docs/specs/prd.adoc.
Follow Plain English guidelines according to Strunk and White.
----

=== Step 4: Create Detailed Specification
Expand All @@ -198,7 +219,7 @@ Save as .adoc files in src/docs/specs/.
⚓ link:#/anchor/gherkin[*Gherkin*] (Given/When/Then) provides acceptance criteria that are both human-readable and machine-testable.
These criteria become the foundation for TDD later.

Activity Diagrams force you to think about error paths, edge cases, and alternative flows early.
Activity Diagrams are an important part of the specification because they define flows, error paths, and edge cases in a way the AI can follow during implementation.

== Phase 2: Architecture

Expand All @@ -208,16 +229,17 @@ Ask the AI to derive an architecture from the specification:

[source]
----
Create an arc42 architecture document based on the specification in src/docs/specs/.
Save it in src/docs/arc42/.
Fill the arc42 template in src/docs/arc42/ based on the specification in src/docs/specs/.
Use PlantUML C4 diagrams for architecture visualization.
----

The arc42 template was downloaded in the prerequisites step.
The AI knows the template structure and fills the 12 sections appropriately.

⚓ link:#/anchor/arc42[*arc42*] provides 12 sections covering everything from context to deployment.
The AI knows the template structure and fills it appropriately.

⚓ link:#/anchor/c4-diagrams[*C4 Diagrams*] combined with PlantUML provide text-based architecture visualization at four levels: Context, Container, Component, Code.
The AI can create and modify these diagrams without graphical tools.
Since the documentation uses AsciiDoc, PlantUML and other text-to-diagram tools are supported natively -- the AI generates diagrams as code, and the build renders them automatically.

==== Architecture Decision Records

Expand All @@ -238,7 +260,7 @@ The AI creates each ADR as a GitHub/GitLab issue first.
You review the issue, comment, or approve it.
Only after your approval is the ADR incorporated into the arc42 documentation.
This way, every architectural decision is traceable through the issue history.
All ADRs must align with the quality goals defined in arc42 Section 1.2.
All ADRs must align with the quality requirements defined in arc42 Section 10.

=== Step 6: Architecture Review (ATAM)

Expand Down Expand Up @@ -269,7 +291,7 @@ Use MoSCoW prioritization for the initial backlog order.
Mark dependencies between issues with labels or cross-references.
----

⚓ *INVEST* ensures User Stories are Independent, Negotiable, Valuable, Estimable, Small, and Testable.
link:#/anchor/invest[*INVEST*] ensures User Stories are Independent, Negotiable, Valuable, Estimable, Small, and Testable.

⚓ link:#/anchor/moscow[*MoSCoW*] (Must have, Should have, Could have, Won't have) provides clear prioritization.

Expand All @@ -281,20 +303,12 @@ As the project evolves, groom the backlog regularly to re-prioritize based on ne

=== Step 8: Implement Issue by Issue

Create a feature branch for each EPIC or logical group of issues:

[source,bash]
----
git checkout -b feature/<epic-or-issue-description>
----

Then enter the core development loop:

[source]
----
Create a feature branch for this EPIC.
Select the next logical issue from the backlog (respect dependencies).
Analyze it and document your analysis as a comment on the issue.
Implement it using TDD. Commit when done.
Implement it using TDD (choose London or Chicago School as appropriate). Commit when done.
Check if the spec or architecture docs need updating.
----

Expand All @@ -305,19 +319,13 @@ For each issue:
. *Commit*: After the issue is implemented and all tests pass, commit with a reference to the issue number
. *Check docs*: Ask whether the specification or architecture documentation needs updating based on what was learned during implementation

⚓ *TDD* (Test-Driven Development) comes in two schools:
TDD (Test-Driven Development) comes in two schools:

* ⚓ link:#/anchor/tdd-london-school[*London School*] (mockist): isolate the unit under test, mock dependencies. Good for interaction-heavy code.
* ⚓ link:#/anchor/tdd-chicago-school[*Chicago School*] (classicist): test behavior through the public API, use real collaborators. Good for state-based logic.
* ⚓ link:#/anchor/tdd-london-school[*TDD, London School*] (mockist): isolate the unit under test, mock dependencies. Good for interaction-heavy code.
* ⚓ link:#/anchor/tdd-chicago-school[*TDD, Chicago School*] (classicist): test behavior through the public API, use real collaborators. Good for state-based logic.

The AI selects the appropriate school based on the code's characteristics.

Each implementation follows the Red-Green-Refactor cycle and respects:

* ⚓ link:#/anchor/dry-principle[*DRY*] (Don't Repeat Yourself)
* ⚓ link:#/anchor/solid-principles[*SOLID*] principles
* ⚓ *KISS* (Keep It Simple, Stupid)
* ⚓ link:#/anchor/domain-driven-design[*Ubiquitous Language*] from Domain-Driven Design: use the same terms in code as in the specification

==== Feedback Loop to Specification

Expand All @@ -337,15 +345,8 @@ The spec is not just a means to generate code -- it remains the authoritative de

==== Merging

When an EPIC or a logical group of issues is complete, merge the feature branch:

[source,bash]
----
git checkout main
git merge feature/<epic-or-issue-description>
----

For team projects, use Pull Requests for code review before merging.
When an EPIC is complete, the AI agent creates a Pull Request for the feature branch.
You review and merge it.

== Phase 5: Quality Assurance

Expand Down
Loading
Loading