Skip to content

Commit b4c911d

Browse files
authored
Merge pull request #455 from raifdmueller/feat/socratic-code-theory-recovery
Add Socratic Code Theory Recovery to Brownfield workflow
2 parents c706cdf + 8ab7663 commit b4c911d

1 file changed

Lines changed: 86 additions & 19 deletions

File tree

docs/brownfield-workflow.adoc

Lines changed: 86 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ Use ⚓ link:#/anchor/domain-driven-design[Domain-Driven Design] to identify the
4343
The AI can help: point it at the code and ask it to identify bounded contexts and their interfaces.
4444

4545
.Prompt
46+
[source,txt]
4647
----
4748
Analyze the codebase in src/. Identify bounded contexts using Domain-Driven Design.
4849
For each context, list: name, responsibility, key entities, interfaces to other contexts.
@@ -52,41 +53,74 @@ Present as a table.
5253
Pick one bounded context to start with.
5354
Choose one that is small, well-isolated, and has a change request pending.
5455

55-
== Phase 0.5: Reverse-Engineer the Safety Net
56+
== Phase 0.5: Socratic Code Theory Recovery
5657

57-
Before changing anything, you need two things: understanding and tests.
58+
Before changing anything, you need to recover the "theory" of the bounded context -- what https://pages.cs.wisc.edu/~remzi/Naur.pdf[Peter Naur] called the mental model that lives in the heads of the original developers. In a brownfield project, this model is not documented. The code is the only source.
5859

59-
=== Extract Existing Behavior as Use Cases
60+
This phase uses *Socratic Code Theory Recovery*: a two-phase workflow that builds understanding through recursive question refinement before producing documentation.
6061

61-
Let the AI read the code in your bounded context and extract what the system currently does.
62-
The output is a set of use cases that describe the *existing* behavior -- not what you want to build, but what is already there.
62+
=== Phase 1: Build the Question Tree
6363

64-
.Prompt
64+
Start with five high-level questions about the bounded context and decompose them recursively. Use Semantic Anchors as decomposition guides: *arc42* for architecture, *Cockburn Use Cases* for specification, *ISO 25010* for quality, *Nygard ADRs* for decisions.
65+
66+
.Starting Questions (adapt to your bounded context)
67+
[source,txt]
6568
----
66-
Read the code in [bounded context path]. Extract the existing behavior as Use Cases.
67-
For each Use Case: ID, Trigger, Actors, Preconditions, Main Flow, Alternative Flows, Postconditions, Business Rules.
68-
Save as docs/specs/use-cases-[context-name].adoc.
69+
1. What problem does this bounded context solve and for whom?
70+
2. What is the specification of this bounded context?
71+
3. What is the architecture of this bounded context?
72+
4. What quality goals drive the design?
73+
5. What risks and technical debt exist?
6974
----
7075

71-
Review the extracted use cases against the running system.
72-
The AI may miss implicit behavior or misinterpret code.
73-
This is the one step where domain knowledge is irreplaceable.
76+
Each leaf in the tree is either `[ANSWERED]` (with code evidence: file, function, line) or `[OPEN]` (with Category and Ask role).
77+
78+
The output is two files:
79+
80+
* `QUESTION_TREE.adoc` -- the full reasoning trace
81+
* `OPEN_QUESTIONS.adoc` -- the handoff document, grouped by role (Product Owner, Architect, Developer, Domain Expert, Operations)
82+
83+
=== Between Phases: Team Answers the Open Questions
84+
85+
Route the Open Questions to the people who can answer them. In a controlled experiment with a 13,000-line Go codebase, *11 targeted questions* were sufficient to close the gap between reverse-engineered documentation and the original. The questions are precise because the recursive decomposition ensures they are specific, not vague.
86+
87+
Typical questions the LLM cannot answer from code:
88+
89+
[cols="2,3",options="header"]
90+
|===
91+
| Category | Example
92+
93+
| Business Context | Why was this built? What alternatives existed?
94+
| Design Rationale | Why JSONC instead of YAML? Why this library?
95+
| Quality Goals | Which quality goal has priority? What are the thresholds?
96+
| Stakeholder Context | Who uses this? What is their skill level?
97+
| Future Direction | What is planned but not yet implemented?
98+
|===
99+
100+
=== Phase 2: Synthesize Documentation
101+
102+
The LLM synthesizes the answered questions plus the code evidence from Phase 1 into documentation following the spec-driven workflow:
103+
104+
* *PRD* from Q-1 branch answers
105+
* *Specification* (Cockburn Use Cases, CLI spec, data models, Gherkin acceptance criteria) from Q-2 branch
106+
* *arc42* with all 12 chapters from Q-3 branch
107+
* *Nygard ADRs* with Pugh Matrix from Q-3.9 branch
108+
109+
Every claim references a Question ID and marks team-provided information with `(team answer)`. This dual traceability (code evidence + team input) is the key difference from a simple reverse-engineering prompt.
74110

75111
=== Establish Baseline Tests
76112

77-
Write tests that verify the existing behavior.
78-
These tests are your safety net: if a change breaks something, the tests will catch it.
113+
From the synthesized Use Cases, write tests that verify the existing behavior. These tests are your safety net.
79114

80115
.Prompt
116+
[source,txt]
81117
----
82118
Based on the Use Cases in docs/specs/use-cases-[context-name].adoc, write tests that verify the current behavior.
83119
Use TDD, London School. Each test references its Use Case ID for traceability.
84120
Do not change any production code. Only add tests.
85121
----
86122

87-
Run the tests.
88-
Every test must pass against the current code.
89-
If a test fails, the extracted use case was wrong -- fix the use case, then fix the test.
123+
Run the tests. Every test must pass against the current code. If a test fails, the extracted use case was wrong -- fix the use case, then fix the test.
90124

91125
[IMPORTANT]
92126
====
@@ -95,6 +129,24 @@ Without them, you cannot distinguish between "my change broke something" and "it
95129
This is the closed loop that makes brownfield changes safe.
96130
====
97131

132+
=== What the LLM Can and Cannot Recover
133+
134+
A controlled experiment (deleting documentation from a greenfield project and regenerating it from code) showed:
135+
136+
*Derivable from code:* Functional requirements (21 vs. 7 in the original), acceptance criteria (69 vs. 40), building block views, glossary (31 terms vs. 2 placeholders), security mechanisms, crosscutting concepts.
137+
138+
*NOT derivable from code:* Business context, design rationale (ADR "why"), quality goal *priorities*, stakeholder concerns, aspirational features, performance budgets, tutorials, review results.
139+
140+
Semantic Anchors serve a dual purpose in this workflow: *prompt compression* (a 69-line prompt produced 3,850 lines of correctly structured documentation) and *decomposition heuristics* ("arc42" generates 12 MECE sub-questions without additional instructions).
141+
142+
=== Spec Drift and Reconciliation
143+
144+
Even in well-documented projects, the specification drifts from the code. The implementation LLM adds security hardening, validation rules, and edge cases that were never in the original specification. This is not a discipline problem -- it is a structural property of the workflow.
145+
146+
The fix: periodic *spec reconciliation*. Run the reverse-engineering prompt against current code and diff against the existing spec. The diff reveals new requirements (in code, not in spec), changed behavior (diverged), and dead spec (documented but removed).
147+
148+
Three natural trigger points: before a release, after a security review, before onboarding.
149+
98150
== Phase 1-12: The Standard Workflow
99151

100152
Once you have use cases and baseline tests for your bounded context, the standard workflow applies.
@@ -143,17 +195,29 @@ Stable code that nobody touches does not need specs.
143195
|`Analyze the codebase in [path]. Identify bounded contexts using DDD. List name, responsibility, entities, interfaces.`
144196
|link:#/anchor/domain-driven-design[DDD]
145197

146-
|Reverse-Engineer
147-
|`Read the code in [path]. Extract existing behavior as Use Cases with Trigger, Main Flow, Alternative Flows, Postconditions.`
198+
|Theory Recovery (Phase 1)
199+
|`You have access to [bounded context path]. No documentation exists. Build a Question Tree by recursively refining 5 questions: Problem/Users, Specification, Architecture, Quality Goals, Risks. Each leaf: [ANSWERED] with code evidence or [OPEN] with Category and Ask role.`
200+
|link:#/anchor/arc42[arc42], link:#/anchor/cockburn-use-cases[Cockburn], link:#/anchor/iso-25010[ISO 25010], link:#/anchor/nygard-adrs[Nygard ADR]
201+
202+
|Team Answers
203+
|Route OPEN_QUESTIONS.adoc to the team by Ask role. Typically 10-15 questions.
148204
|{empty}--
149205

206+
|Theory Recovery (Phase 2)
207+
|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID. Mark team input with (team answer).`
208+
|link:#/spec-driven-development[Spec-Driven Workflow]
209+
150210
|Baseline Tests
151211
|`Write tests for the Use Cases in [spec file]. Each test references its Use Case ID. Do not change production code.`
152212
|link:#/anchor/tdd-london-school[TDD London] / link:#/anchor/tdd-chicago-school[Chicago]
153213

154214
|Continue
155215
|Follow link:#/spec-driven-development[the standard workflow] from Step 3 (PRD) or Step 8 (Implementation), depending on whether you are adding new features or fixing bugs.
156216
|{empty}--
217+
218+
|Reconciliation
219+
|`Compare existing spec in [path] against current code. Report: NEW (in code, not in spec), CHANGED (diverged), DEAD (in spec, not in code). Do not modify existing files.`
220+
|{empty}--
157221
|===
158222

159223
== When Not to Use This Approach
@@ -169,3 +233,6 @@ If the system cannot be built or started, you have a different problem -- fix th
169233
* Simon Martinelli, https://unifiedprocess.ai/[AI Unified Process] -- the bounded-context approach to spec-driven development in existing systems.
170234
* Eric Evans, https://www.domainlanguage.com/ddd/[Domain-Driven Design] -- the foundational work on bounded contexts and strategic design.
171235
* Michael Feathers, _Working Effectively with Legacy Code_ -- techniques for establishing test coverage in systems without tests.
236+
* Peter Naur, "Programming as Theory Building" (1985) -- argues that programming is about building a mental model ("theory") that cannot be fully captured in documentation. Socratic Code Theory Recovery tests this claim in the context of LLM-generated code.
237+
* https://github.com/rdmueller/personalAssistant/blob/main/resources/brownfield-experiment-report.adoc[Brownfield Experiment Report] -- controlled experiment: delete documentation from a greenfield project, regenerate from code, compare. Full methodology and findings.
238+
* https://github.com/rdmueller/personalAssistant/blob/main/resources/brownfield-fair-comparison.adoc[Fair Comparison Report] -- three approaches (Direct, Socratic, Two-Phase) with identical team answers. Measures the structural value of the Question Tree.

0 commit comments

Comments
 (0)