-
Notifications
You must be signed in to change notification settings - Fork 29
feat: socratic-code-theory-recovery Claude Code Skill (#473) #478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
rdmueller
merged 2 commits into
LLM-Coding:main
from
raifdmueller:feat/socratic-recovery-skill-473
May 13, 2026
Merged
Changes from 1 commit
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,114 @@ | ||
| --- | ||
| name: socratic-code-theory-recovery | ||
| description: Recover the "theory" (Naur 1985) of an existing codebase through recursive question refinement before writing documentation. Use on brownfield projects where the spec is missing — produces a Question Tree separating what is answerable from code (with evidence) from what must be asked of the team (routed by role). Phase 1 builds the tree; team answers the OPEN leaves; Phase 2 synthesizes PRD, Cockburn use cases, arc42 architecture, and Nygard ADRs from the answered tree. | ||
| metadata: | ||
| author: LLM-Coding | ||
| version: "0.1" | ||
| source: https://github.com/LLM-Coding/Semantic-Anchors | ||
| license: MIT | ||
| --- | ||
|
|
||
| # Socratic Code-Theory Recovery | ||
|
|
||
| Reverse-engineer a bounded context into documentation without hallucinating the parts the code cannot tell you. | ||
|
|
||
| ## When to use this skill | ||
|
|
||
| Use this skill on a brownfield codebase when: | ||
|
|
||
| - Documentation is missing, outdated, or untrusted. | ||
| - A change is about to be made and you need a spec before you can change safely. | ||
| - You want documentation that distinguishes code-derived facts from team-supplied context — auditable, not generated prose. | ||
| - You want to surface the *open questions* in the system, not just synthesize an answer the team has not seen. | ||
|
|
||
| Do **not** use this skill when: | ||
|
|
||
| - You are doing greenfield development — use the spec-driven workflow instead. | ||
| - The whole system needs to be documented at once — work one bounded context at a time. | ||
| - The code is not runnable — fix that first. | ||
|
|
||
| ## The contract | ||
|
|
||
| This skill implements the *Socratic Code-Theory Recovery* contract from the Semantic Anchors project. The methodology rests on Peter Naur's 1985 paper *Programming as Theory Building*: a program's theory lives in the heads of its developers and cannot be fully captured in code alone. A documentation-recovery process that ignores this produces confident-looking prose that fills in the gaps with invention. | ||
|
|
||
| The fix: model the gaps explicitly. Every question about the system is either `[ANSWERED]` from code (with file:line evidence) or `[OPEN]` (with a category and the role that must answer it). The OPEN leaves are the handoff to humans. | ||
|
|
||
| ## Two-phase workflow | ||
|
|
||
| ``` | ||
| ┌────────────────────────────────┐ | ||
| Phase 1 │ CODE ──► Question Tree │ | ||
| │ ├─ [ANSWERED] leaves│ | ||
| │ └─ [OPEN] leaves │ | ||
| └────────────────┬───────────────┘ | ||
| ▼ | ||
| ┌────────────────────────────────┐ | ||
| Between │ OPEN_QUESTIONS.adoc │ | ||
| │ ──► team (routed by role) │ | ||
| │ ──► answers fill in OPENs │ | ||
| └────────────────┬───────────────┘ | ||
| ▼ | ||
| ┌────────────────────────────────┐ | ||
| Phase 2 │ Answered tree ──► Docs │ | ||
| │ PRD · Cockburn UCs · arc42 · │ | ||
| │ Nygard ADRs (every claim Q-ID) │ | ||
| └────────────────────────────────┘ | ||
| ``` | ||
|
|
||
| ### Phase 1: Build the Question Tree | ||
|
|
||
| Use [prompts/phase-1-question-tree.md](prompts/phase-1-question-tree.md). Adapt the bounded-context path and any domain-specific Q1 examples; do not change the leaf classification, Q-ID scheme, or output files. | ||
|
|
||
| Outputs: | ||
|
|
||
| - `QUESTION_TREE.adoc` — the full hierarchical reasoning trace | ||
| - `OPEN_QUESTIONS.adoc` — only the `[OPEN]` leaves, grouped by Ask role | ||
|
|
||
| Decomposition heuristics — use these Semantic Anchors as guides, not as rigid templates: | ||
|
|
||
| - **arc42** — 12 architecture sub-questions (Q3 branch). See [references/arc42.md](references/arc42.md). | ||
| - **Cockburn Use Cases** — specification structure (Q2 branch). See [references/cockburn-use-cases.md](references/cockburn-use-cases.md). | ||
| - **ISO/IEC 25010** — 8 quality characteristics (Q4 branch). See [references/iso-25010.md](references/iso-25010.md). | ||
| - **Nygard ADRs** — design-rationale capture (Q3.9 branch). See [references/nygard-adrs.md](references/nygard-adrs.md). | ||
|
|
||
| Leaf classification rules and Q-ID scheme: [references/output-schema.md](references/output-schema.md). | ||
|
|
||
| Worked examples — one `[ANSWERED]` and one `[OPEN]` leaf for each major branch: [references/examples.md](references/examples.md). | ||
|
|
||
| ### Between Phases: Team answers the OPEN leaves | ||
|
|
||
| Route `OPEN_QUESTIONS.adoc` to the people whose role appears in each section: Product Owner, Architect, Developer, Domain Expert, Operations. In one controlled experiment with a 13,000-line Go codebase, 11 targeted OPEN questions were enough to close the gap to the original documentation. | ||
|
|
||
| Team answers are written **directly into `OPEN_QUESTIONS.adoc`** under each question, marked clearly. Do not call Phase 2 until every OPEN leaf has either an answer or an explicit `(deferred)` marker. | ||
|
|
||
| ### Phase 2: Synthesize documentation | ||
|
|
||
| Use [prompts/phase-2-synthesize.md](prompts/phase-2-synthesize.md). The Phase 2 LLM reads the answered tree and produces: | ||
|
|
||
| - **PRD** from the Q1 branch (problem, users, goals, success criteria) | ||
| - **Specification** from the Q2 branch (Cockburn use cases at User Goal level, system use cases for each technical interface, supplementary specifications) | ||
| - **arc42** with all 12 chapters from the Q3 branch | ||
| - **Nygard ADRs** with Pugh Matrix from the Q3.9 branch | ||
|
|
||
| Every claim references a Q-ID. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. | ||
|
|
||
| ## What the LLM can and cannot recover | ||
|
|
||
| A controlled experiment (deleting documentation from a greenfield project and regenerating it from code) showed: | ||
|
|
||
| **Derivable from code**: functional requirements, acceptance criteria, building-block views, glossary, security mechanisms, crosscutting concepts. | ||
|
|
||
| **NOT derivable from code**: business context, design rationale (the ADR "why"), quality-goal *priorities*, stakeholder concerns, aspirational features, performance budgets, tutorials, review results. | ||
|
|
||
| If your synthesized documentation contains a claim from the second list without a `(team answer)` marker, the LLM hallucinated it. Mark it `[OPEN]` and ask the team. | ||
|
|
||
| ## Spec drift and reconciliation | ||
|
|
||
| After this skill produces documentation, the implementation LLM will add security hardening, validation rules, and edge cases that are not in the spec. This is structural, not a discipline problem. Re-run Phase 1 against the current code periodically — before a release, after a security review, before onboarding — and diff against the existing spec. The diff reveals NEW (in code, not in spec), CHANGED (diverged), and DEAD (in spec, not in code). | ||
|
|
||
| ## Further reading | ||
|
|
||
| - Peter Naur, *Programming as Theory Building* (1985). https://pages.cs.wisc.edu/~remzi/Naur.pdf | ||
| - Brownfield Workflow (Semantic Anchors). https://llm-coding.github.io/Semantic-Anchors/brownfield | ||
| - Brownfield Experiment Report. https://llm-coding.github.io/Semantic-Anchors/brownfield-experiment-report | ||
| - Fair Comparison Report (three recovery approaches). https://llm-coding.github.io/Semantic-Anchors/brownfield-fair-comparison |
76 changes: 76 additions & 0 deletions
76
skill/socratic-code-theory-recovery/prompts/phase-1-question-tree.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| # Phase 1 Prompt: Build the Question Tree | ||
|
|
||
| Copy the block below into a session that has read access to the bounded context. Replace `[bounded context path]` with the actual path. Adapt the Q1-Q5 examples if your domain has different starting concerns, but do not change the leaf classification, Q-ID scheme, or output files. | ||
|
|
||
| ``` | ||
| You are performing Socratic Code-Theory Recovery on a brownfield bounded | ||
| context located at [bounded context path]. Phase 1 of two. | ||
|
|
||
| Goal: recover the program's theory (Naur, 1985) from source code through | ||
| recursive question refinement, before any documentation is written. | ||
|
|
||
| Process: | ||
|
|
||
| 1. Start with five high-level questions about the bounded context: | ||
| Q1 What problem does this bounded context solve, and for whom? | ||
| Q2 What is the specification of this bounded context? | ||
| Q3 What is the architecture of this bounded context? | ||
| Q4 What quality goals drive the design? | ||
| Q5 What risks and technical debt exist? | ||
|
|
||
| 2. Decompose each question recursively. Use these Semantic Anchors as | ||
| decomposition guides: | ||
| - arc42 — 12 sub-questions for architecture (Q3 branch) | ||
| - Cockburn Use Cases — Primary Actor, Trigger, Main Success Scenario, | ||
| Extensions, Postconditions for specification (Q2 branch) | ||
| - ISO/IEC 25010 — 8 quality characteristics for quality goals (Q4 branch) | ||
| - Nygard ADRs — Context, Decision, Status, Consequences for design | ||
| rationale (Q3.9 branch) | ||
| Stop decomposing when a question is precise enough to be answered with a | ||
| single piece of code evidence or a single fact from a stakeholder. | ||
|
|
||
| 3. Assign a hierarchical Q-ID to every node (Q1, Q1.2, Q1.2.3, ...) so that | ||
| later documentation can cite back to it. | ||
|
|
||
| 4. For each leaf, classify it: | ||
|
|
||
| [ANSWERED] | ||
| - You found the answer in the code. | ||
| - Cite the evidence as <file>:<line> or <file>::<function>. | ||
| - Be exact. No "see X for details." | ||
|
|
||
| [OPEN] | ||
| - The answer is not derivable from code alone. | ||
| - Category: business-context | design-rationale | quality-goals | | ||
| stakeholder-context | future-direction | ||
| - Ask role: Product Owner | Architect | Developer | Domain Expert | | ||
| Operations | ||
| - State precisely what cannot be answered, and why. | ||
|
|
||
| 5. Output two files in AsciiDoc: | ||
|
|
||
| QUESTION_TREE.adoc | ||
| - Full hierarchical tree with all nodes and Q-IDs | ||
| - Each leaf marked [ANSWERED] (with evidence) or [OPEN] (with Category | ||
| and Ask role) | ||
| - Includes all reasoning, not only the leaves | ||
|
|
||
| OPEN_QUESTIONS.adoc | ||
| - Only the [OPEN] leaves, copied verbatim from QUESTION_TREE.adoc | ||
| - Grouped by Ask role (one section per role) | ||
| - Each question short enough to be answered in 1-3 sentences | ||
|
|
||
| Do not write any other documentation in this phase. Phase 2 will synthesize | ||
| the answered tree into PRD, specification, arc42, and ADRs — only after the | ||
| team has filled in the [OPEN] leaves. | ||
| ``` | ||
|
|
||
| ## What to do after the prompt completes | ||
|
|
||
| 1. **Sanity-check `QUESTION_TREE.adoc`.** Pick three `[ANSWERED]` leaves at random and verify the cited file:line actually contains the claim. If any cite is wrong, the LLM is hallucinating evidence — re-run with a smaller bounded context. | ||
|
|
||
| 2. **Route `OPEN_QUESTIONS.adoc` to the team.** One section per Ask role. Typically 10-15 questions for a small bounded context; if you see 50+, the bounded context is too large. | ||
|
|
||
| 3. **Team writes answers directly into `OPEN_QUESTIONS.adoc`** under each question. Mark deferrals explicitly as `(deferred)` so Phase 2 can decide whether to leave them as gaps in the documentation. | ||
|
|
||
| 4. Only after every leaf has an answer or an explicit deferral, run Phase 2. |
68 changes: 68 additions & 0 deletions
68
skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| # Phase 2 Prompt: Synthesize Documentation | ||
|
|
||
| Run this prompt only after every `[OPEN]` leaf in `OPEN_QUESTIONS.adoc` has either a team answer or an explicit `(deferred)` marker. | ||
|
|
||
| ``` | ||
| You are performing Phase 2 of Socratic Code-Theory Recovery. | ||
|
|
||
| Inputs: | ||
| - QUESTION_TREE.adoc — the answered Question Tree from Phase 1. | ||
| - OPEN_QUESTIONS.adoc — same OPEN leaves, now with team answers (or | ||
| (deferred) markers) written under each question. | ||
|
|
||
| Goal: synthesize documentation from the answered tree. Every claim must be | ||
| traceable to a Q-ID. Team-supplied facts must be marked (team answer). | ||
| Anything still marked (deferred) must remain an explicit gap in the output, | ||
| not be filled with invention. | ||
|
|
||
| Produce four artifacts: | ||
|
|
||
| 1. docs/specs/prd-[context-name].adoc — Product Requirements Document | ||
| - Problem statement, target users, goals, success criteria, scope | ||
| boundaries, constraints, open questions | ||
| - Source: Q1 branch of QUESTION_TREE.adoc | ||
| - Anchor: PRD (Cagan / Pichler) | ||
|
|
||
| 2. docs/specs/use-cases-[context-name].adoc — Specification | ||
| - Persona Use Cases in Cockburn Fully Dressed format at User Goal level: | ||
| Primary Actor, Trigger, Stakeholders & Interests, Preconditions, | ||
| Main Success Scenario, Extensions, Postconditions, Business Rules. | ||
| - System Use Cases for each technical interface (API endpoint, CLI | ||
| command, event, file format): input + validation, processing, | ||
| output + status codes, error responses. | ||
| - Supplementary Specifications: Entity Model, State Machines, Interface | ||
| Contracts, Validation Rules. | ||
| - Gherkin acceptance criteria where applicable. | ||
| - Source: Q2 branch of QUESTION_TREE.adoc | ||
| - Anchor: Cockburn Use Cases | ||
|
|
||
| 3. docs/arc42/arc42-[context-name].adoc — Architecture | ||
| - All 12 arc42 chapters. Mark chapters with no content as | ||
| "No information from Phase 1" rather than fabricating content. | ||
| - Source: Q3 branch of QUESTION_TREE.adoc | ||
| - Anchor: arc42 (Starke / Hruschka) | ||
|
|
||
| 4. docs/specs/adrs/*.adoc — one ADR per significant design decision | ||
| - Nygard format: Title, Status, Context, Decision, Consequences. | ||
| - Include a Pugh Matrix listing the alternatives considered with a | ||
| 3-point scale (-1, 0, +1) against the quality goals from Q4. | ||
| - Source: Q3.9 branch of QUESTION_TREE.adoc | ||
| - Anchor: ADR according to Nygard | ||
|
|
||
| Rules for traceability: | ||
| - Every paragraph references the Q-IDs that support it, in square brackets: | ||
| "The system uses Hexagonal Architecture [Q3.5]." | ||
| - Team-supplied facts get an inline marker: "Sessions expire after 24 hours | ||
| (team answer, Q3.4.2)." | ||
| - Deferred questions stay as explicit gaps: "Quality-goal priorities are | ||
| deferred (Q4.1.deferred) and must be resolved before the next release." | ||
| - Do not introduce facts that do not appear in QUESTION_TREE.adoc or | ||
| OPEN_QUESTIONS.adoc. If a Section feels under-specified, leave it | ||
| under-specified — that is signal, not a defect. | ||
| ``` | ||
|
|
||
| ## After Phase 2 | ||
|
|
||
| - **Spec drift starts immediately.** Re-run Phase 1 against the current code before each release; diff the new Question Tree against the existing documentation to surface NEW (in code, not in spec), CHANGED (diverged), and DEAD (in spec, not in code) findings. | ||
|
|
||
| - **Extend bounded contexts incrementally.** Don't reverse-engineer the whole system in one pass. Pick the next bounded context only when the first one's documentation is being actively used. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| # arc42 — Decomposition Guide for Q3 (Architecture) | ||
|
|
||
| arc42 is a 12-section template for documenting software architecture (Gernot Starke, Peter Hruschka). In this skill, the 12 sections serve as decomposition heuristics for the Q3 branch of the Question Tree — each section becomes a sub-question. | ||
|
|
||
| ## The 12 sections as Q3 sub-questions | ||
|
|
||
| | Q-ID | Section | Sub-question(s) | | ||
| |------|---------|-----------------| | ||
| | Q3.1 | Introduction and Goals | What does the system do at the highest level? Which 3-5 quality goals drive design? Who are the most important stakeholders? | | ||
| | Q3.2 | Architecture Constraints | Which technical, organizational, conventional constraints restrict design choices? | | ||
| | Q3.3 | Context and Scope | What are the system's external interfaces — neighbours, channels, protocols? Business context vs technical context? | | ||
| | Q3.4 | Solution Strategy | Which fundamental decisions and patterns shape the architecture? Technology choices, top-level decomposition, quality-goal approaches, organizational decisions? | | ||
| | Q3.5 | Building Block View | How is the system decomposed into containers, components, classes? Static structure at multiple levels of zoom. | | ||
| | Q3.6 | Runtime View | How do components interact for the most important scenarios — startup, user-visible flows, error handling? | | ||
| | Q3.7 | Deployment View | Which hardware/infrastructure runs the system? Deployment topology, environments, mapping building blocks to infrastructure. | | ||
| | Q3.8 | Crosscutting Concepts | Domain models, architecture/design patterns used, persistence, UI, communication, plausibility checks, exception/error handling, logging, security, internationalisation, configurability? | | ||
| | Q3.9 | Architecture Decisions | Why was each significant decision made? Each becomes a Nygard ADR — see [nygard-adrs.md](nygard-adrs.md). | | ||
| | Q3.10 | Quality Requirements | Quality tree, quality scenarios (when/where/who/measurement). Connects to Q4 (ISO 25010). | | ||
| | Q3.11 | Risks and Technical Debt | Known technical risks, debt items, and their mitigation status. Overlaps with Q5. | | ||
| | Q3.12 | Glossary | Domain terminology — terms the team uses with project-specific meaning. | | ||
|
|
||
| ## Decomposition hints | ||
|
|
||
| - **Q3.1 Quality Goals** is *almost always* `[OPEN]` — priorities live in stakeholder heads, not code. Don't fake a ranking from package structure. | ||
| - **Q3.4 Solution Strategy** and **Q3.9 Architecture Decisions** are the *why* of the system. Code shows *what* was decided; the *why* is `[OPEN]` unless ADRs or commit messages explain it. | ||
| - **Q3.5 Building Block View** is the most code-derivable section. Walk packages/modules and trace dependencies. | ||
| - **Q3.6 Runtime View** is partially derivable — entry points, request flows. Error scenarios are often `[OPEN]` because the team's *intent* differs from what happens to compile. | ||
| - **Q3.11 Risks/Tech Debt** is `[OPEN]` unless TODO/FIXME comments are systematically maintained. Recent bug fixes and reverts often hint at debt the team already knows about. | ||
|
|
||
| ## When to stop decomposing | ||
|
|
||
| A Q3 sub-question is fine-grained enough to be a leaf when: | ||
|
|
||
| - It can be answered with a single file:line reference, or | ||
| - It cannot be answered at all from code (mark `[OPEN]` with category and role). | ||
|
|
||
| Avoid making sub-questions like "How does the system handle errors?" — too broad. Prefer "What happens when `OrderService.create()` is called with a duplicate idempotency key?" — answerable. | ||
|
|
||
| ## Reference | ||
|
|
||
| - Project: https://arc42.org/ | ||
| - Anchor in the catalog: https://llm-coding.github.io/Semantic-Anchors/anchor/arc42 |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q-ID-Notation für Traceability vereinheitlichen
Line 53-54 verlangt Q-IDs in
[], aber Line 55-56 nutzt(team answer, Q3.4.2). Für robuste, automatisierbare Traceability sollte ein einziges zitierbares Format gelten (z. B. immer[...], inkl. Team-Answer-Marker).🤖 Prompt for AI Agents