Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions skill/socratic-code-theory-recovery/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
---
name: socratic-code-theory-recovery
description: Recover the "theory" (Naur 1985) of an existing codebase through recursive question refinement before writing documentation. Use on brownfield projects where the spec is missing — produces a Question Tree separating what is answerable from code (with evidence) from what must be asked of the team (routed by role). Phase 1 builds the tree; team answers the OPEN leaves; Phase 2 synthesizes PRD, Cockburn use cases, arc42 architecture, and Nygard ADRs from the answered tree.
metadata:
author: LLM-Coding
version: "0.1"
source: https://github.com/LLM-Coding/Semantic-Anchors
license: MIT
---

# Socratic Code-Theory Recovery

Reverse-engineer a bounded context into documentation without hallucinating the parts the code cannot tell you.

## When to use this skill

Use this skill on a brownfield codebase when:

- Documentation is missing, outdated, or untrusted.
- A change is about to be made and you need a spec before you can change safely.
- You want documentation that distinguishes code-derived facts from team-supplied context — auditable, not generated prose.
- You want to surface the *open questions* in the system, not just synthesize an answer the team has not seen.

Do **not** use this skill when:

- You are doing greenfield development — use the spec-driven workflow instead.
- The whole system needs to be documented at once — work one bounded context at a time.
- The code is not runnable — fix that first.

## The contract

This skill implements the *Socratic Code-Theory Recovery* contract from the Semantic Anchors project. The methodology rests on Peter Naur's 1985 paper *Programming as Theory Building*: a program's theory lives in the heads of its developers and cannot be fully captured in code alone. A documentation-recovery process that ignores this produces confident-looking prose that fills in the gaps with invention.

The fix: model the gaps explicitly. Every question about the system is either `[ANSWERED]` from code (with file:line evidence) or `[OPEN]` (with a category and the role that must answer it). The OPEN leaves are the handoff to humans.

## Two-phase workflow

```
┌────────────────────────────────┐
Phase 1 │ CODE ──► Question Tree │
│ ├─ [ANSWERED] leaves│
│ └─ [OPEN] leaves │
└────────────────┬───────────────┘
┌────────────────────────────────┐
Between │ OPEN_QUESTIONS.adoc │
│ ──► team (routed by role) │
│ ──► answers fill in OPENs │
└────────────────┬───────────────┘
┌────────────────────────────────┐
Phase 2 │ Answered tree ──► Docs │
│ PRD · Cockburn UCs · arc42 · │
│ Nygard ADRs (every claim Q-ID) │
└────────────────────────────────┘
```

### Phase 1: Build the Question Tree

Use [prompts/phase-1-question-tree.md](prompts/phase-1-question-tree.md). Adapt the bounded-context path and any domain-specific Q1 examples; do not change the leaf classification, Q-ID scheme, or output files.

Outputs:

- `QUESTION_TREE.adoc` — the full hierarchical reasoning trace
- `OPEN_QUESTIONS.adoc` — only the `[OPEN]` leaves, grouped by Ask role

Decomposition heuristics — use these Semantic Anchors as guides, not as rigid templates:

- **arc42** — 12 architecture sub-questions (Q3 branch). See [references/arc42.md](references/arc42.md).
- **Cockburn Use Cases** — specification structure (Q2 branch). See [references/cockburn-use-cases.md](references/cockburn-use-cases.md).
- **ISO/IEC 25010** — 8 quality characteristics (Q4 branch). See [references/iso-25010.md](references/iso-25010.md).
- **Nygard ADRs** — design-rationale capture (Q3.9 branch). See [references/nygard-adrs.md](references/nygard-adrs.md).

Leaf classification rules and Q-ID scheme: [references/output-schema.md](references/output-schema.md).

Worked examples — one `[ANSWERED]` and one `[OPEN]` leaf for each major branch: [references/examples.md](references/examples.md).

### Between Phases: Team answers the OPEN leaves

Route `OPEN_QUESTIONS.adoc` to the people whose role appears in each section: Product Owner, Architect, Developer, Domain Expert, Operations. In one controlled experiment with a 13,000-line Go codebase, 11 targeted OPEN questions were enough to close the gap to the original documentation.

Team answers are written **directly into `OPEN_QUESTIONS.adoc`** under each question, marked clearly. Do not call Phase 2 until every OPEN leaf has either an answer or an explicit `(deferred)` marker.

### Phase 2: Synthesize documentation

Use [prompts/phase-2-synthesize.md](prompts/phase-2-synthesize.md). The Phase 2 LLM reads the answered tree and produces:

- **PRD** from the Q1 branch (problem, users, goals, success criteria)
- **Specification** from the Q2 branch (Cockburn use cases at User Goal level, system use cases for each technical interface, supplementary specifications)
- **arc42** with all 12 chapters from the Q3 branch
- **Nygard ADRs** with Pugh Matrix from the Q3.9 branch

Every claim references a Q-ID. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently.

## What the LLM can and cannot recover

A controlled experiment (deleting documentation from a greenfield project and regenerating it from code) showed:

**Derivable from code**: functional requirements, acceptance criteria, building-block views, glossary, security mechanisms, crosscutting concepts.

**NOT derivable from code**: business context, design rationale (the ADR "why"), quality-goal *priorities*, stakeholder concerns, aspirational features, performance budgets, tutorials, review results.

If your synthesized documentation contains a claim from the second list without a `(team answer)` marker, the LLM hallucinated it. Mark it `[OPEN]` and ask the team.

## Spec drift and reconciliation

After this skill produces documentation, the implementation LLM will add security hardening, validation rules, and edge cases that are not in the spec. This is structural, not a discipline problem. Re-run Phase 1 against the current code periodically — before a release, after a security review, before onboarding — and diff against the existing spec. The diff reveals NEW (in code, not in spec), CHANGED (diverged), and DEAD (in spec, not in code).

## Further reading

- Peter Naur, *Programming as Theory Building* (1985). https://pages.cs.wisc.edu/~remzi/Naur.pdf
- Brownfield Workflow (Semantic Anchors). https://llm-coding.github.io/Semantic-Anchors/brownfield
- Brownfield Experiment Report. https://llm-coding.github.io/Semantic-Anchors/brownfield-experiment-report
- Fair Comparison Report (three recovery approaches). https://llm-coding.github.io/Semantic-Anchors/brownfield-fair-comparison
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Phase 1 Prompt: Build the Question Tree

Copy the block below into a session that has read access to the bounded context. Replace `[bounded context path]` with the actual path. Adapt the Q1-Q5 examples if your domain has different starting concerns, but do not change the leaf classification, Q-ID scheme, or output files.

```
You are performing Socratic Code-Theory Recovery on a brownfield bounded
context located at [bounded context path]. Phase 1 of two.

Goal: recover the program's theory (Naur, 1985) from source code through
recursive question refinement, before any documentation is written.

Process:

1. Start with five high-level questions about the bounded context:
Q1 What problem does this bounded context solve, and for whom?
Q2 What is the specification of this bounded context?
Q3 What is the architecture of this bounded context?
Q4 What quality goals drive the design?
Q5 What risks and technical debt exist?

2. Decompose each question recursively. Use these Semantic Anchors as
decomposition guides:
- arc42 — 12 sub-questions for architecture (Q3 branch)
- Cockburn Use Cases — Primary Actor, Trigger, Main Success Scenario,
Extensions, Postconditions for specification (Q2 branch)
- ISO/IEC 25010 — 8 quality characteristics for quality goals (Q4 branch)
- Nygard ADRs — Context, Decision, Status, Consequences for design
rationale (Q3.9 branch)
Stop decomposing when a question is precise enough to be answered with a
single piece of code evidence or a single fact from a stakeholder.

3. Assign a hierarchical Q-ID to every node (Q1, Q1.2, Q1.2.3, ...) so that
later documentation can cite back to it.

4. For each leaf, classify it:

[ANSWERED]
- You found the answer in the code.
- Cite the evidence as <file>:<line> or <file>::<function>.
- Be exact. No "see X for details."

[OPEN]
- The answer is not derivable from code alone.
- Category: business-context | design-rationale | quality-goals |
stakeholder-context | future-direction
- Ask role: Product Owner | Architect | Developer | Domain Expert |
Operations
- State precisely what cannot be answered, and why.

5. Output two files in AsciiDoc:

QUESTION_TREE.adoc
- Full hierarchical tree with all nodes and Q-IDs
- Each leaf marked [ANSWERED] (with evidence) or [OPEN] (with Category
and Ask role)
- Includes all reasoning, not only the leaves

OPEN_QUESTIONS.adoc
- Only the [OPEN] leaves, copied verbatim from QUESTION_TREE.adoc
- Grouped by Ask role (one section per role)
- Each question short enough to be answered in 1-3 sentences

Do not write any other documentation in this phase. Phase 2 will synthesize
the answered tree into PRD, specification, arc42, and ADRs — only after the
team has filled in the [OPEN] leaves.
```

## What to do after the prompt completes

1. **Sanity-check `QUESTION_TREE.adoc`.** Pick three `[ANSWERED]` leaves at random and verify the cited file:line actually contains the claim. If any cite is wrong, the LLM is hallucinating evidence — re-run with a smaller bounded context.

2. **Route `OPEN_QUESTIONS.adoc` to the team.** One section per Ask role. Typically 10-15 questions for a small bounded context; if you see 50+, the bounded context is too large.

3. **Team writes answers directly into `OPEN_QUESTIONS.adoc`** under each question. Mark deferrals explicitly as `(deferred)` so Phase 2 can decide whether to leave them as gaps in the documentation.

4. Only after every leaf has an answer or an explicit deferral, run Phase 2.
68 changes: 68 additions & 0 deletions skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Phase 2 Prompt: Synthesize Documentation

Run this prompt only after every `[OPEN]` leaf in `OPEN_QUESTIONS.adoc` has either a team answer or an explicit `(deferred)` marker.

```
You are performing Phase 2 of Socratic Code-Theory Recovery.

Inputs:
- QUESTION_TREE.adoc — the answered Question Tree from Phase 1.
- OPEN_QUESTIONS.adoc — same OPEN leaves, now with team answers (or
(deferred) markers) written under each question.

Goal: synthesize documentation from the answered tree. Every claim must be
traceable to a Q-ID. Team-supplied facts must be marked (team answer).
Anything still marked (deferred) must remain an explicit gap in the output,
not be filled with invention.

Produce four artifacts:

1. docs/specs/prd-[context-name].adoc — Product Requirements Document
- Problem statement, target users, goals, success criteria, scope
boundaries, constraints, open questions
- Source: Q1 branch of QUESTION_TREE.adoc
- Anchor: PRD (Cagan / Pichler)

2. docs/specs/use-cases-[context-name].adoc — Specification
- Persona Use Cases in Cockburn Fully Dressed format at User Goal level:
Primary Actor, Trigger, Stakeholders & Interests, Preconditions,
Main Success Scenario, Extensions, Postconditions, Business Rules.
- System Use Cases for each technical interface (API endpoint, CLI
command, event, file format): input + validation, processing,
output + status codes, error responses.
- Supplementary Specifications: Entity Model, State Machines, Interface
Contracts, Validation Rules.
- Gherkin acceptance criteria where applicable.
- Source: Q2 branch of QUESTION_TREE.adoc
- Anchor: Cockburn Use Cases

3. docs/arc42/arc42-[context-name].adoc — Architecture
- All 12 arc42 chapters. Mark chapters with no content as
"No information from Phase 1" rather than fabricating content.
- Source: Q3 branch of QUESTION_TREE.adoc
- Anchor: arc42 (Starke / Hruschka)

4. docs/specs/adrs/*.adoc — one ADR per significant design decision
- Nygard format: Title, Status, Context, Decision, Consequences.
- Include a Pugh Matrix listing the alternatives considered with a
3-point scale (-1, 0, +1) against the quality goals from Q4.
- Source: Q3.9 branch of QUESTION_TREE.adoc
- Anchor: ADR according to Nygard

Rules for traceability:
- Every paragraph references the Q-IDs that support it, in square brackets:
"The system uses Hexagonal Architecture [Q3.5]."
- Team-supplied facts get an inline marker: "Sessions expire after 24 hours
(team answer, Q3.4.2)."
- Deferred questions stay as explicit gaps: "Quality-goal priorities are
Comment on lines +53 to +57
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Q-ID-Notation für Traceability vereinheitlichen

Line 53-54 verlangt Q-IDs in [], aber Line 55-56 nutzt (team answer, Q3.4.2). Für robuste, automatisierbare Traceability sollte ein einziges zitierbares Format gelten (z. B. immer [...], inkl. Team-Answer-Marker).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md` around
lines 53 - 57, Vereinheitliche die Q-ID-Notation auf das eckige Klammerformat
für alle Referenzen: ersetze occurrences of the parenthetical form "(team
answer, Q3.4.2)" und ähnliche mit konsistenten bracket-Notationen like "[team
answer, Q3.4.2]" und passe die Beispielzeilen so alle Paragraph-Referenzen,
Team-supplied facts und deferred questions das gleiche zitierbare Format "[...]"
verwenden; aktualisiere die drei Beispiele in der Sektion (die Zeilen mit "The
system uses Hexagonal Architecture [Q3.5].", "Sessions expire after 24 hours
(team answer, Q3.4.2)." und "Deferred questions ...") so sie einheitlich
"[Q...]" bzw. "[team answer, Q...]" nutzen.

deferred (Q4.1.deferred) and must be resolved before the next release."
- Do not introduce facts that do not appear in QUESTION_TREE.adoc or
OPEN_QUESTIONS.adoc. If a Section feels under-specified, leave it
under-specified — that is signal, not a defect.
```

## After Phase 2

- **Spec drift starts immediately.** Re-run Phase 1 against the current code before each release; diff the new Question Tree against the existing documentation to surface NEW (in code, not in spec), CHANGED (diverged), and DEAD (in spec, not in code) findings.

- **Extend bounded contexts incrementally.** Don't reverse-engineer the whole system in one pass. Pick the next bounded context only when the first one's documentation is being actively used.
42 changes: 42 additions & 0 deletions skill/socratic-code-theory-recovery/references/arc42.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# arc42 — Decomposition Guide for Q3 (Architecture)

arc42 is a 12-section template for documenting software architecture (Gernot Starke, Peter Hruschka). In this skill, the 12 sections serve as decomposition heuristics for the Q3 branch of the Question Tree — each section becomes a sub-question.

## The 12 sections as Q3 sub-questions

| Q-ID | Section | Sub-question(s) |
|------|---------|-----------------|
| Q3.1 | Introduction and Goals | What does the system do at the highest level? Which 3-5 quality goals drive design? Who are the most important stakeholders? |
| Q3.2 | Architecture Constraints | Which technical, organizational, conventional constraints restrict design choices? |
| Q3.3 | Context and Scope | What are the system's external interfaces — neighbours, channels, protocols? Business context vs technical context? |
| Q3.4 | Solution Strategy | Which fundamental decisions and patterns shape the architecture? Technology choices, top-level decomposition, quality-goal approaches, organizational decisions? |
| Q3.5 | Building Block View | How is the system decomposed into containers, components, classes? Static structure at multiple levels of zoom. |
| Q3.6 | Runtime View | How do components interact for the most important scenarios — startup, user-visible flows, error handling? |
| Q3.7 | Deployment View | Which hardware/infrastructure runs the system? Deployment topology, environments, mapping building blocks to infrastructure. |
| Q3.8 | Crosscutting Concepts | Domain models, architecture/design patterns used, persistence, UI, communication, plausibility checks, exception/error handling, logging, security, internationalisation, configurability? |
| Q3.9 | Architecture Decisions | Why was each significant decision made? Each becomes a Nygard ADR — see [nygard-adrs.md](nygard-adrs.md). |
| Q3.10 | Quality Requirements | Quality tree, quality scenarios (when/where/who/measurement). Connects to Q4 (ISO 25010). |
| Q3.11 | Risks and Technical Debt | Known technical risks, debt items, and their mitigation status. Overlaps with Q5. |
| Q3.12 | Glossary | Domain terminology — terms the team uses with project-specific meaning. |

## Decomposition hints

- **Q3.1 Quality Goals** is *almost always* `[OPEN]` — priorities live in stakeholder heads, not code. Don't fake a ranking from package structure.
- **Q3.4 Solution Strategy** and **Q3.9 Architecture Decisions** are the *why* of the system. Code shows *what* was decided; the *why* is `[OPEN]` unless ADRs or commit messages explain it.
- **Q3.5 Building Block View** is the most code-derivable section. Walk packages/modules and trace dependencies.
- **Q3.6 Runtime View** is partially derivable — entry points, request flows. Error scenarios are often `[OPEN]` because the team's *intent* differs from what happens to compile.
- **Q3.11 Risks/Tech Debt** is `[OPEN]` unless TODO/FIXME comments are systematically maintained. Recent bug fixes and reverts often hint at debt the team already knows about.

## When to stop decomposing

A Q3 sub-question is fine-grained enough to be a leaf when:

- It can be answered with a single file:line reference, or
- It cannot be answered at all from code (mark `[OPEN]` with category and role).

Avoid making sub-questions like "How does the system handle errors?" — too broad. Prefer "What happens when `OrderService.create()` is called with a duplicate idempotency key?" — answerable.

## Reference

- Project: https://arc42.org/
- Anchor in the catalog: https://llm-coding.github.io/Semantic-Anchors/anchor/arc42
Loading
Loading