Skip to content

Commit 7fa4ab0

Browse files
authored
Merge pull request #525 from raifdmueller/fix/socratic-adaptive-depth-qid-522-524
Adaptive Question Tree depth + Q-ID wording fix in Socratic recovery skill
2 parents deb756e + 0c66337 commit 7fa4ab0

14 files changed

Lines changed: 136 additions & 54 deletions

File tree

docs/changelog.adoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ A chronological record of all semantic anchors added to the catalog. Community c
99
* *Architecture Documentation* — sharpened with cross-section traceability and arc42 chapter-level rules. The contract now directs scaffolding the arc42 "with-help" template via docToolchain `downloadTemplate` instead of restating chapter structure; adds five cross-section traceability rules (quality goal → strategy, context ↔ building blocks, building block → runtime, Chapter 9 ADR index, building-block detail); separates Chapter 11 Risks from Technical Debt with probability/impact/priority columns; wires ADR Consequences to Chapter 11 risk IDs; and clarifies that Chapter 1.2 holds the top 3-5 quality goals while Chapter 10 may elaborate further characteristics (https://github.com/LLM-Coding/Semantic-Anchors/issues/502[#502], https://github.com/LLM-Coding/Semantic-Anchors/issues/503[#503], https://github.com/LLM-Coding/Semantic-Anchors/issues/504[#504], https://github.com/LLM-Coding/Semantic-Anchors/issues/505[#505])
1010
* *Socratic Code Theory Recovery* — the Q4 quality wording now states the Chapter 1.2-vs-10 relationship: Chapter 1.2 names only the top quality goals, Chapter 10 covers all eight characteristics, each marked as concretising a top goal or as derived (https://github.com/LLM-Coding/Semantic-Anchors/issues/505[#505])
1111

12+
*Skill updates:*
13+
14+
* *Socratic Code-Theory Recovery skill* — Phase 1 decomposition is now adaptive instead of near-fixed: a node is a leaf only when answerable from specific `file:line` evidence, otherwise it decomposes further (capped at four levels below a fixed node). Tree depth tracks code density, so a large bounded context no longer collapses into one thin leaf per arc42 chapter. Directory-level evidence is no longer valid. The Phase 2 prompt wording was corrected to match the schema — claims trace back to a tree leaf as a build-time check, Q-IDs never appear in the final documents (https://github.com/LLM-Coding/Semantic-Anchors/issues/522[#522], https://github.com/LLM-Coding/Semantic-Anchors/issues/524[#524])
15+
1216
== 2026-05-13
1317

1418
*New anchors:*

docs/socratic-recovery-skill.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Recovers documentation from a brownfield codebase without hallucinating the part
99

1010
=== Phase 1 — Build the Question Tree
1111

12-
The skill prompts the LLM to build a Question Tree from five root questions about a bounded context (Problem/Users, Specification, Architecture, Quality Goals, Risks). Their second level is fixed — every run emits the same enumerated nodes (Q1.1–Q5.5: six PRD elements, six specification categories, the twelve arc42 chapters, the eight ISO/IEC 25010 characteristics plus a priority question, five risk categories), so Q-IDs are stable and trees from different runs can be diffed node-by-node. Free, code-driven decomposition happens only below that fixed level. Each leaf is classified:
12+
The skill prompts the LLM to build a Question Tree from five root questions about a bounded context (Problem/Users, Specification, Architecture, Quality Goals, Risks). Their second level is fixed — every run emits the same enumerated nodes (Q1.1–Q5.5: six PRD elements, six specification categories, the twelve arc42 chapters, the eight ISO/IEC 25010 characteristics plus a priority question, five risk categories), so Q-IDs are stable and trees from different runs can be diffed node-by-node. Adaptive, code-driven decomposition happens only below that fixed level — a node keeps splitting until each leaf maps to one specific `file:line`, so tree depth tracks code density and a large bounded context yields a deeper tree. Each leaf is classified:
1313

1414
* `[ANSWERED]` -- the LLM found it in the code, with `<file>:<line>` evidence
1515
* `[OPEN]` -- the answer is not derivable from code; tagged with a Category and the role that must answer (Product Owner, Architect, Developer, Domain Expert, Operations)

docs/socratic-recovery-skill.de.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Stellt Dokumentation aus einem Brownfield-Codebase wieder her, ohne die Lücken
99

1010
=== Phase 1 — Question Tree aufbauen
1111

12-
Der Skill weist das LLM an, aus fünf Wurzelfragen zum Bounded Context (Problem/User, Spezifikation, Architektur, Qualitätsziele, Risiken) einen Question Tree zu bauen. Ihre zweite Ebene ist fix — jeder Lauf erzeugt dieselben enumerierten Knoten (Q1.1–Q5.5: sechs PRD-Elemente, sechs Spezifikationskategorien, die zwölf arc42-Kapitel, die acht ISO/IEC-25010-Merkmale plus eine Prioritätsfrage, fünf Risikokategorien), sodass Q-IDs stabil sind und Bäume verschiedener Läufe Knoten für Knoten verglichen werden können. Freie, code-getriebene Zerlegung passiert nur unterhalb dieser fixen Ebene. Jedes Blatt wird klassifiziert:
12+
Der Skill weist das LLM an, aus fünf Wurzelfragen zum Bounded Context (Problem/User, Spezifikation, Architektur, Qualitätsziele, Risiken) einen Question Tree zu bauen. Ihre zweite Ebene ist fix — jeder Lauf erzeugt dieselben enumerierten Knoten (Q1.1–Q5.5: sechs PRD-Elemente, sechs Spezifikationskategorien, die zwölf arc42-Kapitel, die acht ISO/IEC-25010-Merkmale plus eine Prioritätsfrage, fünf Risikokategorien), sodass Q-IDs stabil sind und Bäume verschiedener Läufe Knoten für Knoten verglichen werden können. Adaptive, code-getriebene Zerlegung passiert nur unterhalb dieser fixen Ebene — ein Knoten wird so lange zerlegt, bis jedes Blatt auf eine konkrete `file:line` zeigt, sodass die Baumtiefe der Code-Dichte folgt und ein großer Bounded Context einen tieferen Baum ergibt. Jedes Blatt wird klassifiziert:
1313

1414
* `[ANSWERED]` -- das LLM hat es im Code gefunden, mit `<file>:<line>`-Evidenz
1515
* `[OPEN]` -- die Antwort steckt nicht im Code; mit Category und der Rolle markiert, die antworten muss (Product Owner, Architect, Developer, Domain Expert, Operations)

plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md

Lines changed: 3 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-1-question-tree.md

Lines changed: 31 additions & 11 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md

Lines changed: 4 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/examples.md

Lines changed: 23 additions & 9 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md

Lines changed: 3 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

skill/socratic-code-theory-recovery/SKILL.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ Outputs:
7878
- `QUESTION_TREE.adoc` — the full hierarchical reasoning trace
7979
- `OPEN_QUESTIONS.adoc` — only the `[OPEN]` leaves, grouped by Ask role
8080

81-
The five root questions decompose into a **fixed second level** — the same enumerated node set on every run, so Q-IDs are stable and trees from different runs can be diffed node-by-node. Free, code-driven decomposition applies only *below* the fixed level. The fixed nodes:
81+
The five root questions decompose into a **fixed second level** — the same enumerated node set on every run, so Q-IDs are stable and trees from different runs can be diffed node-by-node. Adaptive, code-driven decomposition applies only *below* the fixed level. The fixed nodes:
8282

8383
- **Q1.1–Q1.6** — product identity, primary users, channels, why-built, success metrics, segment priority.
8484
- **Q2.1–Q2.6** — actors, use-case catalog, per-interface system specs, data/entity model, acceptance criteria, cross-cutting business rules. See [references/cockburn-use-cases.md](references/cockburn-use-cases.md).
@@ -88,6 +88,8 @@ The five root questions decompose into a **fixed second level** — the same enu
8888

8989
Every fixed node is emitted even when its only leaf is `[OPEN]` or `[ANSWERED: not applicable]`.
9090

91+
**Depth below the fixed level is adaptive, not fixed.** A node is a leaf only when its question can be answered with specific `file:line` evidence or definitively marked `[OPEN]`. If the honest answer would still be coarse — a whole directory as evidence, one paragraph for an entire arc42 chapter — the node decomposes further (under a four-level cap below each fixed node). Tree depth therefore tracks code density: a small bounded context yields a shallow tree, a large one a deep tree. The fixed skeleton stays diffable; only the depth varies. This prevents the thin-documentation failure where a large context produces one leaf per arc42 chapter and Phase 2 cannot synthesize a substantial chapter without inventing detail.
92+
9193
Leaf classification rules and Q-ID scheme: [references/output-schema.md](references/output-schema.md).
9294

9395
Worked examples — one `[ANSWERED]` and one `[OPEN]` leaf for each major branch: [references/examples.md](references/examples.md).

0 commit comments

Comments
 (0)