Skip to content

Commit 35d5768

Browse files
rdmuellerclaude
andcommitted
Add landing page, move reports to reports/ subfolder
Landing page describes the experiment, links to all reports and prompts, shows key findings and the three-approach comparison table. Reports moved to src/docs/reports/ so docToolchain shows them in menu. Menu: Reports | Prompts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b8b0cca commit 35d5768

7 files changed

Lines changed: 139 additions & 6 deletions

docToolchainConfig.groovy

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,11 @@ outputPath = 'build'
33
inputPath = 'src/docs'
44

55
inputFiles = [
6-
[file: 'report.adoc', formats: ['html']],
7-
[file: 'experiment-1a-direct.adoc', formats: ['html']],
8-
[file: 'experiment-1c-socratic.adoc', formats: ['html']],
9-
[file: 'experiment-2-twophase.adoc', formats: ['html']],
10-
[file: 'experiment-fair-comparison.adoc', formats: ['html']],
6+
[file: 'reports/00-consolidated.adoc', formats: ['html']],
7+
[file: 'reports/experiment-1a-direct.adoc', formats: ['html']],
8+
[file: 'reports/experiment-1c-socratic.adoc', formats: ['html']],
9+
[file: 'reports/experiment-2-twophase.adoc', formats: ['html']],
10+
[file: 'reports/experiment-fair-comparison.adoc', formats: ['html']],
1111
]
1212

1313
imageDirs = ['images/.']
@@ -18,9 +18,10 @@ microsite.with {
1818
contextPath = '/'
1919
host = 'https://llm-coding.github.io/brownfield-experiment'
2020
title = 'Brownfield Experiment'
21+
landingPage = 'landingpage.gsp'
2122
footerGithub = 'https://github.com/LLM-Coding/brownfield-experiment'
2223
footerText = '<small class="text-white">built with <a href="https://doctoolchain.org">docToolchain</a></small>'
2324
issueUrl = 'https://github.com/LLM-Coding/brownfield-experiment/issues/new'
2425
gitRepoUrl = 'https://github.com/LLM-Coding/brownfield-experiment/edit/main/src/docs'
25-
menu = ['report':'Report', 'experiment-1a-direct':'Direct (1a)', 'experiment-1c-socratic':'Socratic (1c)', 'experiment-2-twophase':'Two-Phase', 'experiment-fair-comparison':'Fair Comparison']
26+
menu = ['reports':'Reports', 'prompts':'Prompts']
2627
}

src/docs/landingpage.gsp

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
<div class="row flex-xl-nowrap">
2+
<main class="col-12 col-md-12 col-xl-12 pl-md-12" role="main">
3+
<!-- Hero Section -->
4+
<div class="p-5 rounded" style="background: linear-gradient(135deg, #1a365d 0%, #2d5a87 100%); color: white; margin-bottom: 2rem;">
5+
<h1 style="font-size: 2.5rem; font-weight: 700;">Socratic Code Theory Recovery</h1>
6+
<p class="lead" style="font-size: 1.4rem; opacity: 0.95;">
7+
Can an LLM reverse-engineer software documentation from code?
8+
</p>
9+
<p style="font-size: 1.1rem; opacity: 0.85; max-width: 700px;">
10+
A controlled experiment measuring what LLMs can and cannot recover from source code alone. We deleted all documentation from a well-documented project, asked an LLM to reconstruct it, and compared the output against the originals.
11+
</p>
12+
<p style="margin-top: 1.5rem;">
13+
<a href="reports/00-consolidated.html" class="btn btn-light btn-lg" style="font-weight: 600;">
14+
Read the Report
15+
</a>
16+
<a href="https://github.com/LLM-Coding/brownfield-experiment" class="btn btn-outline-light btn-lg" style="margin-left: 0.5rem;">
17+
GitHub
18+
</a>
19+
</p>
20+
</div>
21+
22+
<!-- Key Results -->
23+
<h2 style="text-align: center; margin-bottom: 2rem; color: #1a365d;">Key Findings</h2>
24+
25+
<div class="row row-cols-1 row-cols-md-2 mb-4">
26+
<div class="col mb-4">
27+
<div class="card h-100 shadow-sm border-0" style="border-left: 4px solid #16a34a !important;">
28+
<div class="card-body">
29+
<h4 class="card-title" style="color: #16a34a;">LLM recovers from code</h4>
30+
<p class="card-text">
31+
Functional requirements (21 vs 7 in original), acceptance criteria (69 vs 40), building block views, glossary (31 vs 2 terms), security documentation. In some areas, the generated output was <strong>better</strong> than the original.
32+
</p>
33+
</div>
34+
</div>
35+
</div>
36+
<div class="col mb-4">
37+
<div class="card h-100 shadow-sm border-0" style="border-left: 4px solid #dc2626 !important;">
38+
<div class="card-body">
39+
<h4 class="card-title" style="color: #dc2626;">LLM cannot recover from code</h4>
40+
<p class="card-text">
41+
Business context (why, against whom), design rationale (why alternative A over B), quality goal <em>priorities</em>, stakeholder concerns, aspirational features, performance budgets. Code is the result of decisions, not the decision itself.
42+
</p>
43+
</div>
44+
</div>
45+
</div>
46+
<div class="col mb-4">
47+
<div class="card h-100 shadow-sm border-0" style="border-left: 4px solid #2563eb !important;">
48+
<div class="card-body">
49+
<h4 class="card-title" style="color: #2563eb;">11 questions close the gap</h4>
50+
<p class="card-text">
51+
The two-phase workflow identifies exactly what the team needs to provide. In our experiment, 11 targeted questions (routed by role) were sufficient to produce documentation matching the original's ADR topics, quality goals, and performance budgets.
52+
</p>
53+
</div>
54+
</div>
55+
</div>
56+
<div class="col mb-4">
57+
<div class="card h-100 shadow-sm border-0" style="border-left: 4px solid #9333ea !important;">
58+
<div class="card-body">
59+
<h4 class="card-title" style="color: #9333ea;">Semantic Anchors validated</h4>
60+
<p class="card-text">
61+
Terms like "arc42", "Cockburn", "Nygard ADR" serve as both <strong>prompt compression</strong> (69 lines produce 3,850 lines of correct output) and <strong>decomposition heuristics</strong> ("arc42" generates 12 MECE sub-questions automatically).
62+
</p>
63+
</div>
64+
</div>
65+
</div>
66+
</div>
67+
68+
<!-- Three Approaches -->
69+
<div class="bg-light p-4 rounded mb-4">
70+
<h3 style="color: #1a365d; margin-bottom: 1.5rem;">Three Approaches Compared</h3>
71+
<table class="table table-bordered">
72+
<thead style="background-color: #1a365d; color: white;">
73+
<tr><th>Approach</th><th>Score</th><th>Strength</th><th>Report</th></tr>
74+
</thead>
75+
<tbody>
76+
<tr>
77+
<td><strong>Direct</strong></td>
78+
<td>17.5/30</td>
79+
<td>Most detailed functional requirements, inline threat model</td>
80+
<td><a href="reports/experiment-1a-direct.html">Detailed report</a></td>
81+
</tr>
82+
<tr>
83+
<td><strong>Socratic</strong></td>
84+
<td>18.5/30</td>
85+
<td>Only version with correct quality goal priorities, most efficient (21% of original)</td>
86+
<td><a href="reports/experiment-1c-socratic.html">Detailed report</a></td>
87+
</tr>
88+
<tr>
89+
<td><strong>Two-Phase</strong></td>
90+
<td>22/30</td>
91+
<td>All 5 ADR topics correct, highest traceability (50 team-answer markers)</td>
92+
<td><a href="reports/experiment-2-twophase.html">Detailed report</a></td>
93+
</tr>
94+
</tbody>
95+
</table>
96+
<p class="text-center">
97+
<a href="reports/experiment-fair-comparison.html" class="btn btn-outline-primary">Fair Comparison (all with team answers)</a>
98+
</p>
99+
</div>
100+
101+
<!-- Prompts Section -->
102+
<div class="mb-4 p-4">
103+
<h3 style="color: #1a365d; margin-bottom: 1.5rem;">Reproduce the Experiment</h3>
104+
<p>
105+
All prompts are available in the <a href="https://github.com/LLM-Coding/brownfield-experiment/tree/main/src/docs/prompts">prompts/</a> directory. Use them on the <a href="https://github.com/docToolchain/Bausteinsicht">Bausteinsicht</a> repo (branch <code>brownfield</code>) or on your own project.
106+
</p>
107+
<table class="table">
108+
<thead><tr><th>Prompt</th><th>Lines</th><th>Use when</th></tr></thead>
109+
<tbody>
110+
<tr><td><a href="https://github.com/LLM-Coding/brownfield-experiment/blob/main/src/docs/prompts/01-direct.md">01-direct.md</a></td><td>69</td><td>Quick documentation from code alone</td></tr>
111+
<tr><td><a href="https://github.com/LLM-Coding/brownfield-experiment/blob/main/src/docs/prompts/02-socratic.md">02-socratic.md</a></td><td>97</td><td>Identifying knowledge gaps</td></tr>
112+
<tr><td><a href="https://github.com/LLM-Coding/brownfield-experiment/blob/main/src/docs/prompts/03-twophase-p1.md">03-twophase-p1.md</a></td><td>51</td><td>Phase 1: Build Question Tree</td></tr>
113+
<tr><td><a href="https://github.com/LLM-Coding/brownfield-experiment/blob/main/src/docs/prompts/04-twophase-p2.md">04-twophase-p2.md</a></td><td>61</td><td>Phase 2: Synthesize with team answers</td></tr>
114+
<tr><td><a href="https://github.com/LLM-Coding/brownfield-experiment/blob/main/src/docs/prompts/05-reconcile.md">05-reconcile.md</a></td><td>82</td><td>Detect spec drift</td></tr>
115+
</tbody>
116+
</table>
117+
</div>
118+
119+
<!-- Theoretical Foundation -->
120+
<div class="text-center p-4 rounded" style="background-color: #f0f7ff;">
121+
<h3 style="color: #1a365d;">Built on Theory</h3>
122+
<p style="max-width: 700px; margin: 0 auto;">
123+
Peter Naur argued that a program's "theory" &mdash; the mental model of how problem maps to solution &mdash; cannot be fully documented.
124+
This experiment tests that claim: for LLM-generated code, the theory <em>can</em> be externalized in structured documentation. And for legacy code, a recursive question tree can recover most of it.
125+
</p>
126+
<p style="margin-top: 1rem;">
127+
<a href="https://llm-coding.github.io/Semantic-Anchors/brownfield" class="btn btn-primary">Brownfield Workflow</a>
128+
<a href="https://llm-coding.github.io/Semantic-Anchors/spec-driven-development" class="btn btn-outline-secondary">Spec-Driven Development</a>
129+
</p>
130+
</div>
131+
</main>
132+
</div>
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)