Skip to content

docs: add "Anchors and Training Data" article (from #582 discussion)#586

Merged
rdmueller merged 1 commit into
LLM-Coding:mainfrom
raifdmueller:article/training-data-vs-practice
Jun 9, 2026
Merged

docs: add "Anchors and Training Data" article (from #582 discussion)#586
rdmueller merged 1 commit into
LLM-Coding:mainfrom
raifdmueller:article/training-data-vs-practice

Conversation

@raifdmueller

@raifdmueller raifdmueller commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

What

A new website article — An Anchor Delivers Only as Far as the Prior Reaches — on how a semantic anchor's strength depends on how densely the concept sits in an LLM's training data. It grew directly out of the discussion in #582 about the Cockburn Use Cases anchor.

It includes a reproducible A–E experiment (prompts included, run it yourself) across Claude Haiku 4.5, Sonnet 4.6 and Opus 4.8, showing:

  • the anchor reshapes generic requirements into a full Cockburn use case — and secures that behaviour even on a weaker model (portability insurance when you can't pin the model);
  • an anchor delivers whenever its words name a concept the model holds densely (even "slices");
  • when the concept is absent ("Use-Case 3.0"), the model does not error — it silently substitutes the nearest concept it does hold.

Why it matters for the catalog

It draws a clean line between anchors (dense priors), contracts (vocabulary supplied in text, safe even when the prior is weak), and articles (meta-knowledge about a term). This is the reasoning behind keeping the Cockburn Use Cases anchor as-is rather than renaming or modernising it with Use-Case 2.0/3.0.

Changes

  • docs/training-data-vs-practice.adoc — the article
  • wired into render-docs, the router, and the main nav (EN)
  • linked from the Cockburn Use Cases anchor popup (EN + DE)
  • changelog entry

Method note

All model outputs were produced in a clean room (claude -p --setting-sources "" --strict-mcp-config from a neutral directory) so the project's own CLAUDE.md could not bias the results — the article documents this, including a contamination we caught and removed.


@simasch — this is the article I promised on #582. Your pull request is what prompted it and you're credited throughout. I'd value your review of how your point and your role are represented before this goes live.

@JensGrote — would value your eyes too.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Neue Dokumentation "Anker & Trainingsdaten" mit Gedankenexperimenten zu semantischen Ankern und deren Einfluss auf Sprachmodell-Ausgaben hinzugefügt
    • Navigationsmenü erweitert mit direktem Zugriff auf neue Dokumentationsseite (Desktop und Mobile-Version)
    • Mehrsprachige Unterstützung für neue Inhalte (Englisch und Deutsch)
  • Documentation

    • Cockburn Use Cases mit weiterführenden Referenzen ergänzt
    • Changelog mit neuer Artikel-Dokumentation aktualisiert

…scussion)

A website article on how a semantic anchor's strength depends on how densely
the concept sits in an LLM's training data, with a reproducible A-E experiment
across Claude Haiku 4.5, Sonnet 4.6 and Opus 4.8. Wired into render-docs, the
router and the main nav (EN), and linked from the Cockburn Use Cases anchor
popup (EN + DE). Prompted by the discussion in LLM-Coding#582.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Ein neuer Artikel „Training Data vs Practice" wird zur Dokumentation hinzugefügt und über eine neue Website-Route mit vollständiger Navigation und Übersetzung verfügbar gemacht. Die Änderung umfasst den Kern-Artikel, Cross-References in bestehenden Ankern, Routing, Header-Navigation, Internationalisierung und Build-Integration.

Changes

Training Data vs Practice Article & Website Integration

Layer / File(s) Summary
Core Article: Training Data vs Practice
docs/training-data-vs-practice.adoc
Neue Dokumentation mit fünf Abschnitten: Einleitung zum semantischen Anchor-Konzept, Experiment über Benennung und Modell-Abhängigkeit, Messung der Prior-Reichweite, Katalog-Struktur für Anchor/Contract/Article-Layer und Reproduktionsanleitung mit Prompts.
Documentation Cross-References & Changelog
docs/anchors/cockburn-use-cases.adoc, docs/anchors/cockburn-use-cases.de.adoc, docs/changelog.adoc
„Further Reading"- und „Weiterführend"-Abschnitte in Cockburn-Ankern verlinken auf die neue Seite; Changelog dokumentiert den Artikel als neue Publikation mit Experimentbeschreibung.
Website Route Registration & Page Rendering
website/src/main.js, scripts/render-docs.js
Neue Route /training-data-vs-practice wird registriert und mit renderTrainingDataPage verknüpft; Seiten-Renderer lädt die AsciiDoc-Datei; Render-Script konvertiert die Datei in HTML.
Navigation, Header Links & Internationalization
website/src/components/header.js, website/src/utils/router.js, website/src/translations/en.json, website/src/translations/de.json
Desktop- und Mobile-Header erhalten neue Navigationslinks; Übersetzungsschlüssel nav.trainingData werden für Englisch („Anchors & Training Data") und Deutsch („Anker & Trainingsdaten") definiert; Router-Titel für Browser-Tab hinzugefügt.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • LLM-Coding/Semantic-Anchors#193: Modifiziert ebenfalls docs/changelog.adoc für Anker-bezogene Changelog-Einträge; Changelog-Änderung überlappt direkt mit bestehenden Changelog-Updates.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Der PR-Titel fasst die Hauptänderung präzise zusammen: Ein neuer Artikel zum Thema 'Anchors and Training Data' wird hinzugefügt, mit direktem Bezug zur Diskussion in Issue #582.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@website/src/main.js`:
- Around line 330-337: Der Sprachwechsel aktualisiert die neue Doc-Route nicht;
öffne die Switch in handleLanguageChange() und füge einen Fall für die Route
'/training-data-vs-practice' hinzu so dass beim Sprachwechsel die Seite neu
geladen wird (z.B. durch Aufruf von renderTrainingDataPage() oder durch erneutes
Laden des Dokuments via
loadDocContent('docs/training-data-vs-practice.adoc')/mit dem passenden
lokalisierten Pfad); referenziere renderTrainingDataPage und loadDocContent im
neuen case, damit der Artikel-Inhalt beim Wechsel auf EN/DE korrekt ersetzt
wird.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 03a2ef15-0d65-4034-ab90-72bf4e8a000e

📥 Commits

Reviewing files that changed from the base of the PR and between ecffd8d and d8e3fe9.

📒 Files selected for processing (10)
  • docs/anchors/cockburn-use-cases.adoc
  • docs/anchors/cockburn-use-cases.de.adoc
  • docs/changelog.adoc
  • docs/training-data-vs-practice.adoc
  • scripts/render-docs.js
  • website/src/components/header.js
  • website/src/main.js
  • website/src/translations/de.json
  • website/src/translations/en.json
  • website/src/utils/router.js

Comment thread website/src/main.js
Comment on lines +330 to +337
function renderTrainingDataPage() {
const pageContent = document.getElementById('page-content')
if (!pageContent) return

pageContent.innerHTML = renderDocPage()
updateActiveNavLink()
loadDocContent('docs/training-data-vs-practice.adoc')
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sprachwechsel aktualisiert diese neue Doc-Route aktuell nicht.

Für /training-data-vs-practice fehlt im handleLanguageChange()-Switch ein Reload-Pfad. Ergebnis: Beim Umschalten EN/DE bleibt der Artikel-Inhalt in der alten Sprache, obwohl der Rest der UI übersetzt wird.

🔧 Vorschlag
 function handleLanguageChange() {
   const currentRoute = getCurrentRouteSync()

   if (currentRoute === '/about') {
     loadDocContent('docs/about.adoc')
@@
   } else if (currentRoute === '/harness-inventory') {
     loadDocContent('docs/harness-inventory.adoc')
+  } else if (currentRoute === '/training-data-vs-practice') {
+    loadDocContent('docs/training-data-vs-practice.adoc')
   } else if (currentRoute === '/contracts') {
     renderContractsPageHandler()
   } else if (currentRoute === '/') {
     initCardGridVisualization()
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@website/src/main.js` around lines 330 - 337, Der Sprachwechsel aktualisiert
die neue Doc-Route nicht; öffne die Switch in handleLanguageChange() und füge
einen Fall für die Route '/training-data-vs-practice' hinzu so dass beim
Sprachwechsel die Seite neu geladen wird (z.B. durch Aufruf von
renderTrainingDataPage() oder durch erneutes Laden des Dokuments via
loadDocContent('docs/training-data-vs-practice.adoc')/mit dem passenden
lokalisierten Pfad); referenziere renderTrainingDataPage und loadDocContent im
neuen case, damit der Artikel-Inhalt beim Wechsel auf EN/DE korrekt ersetzt
wird.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new long-form documentation article (“An Anchor Delivers Only as Far as the Prior Reaches”) and wires it into the website so it’s routable, visible in navigation, linked from the Cockburn Use Cases anchor, and recorded in the changelog.

Changes:

  • Add new AsciiDoc article docs/training-data-vs-practice.adoc and render it into the website docs build.
  • Register a new SPA route (/training-data-vs-practice) and expose it in desktop/mobile navigation (EN/DE).
  • Link to the new article from the Cockburn Use Cases anchor (EN/DE) and add a changelog entry.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
website/src/utils/router.js Adds route title mapping for the new article route.
website/src/translations/en.json Adds EN nav label for the new page.
website/src/translations/de.json Adds DE nav label for the new page.
website/src/main.js Registers the new route and adds a page renderer that loads the pre-rendered doc.
website/src/components/header.js Adds the new page link to desktop “More” menu and mobile menu.
scripts/render-docs.js Ensures the new AsciiDoc file is pre-rendered to HTML at build time.
docs/training-data-vs-practice.adoc New article content (with experiment + prompts).
docs/changelog.adoc Adds a changelog entry for the new article.
docs/anchors/cockburn-use-cases.adoc Adds “Further Reading” link to the new article.
docs/anchors/cockburn-use-cases.de.adoc Adds “Weiterführend” link to the new article.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 13 to 16
"nav.agentskill": "AgentSkill",
"nav.harnessInventory": "Harness Inventory",
"nav.trainingData": "Anchors & Training Data",
"nav.workflow": "Spec-Driven Dev",
Comment on lines 13 to 16
"nav.agentskill": "AgentSkill",
"nav.harnessInventory": "Harness-Inventar",
"nav.trainingData": "Anker & Trainingsdaten",
"nav.workflow": "Spec-Driven Dev",
Comment on lines +10 to +12
====
*The short version.* A semantic anchor works by triggering a concept the model already learned. Its power is therefore proportional to how _densely_ that concept appears in the training data. We tested this directly: naming "Cockburn use cases" reshapes a generic answer into a full fully-dressed use case (the anchor delivers), while naming "Use-Case 3.0" delivers nothing distinct — the model silently falls back to the nearest concept it does know. That is why an anchor's popup describes the _triggered_ definition, not the state of the art, and why weak-prior terms belong in a *contract* (which supplies its own meaning), not an anchor.
====
Comment on lines 24 to 26
'/harness-inventory': 'The Harness Inventory — Semantic Anchors',
'/training-data-vs-practice': 'Anchors and Training Data — Semantic Anchors',
'/evaluations': 'Evaluations — Semantic Anchors',
@rdmueller rdmueller merged commit 957a67a into LLM-Coding:main Jun 9, 2026
8 checks passed
@JensGrote

JensGrote commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Review Text (GitHub-ready)


Cross-model validation: GPT-5, GPT-5-mini, and Gemini 2.5 Flash confirm and extend the findings

(German version below)

I ran the same A–E experiment against OpenAI GPT-5, GPT-5-mini (via OpenAI API), and Google Gemini 2.5 Flash (via AI Studio), all with empty context — no system prompt, no custom instructions, fresh session per prompt.

The results confirm the article's thesis across three model families and surface one notable divergence worth discussing.


Finding 1: The Cockburn anchor delivers universally (A → C transition)

All five models tested (Claude Opus/Haiku, GPT-5/5-mini, Gemini) show the same structural shift:

Framing GPT-5 GPT-5-mini Gemini 2.5 Flash
A (no anchor) Operational checklist (legal/compliance framing) Practical checklist (implementation-ready) Dual-track timeline (legal + logistics)
C (Cockburn) Full fully-dressed: Scope, Level, Primary Actor, 9 Stakeholders & Interests, Preconditions, Minimal Guarantees, Success Guarantees, Main Success Scenario Full fully-dressed: Goal in Context, Scope, Level, Stakeholders, Preconditions, Minimal Guarantee Full fully-dressed: Level (Sea-level), Stakeholders & Interests, Preconditions, Success Guarantees

Without the anchor: requirements lists, checklists, compliance guides — no use-case structure whatsoever.
With the anchor: immediate activation of the full Cockburn apparatus.

This is the portability insurance the article describes. Even GPT-5-mini, a weaker model, delivers the full Cockburn structure when the anchor is named. The anchor pins the behavior across model families and model sizes.


Finding 2: "Use-Case 3.0" — three distinct failure modes across model families

This is where it gets interesting. The article documents Claude's behavior (Opus hedges transparently, Haiku substitutes silently). The other models show different failure patterns for the same thin prior:

Model P5: "What is Use-Case 3.0?" E: Apply it to Place Order
Claude Opus 4.8 Flags uncertainty, falls back to 2.0 Says "not aware of 3.0," delivers 2.0 slices
Claude Haiku 4.5 Thin, barely knows it Labels output "Use-Case 3.0" but body is plain Cockburn — silent substitution
GPT-5 10 specific principles, detailed and confident Delivers slices structurally identical to D (2.0), with added "Reference flow steps"
GPT-5-mini Explicitly asks: "which Use-Case 3.0 do you mean?" — refuses to guess Delivers slices, calls them "per Jacobson's Use-Case 3.0"
Gemini 2.5 Flash 10 detailed principles with attribution to "Jacobson and Cockburn" Opens with "In Use-Case 2.0 and 3.0" — merges both labels, delivers 2.0 content

Three failure modes for a thin prior:

  1. Transparent hedge (Claude Opus, GPT-5-mini): Model admits uncertainty
  2. Silent substitution (Claude Haiku, Gemini): Model delivers confidently but the content is actually 2.0 or Cockburn under a 3.0 label
  3. Confabulation (GPT-5, Gemini P5): Model invents plausible-sounding principles that may not correspond to any published 3.0 source

Finding 3: The GPT-5 divergence — dense prior or confabulation?

GPT-5 is the outlier. When asked "What is Use-Case 3.0?", it confidently returns 10 specific principles including "Single shared model," "Example- and test-driven," "Scalable and fractal." Gemini does the same (10 principles, attributed to "Jacobson and Cockburn").

This raises the question your article explicitly frames: Is the prior genuinely denser in GPT-5/Gemini (trained on the 2024 IJI eBook?), or are these models confabulating plausible principles from the 2.0 prior?

A verification against the actual Jacobson/Spence/de Mendonca 2024 source would settle this. If the 10 principles match the published text, GPT-5 has a real 3.0 prior and the anchor could work for that model. If they don't match, it's sophisticated confabulation — which is arguably the most dangerous failure mode because it's the hardest to detect.

Either way, this validates the article's core claim: you cannot know whether an anchor delivers until you test it per model. The same term activates real knowledge in one model and fabrication in another.


Finding 4: Gemini shows an additional behavior — visual output

For framing D (Use-Case 2.0 slices), Gemini spontaneously generated an ASCII sequence diagram and a Jacobson-attributed image. It also offered interactive follow-up ("Would you like to see a detailed step-by-step sequence flow for Slice 1?"). This suggests Gemini's prior for Use-Case 2.0 is extremely dense — richer than Claude's in terms of modalities activated.


Finding 5: Attribution is universally correct

All models (P3) correctly identify Jacobson as inventor (OOPSLA 1986/87, OOSE 1992) and Cockburn as the writing-craft codifier (2001). Not a single model misattributes. This confirms what the article states: the misattribution lives in casual human shorthand, not in the models.


Implications for the article

  1. The three-layer model (Anchor / Contract / Article) holds across all tested model families.
  2. The "silent substitution" failure mode the article describes for Haiku also appears in Gemini — it's not a Claude-specific behavior but a general property of weaker or differently-trained models.
  3. Consider adding a note about GPT-5/Gemini P5 results: These models appear to have a denser 3.0 prior than Claude. If verified against the source material, this would demonstrate that anchor viability can change over time as new models are trained on newer data. That's a forward-looking point worth making: today's thin prior may become tomorrow's valid anchor.
  4. Ralfs experiment is reproducible across model families — the prompts confirmed this independently.

Methodology note

  • GPT-5 / GPT-5-mini: OpenAI API, no system message, single-shot per prompt.
  • Gemini 2.5 Flash: Google AI Studio, no system instructions, new chat per prompt.
  • All prompts are identical to those published in the article's "Run It Yourself" section.
  • GPT-5 results stored in anchor-activation-test-20260609/gpt-5/ and gpt-5-mini/; Gemini results in anchor-activation-test-20260609/gemini/.

Deutsche Version

Cross-Modell-Validierung: GPT-5, GPT-5-mini und Gemini 2.5 Flash bestätigen und erweitern die Ergebnisse

Ich habe dasselbe A–E Experiment gegen OpenAI GPT-5, GPT-5-mini (über OpenAI API) und Google Gemini 2.5 Flash (über AI Studio) durchgeführt — alle mit leerem Kontext, kein System-Prompt, neue Session pro Prompt.

Kernergebnisse:

  1. Der Cockburn-Anker liefert universell. Alle fünf getesteten Modelle zeigen denselben Strukturwechsel von A (Checkliste) zu C (Full Fully-Dressed). Die Portabilitätsversicherung wirkt modellübergreifend.

  2. "Use-Case 3.0" zeigt drei verschiedene Failure-Modes:

    • Transparentes Hedging (Claude Opus, GPT-5-mini)
    • Stille Substitution (Claude Haiku, Gemini)
    • Konfabulation (GPT-5, Gemini P5): Erfindet plausibel klingende Prinzipien
  3. GPT-5 Divergenz: GPT-5 liefert 10 spezifische 3.0-Prinzipien — entweder ist der Prior dort tatsächlich dichter (neuere Trainingsdaten inkl. 2024 eBook), oder es ist sophistizierte Konfabulation. Eine Verifikation gegen die Originalquelle würde das klären.

  4. Gemini zeigt zusätzliche Modalitäten: Für Use-Case 2.0 generiert es spontan visuelle Outputs (Diagramme) und bietet interaktive Vertiefung an — extrem dichter Prior.

  5. Attribution universell korrekt: Alle Modelle benennen Jacobson als Erfinder und Cockburn als Schreibhandwerk-Kodierer.

Implikation für den Artikel: Die Drei-Schichten-Unterscheidung (Anchor / Contract / Article) hält über alle getesteten Modellfamilien. Die stille Substitution ist kein Claude-spezifisches Verhalten sondern eine allgemeine Eigenschaft bei dünnem Prior. Der Hinweis auf GPT-5's möglicherweise dichterer 3.0-Prior wäre ein wertvoller Ausblick: Anker-Validität kann sich über die Zeit ändern.

@raifdmueller

Copy link
Copy Markdown
Contributor Author

@JensGrote this is excellent — thank you. Reproducing the whole A–E battery against GPT-5, GPT-5-mini and Gemini 2.5 Flash is exactly the cross-family validation the article was missing (it only tested Claude tiers), and your three-failure-mode breakdown is sharper than what's written today.

Two things stand out:

  • Silent substitution is not Claude-specific. Gemini doing it too is an important correction to how the article frames that mode.
  • Confabulation (GPT-5 / Gemini confidently inventing "10 principles") is the most valuable addition — it is the hardest failure to detect.

On your open question (genuine 3.0 prior vs confabulation): I checked the source. The published Use-Case Foundation (2024) lists nine principles, with quite different wording — "use cases apply to systems of all types", "stakeholder involvement is essential", "tells the whole story", "trigger conversations", "prioritize readability" (IJI guide). GPT-5's "10 principles" ("Single shared model", "Example- and test-driven", "Scalable and fractal") do not match that set — different count, different names. So it leans confabulation, not a denser prior — which makes your GPT-5 result the strongest example of the dangerous mode, not a counterexample. A line-by-line check against the IJI PDF would settle it definitively.

This deserves to be in the article — in your words and with your data. Would you open a PR adding a "Cross-model validation" section? You have the raw outputs (anchor-activation-test-20260609/…); a section covering the universal Cockburn delivery, the three failure modes, and your forward-looking point (anchor viability can shift as models are retrained on newer data) would be a great addition. Happy to review and help wire it into the page (route/nav are already set up). The credit is yours.

rdmueller added a commit that referenced this pull request Jun 9, 2026
- main.js: reload the article on language switch (handleLanguageChange was
  missing a case for /training-data-vs-practice, unlike every other doc route)
- router.js: align the browser/tab title with the article's actual title
- article: drop the "full fully-dressed" redundancy

Addresses CodeRabbit + Copilot review feedback on #586.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
JensGrote pushed a commit to JensGrote/Semantic-Anchors that referenced this pull request Jun 10, 2026
Reproduces the A–E anchor activation battery against GPT-5,
GPT-5-mini and Gemini 2.5 Flash. Confirms the mechanism is
model-family-independent and documents a third failure mode
(confabulation) not visible in the Claude-only test.

Addresses: LLM-Coding#586 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants