[Process Proposal]: Formal Artifact Taxonomy — Anchor vs. Contract vs. Skill

# [Process Proposal]: Formal Artifact Taxonomy — Anchor vs. Contract vs. Skill

## Issue Text (GitHub-ready)

---

*(German version below)*

### Motivation

While reviewing PR #582 (Use Cases rename) with @raifdmueller, we found ourselves reconstructing from first principles what a Semantic Anchor actually *is* — and how it differs from a Contract or a Skill. The key insight that emerged: an Anchor is not a definition of a concept. It's an **activation signal for a pre-existing knowledge cluster** in an LLM's training data. It focuses; it does not teach.

This distinction was never formally written down. The consequence is visible across multiple independent threads:

- **PR #582**: A contributor renamed "Cockburn Use Cases" to "Use Cases" — a perfectly reasonable move from a practitioner perspective, but one that destroys the anchor's focusing property because the generic term activates multiple knowledge clusters simultaneously (UML, ISO, Cockburn). The reviewer had to explain activation theory ad hoc in a PR comment.
- **#529**: The Architecture Documentation contract grew into a multi-page procedural document because there was no formal rule saying "a Contract pins vocabulary; a Skill carries procedure." The boundary had to be re-derived from scratch in the issue body.
- **#580**: The project is invisible to AI search engines despite strong SERP performance. One contributing factor: without a clear, extractable statement of what each artifact type *is*, crawlers cannot categorize and cite the content correctly. A formal taxonomy provides exactly the direct-answer blocks that AI crawlers need.
- **The PR #582 review discussion** (raifdmueller vs. simasch) crystallized the deeper principle: simasch argued from practitioner reality ("I don't distinguish Cockburn from Jacobson in daily work"), raifdmueller argued from LLM activation theory ("the names summon different concept fields"). Both are right in their respective frame — but the project needs to make explicit *which* frame it operates in.

Each time, the same taxonomy was implicitly reconstructed. It's time to write it down once.

### Proposal

I've drafted an ADR that formally defines the three artifact types:

| Type | Function | Linguistic Shape | Loading |
|------|----------|-----------------|---------|
| **Anchor** | "What do I know?" — Focuses pre-existing knowledge | Noun (e.g. "Cockburn Use Cases") | Passive (activation signal) |
| **Contract** | "What may I do?" — Pins vocabulary + invariants | Declarative ("follows/NEVER/MUST") | Always-on |
| **Skill** | "How do I do it?" — Supplies procedure | Verb/Imperative ("Authoring/Create") | On-demand |

The core discrimination test:
- Knowledge already dense in the LLM → **Anchor**
- Need to pin which anchors apply and set guardrails → **Contract**
- Need to describe *how* to do it concretely → **Skill**

### What this would enable

1. A clear contributor guide: "Is my artifact an Anchor, a Contract, or a Skill?"
2. A formal basis for the split proposed in #529 (overscoped Contracts → Skill extraction)
3. A reviewable criterion for anchor naming decisions (PR #582: does the name hit one cluster or many?)
4. Extractable "What is X?" definitions for each artifact type, supporting #580's direct-answer blocks

### Full ADR Draft

<details>
<summary>Click to expand the complete ADR</summary>

# ADR: Artifact Taxonomy — Anchor vs. Contract vs. Skill

**Status:** Proposed
**Date:** 2026-06-08
**Author:** Jens Grote
**Related:** [PR #582](https://github.com/LLM-Coding/Semantic-Anchors/pull/582), [Issue #529](https://github.com/LLM-Coding/Semantic-Anchors/issues/529), [Issue #580](https://github.com/LLM-Coding/Semantic-Anchors/issues/580)

---

## Introduction

The Semantic Anchors catalog ships three distinct artifact types — Anchors, Contracts, and Skills — but their boundaries are defined only implicitly through examples and convention. This ADR proposes an explicit taxonomy that answers:

1. **What is a Semantic Anchor, fundamentally?** Not a definition of a concept, but an *activation signal* for pre-existing knowledge in an LLM's training data. It selects and focuses — it does not teach.
2. **How does it differ from a Contract and a Skill?** Each artifact type has a distinct function, loading behavior, and linguistic shape.
3. **Why does this matter for search engines and LLM citability?** If the project cannot clearly state what its own artifact types *are* in extractable, self-contained prose (#580), neither humans nor AI crawlers can reliably categorize and cite them.

### The core insight (from PR #582)

When you prompt a bare LLM "Write me a use case for feature X", it draws from at least three traditions simultaneously: UML/Jacobson (diagrams, «include»/«extend»), ISO 29148 (formal requirement structure), and Cockburn (Fully Dressed Template, Goal Levels). The result is an undifferentiated blend.

A Semantic Anchor like "Cockburn Use Cases" acts as a **focusing lens**: it tells the LLM "You already know this — use *this specific* knowledge cluster, not the others." It adds no new information; it selects.

This is fundamentally different from a Contract (which sets boundaries: "You MUST use this format, you MUST NOT do that") and from a Skill (which teaches a procedure the model doesn't inherently know how to execute).

The rename discussion in PR #582 demonstrated that weakening the anchor name from "Cockburn Use Cases" to the generic "Use Cases" destroys this focusing property — the generic term activates multiple clusters simultaneously, violating the *Precise* quality criterion. Issue #529 demonstrated that Contracts grow into Skills when procedural detail creeps in, because the boundary was never formally defined.

---

## Decision

We define three artifact types with clearly distinct functions in the LLM interaction model:

### Anchor — "What do I know?"

**Canonical definition:** A Semantic Anchor is a named activation signal for a dense, pre-computed knowledge cluster in an LLM's training data. It focuses existing knowledge without supplying new information.

**Function:** Selection lens. A WHERE-clause on the model's training knowledge.

**Linguistic shape:** Noun / Named concept with attribution.
Examples: "Cockburn Use Cases", "Clean Architecture (Martin)", "Nygard ADRs"

**Quality criteria** (from CONTRIBUTING.adoc):
- **Precise** — Hits exactly one coherent knowledge cluster, not multiple
- **Rich** — The hit cluster contains enough structured knowledge to be useful
- **Consistent** — Activates the same cluster reliably across different prompts
- **Attributable** — Traceable to a verifiable source

**Key constraint:** The named concept must already be *dense* in the LLM's training data. If asking "What do you associate with X?" returns thin or inconsistent results, the concept cannot function as an anchor — it needs a Skill to supply the missing knowledge.

**Anti-patterns:**
- A name that hits multiple clusters simultaneously ("Use Cases" → UML + ISO + Cockburn)
- A concept not dense in training data ("Use-Case 3.0" — exists since 2011 but thin across all models)
- A name so generic it triggers colloquial usage ("Good design", "Best practices")

### Contract — "What may I do?"

**Canonical definition:** A Contract is terse, always-on shared vocabulary that pins which composition of anchors applies in a given project context and defines invariants (what MUST be done, what MUST NOT be done).

**Function:** Guardrails. Configuration of which lenses apply and how they combine.

**Linguistic shape:** Declarative statements.
Examples: "Documentation follows arc42", "Diagrams are C4 via PlantUML", "NEVER more than 10,000 rows without LIMIT"

**Properties:**
- Terse (one paragraph, not pages)
- Always-on (lives in CLAUDE.md / AGENTS.md / Kiro Steering)
- Pins vocabulary and constraints, does NOT describe procedure

**Key constraint:** The moment a Contract describes *how* to do something (step-by-step instructions, scaffolding commands, template fill-in guides), it has exceeded its scope and the procedural parts belong in a Skill.

**Anti-patterns:**
- A Contract containing scaffolding instructions ("run `downloadTemplate`…")
- A Contract with cross-section traceability rules (procedural detail → Skill)
- A Contract so long it cannot function as an always-on reminder

### Skill — "How do I do it?"

**Canonical definition:** A Skill is on-demand procedural machinery that supplies new knowledge or step-by-step guidance the model does not (or not densely enough) carry in its training data.

**Function:** Recipe / How-To. The concrete *how* of applying anchors within contract boundaries.

**Linguistic shape:** Verb / Imperative.
Examples: "Use Case Authoring", "arc42 Documentation Authoring", "UML Diagram Authoring"

**Properties:**
- On-demand (loaded when the task requires it, not always-on)
- Carries `references/`, prompts, invocation workflow
- Supplies procedural detail and/or meaning the model lacks
- Can teach new concepts that are not dense in training data

**Key constraint:** A Skill is needed precisely when the anchor alone is insufficient — either because the procedure is project-specific, or because the concept is too thin in the model's training data to be activated by name alone.

**Anti-patterns:**
- A Skill that only defines vocabulary without action guidance (that's a Contract)
- A Skill that only names a well-known concept (that's an Anchor)

---

## Discrimination Test

| Question | Answer → Type |
|----------|---------------|
| Is the knowledge already dense in the LLM and only needs focusing? | → **Anchor** |
| Do we need to pin which anchors apply and what's allowed/forbidden? | → **Contract** |
| Do we need to describe *how* something is concretely done? | → **Skill** |

**Word-form heuristic:**
- Sounds like a standalone technical term → Anchor
- Sounds like "is/uses/follows/NEVER/ALWAYS" → Contract
- Sounds like "create/verify/wire/write" → Skill

---

## Worked Examples

### Use Cases

| Layer | Artifact | Content |
|-------|----------|---------|
| Anchor | "Cockburn Use Cases" | Activates: Fully Dressed Template, Goal Levels, Actor-Goal List |
| Contract | "Specification" | "Use Cases in Cockburn Fully Dressed format. Always Actor-Goal List first. Never UML diagrams as substitute for textual use case." |
| Skill | "Use Case Authoring" | Step-by-step: analyze input → identify actors → derive goals → write main path → write extensions → format |

### Architecture Documentation

| Layer | Artifact | Content |
|-------|----------|---------|
| Anchors | "arc42", "C4 Model", "Nygard ADRs" | Activates respective knowledge clusters |
| Contract | "Architecture Documentation" | "Documentation follows arc42. Diagrams are C4 via PlantUML. Decisions are Nygard ADRs with 3-point Pugh matrix." |
| Skill | "arc42 Documentation Authoring" | Scaffolding, traceability rules, Chapter 11 structure, ADR↔risk wiring |

---

## Why This Taxonomy Is Needed — Converging Evidence

This proposal isn't motivated by a single issue. Multiple independent threads have surfaced the same gap from different angles:

**PR #582 (Use Cases rename)** exposed that without a formal definition of what an Anchor *is*, contributors make well-intentioned changes that destroy the focusing property. The reviewer (raifdmueller) had to reconstruct the theory from first principles in a PR comment — that reasoning should live in a canonical document, not be buried in review threads.

**Issue #529 (Architecture Documentation split)** demonstrated that Contracts silently drift into Skills when nobody enforces the boundary. The issue's author had to define "A contract is shared vocabulary... A skill is on-demand machinery" ad hoc in the issue body — again, canonical knowledge that should exist once, not be re-derived.

**Issue #580 (AI/LLM citability)** found that the project itself is invisible to AI search engines despite strong SERP performance. One root cause: the project cannot state in extractable, self-contained prose what its own artifact types *are*. A formal taxonomy provides exactly the kind of direct-answer block that AI crawlers and LLM retrieval pipelines need to categorize and cite the project's content. If a search engine crawls an anchor page but cannot determine whether it's looking at a concept definition, a set of rules, or a how-to guide, it cannot surface it for the right queries.

**The PR #582 review discussion between raifdmueller and simasch** crystallized the deeper principle: simasch argued from practitioner reality ("I don't distinguish Cockburn from Jacobson in daily work"), raifdmueller argued from LLM activation theory ("the names summon different concept fields"). Both are right in their respective frame — but the project needed to make explicit *which* frame it operates in. This ADR does that.

---

## Consequences

### Positive
- Clear assignment for contributors: what belongs where?
- Overscoped Contracts become recognizable and splittable (#529)
- Anchor renames become evaluable against Precision criteria (#582)
- Consistent naming: nouns for Anchors, verbs for Skills
- Enables extractable "What is X?" blocks for AI citability (#580)

### Negative
- Existing artifacts must be audited against the taxonomy (migration effort)
- Edge cases with partially dense concepts need case-by-case decisions

### Open Questions
- How to handle concepts that are *partially* dense in training data? (Anchor + supplementary Skill?)
- Is the taxonomy stable across models or does it need model-specific density indicators?
- Should the LLM Activation Test (CONTRIBUTING.md) explicitly measure cluster density?
- Should the taxonomy itself become an Anchor (if LLMs learn it) or remain a Contract (project-internal rule)?

---

## References

- [PR #582](https://github.com/LLM-Coding/Semantic-Anchors/pull/582) — "Cockburn Use Cases" → "Use Cases" rename discussion
- [Issue #529](https://github.com/LLM-Coding/Semantic-Anchors/issues/529) — "Split the over-grown Architecture Documentation contract"
- [Issue #580](https://github.com/LLM-Coding/Semantic-Anchors/issues/580) — "Direct-answer blocks per anchor to improve AI/LLM citability"
- [PR #570](https://github.com/LLM-Coding/Semantic-Anchors/pull/570) — Issue Title Naming Convention
- CONTRIBUTING.adoc — Anchor Quality Criteria (Precise, Rich, Consistent, Attributable)

</details>

### Questions for discussion

- Does this three-way split match everyone's mental model, or are there artifacts that don't fit?
- Should the taxonomy live in CONTRIBUTING.adoc, as a standalone ADR document, or both?
- Is the word-form heuristic (Noun → Anchor, Declarative → Contract, Verb → Skill) too reductive, or a useful shorthand?
- How should we handle partially-dense concepts (e.g. Use-Case 2.0 — exists but thin)?

---

### Deutsche Version

#### Motivation

Beim Review von PR #582 (Use Cases Rename) mit @raifdmueller haben wir aus ersten Prinzipien rekonstruiert, was ein Semantic Anchor eigentlich *ist* — und wie er sich von einem Contract und einem Skill unterscheidet. Die Kernerkenntnis: Ein Anker ist kein Konzept-Label. Er ist ein **Aktivierungssignal für einen vorhandenen Wissenscluster** in den Trainingsdaten eines LLMs. Er fokussiert; er lehrt nicht.

Diese Unterscheidung war nie formal aufgeschrieben. Die Konsequenz sieht man in mehreren unabhängigen Threads:

- **PR #582**: Ein Contributor benannte "Cockburn Use Cases" in "Use Cases" um — aus Praktiker-Sicht vernünftig, aber es zerstört die Fokussierungseigenschaft des Ankers, weil der generische Begriff mehrere Wissenscluster gleichzeitig aktiviert (UML, ISO, Cockburn).
- **#529**: Der Architecture Documentation Contract wuchs zu einem mehrseitigen prozeduralen Dokument, weil keine formale Regel sagte "ein Contract pinnt Vokabular; ein Skill trägt Prozedur."
- **#580**: Das Projekt ist für AI-Suchmaschinen unsichtbar. Ein Grund: Ohne klare, extrahierbare Aussagen was jeder Artefakttyp *ist*, können Crawler den Inhalt nicht korrekt kategorisieren und zitieren.
- **Die Diskussion in PR #582** (raifdmueller vs. simasch) kristallisierte das tiefere Prinzip: simasch argumentierte aus der Praxis ("Ich unterscheide nicht zwischen Cockburn und Jacobson"), raifdmueller aus der LLM-Aktivierungstheorie ("die Namen rufen unterschiedliche Konzeptfelder auf"). Beide haben in ihrem Bezugsrahmen recht — aber das Projekt muss explizit machen, in welchem Rahmen es operiert.

Jedes Mal wurde dieselbe Taxonomie implizit rekonstruiert. Es ist Zeit, sie einmal aufzuschreiben.

#### Vorschlag

Ein ADR der die drei Artefakttypen formal definiert:

- **Anchor** = "Was weiß ich?" — Linse auf vorhandenes Wissen (Nomen, Aktivierungssignal)
- **Contract** = "Was darf ich?" — Leitplanken und Invarianten (Deklarativ, always-on in CLAUDE.md / AGENTS.md / Kiro Steering)
- **Skill** = "Wie tue ich es?" — Handlungsanleitung (Verb/Imperativ, on-demand)

Der vollständige ADR-Entwurf ist oben in der Details-Sektion.

#### Was das ermöglicht

1. Klarer Contributor-Guide: "Ist mein Artefakt ein Anchor, ein Contract oder ein Skill?"
2. Formale Basis für den in #529 vorgeschlagenen Split (überscoped Contracts → Skill-Extraktion)
3. Prüfbares Kriterium für Anchor-Namensgebung (PR #582: trifft der Name einen Cluster oder mehrere?)
4. Extrahierbare "Was ist X?"-Definitionen für jeden Typ, unterstützt #580's Direct-Answer-Blocks

#### Diskussionsfragen

- Passt diese Dreiteilung zum mentalen Modell aller Beteiligten, oder gibt es Artefakte die nicht reinpassen?
- Soll die Taxonomie in CONTRIBUTING.adoc, als eigenständiges ADR-Dokument, oder beides?
- Ist die Wortform-Heuristik (Nomen → Anchor, Deklarativ → Contract, Verb → Skill) zu vereinfacht oder ein nützlicher Shorthand?
- Wie mit teilweise dichten Konzepten umgehen (z.B. Use-Case 2.0 — existiert, aber dünn)?


Type	Function	Linguistic Shape	Loading
Anchor	"What do I know?" — Focuses pre-existing knowledge	Noun (e.g. "Cockburn Use Cases")	Passive (activation signal)
Contract	"What may I do?" — Pins vocabulary + invariants	Declarative ("follows/NEVER/MUST")	Always-on
Skill	"How do I do it?" — Supplies procedure	Verb/Imperative ("Authoring/Create")	On-demand

Question	Answer → Type
Is the knowledge already dense in the LLM and only needs focusing?	→ Anchor
Do we need to pin which anchors apply and what's allowed/forbidden?	→ Contract
Do we need to describe how something is concretely done?	→ Skill

Layer	Artifact	Content
Anchor	"Cockburn Use Cases"	Activates: Fully Dressed Template, Goal Levels, Actor-Goal List
Contract	"Specification"	"Use Cases in Cockburn Fully Dressed format. Always Actor-Goal List first. Never UML diagrams as substitute for textual use case."
Skill	"Use Case Authoring"	Step-by-step: analyze input → identify actors → derive goals → write main path → write extensions → format

Layer	Artifact	Content
Anchors	"arc42", "C4 Model", "Nygard ADRs"	Activates respective knowledge clusters
Contract	"Architecture Documentation"	"Documentation follows arc42. Diagrams are C4 via PlantUML. Decisions are Nygard ADRs with 3-point Pugh matrix."
Skill	"arc42 Documentation Authoring"	Scaffolding, traceability rules, Chapter 11 structure, ADR↔risk wiring

Uh oh!

[Process Proposal]: Formal Artifact Taxonomy — Anchor vs. Contract vs. Skill #585

Description

[Process Proposal]: Formal Artifact Taxonomy — Anchor vs. Contract vs. Skill

Issue Text (GitHub-ready)

Motivation

Proposal

What this would enable

Full ADR Draft

ADR: Artifact Taxonomy — Anchor vs. Contract vs. Skill

Introduction

The core insight (from PR #582)

Decision

Anchor — "What do I know?"

Contract — "What may I do?"

Skill — "How do I do it?"

Discrimination Test

Worked Examples

Use Cases

Architecture Documentation

Why This Taxonomy Is Needed — Converging Evidence

Consequences

Positive

Negative

Open Questions

References

Questions for discussion

Deutsche Version

Motivation

Vorschlag

Was das ermöglicht

Diskussionsfragen

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions