Skip to content

[Process Proposal]: Formal Artifact Taxonomy — Anchor vs. Contract vs. Skill #585

Description

@JensGrote

[Process Proposal]: Formal Artifact Taxonomy — Anchor vs. Contract vs. Skill

Issue Text (GitHub-ready)


(German version below)

Motivation

While reviewing PR #582 (Use Cases rename) with @raifdmueller, we found ourselves reconstructing from first principles what a Semantic Anchor actually is — and how it differs from a Contract or a Skill. The key insight that emerged: an Anchor is not a definition of a concept. It's an activation signal for a pre-existing knowledge cluster in an LLM's training data. It focuses; it does not teach.

This distinction was never formally written down. The consequence is visible across multiple independent threads:

  • PR Refactored "Cockburn Use Cases" to "Use Cases"  #582: A contributor renamed "Cockburn Use Cases" to "Use Cases" — a perfectly reasonable move from a practitioner perspective, but one that destroys the anchor's focusing property because the generic term activates multiple knowledge clusters simultaneously (UML, ISO, Cockburn). The reviewer had to explain activation theory ad hoc in a PR comment.
  • Split the over-grown Architecture Documentation contract: vocabulary stays, procedure moves to a skill #529: The Architecture Documentation contract grew into a multi-page procedural document because there was no formal rule saying "a Contract pins vocabulary; a Skill carries procedure." The boundary had to be re-derived from scratch in the issue body.
  • feat: Direct-answer blocks per anchor to improve AI/LLM citability #580: The project is invisible to AI search engines despite strong SERP performance. One contributing factor: without a clear, extractable statement of what each artifact type is, crawlers cannot categorize and cite the content correctly. A formal taxonomy provides exactly the direct-answer blocks that AI crawlers need.
  • The PR Refactored "Cockburn Use Cases" to "Use Cases"  #582 review discussion (raifdmueller vs. simasch) crystallized the deeper principle: simasch argued from practitioner reality ("I don't distinguish Cockburn from Jacobson in daily work"), raifdmueller argued from LLM activation theory ("the names summon different concept fields"). Both are right in their respective frame — but the project needs to make explicit which frame it operates in.

Each time, the same taxonomy was implicitly reconstructed. It's time to write it down once.

Proposal

I've drafted an ADR that formally defines the three artifact types:

Type Function Linguistic Shape Loading
Anchor "What do I know?" — Focuses pre-existing knowledge Noun (e.g. "Cockburn Use Cases") Passive (activation signal)
Contract "What may I do?" — Pins vocabulary + invariants Declarative ("follows/NEVER/MUST") Always-on
Skill "How do I do it?" — Supplies procedure Verb/Imperative ("Authoring/Create") On-demand

The core discrimination test:

  • Knowledge already dense in the LLM → Anchor
  • Need to pin which anchors apply and set guardrails → Contract
  • Need to describe how to do it concretely → Skill

What this would enable

  1. A clear contributor guide: "Is my artifact an Anchor, a Contract, or a Skill?"
  2. A formal basis for the split proposed in Split the over-grown Architecture Documentation contract: vocabulary stays, procedure moves to a skill #529 (overscoped Contracts → Skill extraction)
  3. A reviewable criterion for anchor naming decisions (PR Refactored "Cockburn Use Cases" to "Use Cases"  #582: does the name hit one cluster or many?)
  4. Extractable "What is X?" definitions for each artifact type, supporting feat: Direct-answer blocks per anchor to improve AI/LLM citability #580's direct-answer blocks

Full ADR Draft

Click to expand the complete ADR

ADR: Artifact Taxonomy — Anchor vs. Contract vs. Skill

Status: Proposed
Date: 2026-06-08
Author: Jens Grote
Related: PR #582, Issue #529, Issue #580


Introduction

The Semantic Anchors catalog ships three distinct artifact types — Anchors, Contracts, and Skills — but their boundaries are defined only implicitly through examples and convention. This ADR proposes an explicit taxonomy that answers:

  1. What is a Semantic Anchor, fundamentally? Not a definition of a concept, but an activation signal for pre-existing knowledge in an LLM's training data. It selects and focuses — it does not teach.
  2. How does it differ from a Contract and a Skill? Each artifact type has a distinct function, loading behavior, and linguistic shape.
  3. Why does this matter for search engines and LLM citability? If the project cannot clearly state what its own artifact types are in extractable, self-contained prose (feat: Direct-answer blocks per anchor to improve AI/LLM citability #580), neither humans nor AI crawlers can reliably categorize and cite them.

The core insight (from PR #582)

When you prompt a bare LLM "Write me a use case for feature X", it draws from at least three traditions simultaneously: UML/Jacobson (diagrams, «include»/«extend»), ISO 29148 (formal requirement structure), and Cockburn (Fully Dressed Template, Goal Levels). The result is an undifferentiated blend.

A Semantic Anchor like "Cockburn Use Cases" acts as a focusing lens: it tells the LLM "You already know this — use this specific knowledge cluster, not the others." It adds no new information; it selects.

This is fundamentally different from a Contract (which sets boundaries: "You MUST use this format, you MUST NOT do that") and from a Skill (which teaches a procedure the model doesn't inherently know how to execute).

The rename discussion in PR #582 demonstrated that weakening the anchor name from "Cockburn Use Cases" to the generic "Use Cases" destroys this focusing property — the generic term activates multiple clusters simultaneously, violating the Precise quality criterion. Issue #529 demonstrated that Contracts grow into Skills when procedural detail creeps in, because the boundary was never formally defined.


Decision

We define three artifact types with clearly distinct functions in the LLM interaction model:

Anchor — "What do I know?"

Canonical definition: A Semantic Anchor is a named activation signal for a dense, pre-computed knowledge cluster in an LLM's training data. It focuses existing knowledge without supplying new information.

Function: Selection lens. A WHERE-clause on the model's training knowledge.

Linguistic shape: Noun / Named concept with attribution.
Examples: "Cockburn Use Cases", "Clean Architecture (Martin)", "Nygard ADRs"

Quality criteria (from CONTRIBUTING.adoc):

  • Precise — Hits exactly one coherent knowledge cluster, not multiple
  • Rich — The hit cluster contains enough structured knowledge to be useful
  • Consistent — Activates the same cluster reliably across different prompts
  • Attributable — Traceable to a verifiable source

Key constraint: The named concept must already be dense in the LLM's training data. If asking "What do you associate with X?" returns thin or inconsistent results, the concept cannot function as an anchor — it needs a Skill to supply the missing knowledge.

Anti-patterns:

  • A name that hits multiple clusters simultaneously ("Use Cases" → UML + ISO + Cockburn)
  • A concept not dense in training data ("Use-Case 3.0" — exists since 2011 but thin across all models)
  • A name so generic it triggers colloquial usage ("Good design", "Best practices")

Contract — "What may I do?"

Canonical definition: A Contract is terse, always-on shared vocabulary that pins which composition of anchors applies in a given project context and defines invariants (what MUST be done, what MUST NOT be done).

Function: Guardrails. Configuration of which lenses apply and how they combine.

Linguistic shape: Declarative statements.
Examples: "Documentation follows arc42", "Diagrams are C4 via PlantUML", "NEVER more than 10,000 rows without LIMIT"

Properties:

  • Terse (one paragraph, not pages)
  • Always-on (lives in CLAUDE.md / AGENTS.md / Kiro Steering)
  • Pins vocabulary and constraints, does NOT describe procedure

Key constraint: The moment a Contract describes how to do something (step-by-step instructions, scaffolding commands, template fill-in guides), it has exceeded its scope and the procedural parts belong in a Skill.

Anti-patterns:

  • A Contract containing scaffolding instructions ("run downloadTemplate…")
  • A Contract with cross-section traceability rules (procedural detail → Skill)
  • A Contract so long it cannot function as an always-on reminder

Skill — "How do I do it?"

Canonical definition: A Skill is on-demand procedural machinery that supplies new knowledge or step-by-step guidance the model does not (or not densely enough) carry in its training data.

Function: Recipe / How-To. The concrete how of applying anchors within contract boundaries.

Linguistic shape: Verb / Imperative.
Examples: "Use Case Authoring", "arc42 Documentation Authoring", "UML Diagram Authoring"

Properties:

  • On-demand (loaded when the task requires it, not always-on)
  • Carries references/, prompts, invocation workflow
  • Supplies procedural detail and/or meaning the model lacks
  • Can teach new concepts that are not dense in training data

Key constraint: A Skill is needed precisely when the anchor alone is insufficient — either because the procedure is project-specific, or because the concept is too thin in the model's training data to be activated by name alone.

Anti-patterns:

  • A Skill that only defines vocabulary without action guidance (that's a Contract)
  • A Skill that only names a well-known concept (that's an Anchor)

Discrimination Test

Question Answer → Type
Is the knowledge already dense in the LLM and only needs focusing? Anchor
Do we need to pin which anchors apply and what's allowed/forbidden? Contract
Do we need to describe how something is concretely done? Skill

Word-form heuristic:

  • Sounds like a standalone technical term → Anchor
  • Sounds like "is/uses/follows/NEVER/ALWAYS" → Contract
  • Sounds like "create/verify/wire/write" → Skill

Worked Examples

Use Cases

Layer Artifact Content
Anchor "Cockburn Use Cases" Activates: Fully Dressed Template, Goal Levels, Actor-Goal List
Contract "Specification" "Use Cases in Cockburn Fully Dressed format. Always Actor-Goal List first. Never UML diagrams as substitute for textual use case."
Skill "Use Case Authoring" Step-by-step: analyze input → identify actors → derive goals → write main path → write extensions → format

Architecture Documentation

Layer Artifact Content
Anchors "arc42", "C4 Model", "Nygard ADRs" Activates respective knowledge clusters
Contract "Architecture Documentation" "Documentation follows arc42. Diagrams are C4 via PlantUML. Decisions are Nygard ADRs with 3-point Pugh matrix."
Skill "arc42 Documentation Authoring" Scaffolding, traceability rules, Chapter 11 structure, ADR↔risk wiring

Why This Taxonomy Is Needed — Converging Evidence

This proposal isn't motivated by a single issue. Multiple independent threads have surfaced the same gap from different angles:

PR #582 (Use Cases rename) exposed that without a formal definition of what an Anchor is, contributors make well-intentioned changes that destroy the focusing property. The reviewer (raifdmueller) had to reconstruct the theory from first principles in a PR comment — that reasoning should live in a canonical document, not be buried in review threads.

Issue #529 (Architecture Documentation split) demonstrated that Contracts silently drift into Skills when nobody enforces the boundary. The issue's author had to define "A contract is shared vocabulary... A skill is on-demand machinery" ad hoc in the issue body — again, canonical knowledge that should exist once, not be re-derived.

Issue #580 (AI/LLM citability) found that the project itself is invisible to AI search engines despite strong SERP performance. One root cause: the project cannot state in extractable, self-contained prose what its own artifact types are. A formal taxonomy provides exactly the kind of direct-answer block that AI crawlers and LLM retrieval pipelines need to categorize and cite the project's content. If a search engine crawls an anchor page but cannot determine whether it's looking at a concept definition, a set of rules, or a how-to guide, it cannot surface it for the right queries.

The PR #582 review discussion between raifdmueller and simasch crystallized the deeper principle: simasch argued from practitioner reality ("I don't distinguish Cockburn from Jacobson in daily work"), raifdmueller argued from LLM activation theory ("the names summon different concept fields"). Both are right in their respective frame — but the project needed to make explicit which frame it operates in. This ADR does that.


Consequences

Positive

Negative

  • Existing artifacts must be audited against the taxonomy (migration effort)
  • Edge cases with partially dense concepts need case-by-case decisions

Open Questions

  • How to handle concepts that are partially dense in training data? (Anchor + supplementary Skill?)
  • Is the taxonomy stable across models or does it need model-specific density indicators?
  • Should the LLM Activation Test (CONTRIBUTING.md) explicitly measure cluster density?
  • Should the taxonomy itself become an Anchor (if LLMs learn it) or remain a Contract (project-internal rule)?

References

  • PR #582 — "Cockburn Use Cases" → "Use Cases" rename discussion
  • Issue #529 — "Split the over-grown Architecture Documentation contract"
  • Issue #580 — "Direct-answer blocks per anchor to improve AI/LLM citability"
  • PR #570 — Issue Title Naming Convention
  • CONTRIBUTING.adoc — Anchor Quality Criteria (Precise, Rich, Consistent, Attributable)

Questions for discussion

  • Does this three-way split match everyone's mental model, or are there artifacts that don't fit?
  • Should the taxonomy live in CONTRIBUTING.adoc, as a standalone ADR document, or both?
  • Is the word-form heuristic (Noun → Anchor, Declarative → Contract, Verb → Skill) too reductive, or a useful shorthand?
  • How should we handle partially-dense concepts (e.g. Use-Case 2.0 — exists but thin)?

Deutsche Version

Motivation

Beim Review von PR #582 (Use Cases Rename) mit @raifdmueller haben wir aus ersten Prinzipien rekonstruiert, was ein Semantic Anchor eigentlich ist — und wie er sich von einem Contract und einem Skill unterscheidet. Die Kernerkenntnis: Ein Anker ist kein Konzept-Label. Er ist ein Aktivierungssignal für einen vorhandenen Wissenscluster in den Trainingsdaten eines LLMs. Er fokussiert; er lehrt nicht.

Diese Unterscheidung war nie formal aufgeschrieben. Die Konsequenz sieht man in mehreren unabhängigen Threads:

Jedes Mal wurde dieselbe Taxonomie implizit rekonstruiert. Es ist Zeit, sie einmal aufzuschreiben.

Vorschlag

Ein ADR der die drei Artefakttypen formal definiert:

  • Anchor = "Was weiß ich?" — Linse auf vorhandenes Wissen (Nomen, Aktivierungssignal)
  • Contract = "Was darf ich?" — Leitplanken und Invarianten (Deklarativ, always-on in CLAUDE.md / AGENTS.md / Kiro Steering)
  • Skill = "Wie tue ich es?" — Handlungsanleitung (Verb/Imperativ, on-demand)

Der vollständige ADR-Entwurf ist oben in der Details-Sektion.

Was das ermöglicht

  1. Klarer Contributor-Guide: "Ist mein Artefakt ein Anchor, ein Contract oder ein Skill?"
  2. Formale Basis für den in Split the over-grown Architecture Documentation contract: vocabulary stays, procedure moves to a skill #529 vorgeschlagenen Split (überscoped Contracts → Skill-Extraktion)
  3. Prüfbares Kriterium für Anchor-Namensgebung (PR Refactored "Cockburn Use Cases" to "Use Cases"  #582: trifft der Name einen Cluster oder mehrere?)
  4. Extrahierbare "Was ist X?"-Definitionen für jeden Typ, unterstützt feat: Direct-answer blocks per anchor to improve AI/LLM citability #580's Direct-Answer-Blocks

Diskussionsfragen

  • Passt diese Dreiteilung zum mentalen Modell aller Beteiligten, oder gibt es Artefakte die nicht reinpassen?
  • Soll die Taxonomie in CONTRIBUTING.adoc, als eigenständiges ADR-Dokument, oder beides?
  • Ist die Wortform-Heuristik (Nomen → Anchor, Deklarativ → Contract, Verb → Skill) zu vereinfacht oder ein nützlicher Shorthand?
  • Wie mit teilweise dichten Konzepten umgehen (z.B. Use-Case 2.0 — existiert, aber dünn)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions