Skip to content

feat(skill): add anchor-prior-test skill for vetting anchor candidates#587

Merged
rdmueller merged 1 commit into
LLM-Coding:mainfrom
raifdmueller:feat/anchor-prior-test-skill
Jun 9, 2026
Merged

feat(skill): add anchor-prior-test skill for vetting anchor candidates#587
rdmueller merged 1 commit into
LLM-Coding:mainfrom
raifdmueller:feat/anchor-prior-test-skill

Conversation

@raifdmueller

@raifdmueller raifdmueller commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

What

Adds a new skill anchor-prior-test — a clean-room procedure that empirically tests whether a candidate term is a strong semantic anchor before it is proposed, instead of guessing.

It operationalises the manual litmus test in CONTRIBUTING.adoc ("ask the LLM what it associates") into a rigorous, multi-model method and emits a structured verdict.

How it works

  1. Frame the candidate (precise name + ambiguous bare form + rivals).
  2. Clean room — a fresh claude -p --setting-sources "" --strict-mcp-config from a neutral directory, so the project's own CLAUDE.md/memory cannot bias the result. (Sub-agents are contaminated; the skill explains why.)
  3. Probe battery — recognition, anchor-as-instruction, attribution, and name-ambiguity, across weak/mid/strong model tiers, ≥2 runs for decisive probes.
  4. Score — map to the four criteria (Precise, Rich, Consistent, Attributable) + prior density + a tier (★★★/★★☆/★☆☆).
  5. Emit — verdict + route (anchor / contract / reject) + ready-to-paste proposal or rejection.

Why

  • A good but niche method (e.g. "Use-Case 3.0") can be a weak anchor; a model silently substitutes the nearest concept rather than admit the gap. This skill measures density, which is what actually matters.
  • A weak prior is routed to a contract (supplies its own meaning) rather than forced into an anchor.
  • Relates to EPIC: Semantic Anchor Evaluations across LLMs #329 (Semantic Anchor Evaluations across LLMs) — this is a concrete tool for it.

Contents

  • skill/anchor-prior-test/SKILL.md + references/ (clean-room, probe-battery, scoring, worked-example)
  • references/worked-example.md documents a full run on the Morphological Box candidate (Haiku/Sonnet/Opus)
  • plugin synced to plugins/semantic-anchors/skills/, version bumped 0.2.0 → 0.3.0

Companion to the article in #586 (the why); this skill is the how.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Dokumentation

    • Umfangreiche Dokumentation zur Validierung semantischer Anker bereitgestellt. Diese enthält detaillierte Testverfahren, Bewertungskriterien, Anforderungen an isolierte Testumgebungen sowie praktische Beispiele und Leitfäden für die Durchführung dieser Validierungsprozesse.
  • Chores

    • Versionsnummer aktualisiert.

A clean-room procedure that empirically tests whether a term is a strong
semantic anchor - probing recognition, anchor-action, attribution and name
ambiguity across model tiers via `claude -p --setting-sources ""` - and emits
a verdict against the four criteria, a tier rating, and a ready-to-paste
proposal or rejection. Operationalises the CONTRIBUTING litmus test; relates
to LLM-Coding#329. Includes the Morphological Box run as a worked example.

Bumps the semantic-anchors plugin to 0.3.0.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 2cf8c894-ddbf-4017-80a0-15c23b8c4a53

📥 Commits

Reviewing files that changed from the base of the PR and between 957a67a and a05251f.

📒 Files selected for processing (11)
  • plugins/semantic-anchors/plugin.json
  • plugins/semantic-anchors/skills/anchor-prior-test/SKILL.md
  • plugins/semantic-anchors/skills/anchor-prior-test/references/clean-room.md
  • plugins/semantic-anchors/skills/anchor-prior-test/references/probe-battery.md
  • plugins/semantic-anchors/skills/anchor-prior-test/references/scoring.md
  • plugins/semantic-anchors/skills/anchor-prior-test/references/worked-example.md
  • skill/anchor-prior-test/SKILL.md
  • skill/anchor-prior-test/references/clean-room.md
  • skill/anchor-prior-test/references/probe-battery.md
  • skill/anchor-prior-test/references/scoring.md
  • skill/anchor-prior-test/references/worked-example.md

Walkthrough

Die Pull Request fügt eine vollständige Dokumentation für die neue „Anchor Prior Test"-Skill zur empirischen Überprüfung semantischer Anker-Kandidaten hinzu und erhöht die Plugin-Version auf 0.3.0. Die Dokumentation wird parallel in zwei Verzeichnissen (Plugin und Root) mit identischem Inhalt bereitgestellt.

Changes

Anchor Prior Test Skill Release

Layer / File(s) Summary
Plugin-Versionierung
plugins/semantic-anchors/plugin.json
Plugin-Version wird von 0.2.0 auf 0.3.0 erhöht.
Skill-Definition und Grundprinzipien
plugins/semantic-anchors/skills/anchor-prior-test/SKILL.md, skill/anchor-prior-test/SKILL.md
Anchor Prior Test-Dokumentation definiert die empirische Prozedur zur Klassifizierung semantischer Anker, basierend auf Prior-Dichte statt Merit. Strukturierter fünfstufiger Ablauf (Kandidat-Rahmung, Clean-Room-Ausführung, Probe-Batterie, Scoring, Verdict-Ausgabe) wird dokumentiert; kritische Constraints (keine Sub-Agents, Recall-vs-Execution-Test, Bare-Term-Ambiguität, n>1 Anforderung) werden festgelegt.
Clean-Room-Ausführungsmethodologie
plugins/semantic-anchors/skills/anchor-prior-test/references/clean-room.md, skill/anchor-prior-test/references/clean-room.md
Clean-Room-Vorgehen dokumentiert Kontext-Isolation durch spezifische CLI-Parameter (--setting-sources "", /tmp-Arbeitsverzeichnis, --strict-mcp-config), Validierungsschritte, Model-Tier-Schema (haiku/sonnet/opus) und Fallback-Verhalten. Sub-Agents werden aus Kontexteinfrier-Gründen ausgeschlossen.
Probe-Battery-Spezifikation
plugins/semantic-anchors/skills/anchor-prior-test/references/probe-battery.md, skill/anchor-prior-test/references/probe-battery.md
Vier-Proben-Batterie (P1–P4) plus optionale Substitution wird definiert mit exakten Prompt-Vorgaben und strukturiertem SELF:-Output-Format. Tier-übergreifende Durchführungsanweisungen und Ergebnismatrix-Aggregationsprocedur werden festgelegt.
Bewertungs- und Routing-Regeln
plugins/semantic-anchors/skills/anchor-prior-test/references/scoring.md, skill/anchor-prior-test/references/scoring.md
Vier Qualitätskriterien (Precise, Rich, Consistent, Attributable) werden Probeantworten zugeordnet; „Prior density" wird als entscheidendes fünftes Kriterium eingeführt. Tier-Bewertungsskala und Routing-Algorithmus (Anchor-Aktivierung/Contract/Ablehnung) werden definiert; „Honesty rules" für Zitierung und Tier-Verfügbarkeit werden festgelegt.
Worked Example als Referenz
plugins/semantic-anchors/skills/anchor-prior-test/references/worked-example.md, skill/anchor-prior-test/references/worked-example.md
Vollständiges Beispiel mit „Morphological Box (Zwicky Box)" demonstriert End-to-End-Ausführung: Ergebnistabellen (Haiku/Sonnet/Opus), Urteilskriterien, Tier-Einstufung und „Lesson captured"-Zusammenfassung.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • JensGrote
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rdmueller rdmueller merged commit 5793584 into LLM-Coding:main Jun 9, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants