moonrunnerkc
diff --git a/‎.gitignore‎
Lines changed: 10 additions & 0 deletions b/‎.gitignore‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 28 additions & 1 deletion b/‎CHANGELOG.md‎
Lines changed: 28 additions & 1 deletion
diff --git a/‎README.md‎
Lines changed: 29 additions & 8 deletions b/‎README.md‎
Lines changed: 29 additions & 8 deletions
diff --git a/‎RELEASE_NOTES_v1.0.1.md‎
Lines changed: 50 additions & 0 deletions b/‎RELEASE_NOTES_v1.0.1.md‎
Lines changed: 50 additions & 0 deletions
@@ -11,3 +11,13 @@ CLAUDE.md
 .copilot-instructions.md
 # Field test artifacts (kept locally, not in git)
 runs/
+
+# Launch post drafts (kept locally, never push)
+launch-post/
+
+# Verification venvs (created by end-to-end verification runs)
+.venv-verify/
+
+# Agent scratchpads (kept locally; never push)
+.codex
+review.md
@@ -8,7 +8,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
-- Release prep artifacts for v1.0.0: `RELEASE_NOTES_v1.0.0.md`, `LAUNCH_POST_v1.0.md`, `LAUNCH_CHECKLIST.md`.
+- `scripts/summarize_batch.py` and `tests/test_batch15_summarize.py`: maintainer-facing tool that consumes a directory of skillcheck batch-run artifacts (one directory per repo, one subdirectory per skill, paired `*.json` / `*.txt` reports per phase) and writes `summary.csv` plus `findings.md`. Invoked as `python scripts/summarize_batch.py <batch_dir>`. Not exposed as a console script, not wired into the GitHub Action; the action runs skillcheck against one path, this consumes outputs across many. Documented under Maintainer Notes in the README.
+- `tests/test_readme_test_count_claim.py`: parses the README's "N tests cover ..." sentence and asserts it matches `pytest --collect-only`. The next time the suite grows without bumping the README number, CI fails. Closes the recurring drift pattern that v1.0.1 had to correct twice.
+
+### Changed
+- README test count bumped from 663 to 664 to include the new drift-guard test.
+
+## [1.0.1] - 2026-04-28
+
+End-to-end verification against `anthropics/skills` surfaced documentation drift in the published v1.0.0 README and a batch of post-tag implementation work that had not been committed. v1.0.1 commits that work, ships the docs corrections, and adds guide-parity flags. Behavior change: warning-only runs now return exit code 2 (was 0).
+
+### Changed
+- Warning-only CLI reports now return exit code 2. Exit code 1 remains errors; exit code 3 remains semantic drift. README Exit Codes table row 0 updated to "no errors and no warnings".
+- README test count corrected from 653 to 663.
+- README JSON-stability promise updated from "0.x series" to "v1.x series".
+- README field-test numbers reframed as April 2026 snapshots against `anthropics/skills`, with a note that they will drift as upstream evolves.
+- `action.yml` `format` input description clarified: accepted but ignored at runtime; the action always invokes skillcheck with `--format json`.
+- Development extras now include `ruff>=0.6`, `mypy>=1.10`, and `types-PyYAML>=6.0`.
+
+### Added
+- `--semantic`: guide-compatible shortcut that enables semantic-adjacent validation. In standalone mode it runs heuristic graph analysis; with ingested agent responses it merges those diagnostics.
+- `--agent-reason`: guide-compatible agent-workflow shortcut. Emits a combined critique and graph prompt packet so the calling agent can run both reasoning steps and feed JSON back through `--ingest-critique` and `--ingest-graph`.
+- `--format md` and `--format agent`: Markdown report output and agent-oriented next-action output.
+- `skillcheck.toml` config loading: top-level defaults for format, thresholds, target agent, strict VS Code mode, skip flags, ignored rule prefixes, graph analysis, semantic mode, history, and agent variants.
+- Experimental `--activation-hypotheses`: generates likely natural-language routing triggers plus a discoverability entropy score. Routing caveat included in every report.
+- Machine-readable diagnostic metadata: JSON diagnostics now include `source` and `confidence` fields.
+- GitHub Action inputs for the v1.0 modes: `semantic`, `analyze-graph`, `ingest-critique`, `critique-agent`, `ingest-graph`, `graph-agent`, `history`, `activation-hypotheses`. The action still always emits JSON internally for PR annotations.
+- `tests/test_v1_completion.py`: covers `--format md`, `--format agent`, `--agent-reason`, `--semantic` graph enabling, `--activation-hypotheses` JSON, `skillcheck.toml` loading, and source/confidence in JSON output.
 
 ## [1.0.0] - 2026-04-25
 
@@ -17,6 +43,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Added `docs/case-study-v1-real-world-runs.md`: full breakdown of the pre-3B field test covering 18 Anthropic skills (symbolic), `mcp-builder` through the full v1.0 pipeline (symbolic + heuristic graph + agent critique + agent graph), and 5 uxuiprinciples skills (strict VS Code mode). Documents three `semantic.contradiction.detected` errors on a skill that passed all symbolic checks, five `graph.capability.orphaned` patterns, and the recurring unknown-field pattern (`license`, `homepage`, `env`) across official catalogs.
 
 ### Added
+- Release prep artifacts: `RELEASE_NOTES_v1.0.0.md`, `LAUNCH_POST_v1.0.md`, `LAUNCH_CHECKLIST.md`.
 - `skills/skillcheck/SKILL.md`: skillcheck's own SKILL.md, validating the tool against itself. Passes symbolic, graph, critique, and history validation with zero errors and zero warnings. Serves as the worked example for the Rules table in the README.
 - Self-host integration test suite (`tests/test_self_host.py`): confirms the bundled SKILL.md passes symbolic validation, all five graph analyzers, critique ingestion, agent graph ingestion with divergence analysis, full CLI pipeline, history round-trip, and description scoring threshold.
 - `scripts/regen_self_host_fixtures.py`: regenerates `tests/fixtures/self_host/graph_clean.json` from the live heuristic graph after skill edits.
 
@@ -49,6 +49,12 @@ skillcheck path/to/SKILL.md --analyze-graph
 skillcheck path/to/SKILL.md --emit-critique-prompt > prompt.txt
 # Run prompt.txt through your agent. Agent returns JSON. Then:
 skillcheck path/to/SKILL.md --ingest-critique response.json
+
+# Agent shortcut: emit critique and graph prompts in one packet
+skillcheck path/to/SKILL.md --agent-reason --format agent
+
+# Experimental activation estimates
+skillcheck path/to/SKILL.md --activation-hypotheses --format json
 ```
 
 ## Modes
@@ -63,7 +69,7 @@ skillcheck skills/            # recursive scan; finds every file named SKILL.md
 skillcheck SKILL.md --format json
 ```
 
-From the field test on Anthropic's official skills repository (18 skills, `runs/anthropics-corpus/01-symbolic-all.txt`): four of eighteen files failed. `claude-api/SKILL.md` failed with `frontmatter.name.reserved-word` because the name contains the reserved word "claude". `template/SKILL.md` failed with `frontmatter.name.directory-mismatch` (name `template-skill`, directory `template`). Both files look correct on casual inspection.
+From the field test on Anthropic's official skills repository (18 skills, `runs/anthropics-corpus/01-symbolic-all.txt`, snapshot taken during v1.0 release prep in April 2026): four of eighteen files failed. `claude-api/SKILL.md` failed with `frontmatter.name.reserved-word` because the name contains the reserved word "claude". `template/SKILL.md` failed with `frontmatter.name.directory-mismatch` (name `template-skill`, directory `template`). Both files look correct on casual inspection.
 
 ### Heuristic Graph
 
@@ -86,7 +92,7 @@ From the field test on `mcp-builder/SKILL.md` (`runs/anthropics-mcp-builder/02-g
                         has no declared inputs or outputs.
 ```
 
-Thirteen of fourteen capability headings in that skill had no declared I/O. That is a signal the skill relies entirely on implicit context rather than declared contracts.
+Thirteen of fourteen capability headings in that skill had no declared I/O at the time of the field test. That is a signal the skill relies entirely on implicit context rather than declared contracts. Numbers reflect a snapshot of `anthropics/skills` from April 2026 and will drift as upstream evolves; rerun against the current repo to see fresh counts.
 
 ### Agent Critique
 
@@ -98,6 +104,7 @@ skillcheck SKILL.md --emit-critique-prompt > prompt.txt
 skillcheck SKILL.md --ingest-critique response.json
 skillcheck SKILL.md --ingest-critique -                   # read from stdin
 skillcheck SKILL.md --emit-critique-prompt --critique-agent codex > prompt.txt
+skillcheck SKILL.md --agent-reason --format agent         # critique + graph prompt packet
 ```
 
 `--critique-agent` selects a framing variant tuned for each platform (claude, codex, cursor). The schema and exit codes are identical across all variants.
@@ -218,13 +225,16 @@ JSON output (`--format json`):
 }
 ```
 
-The JSON schema is stable. It will not change in a backward-incompatible way within the 0.x series.
+Each diagnostic includes `source` and `confidence` fields in JSON output. `source` is one of `spec`, `advisory`, `heuristic`, `agent`, or `history`; `confidence` is `high`, `medium`, or `low`.
+
+The JSON schema is stable. It will not change in a backward-incompatible way within the v1.x series.
 
 ## Options
 
 | Flag | Default | Description |
 |---|---|---|
-| `--format {text,json}` | `text` | Output format |
+| `--format {text,json,md,agent}` | `text` | Output format |
+| `--config PATH` | nearest `skillcheck.toml` | Load config defaults from TOML |
 | `--max-lines N` | `500` | Override the line-count threshold |
 | `--max-tokens N` | `8000` | Override the token-count threshold |
 | `--ignore PREFIX` | | Suppress rules matching this prefix; can be repeated |
@@ -235,6 +245,8 @@ The JSON schema is stable. It will not change in a backward-incompatible way wit
 | `--min-desc-score N` | | Minimum description quality score (0-100); below this triggers a warning |
 | `--target-agent {claude,vscode,all}` | `all` | Scope compatibility checks to a specific agent |
 | `--strict-vscode` | `false` | Promote VS Code compatibility issues to errors |
+| `--semantic` | `false` | Enable semantic-adjacent validation; standalone mode runs heuristic graph analysis |
+| `--agent-reason` | `false` | Emit a combined critique + graph prompt packet for the calling agent |
 | `--emit-critique-prompt` | `false` | Print agent self-critique prompt to stdout and exit 0 |
 | `--ingest-critique PATH` | | Read agent critique JSON from PATH or `-` for stdin; merge with symbolic results |
 | `--critique-agent NAME` | `claude` | Prompt variant: `claude`, `codex`, or `cursor`. Requires `--emit-critique-prompt` or `--ingest-critique` |
@@ -245,15 +257,16 @@ The JSON schema is stable. It will not change in a backward-incompatible way wit
 | `--graph-agent NAME` | `claude` | Prompt variant for graph extraction: `claude`, `codex`, or `cursor`. Requires `--emit-graph-prompt` or `--ingest-graph` |
 | `--history` | `false` | Append a validation record to `.skillcheck-history.json` next to the skill |
 | `--show-history` | `false` | Print the validation ledger and exit 0 |
+| `--activation-hypotheses` | `false` | Experimental emit mode for likely natural-language activation triggers |
 | `--version` | | Show version and exit |
 
 ## Exit Codes
 
 | Code | Meaning | Example invocation |
 |---|---|---|
-| `0` | No errors; warnings and info are allowed | `skillcheck skills/skillcheck/SKILL.md` |
+| `0` | No errors and no warnings | `skillcheck skills/skillcheck/SKILL.md` |
 | `1` | One or more errors found | `skillcheck SKILL.md` when the name is invalid |
-| `2` | Input error: missing file or empty directory | `skillcheck path/that/does/not/exist` |
+| `2` | Warning-only report or input error | `skillcheck SKILL.md --max-lines 1` |
 | `3` | Symbolic passed but ingested critique found semantic errors | `skillcheck SKILL.md --ingest-critique response.json` when the agent reported contradictions |
 
 Exit code 1 takes priority over 3 when symbolic errors also exist.
@@ -307,7 +320,7 @@ Source tags: `spec` rules derive from the agentskills.io specification or agent-
 
 ## Case Study
 
-We ran skillcheck against three corpora: Anthropic's official skills repository (18 skills), the `mcp-builder` skill through the full v1.0 pipeline, and five skills from the uxuiprinciples/agent-skills collection. Full run artifacts: `runs/anthropics-corpus/`, `runs/anthropics-mcp-builder/`, `runs/uxuiprinciples-corpus/`.
+We ran skillcheck against three corpora during v1.0 release prep (April 2026 snapshots): Anthropic's official skills repository (18 skills), the `mcp-builder` skill through the full v1.0 pipeline, and five skills from the uxuiprinciples/agent-skills collection. Full run artifacts: `runs/anthropics-corpus/`, `runs/anthropics-mcp-builder/`, `runs/uxuiprinciples-corpus/`.
 
 The symbolic run of the Anthropic corpus returned four failures from eighteen files (exit 1). All four files look correct on review: two had second-person voice in the description, one used "claude" as part of the name (reserved word per spec), and the template skill had a name/directory mismatch. The deeper finding came from running `mcp-builder` through the critique pipeline: the symbolic run passed (exit 0), but the ingested agent critique returned exit 3 with three `semantic.contradiction.detected` errors. The skill's frontmatter offers Python and TypeScript as equal options; its body unconditionally recommends TypeScript in Phase 1.3. That inconsistency means any agent following the Python path hits an unresolved decision point. No static linter catches it. See [docs/case-study-v1-real-world-runs.md](docs/case-study-v1-real-world-runs.md) for the full breakdown.
 
@@ -334,7 +347,7 @@ pip install -e ".[dev]"
 python3 -m pytest tests/ -q
 ```
 
-653 tests cover all rule modules, CLI exit codes, graph analyzers, divergence detection, critique parsing, history round-trips, and the full self-host pipeline against `skills/skillcheck/SKILL.md`. Fixtures are in `tests/fixtures/`; every rule has at least one positive and one negative test case.
+664 tests cover all rule modules, CLI exit codes, graph analyzers, divergence detection, critique parsing, history round-trips, and the full self-host pipeline against `skills/skillcheck/SKILL.md`. Fixtures are in `tests/fixtures/`; every rule has at least one positive and one negative test case. `tests/test_readme_test_count_claim.py` asserts this count matches `pytest --collect-only`, so any future suite change has to update the number in the same commit or CI fails.
 
 ## Maintainer Notes
 
@@ -346,6 +359,14 @@ make regen-self-host-fixtures
 
 This runs `scripts/regen_self_host_fixtures.py`, which extracts a fresh heuristic graph and writes it to `tests/fixtures/self_host/graph_clean.json`.
 
+To summarize a batch of skillcheck JSON outputs across many repos (the layout the field-test runs use, with one directory per repo, one subdirectory per skill, and `01-symbolic.json` / `02-strict-vscode.json` / `03-graph-analyze.json` / `04-graph-extracted.json` / `08-critique-report.json` / `09-graph-agent-report.json` / `10-full-pipeline.json` per skill), run:
+
+```bash
+python scripts/summarize_batch.py path/to/batch-dir
+```
+
+It writes `summary.csv` and `findings.md` next to the batch directory. The script is intended for benchmark and field-test workflows; it is not part of the CLI surface and is not exposed as a console script.
+
 To add a new rule: implement `def check_something(skill: ParsedSkill) -> list[Diagnostic]` in the appropriate module under `src/skillcheck/rules/`, register it in `src/skillcheck/rules/__init__.py`, add at least one positive and one negative fixture, and add a row to the Rules table above. Full conventions are in [`.github/CLAUDE.md`](.github/CLAUDE.md).
 
 ## License
 
@@ -0,0 +1,50 @@
+# skillcheck 1.0.1
+
+skillcheck v1.0.1 commits a batch of post-v1.0.0 implementation work that had been sitting uncommitted, ships the docs corrections an end-to-end verification surfaced, and aligns the README, CHANGELOG, and CLI surface so they describe the same release.
+
+There is one behavior change relative to v1.0.0: warning-only runs now return exit code 2. Errors return 1; semantic drift returns 3. CI consumers that previously relied on warning-only exiting 0 must update.
+
+## Changed
+
+- Warning-only CLI reports now return exit code 2. Exit code 1 remains errors; exit code 3 remains semantic drift.
+- README Exit Codes table row 0 now reads "no errors and no warnings".
+- README test count corrected from 653 to 663.
+- README JSON-stability promise updated from "0.x series" to "v1.x series".
+- README field-test numbers reframed as April 2026 snapshots against `anthropics/skills`, with a note that they will drift as upstream evolves.
+- `action.yml` `format` input description clarified: accepted but ignored at runtime; the action always invokes skillcheck with `--format json` so it can parse diagnostics for PR annotations and the step summary.
+- Development extras now include `ruff>=0.6`, `mypy>=1.10`, and `types-PyYAML>=6.0`.
+
+## Added
+
+- `--semantic`: guide-compatible shortcut that enables semantic-adjacent validation. In standalone mode it runs heuristic graph analysis; with ingested agent responses it merges those diagnostics.
+- `--agent-reason`: guide-compatible agent-workflow shortcut. Emits a combined critique and graph prompt packet so the calling agent can run both reasoning steps and feed JSON back through `--ingest-critique` and `--ingest-graph`.
+- `--format md` and `--format agent`: Markdown report output and agent-oriented next-action output.
+- `skillcheck.toml` config loading: top-level defaults for format, thresholds, target agent, strict VS Code mode, skip flags, ignored rule prefixes, graph analysis, semantic mode, history, and agent variants. CLI flags always win; the loader fills unset values.
+- Experimental `--activation-hypotheses`: generates likely natural-language routing triggers plus a discoverability entropy score. Routing caveat included in every report.
+- Machine-readable diagnostic metadata: JSON diagnostics now include `source` and `confidence` fields.
+- GitHub Action inputs for the v1.0 modes: `semantic`, `analyze-graph`, `ingest-critique`, `critique-agent`, `ingest-graph`, `graph-agent`, `history`, `activation-hypotheses`. The action still always emits JSON internally for PR annotations.
+
+## Why this is a patch and not a minor
+
+Every addition above either documents existing behavior, refines a flag, or is gated behind a new opt-in flag. There is one breaking-ish change: warning-only runs now exit 2 instead of 0. Strict semver would call that a minor bump. The judgment call here: v1.0.0 shipped with documentation that already implied the v2-style exit codes (and the v1.0.1 README makes it explicit), the prior "warnings exit 0" behavior was undocumented in the released README, and the change matches what users running this in CI would expect. If your CI pipeline depended on the old behavior, pin to `@v1.0.0` rather than `@v1` until you can update.
+
+## Verification
+
+After installing `skillcheck==1.0.1`:
+
+```bash
+skillcheck --version
+# skillcheck 1.0.1
+
+skillcheck skills/skillcheck/SKILL.md --analyze-graph
+# exit 0 with no errors and no warnings (only INFO diagnostics)
+```
+
+End-to-end verification was run against `anthropics/skills` at commit `5128e186` (18 SKILL.md files). All 26 documented flags exercised; all four exit codes (0, 1, 2, 3) reproduced; the action entrypoint produced byte-identical JSON to the CLI. Full report: see the v1.0.1 verification artifacts.
+
+## Links
+
+- PyPI: https://pypi.org/project/skillcheck/1.0.1/ (available after publish)
+- GitHub Release: https://github.com/moonrunnerkc/skillcheck/releases/tag/v1.0.1
+- agentskills.io specification: https://agentskills.io/specification
+- README: https://github.com/moonrunnerkc/skillcheck/blob/main/README.md