Skip to content

Commit 1bc2edc

Browse files
docs: backfill apm-usage and consolidate registry guides (v0.14->v0.15 drift sweep) (#1511)
* docs: backfill apm-usage and consolidate registry guides (v0.14->v0.15 drift sweep) Holistic docs-sync retrospective on the v0.14.0->v0.15.0 release window flagged 23 of 39 user-impact PRs as docs-debt: 7 Rule 4 violations (apm-usage/ skipped) plus 16 silent-drift PRs. This PR closes the highest-priority gaps (P0/P1 from the retrospective) in one sweep. Backfills (apm-usage/ training corpus): - dependencies.md: registry-sourced APM dep object form (#1471) - authentication.md: APM_REGISTRY_TOKEN_{NAME} precedence (#1471) - governance.md: registry_source + allow_non_registry policy (#1471) - package-authoring.md: apm publish workflow (#1471) and project-scope hook command path semantics (#1396) - commands.md: apm publish entry (#1471), apm config transport keys (#1308), apm compile live-reload + --clean --watch warning (#1403), Claude Code instruction dedup (#1146), MCP env-var placeholder resolution (#1277), AppLocker/WDAC staged-install diagnostic (#1390) Structural fix (per docs-impact-architect verdict): - Merge guides/private-registries.md INTO guides/registries.md with progressive disclosure (public -> private -> per-dep routing -> enterprise link). Adds Starlight redirect for the old slug, patches 5 cross-references across consumer/, reference/cli/. Editorial fixes (per editorial-owner sweep): - integrations/copilot-app.md (#1431): lead with user value before WS-IPC/SQLite mechanics; add 'restart the Copilot App once' troubleshooting hint - producer/compile.md: dedup the Claude Code instruction dedup explanation (was stated twice) - enterprise/security.md: reframe defensive memo voice ('do not call this X') to user voice ('here is what we provide / here is what we don't') Method: docs-sync skill end-to-end. 5-panelist fan-out plus CDO synthesis. Every CLI claim in the apm-usage adds was verified against the live 'apm <verb> --help' surface (S7 tool bridge). Out of scope (tracked as P1 follow-up): backfilling docs for the 16 silent-drift PRs grouped by subsystem (MCP, install, compile, auth). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: full-corpus regrounding audit (55 pages, 14 surgical fixes) Wave-batched grounding audit across 55 high-risk pages (CLI ref x27, schemas/specs x10, consumer ramp x12, onboarding x6). Each page's factual claims (flags, env vars, exit codes, schema fields, file paths, code links) was extracted and verified against current src/apm_cli/ and 'apm <verb> --help' output via S7 tool-bridge. Fixes applied (14 files): CLI reference: - pack.md: add --check-versions, --check-clean flags + exit codes 3, 4 - targets.md: expand copilot detection signals (5, not 1) - experimental.md: add copilot-app, marketplace-authoring, registries - install.md: dedup duplicate '## Exit codes' + '## Notes' sections Schemas / specs: - lockfile-spec.md: expand package_type enum to full 6-value list - manifest-schema.md: document plural 'targets:' alias (#1335) - environment-variables.md: add APM_BROAD_FETCH_DEPTH, APM_COPILOT_APP_DB - package-types.md: add 5th layout (hook_package, hooks/*.json only) Consumer ramp: - install-mcp-servers.md: fix stale code citation + 'Or' -> 'And' - private-and-org-packages.md: drop nonexistent BITBUCKET_APM_PAT Onboarding (6 broken navigation links, 4 files): - quickstart.mdx, getting-started/installation.md, getting-started/first-package.md, getting-started/migration.md: repoint self-loops and dead routes to actual page paths Process: dispatched as 6 parallel grounding-verifier agents (general- purpose) across disjoint page scopes; each agent had edit authority on its scope and applied surgical fixes inline. Reusable pattern via the docs-corpus-audit sibling skill design (PANEL + WAVE EXECUTION + S7 verifier fan-out, see files/docs-corpus-audit-design.md). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: wave 3 corpus audit + IA-reshuffle dead-link cleanup (53 pages) Second sweep of the regrounding audit. Covers the 57 pages deferred in wave 2: producer/ (15), enterprise/ (15), concepts/ (6), integrations/ (7), troubleshooting/ (7), contributing/ (3), reference tail (3), 404. Process: 6 parallel grounding-verifier agents on disjoint scopes; each agent extracts factual claims, S7-verifies against current source ('apm <verb> --help' + grep src/apm_cli/), and applies surgical edits inline. Same pattern as wave 2 (PANEL + WAVE EXECUTION + S7 verifier fan-out). Orchestrator post-pass swept three cross-corpus broken-link patterns the per-scope agents could not fix alone. High-signal factual fixes: enterprise/governance-guide.md: - --output-file -> --output (real flag is --output / -o) - 7+17 check count -> 8+17 (8 baseline checks, not 7) enterprise/apm-policy.md: - '16 of 22 checks' -> '17 of 25 checks' (phantom counts) - conflated --no-policy (install-only) with APM_POLICY_DISABLE (env) enterprise/apm-policy-getting-started.md: - dropped 'apm compile' from list of commands that run policy (compile enforces zero policy per governance-overview.md L57) enterprise/policy-reference.md: - compilation.target.allow: added copilot, gemini, vscode, windsurf, agent-skills (only 5 of 9 runtimes were listed) enterprise/registry-proxy.md: - 'apm marketplace add --branch main' -> '--ref main' (no --branch flag) enterprise/security-and-supply-chain.md: - 3 stale source line-number citations corrected producer/author-primitives/index.md: - legacy '.hook.md' extension -> '.json' (hook_integrator scans JSON) - removed nonexistent '.apm/commands/' subdirectory from layout example concepts/lifecycle.md: - 4 reference-page links all pointed at install/ (copy-paste) Cross-corpus IA-reshuffle dead-link cleanup (orchestrator pass): - introduction/* -> concepts/* (4 links across 2 files) - guides/ci-policy-setup/ -> enterprise/enforce-in-ci/ (8 links, 4 files) - guides/pack-distribute/ -> producer/pack-a-bundle/ (5 links, 4 files) - guides/dependencies/ -> consumer/manage-dependencies/ (1 link) - guides/agent-workflows/ -> contextual canonical (3 links, 3 files) - guides/install-and-use/mcp-servers/ -> consumer/install-mcp-servers/ (3) - guides/compilation/ -> producer/compile/ (1) - guides/prompts/ -> producer/author-primitives/prompts/ (2) - guides/drift-detection/ -> enterprise/drift-detection/ (1) enterprise/security.md side-fix: - 'apm unpack scheduled for removal in v0.14' -> drop version target (APM is 0.15.0 and unpack still ships marked DEPRECATED in --help). Upstream remediation (refresh deprecation timeline in source or remove the shim) tracked outside this PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: close deferred items from corpus regrounding audit Closes the three items deferred from the v0.14->v0.15 docs-sync retrospective and the full-corpus regrounding waves (commits 4f00c2b, 242bb9e, b80da69): 1. apm unpack source-side deprecation timeline - src/apm_cli/commands/pack.py: 'will be removed in v0.14' -> 'will be removed in a future release'. Current version is 0.15.0; the v0.14 target had already passed. Docs were softened in wave 3; this mirrors the choice in source. - CHANGELOG.md: [Unreleased] Fixed entry. 2. Bucket-C silent-drift backfills (20 PRs, parallel triage) - 3 grounding-verifier subagents reviewed 20 of the 21 bucket-C PRs (#1477 excluded as test-flake fix, no doc surface). Verdicts: 17 ALREADY_COVERED or NO_DOC_SURFACE (verified honestly against wave 2-3 backfills, not manufactured), 3 BACKFILLED: - #1385 SSH dep user-from-URL: added supported-form row in docs/src/content/docs/consumer/manage-dependencies.md and bullet in apm-usage/dependencies.md. - #1434 Copilot App schema range [13,15] + warn-not-fail: rewrote the 'Schema compatibility' paragraph in docs/src/content/docs/integrations/copilot-app.md (was factually wrong, claimed [13,13] hard-fail). - #1440 Copilot file-based detection signals: added the four .github/{instructions,agents,prompts,hooks}/ directories to the canonical-signals list in troubleshooting/compile-zero-output-warning.md and to the apm-usage commands.md + package-authoring.md auto-detect rules. 3. docs-corpus-audit skill extracted - .apm/skills/docs-corpus-audit/SKILL.md: first-class skill module emitted from the genesis design artifact used to drive waves 2 and 3. Pattern: PANEL + WAVE EXECUTION + S7 verification. Wave-batched (scales as O(waves), not O(claims)), disjoint page ownership (no merge conflicts), orchestrator post-pass for cross-corpus drift patterns invisible to per-scope agents. - references/design-handoff.md: full design artifact preserved for future maintainers. - Sibling to docs-sync (per-PR), not a replacement. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: fix dead links + address Copilot review findings Two classes of fix on PR #1511: 1. Deploy Docs CI -- starlight-links-validator failure (2 dead links) - getting-started/first-package.md:18 and quickstart.mdx:40 used absolute /apm/getting-started/installation/ paths introduced in wave 2 (242bb9e). Converted to relative paths matching the surrounding link convention. - Verified with local 'npm run build' under docs/: 'All internal links are valid.' 2. Copilot PR review -- 7 inline factual accuracy comments, all verified against source and addressed: - apm-usage/package-authoring.md: hook path rewrite is performed by 'apm install' (hook integrator pass), not 'apm compile'. - apm-usage/dependencies.md + docs/guides/registries.md: registry resolver requires semver per apm_cli/deps/registry/semver.py (is_semver_range gate). Removed examples implying opaque labels (#stable, #v2.0.0, 'latest') route through a registry; updated selector tables to flag non-semver refs as rejected for registry sources. - apm-usage/dependencies.md + docs/guides/registries.md: lockfile_version: '2' promotion triggers on registry deps OR git-source semver resolution fields (constraint / resolved_tag / resolved_at per lockfile.py:_needs_v2, issue #1488), not just registry deps. - apm-usage/authentication.md: 'token:' in apm-policy.yml is not parse-rejected, only surfaces as an 'Unknown top-level policy key' warning per policy/parser.py. Still discouraged (leaks to repo), but the rejection mechanism is different from apm.yml. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * skill(docs-corpus-audit): refactor under genesis discipline + self-test Round-trip assessment found the original SKILL.md draft violated genesis SoC in 7 ways: 1. Invented inline 'grounding-verifier' persona instead of composing shared agent personas (python-architect for S7, doc-writer for edits). R3 EXTRACT in reverse. 2. Subagent prompt template inlined in SKILL body (~40 lines that belong in assets/). 3. IA-reshuffle grep patterns hard-coded in body as bash heredoc -- the patterns rot per release and belong in scripts/ with --help and a versioned update cadence. 4. PHANTOM DEPENDENCY on docs-sync's substrate (.apm/docs-index.yml, personas, panelist-return-schema, the apm-usage Rule-4 corpus) never declared via tool-call probes -- A9 SUPERVISED EXECUTION violation per genesis Step 7b. 5. Missing A8 ALIGNMENT LOOP: wave agents edited inline and nothing re-verified the edits grounded. 6. DISPATCH COLLISION risk vs docs-sync: identical 'drift between docs and code' triggers; dispatcher LLM could misroute. 7. BUNDLE LEAKAGE: references/design-handoff.md was session-history (maintainer-scope), not runtime-loaded. Per genesis 3.5 it must NOT ship with the user-facing bundle. Refactor: - SKILL.md (218 lines, well under 500-line cap): adds explicit Sibling Contract table with docs-sync; declares roster as composition of existing personas via relative links; PROBE / RISK-TRIAGE / WAVE / POST-PASS / ALIGNMENT-LOOP / COMMIT / PR phases; sharpened trigger description naming whole-corpus scope. - assets/subagent-prompt-template.md: extracted the per-scope prompt that composes python-architect + doc-writer. - assets/panelist-return-schema.json: explicit JSON schema for agent returns; orchestrator validates and rejects malformed. - scripts/scan-cross-corpus-drift.sh: deterministic cross-corpus drift sweep with 4 pattern groups (ia-links, stale-deprecation, absolute-base, ascii-leak). Non-interactive, --help-documented, stdout/stderr split per genesis script conventions. - evals/{trigger,content}-evals.json + README.md: ship gate exercising 10+10 trigger queries (docs-sync boundary is the load-bearing distinction) and 3 seeded-drift scenarios with control baselines. - Deleted references/design-handoff.md (bundle leak; design artifact stays in session state only). Self-test (proves the refactor works end-to-end): - Ran scan-cross-corpus-drift.sh against the live corpus; it immediately surfaced two genuine misses that wave 3 missed: - src/apm_cli/commands/pack.py:606: click help= string still said 'removed in v0.14' (the logger.warning at line 633 was fixed last commit; this is a sibling string the wave 3 agent didn't see because each agent only owned ~9 pages). - docs/src/content/docs/reference/cli/unpack.md:9: caution banner still said 'scheduled for removal in v0.14'. - Both softened to 'in a future release' (consistent with the rest of the wave 3 choice). - Lint clean; docs build clean ('All internal links are valid'). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * skill(docs-grounding-verifier): claim-level grounding harness + 7 drift fixes New sibling skill to docs-corpus-audit. Genesis-designed PIPELINE-of-PANELS (RAGAS-faithfulness adapted from RAG to docs/code): - Stage 1: per-page LLM claim extraction - Stage 2: deterministic grep-based evidence retrieval (S7, no LLM) - Stage 3: adversarial LLM grounding judge (A7, 4-verdict calibrated) Empirical proof bundle (.apm/skills/docs-grounding-verifier/evals/runs/proof/): - 5 high-stakes pages -> 75 atomic claims extracted - Tally: 63 GROUNDED / 6 PARTIAL / 4 CONTRADICTED / 2 UNSUPPORTED (84%) - Trigger eval: 20/20 dispatch classification correct (precision=1.0, recall=1.0, specificity=1.0, pass_gate=true) High-confidence drift fixes applied: - apm-policy.md: MCP transport defaults (was 'block sse/streamable-http by default' -> actually allow=None means all permitted; sample policy now correctly framed as restriction example) - apm-policy.md: inheritance levels (was '5 levels including team policy' -> canonical chain is 3 semantic levels; 5 is MAX_CHAIN_DEPTH for intermediate extends: jumps) - Plus 5 editorial fixes from prior pass (examples, registries x2, security, copilot-app) Lower-confidence findings (judge retrieval gaps, vague reasoning) left for follow-up rather than risk introducing new drift via speculative edits. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 3d16520 commit 1bc2edc

159 files changed

Lines changed: 10158 additions & 622 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
---
2+
name: docs-corpus-audit
3+
description: >-
4+
Use this skill to run a holistic regrounding pass on the entire
5+
microsoft/apm documentation corpus against current source code,
6+
page-by-page, and emit surgical fixes for stale claims. Activate
7+
when the maintainer wants a WHOLE-CORPUS audit (not per-PR review)
8+
-- typical triggers include "audit the docs", "reground the
9+
corpus", "check every page against code", "pre-release docs
10+
sweep", "the docs have drifted everywhere", or "we just reshaped
11+
the TOC, find dead links". Wave-batched and S7-verified; scales
12+
to the full ~112-page corpus in ~10 minutes wall-time. This is a
13+
SIBLING to docs-sync, not a replacement: docs-sync is per-PR
14+
(triggered by a diff); this skill is per-corpus (triggered by a
15+
maintainer ask). They share agent personas, schemas, and the
16+
docs index, but their triggers MUST NOT collide. Does NOT
17+
auto-merge, does NOT push without maintainer review, and does
18+
NOT replace per-PR drift detection.
19+
---
20+
21+
# docs-corpus-audit -- whole-corpus regrounding pass
22+
23+
The docs corpus drifts silently between releases. `docs-sync` catches
24+
drift introduced by individual PRs at PR-open time. This skill catches
25+
the **accumulated** drift that slips past per-PR review -- stale flag
26+
names, dead nav links from past IA reshuffles, deprecation banners
27+
that outlived their version targets, factual claims whose source-side
28+
truth has moved.
29+
30+
The pattern is **A1 PANEL + WAVE EXECUTION + S7 DETERMINISTIC TOOL
31+
BRIDGE + A8 ALIGNMENT LOOP + A9 SUPERVISED EXECUTION**. The corpus is
32+
split into disjoint page scopes; one verifier subagent owns each
33+
scope; agents extract factual claims, S7-verify against source, apply
34+
surgical fixes inline. The orchestrator then runs an alignment-loop
35+
pass to re-verify that applied edits actually ground out true.
36+
37+
This skill is ADVISORY but ACTIONABLE: agents apply edits inline on a
38+
working branch. The orchestrator is the sole writer to git -- stages,
39+
commits, pushes. Maintainer reviews the resulting PR.
40+
41+
## Sibling contract with docs-sync
42+
43+
These two skills share substrate. Be explicit:
44+
45+
| Shared resource | Owner | Both use |
46+
|---|---|---|
47+
| `.apm/docs-index.yml` (corpus map) | docs-sync | yes |
48+
| [doc-writer](../../agents/doc-writer.agent.md) persona | shared | yes (per-page edits) |
49+
| [python-architect](../../agents/python-architect.agent.md) persona | shared | yes (S7 verification) |
50+
| [editorial-owner](../../agents/editorial-owner.agent.md) persona | shared | optional (voice pass at scale) |
51+
| [cdo](../../agents/cdo.agent.md) persona | shared | yes (final synthesis) |
52+
| `assets/panelist-return-schema.json` | docs-sync (mirrored) | yes |
53+
54+
**Trigger boundary (avoid DISPATCH COLLISION):**
55+
56+
- `docs-sync` triggers on a PR event ("PR opened/synchronized",
57+
source-diff-driven).
58+
- `docs-corpus-audit` triggers on a maintainer ask for a
59+
WHOLE-CORPUS pass ("audit the corpus", "reground", "pre-release
60+
sweep") -- no PR required, no diff required, the whole corpus
61+
is the input.
62+
63+
If a maintainer asks "review this PR's doc impact", route to
64+
`docs-sync`. If they ask "audit all our docs" or "the docs feel
65+
stale everywhere", route here.
66+
67+
## Architecture invariants
68+
69+
- **Wave-batched, not flat.** Pages are partitioned into 6-8 disjoint
70+
scopes; each scope is one verifier subagent. Cost scales with
71+
wave size, not corpus size. A wave of 6 agents on ~10 pages each
72+
is the canonical shape.
73+
- **Disjoint page ownership.** Each subagent has EDIT AUTHORITY on
74+
its scope only. No two agents touch the same file -- guarantees
75+
no merge conflicts during fan-in.
76+
- **S7 verification is mandatory.** Every factual claim is verified
77+
against deterministic source: `uv run apm <verb> --help` for CLI,
78+
`grep -n src/apm_cli/` for symbols, `python -c "import ..."` for
79+
module shape, file-existence checks for nav links. Never assert
80+
from LLM recall.
81+
- **Surgical edits only.** 1-3 line patches per drift, preserving
82+
voice. Restructuring is deferred to the orchestrator post-pass,
83+
never auto-applied by per-scope agents.
84+
- **Single-writer interlock for git.** Subagents NEVER run
85+
`git commit`, `git push`, or `gh pr <write>`. Orchestrator
86+
commits per wave; pushes once per session.
87+
- **Alignment loop (A8).** After waves return, orchestrator
88+
re-greps the corpus for the patterns the agents claimed to fix.
89+
Any residue triggers a targeted re-dispatch (max 2 redrafts) or
90+
is escalated to maintainer.
91+
92+
## Roster (composition, not invention)
93+
94+
Reuse docs-sync's personas. Do NOT invent a one-off "grounding-
95+
verifier" role; that's R3 EXTRACT in reverse.
96+
97+
| Role | Persona | Always active? |
98+
|---|---|---|
99+
| Per-scope verifier+editor | [python-architect](../../agents/python-architect.agent.md) (S7) and [doc-writer](../../agents/doc-writer.agent.md) (edits), bundled into one subagent prompt per scope | Yes -- one per page scope, parallel fan-out |
100+
| Cross-corpus post-pass | orchestrator (deterministic greps via `scripts/scan-cross-corpus-drift.sh`) | Yes -- once after waves return |
101+
| Alignment-loop checker | orchestrator (deterministic re-grep + targeted re-dispatch) | Yes -- once after post-pass |
102+
| Voice pass (optional) | [editorial-owner](../../agents/editorial-owner.agent.md) | Only when >20 edits to keep tone coherent |
103+
| Final synthesis | [cdo](../../agents/cdo.agent.md) | Once, for the PR summary comment |
104+
105+
The per-scope subagent prompt that composes `python-architect` +
106+
`doc-writer` is in `assets/subagent-prompt-template.md` -- the
107+
orchestrator substitutes scope + working dir + branch and dispatches
108+
via the task tool.
109+
110+
## Process
111+
112+
```
113+
1. PROBE (A9 SUPERVISED EXECUTION)
114+
- Check working tree: docs/src/content/docs/ exists?
115+
- Check working tree: packages/apm-guide/.apm/skills/apm-usage/
116+
exists? (Rule-4 backfill target. If missing, the audit cannot
117+
close Rule 4; ask maintainer before continuing.)
118+
- Check `.apm/docs-index.yml` reachable.
119+
- Verify on a working branch (not main).
120+
121+
2. RISK-TRIAGE (orchestrator, ~1 LLM call)
122+
- Read .apm/docs-index.yml only (NOT the corpus body).
123+
- Bucket pages by drift risk: HIGH (CLI ref, schemas, consumer
124+
flows), MEDIUM (producer, enterprise policy), LOW (concepts,
125+
contributing, troubleshooting, integrations).
126+
- Decide wave order: HIGH first, MEDIUM next, LOW last.
127+
128+
3. WAVE-PLANNER (orchestrator, deterministic)
129+
- Partition pages into 6-8 disjoint scopes per wave.
130+
- Each agent gets ~9 pages, mixed surface types.
131+
132+
4. WAVE EXECUTION (parallel, one subagent per scope)
133+
- Orchestrator dispatches one task per scope using the prompt
134+
template in assets/subagent-prompt-template.md.
135+
- Subagents read pages, extract claims, S7-verify, apply
136+
surgical edits, return JSON per the docs-sync panelist
137+
schema (mirrored at assets/panelist-return-schema.json).
138+
- Validate every return against the schema; reject malformed
139+
JSON.
140+
141+
5. CROSS-CORPUS POST-PASS (orchestrator, deterministic)
142+
- Run scripts/scan-cross-corpus-drift.sh to grep for patterns
143+
a per-scope agent cannot see (IA-reshuffle dead links, stale
144+
deprecation version targets, phantom flag references).
145+
- Patch residue inline.
146+
147+
6. ALIGNMENT LOOP (orchestrator, deterministic)
148+
- Re-run scripts/scan-cross-corpus-drift.sh.
149+
- Re-grep for claims the agents marked DRIFTED-FIXED.
150+
- If residue: targeted re-dispatch to the owning agent
151+
(bounded: max 2 redrafts per wave).
152+
153+
7. COMMIT + PUSH (orchestrator, single writer)
154+
- One commit per wave; structured message naming closed items.
155+
- Push to working branch.
156+
157+
8. PR + SUMMARY COMMENT (orchestrator)
158+
- If no PR exists: open one with the [pr-description-skill]
159+
(../pr-description-skill/SKILL.md).
160+
- Post per-wave summary comment: pages audited, drift caught,
161+
fixes applied, items deferred, alignment-loop residue.
162+
```
163+
164+
## Bundled assets
165+
166+
- `assets/subagent-prompt-template.md` -- the per-scope prompt the
167+
orchestrator substitutes and dispatches. Composes python-architect
168+
(S7) + doc-writer (surgical edit). Loaded once per scope.
169+
- `assets/panelist-return-schema.json` -- subagent return schema,
170+
mirrored from docs-sync. Loaded once at wave start; validated
171+
against every return.
172+
- `scripts/scan-cross-corpus-drift.sh` -- deterministic grep sweep
173+
for cross-corpus patterns (IA dead links, stale deprecation
174+
targets, phantom flags). Non-interactive; emits structured
175+
matches on stdout, diagnostics on stderr. Run `--help` for
176+
pattern list. Update this script after each major IA reshuffle.
177+
178+
## Cost model
179+
180+
| Wave size | Pages | Subagents | LLM dispatches | Wall time |
181+
|---:|---:|---:|---:|---:|
182+
| Small | ~30 | 4 | ~5 | ~3 min |
183+
| Medium (default) | ~55 | 6 | ~7 | ~5 min |
184+
| Large | ~110 (full corpus) | 12 (two medium waves) | ~14 | ~10 min |
185+
186+
Compared to docs-sync (15-call flat ceiling), this skill scales as
187+
O(waves), not O(claims), because per-agent work fits in one context
188+
window. S7 verification dominates wall-time, not LLM cost.
189+
190+
## Boundary (what this skill does NOT do)
191+
192+
- Per-PR doc-impact review -- use `docs-sync`.
193+
- Single-page typo or copy edit -- direct edit is faster.
194+
- Writing docs for a brand-new feature -- use `docs-impact-architect`
195+
and `doc-writer` directly.
196+
- Auto-merging or pushing without maintainer review.
197+
- Reviewing code quality, security, or test coverage (out of scope).
198+
199+
## Evals
200+
201+
See `evals/`:
202+
- `evals/content-evals.json` -- 3 corpus snapshots with seeded drift
203+
(stale CLI flag, dead nav link, expired deprecation target);
204+
expected behavior is that the skill catches all three and applies
205+
surgical fixes that ground out true on re-verification.
206+
- `evals/trigger-evals.json` -- 10 should-trigger + 10 should-NOT-
207+
trigger queries, 60/40 train/val. The val split is the ship gate
208+
(>=0.5 should-trigger AND <0.5 should-not-trigger).
209+
- `evals/README.md` -- how to run.
210+
211+
## Provenance
212+
213+
This skill was extracted from a real session that audited the
214+
microsoft/apm corpus across 3 waves (PR #1511, 2026-05-27):
215+
112/112 pages audited, 49 surgical fixes, ~25 LLM dispatches,
216+
~30 min wall-time. The session design artifact (genesis hand-off
217+
packet) lives in session state, not in this bundle (maintainer-
218+
scope, not runtime-loaded).
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
{
2+
"$schema": "http://json-schema.org/draft-07/schema#",
3+
"title": "docs-corpus-audit per-scope verifier return schema",
4+
"description": "Schema for the JSON returned by each per-scope verifier+editor subagent. Mirrors the docs-sync panelist-return-schema. The orchestrator validates every wave-agent return against this; malformed JSON triggers re-dispatch.",
5+
"type": "object",
6+
"required": ["agent", "pages", "summary"],
7+
"additionalProperties": false,
8+
"properties": {
9+
"agent": {
10+
"type": "string",
11+
"description": "Scope identifier the orchestrator assigned (e.g., 'wave2-scope-cli-ref')."
12+
},
13+
"pages": {
14+
"type": "array",
15+
"items": {
16+
"type": "object",
17+
"required": ["page", "claims_checked", "grounded", "drifted", "edits_applied", "verdict"],
18+
"additionalProperties": false,
19+
"properties": {
20+
"page": {"type": "string", "description": "Absolute path to the page."},
21+
"claims_checked": {"type": "integer", "minimum": 0},
22+
"grounded": {"type": "integer", "minimum": 0},
23+
"drifted": {
24+
"type": "array",
25+
"items": {
26+
"type": "object",
27+
"required": ["claim", "evidence", "fix"],
28+
"properties": {
29+
"claim": {"type": "string"},
30+
"evidence": {"type": "string", "description": "Deterministic source proving the claim is wrong (command output, file:line)."},
31+
"fix": {"type": "string", "description": "Surgical patch description (1-3 lines)."}
32+
}
33+
}
34+
},
35+
"unverifiable": {
36+
"type": "array",
37+
"items": {
38+
"type": "object",
39+
"required": ["claim", "reason"],
40+
"properties": {
41+
"claim": {"type": "string"},
42+
"reason": {"type": "string"}
43+
}
44+
}
45+
},
46+
"edits_applied": {"type": "integer", "minimum": 0},
47+
"verdict": {"enum": ["CLEAN", "MINOR_DRIFT", "MAJOR_DRIFT"]}
48+
}
49+
}
50+
},
51+
"summary": {"type": "string"},
52+
"open_questions": {
53+
"type": "array",
54+
"items": {"type": "string"}
55+
}
56+
}
57+
}
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# Per-scope verifier+editor subagent prompt template
2+
3+
The orchestrator substitutes the placeholders (`<...>`) and dispatches
4+
this as one task per page scope via the task tool. The prompt composes
5+
two personas: `python-architect` (for S7 deterministic verification of
6+
factual claims) and `doc-writer` (for surgical, voice-preserving
7+
edits). Do NOT invent a one-off "grounding-verifier" persona; the
8+
composition above is the contract.
9+
10+
---
11+
12+
You are a per-scope verifier+editor subagent in a `docs-corpus-audit`
13+
wave. You combine two personas:
14+
15+
- **python-architect** for S7 deterministic verification: every
16+
factual claim is checked against runnable source, not LLM recall.
17+
- **doc-writer** for surgical edits: 1-3 line patches per drift,
18+
voice-preserving, no restructuring.
19+
20+
**Working directory:** `<ABSOLUTE PATH>`
21+
**Branch:** `<branch-name>` (PR #`<num>` in microsoft/apm, if open)
22+
23+
**Your page scope (EDIT AUTHORITY on these only, ABSOLUTE PATHS):**
24+
25+
- `<page 1>`
26+
- `<page 2>`
27+
- ...
28+
29+
**Method per page:**
30+
31+
1. Read the page.
32+
2. Extract every FACTUAL CLAIM: CLI invocation, flag, env var, file
33+
path, code symbol, config field, exit code, behavior assertion,
34+
internal nav link.
35+
3. For each claim, verify against source of truth:
36+
- CLI claims: `cd <WORKDIR> && uv run apm <verb> --help`
37+
- File paths / symbols: `grep -n src/apm_cli/`
38+
- Module shape: `python -c "import <mod>; print(...)"`
39+
- Internal nav links: file-existence check against
40+
`docs/src/content/docs/`
41+
- Code links with line numbers: cross-check exact line
42+
4. Bucket each claim: GROUNDED | DRIFTED | UNVERIFIABLE.
43+
5. For DRIFTED claims, apply a SURGICAL edit (1-3 lines, preserve
44+
voice, no scope creep). Restructuring is deferred to the
45+
orchestrator post-pass -- never auto-applied at scope level.
46+
6. NEVER edit files outside your scope. NEVER commit, push, or
47+
delete. NEVER touch `git`, `gh`, or any write tool.
48+
49+
**Conventions:**
50+
51+
- ASCII-only rule does NOT apply to `docs/src/content/docs/`
52+
(Starlight handles UTF-8).
53+
- ASCII-only rule DOES apply to
54+
`packages/apm-guide/.apm/skills/apm-usage/` (cp1252 hostility).
55+
- External-tool commands (`gh`, `codex`, `claude`) are
56+
UNVERIFIABLE from inside the apm repo -- mark them and move on.
57+
58+
**Return JSON in this shape ONLY (no other prose, no markdown):**
59+
60+
```json
61+
{
62+
"agent": "<scope-id>",
63+
"pages": [
64+
{
65+
"page": "<absolute path>",
66+
"claims_checked": 0,
67+
"grounded": 0,
68+
"drifted": [
69+
{"claim": "...", "evidence": "...", "fix": "..."}
70+
],
71+
"unverifiable": [
72+
{"claim": "...", "reason": "..."}
73+
],
74+
"edits_applied": 0,
75+
"verdict": "CLEAN|MINOR_DRIFT|MAJOR_DRIFT"
76+
}
77+
],
78+
"summary": "...",
79+
"open_questions": []
80+
}
81+
```
82+
83+
Validation: this schema is mirrored in
84+
`assets/panelist-return-schema.json`. The orchestrator validates
85+
every return; malformed JSON is rejected and re-dispatched.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# docs-corpus-audit evals
2+
3+
Two eval suites, both required for the ship gate.
4+
5+
## trigger-evals.json
6+
7+
20 queries (10 should-trigger + 10 should-NOT-trigger), 60/40 train/val
8+
split. The validation split is the ship gate:
9+
10+
- should-trigger val: >=0.5 must invoke `docs-corpus-audit`.
11+
- should-NOT-trigger val: <0.5 must invoke `docs-corpus-audit` (the
12+
rest route to `docs-sync`, `doc-writer`, or direct edit).
13+
14+
The boundary with `docs-sync` is the load-bearing distinction:
15+
PR-scope queries -> docs-sync; whole-corpus-scope queries -> here.
16+
17+
## content-evals.json
18+
19+
Three corpus-drift scenarios, each with seeded drift, expected
20+
behavior, and a control baseline (what the LLM does WITHOUT the
21+
skill loaded). The skill must produce a measurably different and
22+
better outcome on each scenario -- if with-skill and without-skill
23+
are indistinguishable, the skill adds no value and should be
24+
redesigned or deleted.
25+
26+
## How to run
27+
28+
These evals are descriptive at present (the run harness is a TODO).
29+
Until the harness lands, treat them as the operator checklist when
30+
authoring or modifying this skill: every change MUST be re-checked
31+
against the val splits manually.

0 commit comments

Comments
 (0)