Skip to content

chore(commands): modernize slash command frontmatter (v1.2.3)#12

Merged
JonyanDunh merged 1 commit into
mainfrom
chore/slash-command-frontmatter-v1.2.3
Apr 11, 2026
Merged

chore(commands): modernize slash command frontmatter (v1.2.3)#12
JonyanDunh merged 1 commit into
mainfrom
chore/slash-command-frontmatter-v1.2.3

Conversation

@JonyanDunh
Copy link
Copy Markdown
Owner

Summary

Sweep all three slash command `.md` files under `commands/` so they use only documented frontmatter fields from the canonical Claude Code reference at https://code.claude.com/docs/en/skills.md#frontmatter-reference.

Changes

  • Removed hide-from-slash-command-tool from `commands/start.md` and `commands/stop.md`. This field is not documented anywhere in the official Claude Code reference — it was inherited from the upstream ralph-loop plugin as a legacy / undocumented convention. Claude Code silently ignores unknown fields so removing it is a no-op behaviorally, but it cleans up the file and removes the implication that we know something the docs don't.
  • Added `disable-model-invocation: true` to all three slash commands. This is the documented way to block programmatic invocation by the agent — ensures `/watchdog:start`, `/watchdog:stop`, and `/watchdog:help` are only triggered by a user typing them directly.
  • Added explicit `argument-hint` to `stop.md` and `help.md` (empty string for no-args). `start.md` already had one; tightened it to `"" [--max-iterations N]` to hint at the required double quotes around the prompt.

Threat model clarification

Watchdog was never designed to survive an adversarial agent deliberately trying to escape — only a confused or lazy agent. With just `disable-model-invocation: true`:

  • The agent cannot call `/watchdog:stop` programmatically even if it wanted to
  • The classifier subprocess lives in a separate claude process the agent has no handle on
  • The Stop hook fires regardless of any agent output; the agent cannot suppress or spoof it
  • Even if the agent deletes `.claude/watchdog.claudepid.*.local.json` to break the loop, that action shows up in the session transcript and is visible to the user — a self-owning "win"

So the level of protection provided by the removed undocumented field was zero in practice — it didn't buy any adversarial resistance, and the documented fields give all the protection the realistic threat model needs.

Scope

All changes confined to `commands/*.md`. Zero changes to `lib/`, `hooks/`, `scripts/`, or tests. Bumped `.claude-plugin/plugin.json` and `.claude-plugin/marketplace.json` to 1.2.3.

Test plan

  • 80-test suite passes locally (78 active + 2 skipped-inside-Claude-Code, 0 failures)
  • CI will confirm on 9 matrix jobs (ubuntu/macos/windows × Node 18/20/22)

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

Update all three slash command .md files (start, stop, help) to use
only documented frontmatter fields from the canonical Claude Code
reference at https://code.claude.com/docs/en/skills.md#frontmatter-reference.

## What changed

- Removed `hide-from-slash-command-tool` from start.md and stop.md.
  This field is NOT documented in the official Claude Code reference.
  It was inherited from the upstream ralph-loop plugin where it
  appears as a legacy / undocumented convention — Claude Code
  silently ignores unknown fields, so removing it has no behavioral
  downside. Confirmed absent from:
    - https://code.claude.com/docs/en/skills.md
    - https://code.claude.com/docs/en/plugins-reference.md
    - https://code.claude.com/docs/en/commands.md

- Added `disable-model-invocation: true` to all three commands.
  This IS the documented way to block programmatic invocation of a
  slash command by the agent — ensures /watchdog:start, /watchdog:stop,
  and /watchdog:help can only be triggered by the user typing them
  directly, never by the agent auto-invoking them via the Skill tool.

- Added explicit `argument-hint` to stop.md and help.md (empty string
  for no-args commands, so autocomplete shows the command takes no
  arguments). start.md already had `argument-hint` from earlier,
  tightened to "\"<PROMPT>\" [--max-iterations N]" which hints at
  the required double quotes around the prompt.

## Threat model clarification

Even without any "hide from agent" mechanism, the watchdog plugin is
robust against a *confused or lazy* agent — which is the realistic
threat model for a self-referential loop:

- `disable-model-invocation: true` means the agent cannot call
  /watchdog:stop programmatically even if it wanted to
- The classifier subprocess lives in a separate claude process
  the agent has no handle on
- The Stop hook fires regardless of any output the agent produces;
  the agent cannot suppress or spoof it
- Even if the agent deletes `.claude/watchdog.claudepid.*.local.json`
  to break the loop, that action shows up in the session transcript
  and is visible to the user — a self-owning "win"

Watchdog is not designed to survive an *adversarial* agent that is
deliberately trying to escape a sandbox; it is designed to keep a
task-following agent honest about whether it has actually done
the work.

## No code changes

All changes are confined to the three .md files under `commands/`.
Zero changes to lib/, hooks/, scripts/, or tests. The 80-test suite
still passes unchanged (78 active + 2 skipped-inside-Claude-Code,
0 failures).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@JonyanDunh JonyanDunh merged commit cc2d6d0 into main Apr 11, 2026
11 checks passed
@JonyanDunh JonyanDunh deleted the chore/slash-command-frontmatter-v1.2.3 branch April 11, 2026 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant