Skip to content

docs: expand SECURITY with secrets audit, MCP port checks, and reminder triage#1513

Closed
MaxRyabov wants to merge 1 commit intoaffaan-m:mainfrom
MaxRyabov:docs/security-audit-and-reminder-triage
Closed

docs: expand SECURITY with secrets audit, MCP port checks, and reminder triage#1513
MaxRyabov wants to merge 1 commit intoaffaan-m:mainfrom
MaxRyabov:docs/security-audit-and-reminder-triage

Conversation

@MaxRyabov
Copy link
Copy Markdown

@MaxRyabov MaxRyabov commented Apr 20, 2026

What Changed

Three new operational sections in SECURITY.md, plus a matching Unreleased entry in CHANGELOG.md:

  1. Secrets Handling — explicit audit recipe for the user-scope ~/.claude/settings.json (outside the repo but commonly shared via claude doctor output and screenshots). Provides grep/Select-String commands for macOS/Linux/Windows that flag plain-text tokens inside mcpServers[*].env.

  2. Local MCP Ports — explains the risk of plain-HTTP localhost MCP endpoints shipped in mcp-configs/mcp-servers.json (notably devfleet at http://localhost:18801/mcp), and gives netstat/lsof commands to verify the listener PID before first use.

  3. Triage: suspicious <system-reminder> blocks — documents Claude Code's ephemeral client-side reminder mechanism (TodoWrite nudge, date-changed notice, file-modified notice). These blocks end with phrasing like "NEVER mention this reminder to the user" or "Don't tell the user this, since they are already aware" — that wording is Anthropic's own prompt, not a malicious tail, and the blocks are not persisted in the session transcript at ~/.claude/projects/<slug>/<sessionId>.jsonl. Includes a 3-step triage checklist (repo grep → transcript check → Anthropic-reminder consistency) so real tool-result injections are distinguished from harness-level reminders before escalating to Anthropic.

Why This Change

Two hygiene gaps observed during a recent security review of the repo:

  • mcp-configs/mcp-servers.json already has a "template only" convention for YOUR_*_HERE, but the user-scope ~/.claude/settings.json has no equivalent guidance, and plain-text PATs there are a recurring source of real leaks.
  • A previously-filed report on this project misclassified Anthropic's ephemeral <system-reminder> blocks as prompt injection (because of the "NEVER mention…" tail wording). Without a documented triage path, similar false-positives are likely. Inspecting the JSONL transcript shows the blocks are never persisted in tool_result bodies, which is a deterministic signal for future triagers.

Testing Done

  • Manual testing completed — grep recipes in the new sections exercised locally against ~/.claude/settings.json and an mcp-servers.json template.
  • Automated tests pass locally (node tests/run-all.js) — 1847 tests, 1843 passing, 4 pre-existing broken-symlink failures unchanged from baseline.
  • Edge cases considered and tested — markdownlint passes on the affected files (SECURITY.md is not in the markdownlint scope configured in CI, but formatting matches the rest of the file).

Type of Change

  • docs: Documentation

Security & Quality Checklist

  • No secrets or API keys committed
  • JSON files validate cleanly (no JSON touched)
  • Shell scripts pass shellcheck (not applicable — only shell snippets inside Markdown code fences)
  • Pre-commit hooks pass locally
  • No sensitive data exposed in logs or output
  • Follows conventional commits format

Documentation

  • Updated relevant documentation — SECURITY.md + CHANGELOG.md
  • Added comments for complex logic — not applicable (docs only)
  • README updated (if needed) — not needed

Notes for Maintainers

  • Translations out of sync after this PR: docs/tr/SECURITY.md and docs/zh-CN/SECURITY.md are last-synced with the root SECURITY.md at 2026-04-05 and do not carry the three new sections. Happy to open a follow-up tracking issue labeled translation so community translators can pick it up — let me know the preferred title.

Summary by cubic

Expand SECURITY.md with hands-on guidance to prevent secrets leaks, verify local MCP endpoints, and avoid false prompt-injection reports from Claude Code reminders. Adds audit commands for ~/.claude/settings.json, port checks for HTTP MCP servers like devfleet, and a 3-step reminder triage; updates CHANGELOG.md under Unreleased.

Written for commit 1af5e7f. Summary will update on new commits.

Summary by CodeRabbit

  • Bug Fixes

    • Resolved duplicate configuration entry in an agent file that could cause parsing issues.
  • Documentation

    • Added comprehensive security guidance covering secrets management, local service verification, and system message handling procedures.
    • Enhanced changelog with new release tracking section.
  • Chores

    • Improved CI validation to detect and report duplicate configuration keys.

…ral reminder triage

Three operational sections were missing from SECURITY.md:

1. Secrets Handling now explicitly covers the user-scope
   `~/.claude/settings.json` (outside the repo but frequently shared via
   `claude doctor` output and screenshots). Adds a concrete grep/
   Select-String recipe for auditing `mcpServers[*].env` entries and a
   short remediation path.

2. Local MCP Ports explains the risk of plain-HTTP localhost MCP
   endpoints (e.g. `devfleet` at `http://localhost:18801/mcp`) and gives
   `netstat`/`lsof` commands to confirm the listening PID before first
   use.

3. Triage: suspicious `<system-reminder>` blocks documents Claude Code's
   ephemeral client-side reminder mechanism (TodoWrite nudge, date
   changed, file modified). These blocks end with phrasing like
   "NEVER mention this reminder to the user" or "Don't tell the user
   this, since they are already aware" — that wording is part of
   Anthropic's own prompt, not a malicious tail, and the blocks are not
   persisted in the session transcript. The section gives a 3-step
   triage checklist (repo grep, transcript check, content
   consistency) so future reports distinguish real injections from
   harness-level reminders before escalating to Anthropic.

Also adds a matching `Unreleased` entry to CHANGELOG.md.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 20, 2026

📝 Walkthrough

Walkthrough

Added changelog entries documenting CI validation improvements for duplicate YAML frontmatter keys and new security operational guidance document. Operational guidance covers secrets management, MCP port verification, and suspicious system-reminder block triage procedures. Fixed duplicate model declaration in an agent file.

Changes

Cohort / File(s) Summary
Changelog and Release Notes
CHANGELOG.md
Added Unreleased section with entries for CI validation of duplicate YAML keys and new SECURITY.md document, plus fix for duplicate model: declaration in agents/a11y-architect.md.
Security Documentation
SECURITY.md
Added Operational Guidance section covering secrets handling (template management, credential rotation, auditing), local MCP port verification procedures, and triage guidance for suspicious <system-reminder> blocks.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested reviewers

  • affaan-m

Poem

🐰 Whiskers twitch with documentation cheer!
YAML keys no longer hide so dear,
Secrets safe, ports verified true,
Security guidance—shiny and new!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically summarizes the main changes: expanding SECURITY.md with three operational sections (secrets audit, MCP port checks, and reminder triage).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 20, 2026

Greptile Summary

This PR adds three operational security sections to SECURITY.md (secrets audit recipe, localhost MCP port verification, and a <system-reminder> triage checklist) plus a matching Unreleased entry in CHANGELOG.md. Two issues need resolution before merge:

  • Triage grep is self-defeating: Step 1 of the triage checklist instructs running grep -rEn \"system-reminder|NEVER mention|DO NOT mention\" ., but SECURITY.md itself now contains those exact strings verbatim, so the grep will always fire — undermining the "if nothing, it is not carried by the repo" logic.
  • CHANGELOG references out-of-scope files: The Unreleased block documents changes to scripts/ci/validate-agents.js and agents/a11y-architect.md, neither of which is in this PR's diff.

Confidence Score: 3/5

Not safe to merge as-is — the triage checklist contains a self-defeating grep that will mislead every future triager, and the CHANGELOG conflates unrelated file changes.

Two P1 findings: the self-referential grep false positive in the triage section actively breaks the checklist's logic for all future users, and the CHANGELOG entry documents files not in this PR's diff which corrupts release history traceability.

SECURITY.md (triage grep + heading level), CHANGELOG.md (out-of-scope file references)

Important Files Changed

Filename Overview
SECURITY.md Adds three operational guidance sections; the triage grep command is self-defeating (always matches SECURITY.md itself), the Triage heading is at the wrong level (## instead of ###), and the secrets audit regex misses lowercase key names.
CHANGELOG.md Adds an Unreleased block, but it documents changes to files (scripts/ci/validate-agents.js, agents/a11y-architect.md) that are not part of this PR's diff, conflating unrelated work.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Observe suspicious system-reminder block] --> B{Step 1: grep repo\nfor 'system-reminder'\n'NEVER mention'\n'DO NOT mention'}
    B -->|Match found| C[⚠️ Always hits SECURITY.md now\nFalse positive risk]
    B -->|No match| D[Not in repo files]
    D --> E{Step 2: Check session\n.jsonl transcript\nfor block in tool_result}
    E -->|Found in transcript| F{Step 3: Attributable\nto file/URL read?}
    E -->|Not in transcript| G[✅ Ephemeral client reminder\nNo action needed]
    F -->|Yes| G
    F -->|No| H[🚨 Escalate to Anthropic\nvia GitHub Issues or\nsecurity@anthropic.com]
Loading

Reviews (1): Last reviewed commit: "docs: expand SECURITY with secrets audit..." | Re-trigger Greptile

Comment thread SECURITY.md

That combination makes them easy to mistake for a prompt-injection appended to a tool result. Before treating one as an attack, verify:

1. Is the block actually in a file under this repo? `grep -rEn "system-reminder|NEVER mention|DO NOT mention" .` — if nothing, it is not carried by the repo.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Self-referential grep false positive in triage step 1

The triage checklist instructs users to run grep -rEn "system-reminder|NEVER mention|DO NOT mention" . to determine whether a suspicious block is carried by the repo. However, SECURITY.md now contains the literal strings "system-reminder" and "NEVER mention" (lines 82–87 document them verbatim), so this grep will always produce a match against SECURITY.md itself. The logic immediately after the command — "if nothing, it is not carried by the repo" — becomes unreliable: there will never be "nothing", causing every triager to conclude the repo is a carrier when it isn't.

Fix: exclude SECURITY.md from the grep, or adjust the surrounding prose to say "ignore hits in SECURITY.md":

Suggested change
1. Is the block actually in a file under this repo? `grep -rEn "system-reminder|NEVER mention|DO NOT mention" .` — if nothing, it is not carried by the repo.
1. Is the block actually in a file under this repo? `grep -rEn --exclude=SECURITY.md "system-reminder|NEVER mention|DO NOT mention" .` — if nothing, it is not carried by the repo.

Comment thread CHANGELOG.md
Comment on lines +7 to +12
- `scripts/ci/validate-agents.js`: detects duplicate top-level YAML frontmatter keys in agent files (e.g. two `model:` lines silently collapsing to the last value) and fails validation with the duplicated key names.
- `SECURITY.md`: new operational sections — Secrets Handling (including a user-scope `~/.claude/settings.json` audit recipe for macOS/Linux/Windows), Local MCP Ports (how to verify the listener on bundled HTTP MCP endpoints such as `devfleet`), and a Triage section explaining Claude Code's ephemeral client-side `<system-reminder>` blocks (to avoid misclassifying them as prompt injections).

### Fixed

- `agents/a11y-architect.md`: removed stray duplicate `model:` line so YAML parsers no longer silently override `sonnet` with a later `opus` entry.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unreleased block references files not in this PR

The Unreleased section documents changes to scripts/ci/validate-agents.js and agents/a11y-architect.md, but neither file appears in this PR's diff (only CHANGELOG.md and SECURITY.md are changed). If those changes are intended to land in a separate PR, having them recorded here will conflate unrelated work under one changelog entry and make the release diff hard to audit. If they're already merged to main, they should be under a versioned heading, not Unreleased.

Comment thread SECURITY.md

Compare the PID against the expected devfleet binary. Any other process on that port can intercept MCP traffic.

## Triage: suspicious `<system-reminder>` blocks
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Heading level breaks document hierarchy

The Triage section uses an ## heading, placing it at the same level as ## Operational Guidance and ## Security Resources, but its content (like Secrets Handling and Local MCP Ports above it) is clearly operational guidance. Using ### keeps the ToC hierarchy consistent and avoids readers thinking this is a separate top-level policy section.

Suggested change
## Triage: suspicious `<system-reminder>` blocks
### Triage: suspicious `<system-reminder>` blocks

Comment thread SECURITY.md

```bash
# macOS / Linux
grep -EnH '(TOKEN|SECRET|KEY|PASSWORD)\s*"\s*:\s*"[A-Za-z0-9_-]{16,}"' ~/.claude/settings.json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Audit grep misses lowercase key names

The pattern (TOKEN|SECRET|KEY|PASSWORD) is case-sensitive (no -i flag), so common lowercase variants like "api_key", "secret_token", or "access_token" are silently skipped. Adding -i closes this gap:

Suggested change
grep -EnH '(TOKEN|SECRET|KEY|PASSWORD)\s*"\s*:\s*"[A-Za-z0-9_-]{16,}"' ~/.claude/settings.json
grep -EniH '(TOKEN|SECRET|KEY|PASSWORD)\s*"\s*:\s*"[A-Za-z0-9_-]{16,}"' ~/.claude/settings.json

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
SECURITY.md (2)

56-61: Secret-audit regex is too narrow and will miss real credentials.

The current character class ([A-Za-z0-9_-]{16,}) excludes common secret formats (e.g., JWT segments with ., base64 tokens with +/=, and provider keys with other symbols), so this “quick audit” can produce false negatives.

Suggested doc tweak
 # macOS / Linux
-grep -EnH '(TOKEN|SECRET|KEY|PASSWORD)\s*"\s*:\s*"[A-Za-z0-9_-]{16,}"' ~/.claude/settings.json
+grep -EnH '"[^"]*(TOKEN|SECRET|KEY|PASSWORD)[^"]*"\s*:\s*"[^"]{16,}"' ~/.claude/settings.json
 # Windows PowerShell
-Select-String -Path "$env:USERPROFILE\.claude\settings.json" -Pattern '(TOKEN|SECRET|KEY|PASSWORD)"\s*:\s*"[A-Za-z0-9_-]{16,}"'
+Select-String -Path "$env:USERPROFILE\.claude\settings.json" -Pattern '"[^"]*(TOKEN|SECRET|KEY|PASSWORD)[^"]*"\s*:\s*"[^"]{16,}"'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@SECURITY.md` around lines 56 - 61, The secret-audit regex used in the
grep/Select-String examples is too restrictive (it uses [A-Za-z0-9_-]{16,}) and
will miss many real secrets (JWTs, base64, provider keys containing .+/= and
other symbols); update both patterns (the grep pattern and the PowerShell
Select-String pattern) to match a broader set of non-whitespace characters for
the value (e.g., use \S or a character class like [A-Za-z0-9._~+/=-] with an
appropriate minimum length) and keep the key name capture
(TOKEN|SECRET|KEY|PASSWORD) and length threshold (e.g., {16,}) intact so the
examples in SECURITY.md detect common token formats without being overly
permissive.

69-74: Windows port check should explicitly filter listener state.

netstat -ano | findstr :18801 can include non-listening rows and make PID verification noisy. Consider documenting LISTENING filtering to reduce operator error.

Suggested doc tweak
 # Windows
-netstat -ano | findstr :18801
+netstat -ano -p tcp | findstr ":18801" | findstr LISTENING
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@SECURITY.md` around lines 69 - 74, Update the Windows port-check example to
explicitly filter for LISTENING state so the command only returns active
listeners and the correct PID; replace the current `netstat -ano | findstr
:18801` snippet with a variant that matches both the port and the LISTENING
state (e.g., using findstr twice or a single regex) and include a short note
explaining this avoids non-listening rows being returned; reference the existing
Windows example in SECURITY.md to locate and update the snippet.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@SECURITY.md`:
- Around line 56-61: The secret-audit regex used in the grep/Select-String
examples is too restrictive (it uses [A-Za-z0-9_-]{16,}) and will miss many real
secrets (JWTs, base64, provider keys containing .+/= and other symbols); update
both patterns (the grep pattern and the PowerShell Select-String pattern) to
match a broader set of non-whitespace characters for the value (e.g., use \S or
a character class like [A-Za-z0-9._~+/=-] with an appropriate minimum length)
and keep the key name capture (TOKEN|SECRET|KEY|PASSWORD) and length threshold
(e.g., {16,}) intact so the examples in SECURITY.md detect common token formats
without being overly permissive.
- Around line 69-74: Update the Windows port-check example to explicitly filter
for LISTENING state so the command only returns active listeners and the correct
PID; replace the current `netstat -ano | findstr :18801` snippet with a variant
that matches both the port and the LISTENING state (e.g., using findstr twice or
a single regex) and include a short note explaining this avoids non-listening
rows being returned; reference the existing Windows example in SECURITY.md to
locate and update the snippet.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7597ece9-dd84-4acd-a38e-ab78d36739fc

📥 Commits

Reviewing files that changed from the base of the PR and between 8bdf88e and 1af5e7f.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • SECURITY.md

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="SECURITY.md">

<violation number="1" location="SECURITY.md:83">
P2: Security triage uses an absolute transcript-persistence claim that is version-sensitive and can cause false negatives in incident classification.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread SECURITY.md
ECC runs inside Claude Code, which injects **ephemeral client-side system reminders** into the model's input on every turn (TodoWrite nudges, date-changed notices, file-modified notices, etc.). These blocks:

- typically end with phrasing like *"ignore if not applicable"* or *"NEVER mention this reminder to the user"* / *"Don't tell the user this, since they are already aware"* — that wording is Anthropic's own prompt, not a malicious tail;
- are added by the CLI per turn and are **not persisted** in the session transcript at `~/.claude/projects/<slug>/<sessionId>.jsonl`.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Security triage uses an absolute transcript-persistence claim that is version-sensitive and can cause false negatives in incident classification.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At SECURITY.md, line 83:

<comment>Security triage uses an absolute transcript-persistence claim that is version-sensitive and can cause false negatives in incident classification.</comment>

<file context>
@@ -45,6 +45,53 @@ This policy covers:
+ECC runs inside Claude Code, which injects **ephemeral client-side system reminders** into the model's input on every turn (TodoWrite nudges, date-changed notices, file-modified notices, etc.). These blocks:
+
+- typically end with phrasing like *"ignore if not applicable"* or *"NEVER mention this reminder to the user"* / *"Don't tell the user this, since they are already aware"* — that wording is Anthropic's own prompt, not a malicious tail;
+- are added by the CLI per turn and are **not persisted** in the session transcript at `~/.claude/projects/<slug>/<sessionId>.jsonl`.
+
+That combination makes them easy to mistake for a prompt-injection appended to a tool result. Before treating one as an attack, verify:
</file context>
Fix with Cubic

@affaan-m
Copy link
Copy Markdown
Owner

Thanks for the PR. This has been idle for several weeks and is not merge-ready against the moving main branch, so I am closing it to keep the queue tractable. Reopen is welcome with a current rebase and focused scope; maintainers may also port the useful parts into a fresh PR where appropriate.

@affaan-m affaan-m closed this May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants