Nothing in the command surface. You still run:
mcpproxy security scan <server> # or --allThe findings now carry a confidence value and a signals list (which checks fired), and the scanner catches structural attacks (hidden Unicode, cross-server shadowing, decode-to-shell) it previously missed. Hard-tier findings auto-quarantine; soft-tier findings are raised for review with severity by how many independent checks agreed.
go test -race ./internal/security/detect/...
go test -race ./internal/security/...go run ./cmd/scan-eval --corpus specs/065-evaluation-foundation/datasets --gate --min-recall 0.90 --max-fp 0.05Expect a metrics breakdown and exit 0 when recall ≥ 0.90 and hard-negative FP ≤ 5%.
- Create
internal/security/detect/checks/<name>.goimplementingCheck. - Write
<name>_test.gofirst (TDD) with MUST-flag and MUST-NOT-flag cases from the contract table. - Register it in the engine's check set.
- Add corpus entries exercising it; confirm the gate still passes.
| Spec scenario | Quick check |
|---|---|
| US1 hidden-Unicode → quarantine | scan a fixture with zero-width chars → hard finding unicode.hidden |
| US1 shadowing | two servers, same tool name → hard finding shadowing.cross_server |
| US1 decode-to-shell | description with base64 of curl x | sh → hard finding payload.decoded, evidence = decoded |
| US2 hard-negative | "detects prompts such as 'ignore previous instructions'" → no quarantine |
| US2 variant | "don't disclose" and "do not tell the user" both flagged |
| US3 gate | corpus eval fails build when recall < 0.90 |
| US4 transparency | multi-signal tool → finding lists check IDs + confidence; severity rises with count |