Add untrusted annotation to proposed mitigations

johannhof · johannhof · commit 18bfb9c9ca78 · 2026-03-11T21:15:21.000Z
diff --git a/docs/security-privacy-considerations.md b/docs/security-privacy-considerations.md
@@ -360,6 +360,14 @@ To advance the security and privacy posture of WebMCP, we need community input o
 
 **How:** Ensuring an interoperable basis for prompt injection defense, by requiring any implementer to protect against at least the attacks in that dataset
 
+#### [Untrusted Annotation for Tool Responses](https://github.com/webmachinelearning/webmcp/issues/136)
+
+**What:** Giving agents information about trust boundaries such as highlighting untrustworthy content to the model using an untrusted annotation.
+
+**Threats addressed:** Output Injection Attacks (Prompt Injection Attacks)
+
+**How:** A boolean `contains_untrusted_content: true` or `openWorldHint` annotation that acts as a signal to the client that the payload requires heightened security handling, allowing the client to properly parse and sanitize the payload or use indicators such as spotlighting to highlight untrustworthy content to the model.
+
 ... add more issues here
 
 ## Next Steps