awesome-agentic-patterns/patterns/lethal-trifecta-threat-model.md at main · Fr-e-d/awesome-agentic-patterns

title

Lethal Trifecta Threat Model

status

best-practice

authors

Nikola Balic (@nibzard)

based_on

Simon Willison

Problem

Combining three agent capabilities—

Access to private data
Exposure to untrusted content
Ability to externally communicate

—creates a straightforward path for prompt-injection attackers to steal sensitive information.
LLMs cannot reliably distinguish "good" instructions from malicious ones once they appear in the same context window.

Solution

Adopt a Trifecta Threat Model:

Audit every tool an agent can call and classify it against the three capabilities.
Guarantee that at least one circle is missing in any execution path. Options include:
Remove external network access (no exfiltration).
- Deny direct file/database reads (no private data).
- Sanitize or segregate untrusted inputs (no hostile instructions).
Enforce this at orchestration time, not with brittle prompt guardrails.

# pseudo-policy
if tool.can_externally_communicate and
   tool.accesses_private_data and
   input_source == "untrusted":
       raise SecurityError("Lethal trifecta detected")

How to use it

Maintain a machine-readable capability matrix for every tool.
Add a pre-execution policy check in your agent runner.
Fail closed: if capability metadata is missing, treat the tool as high-risk.

Trade-offs

Pros: Simple mental model; eliminates entire attack class. Cons: Limits powerful "all-in-one" agents; requires disciplined capability tagging.

References

Willison, The Lethal Trifecta for AI Agents (June 16 2025).
Beurer-Kellner et al., Design Patterns for Securing LLM Agents against Prompt Injections (arXiv:2506.08837, June 2025).

Primary source: https://simonwillison.net/2025/Jun/16/lethal-trifecta/
Academic source: https://doi.org/10.48550/arXiv.2506.08837

Note on terminology: This pattern describes Simon Willison's prompt injection threat model (private data + untrusted content + external communication), distinct from the AI safety literature's "lethal trifecta" (advanced capabilities + agentic behavior + situational awareness).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem

Solution

How to use it

Trade-offs

References

FilesExpand file tree

lethal-trifecta-threat-model.md

Latest commit

History

lethal-trifecta-threat-model.md

File metadata and controls

Problem

Solution

How to use it

Trade-offs

References