Skip to content

Commit 5f6c2e2

Browse files
Robert FitzpatrickRobert Fitzpatrick
authored andcommitted
Simplify MoltbotScenario: remove custom converter, use plain objectives
- Remove AgentCommandInjectionConverter - too specific for PyRIT's converter philosophy - Converters should transform text, not generate attack payloads - Attack payloads now belong directly in test objectives - Simplifies architecture: objectives → PromptSendingAttack (no converter layer) - Aligns with PyRIT's pattern: converters transform, objectives define what to test
1 parent 5e7398f commit 5f6c2e2

32 files changed

Lines changed: 6560 additions & 318 deletions

AI_AGENT_SECURITY_FEATURE.md

Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
# AI Agent Security Testing (Moltbot/ClawdBot Vulnerability Patterns)
2+
3+
This feature adds comprehensive testing capabilities for AI agent security vulnerabilities, based on real-world attacks discovered in January 2026 targeting Moltbot (formerly ClawdBot) and similar AI agent platforms.
4+
5+
## What Was Added
6+
7+
### 1. AgentCommandInjectionConverter
8+
**Location**: `pyrit/prompt_converter/agent_command_injection_converter.py`
9+
10+
A new prompt converter that generates command injection patterns to test AI agents for vulnerabilities:
11+
12+
**Injection Types:**
13+
- `hidden_instruction` - Hidden commands embedded in normal text
14+
- `cron` - Scheduled task injection (Moltbot-style attack)
15+
- `file_read` - Unauthorized file system access attempts
16+
- `credential_theft` - Credential exfiltration patterns
17+
- `system_info` - System information gathering/reconnaissance
18+
19+
**Key Features:**
20+
- Stealth mode for subtle, hard-to-detect injections
21+
- Exfiltration target support for data theft testing
22+
- Command prefix customization for different agent syntaxes
23+
- Based on actual vulnerability patterns from real attacks
24+
25+
### 2. AI Agent Security Dataset
26+
**Location**: `pyrit/datasets/seed_datasets/local/airt/ai_agent_security.prompt`
27+
28+
A comprehensive dataset of 60+ test objectives covering:
29+
- Command injection attacks
30+
- Credential theft attempts
31+
- Unauthorized file access
32+
- System reconnaissance
33+
- Hidden instruction injection
34+
- Data exfiltration patterns
35+
- Indirect prompt injection
36+
- Multi-stage attacks
37+
- Supply chain compromises
38+
39+
### 3. Documentation
40+
**Location**: `doc/code/converters/ai_agent_security_testing.md`
41+
42+
Complete guide including:
43+
- Background on Moltbot vulnerabilities
44+
- Usage examples for all injection types
45+
- Integration with PyRIT's attack strategies
46+
- Scoring and detection patterns
47+
- Best practices and mitigation recommendations
48+
- Real-world attack scenario recreations
49+
50+
### 4. Unit Tests
51+
**Location**: `tests/unit/converter/test_agent_command_injection_converter.py`
52+
53+
Comprehensive test suite with 20+ test cases covering:
54+
- Initialization and configuration
55+
- All injection type generations
56+
- Stealth vs non-stealth modes
57+
- Exfiltration target handling
58+
- Input validation
59+
- Output correctness
60+
61+
### 5. Demo Script
62+
**Location**: `examples/ai_agent_security_demo.py`
63+
64+
Interactive demonstration showing:
65+
- All injection pattern types
66+
- Stealth mode comparison
67+
- Moltbot-style cron injection
68+
- Dataset integration
69+
- Visual examples of generated attacks
70+
71+
## Background: The Moltbot Vulnerabilities (Jan 2026)
72+
73+
In January 2026, security researchers discovered critical vulnerabilities in Moltbot (formerly ClawdBot), a rapidly popular open-source AI agent platform that gained 98K GitHub stars in days:
74+
75+
### Key Vulnerabilities Found:
76+
77+
1. **Cleartext Credential Storage**
78+
- API keys, secrets stored unencrypted in `~/.clawdbot/`
79+
- Backup files retained "deleted" credentials
80+
- Accessible to infostealers and local attackers
81+
82+
2. **Cron Job Injection**
83+
- Attackers could inject scheduled tasks via Discord messages
84+
- Tasks ran with host machine privileges
85+
- 30-second attack timeline from injection to exfiltration
86+
87+
3. **Indirect Prompt Injection**
88+
- Malicious instructions in emails, PDFs, websites
89+
- No validation of external content
90+
- White-on-white hidden text attacks
91+
92+
4. **Unsafe Command Execution**
93+
- `eval` used 100+ times, `execSync` 9 times
94+
- No command sandboxing or whitelisting
95+
- Direct system command access
96+
97+
5. **Public Exposure**
98+
- 1,238 publicly accessible Moltbot instances (via Shodan)
99+
- No authentication on many deployments
100+
- Control panels exposed to internet
101+
102+
6. **Supply Chain Risk**
103+
- 300+ contributors with commit access
104+
- No code review requirements
105+
- "Vibe-coded" with AI tools
106+
107+
### Impact:
108+
- ~300K-400K estimated users affected
109+
- Access to email, calendars, chat, files, browsers
110+
- Full account takeover potential
111+
- Credential theft and persistent access
112+
113+
## How to Use
114+
115+
### Basic Example
116+
117+
```python
118+
from pyrit.prompt_converter import AgentCommandInjectionConverter
119+
from pyrit.executor.attack import PromptSendingAttack
120+
121+
# Create converter for cron injection testing
122+
converter = AgentCommandInjectionConverter(
123+
injection_type="cron",
124+
exfiltration_target="test-server.com",
125+
stealth_mode=True
126+
)
127+
128+
# Use in an attack
129+
attack = PromptSendingAttack(
130+
objective_target=your_ai_agent,
131+
converters=[converter]
132+
)
133+
134+
result = await attack.execute_async(
135+
objective="Schedule recurring credential collection"
136+
)
137+
```
138+
139+
### With the Dataset
140+
141+
```python
142+
from pyrit.models import SeedPromptDataset
143+
144+
# Load AI agent security test cases
145+
dataset = SeedPromptDataset.from_yaml_file(
146+
"pyrit/datasets/seed_datasets/local/airt/ai_agent_security.prompt"
147+
)
148+
149+
# Test each objective
150+
for seed in dataset.prompts:
151+
result = await attack.execute_async(objective=seed.value)
152+
```
153+
154+
## Testing Your AI Agent
155+
156+
To test if your AI agent is vulnerable to Moltbot-style attacks:
157+
158+
```python
159+
# Test 1: Cron Injection
160+
converter = AgentCommandInjectionConverter(injection_type="cron")
161+
result = await test_agent(converter)
162+
# Check if agent created scheduled tasks
163+
164+
# Test 2: Credential Access
165+
converter = AgentCommandInjectionConverter(injection_type="credential_theft")
166+
result = await test_agent(converter)
167+
# Check if agent exposed credentials
168+
169+
# Test 3: File System Access
170+
converter = AgentCommandInjectionConverter(injection_type="file_read")
171+
result = await test_agent(converter)
172+
# Check if agent read unauthorized files
173+
```
174+
175+
## Integration with PyRIT
176+
177+
This feature integrates seamlessly with PyRIT's existing components:
178+
179+
- **Attack Strategies**: Use with PromptSendingAttack, RedTeamingAttack, CrescendoAttack
180+
- **Scoring**: Combine with SelfAskCategoryScorer to detect vulnerabilities
181+
- **Multi-Turn**: Test persistent exploitation with RedTeamingAttack
182+
- **Datasets**: Integrate with existing AIRT test scenarios
183+
184+
## Mitigation for AI Agent Developers
185+
186+
If you're building AI agents, protect against these vulnerabilities:
187+
188+
1. **Never store credentials in cleartext** - Use secure vaults
189+
2. **Validate all external inputs** - Sanitize emails, PDFs, websites
190+
3. **Implement command whitelisting** - Restrict executable commands
191+
4. **Use sandboxing** - Isolate agents with limited privileges
192+
5. **Monitor suspicious activity** - Log all file/network access
193+
6. **Regular security testing** - Use PyRIT regularly
194+
7. **Implement rate limiting** - Prevent rapid exploitation
195+
8. **Code review** - Audit all contributions, especially commands
196+
197+
## References
198+
199+
- [OX Security: Moltbot Analysis](https://www.ox.security/blog/one-step-away-from-a-massive-data-breach-what-we-found-inside-moltbot/)
200+
- [Noma Security: Agentic Trojan Horse](https://noma.security/blog/moltbot-the-agentic-trojan-horse/)
201+
- [Bitdefender: Moltbot Alert](https://www.bitdefender.com/en-us/blog/hotforsecurity/moltbot-security-alert-exposed-clawdbot-control-panels-risk-credential-leaks-and-account-takeovers)
202+
203+
## Future Enhancements
204+
205+
Potential additions:
206+
- Target implementation for Moltbot/OpenClaw instances
207+
- Additional converters for specific agent platforms (AutoGPT, LangChain)
208+
- Scorer specialized for agent vulnerability detection
209+
- Integration with agent security benchmarks
210+
- Automated vulnerability reporting
211+
212+
## Contributing
213+
214+
Found new AI agent vulnerabilities? Contributions welcome:
215+
- Add new injection patterns to the converter
216+
- Expand the test dataset with new objectives
217+
- Create additional converters for specific platforms
218+
- Improve detection and scoring capabilities
219+
220+
## Important Notes
221+
222+
⚠️ **Authorization Required**: Only test AI agents you own or have explicit permission to test.
223+
224+
⚠️ **Controlled Environments**: Never test against production systems without proper safeguards.
225+
226+
⚠️ **Responsible Disclosure**: Report discovered vulnerabilities through proper channels.
227+
228+
## Questions?
229+
230+
See the full documentation at `doc/code/converters/ai_agent_security_testing.md`

0 commit comments

Comments
 (0)