Skip to content

Commit 9a5ae64

Browse files
feat: airtbench ai agent code
1 parent 4e79a9e commit 9a5ae64

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+11415
-19
lines changed

.env.example

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
DREADNODE_SERVER_URL="https://platform.dreadnode.io"
2+
# See https://platform.dreadnode.io/account to get your API token and key (same value)
3+
DREADNODE_API_TOKEN=YOUR_DREADNODE_API_TOKEN
4+
DREADNODE_API_KEY=YOUR_DREADNODE_API_KEY
5+
DREADNODE_LOCAL_DIR="runs/"
6+
LOGFIRE_IGNORE_NO_CONFIG=1
7+
8+
ANTHROPIC_API_KEY=
9+
GROQ_API_KEY=
10+
OPENAI_API_KEY=
11+
TOGETHER_AI_API_KEY=
12+
GEMINI_API_KEY=

README.md

Lines changed: 15 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@ The paper is available on [arXiV](TODO) and [ACL Anthology](TODO).
3535
- [Code for the "AIRTBench" AI Red Teaming Agent](#code-for-the-airtbench-ai-red-teaming-agent)
3636
- [Setup](#setup)
3737
- [Run the Evaluation](#run-the-evaluation)
38+
- [Basic Usage](#basic-usage)
39+
- [Challenge Filtering](#challenge-filtering)
3840
- [Model requests](#model-requests)
39-
- [Support the Project and Contributing](#support-the-project-and-contributing)
40-
- [Star History](#star-history)
4141

4242
## Setup
4343

@@ -51,17 +51,25 @@ uv sync
5151

5252
<mark>In order to run the code, you will need access to the Dreadnode strikes platform, see the [docs](https://docs.Dreadnode.io/strikes/overview) or submit for the Strikes waitlist [here](https://platform.dreadnode.io/waitlist/strikes)</mark>.
5353

54-
This [rigging](https://docs.dreadnode.io/open-source/rigging/intro)-based agent works to solve a variety of AI ML CTF challenges from the dreadnode [Crucible](https://platform.dreadnode.io/crucible) platform and given access to execute python commands on a network-local container with custom [Dockerfile](./ai_ctf/container/Dockerfile). This example-agent is also a compliment to our research paper [AIRTBench: Can Language Models Autonomously Exploit
54+
This [rigging](https://docs.dreadnode.io/open-source/rigging/intro)-based agent works to solve a variety of AI ML CTF challenges from the dreadnode [Crucible](https://platform.dreadnode.io/crucible) platform and given access to execute python commands on a network-local container with custom [Dockerfile](./airtbench/container/Dockerfile). This example-agent is also a compliment to our research paper [AIRTBench: Can Language Models Autonomously Exploit
5555
Language Models?](https://arxiv.org/abs/TODO). # TODO: Add link to paper once published.
5656

5757
```bash
58-
uv run -m ai_ctf --help
58+
uv run -m airtbench --help
5959
```
6060

61+
### Basic Usage
62+
63+
```bash
64+
uv run -m airtbench --model $MODEL --project $PROJECT --platform-api-key $DREADNODE_API_KEY --token $DREADNODE_API_TOKEN --server https://platform.dreadnode.io --max-steps 100 --inference_timeout 240 --enable-cache --no-give-up --challenges bear1 bear2
65+
```
66+
67+
### Challenge Filtering
68+
6169
To run the agent against challenges that match the `is_llm:true` criteria, which are LLM-based challenges, you can use the following command:
6270

6371
```bash
64-
uv run -m ai_ctf --model <model> --llm-challenges-only
72+
uv run -m airtbench --model <model> --llm-challenges-only
6573
```
6674

6775
The harness will automatically build the defined number of containers with the supplied flag, and load them
@@ -74,21 +82,9 @@ as needed to ensure they are network-isolated from each other. The process is ge
7482
5. If the CTF challenge is solved and flag is observed, the agent must submit the flag
7583
6. Otherwise run until an error, give up, or max-steps is reached
7684

77-
Check out [the challenge manifest](./ai_ctf/challenges/.challenges.yaml) to see current challenges in scope.
85+
Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to see current challenges in scope.
7886

7987

8088
## Model requests
8189

82-
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.
83-
84-
## Support the Project and Contributing
85-
86-
We welcome any issues or contributions to the project, share the treasure! If you like our project, please feel free to drop us some love <3
87-
88-
### Star History
89-
90-
[![GitHub stars](https://img.shields.io/github/stars/dreadnode/AIRTBench-Code?style=social)](https://github.com/dreadnode/AIRTBench-Code/stargazers)
91-
92-
By watching the repo, you can also be notified of any upcoming releases.
93-
94-
<img src="https://api.star-history.com/svg?repos=dreadnode/AIRTBench-Code&type=Date" width="600" height="400">
90+
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.

airtbench/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
*.bak
2+
*.removed_notebooks/

airtbench/__init__.py

Whitespace-only changes.

airtbench/__main__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
from .main import app
2+
3+
if __name__ == "__main__":
4+
app()

airtbench/challenges.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
import pathlib
2+
3+
import yaml # type: ignore [import-untyped]
4+
from pydantic import BaseModel
5+
6+
current_dir = pathlib.Path(__file__).parent
7+
challenges_dir = current_dir / "challenges"
8+
9+
10+
class Challenge(BaseModel):
11+
id: str
12+
name: str
13+
category: str
14+
difficulty: str
15+
notebook: str
16+
is_llm: bool = False
17+
18+
19+
def load_challenges() -> list[Challenge]:
20+
"""
21+
Load challenges from the .challenges.yaml file in the challenges directory.
22+
23+
Returns:
24+
list[Challenge]: A list of Challenge objects.
25+
"""
26+
with (challenges_dir / ".challenges.yaml").open() as f:
27+
return [Challenge(id=key, **challenge) for key, challenge in yaml.safe_load(f).items()]

0 commit comments

Comments
 (0)