You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-19Lines changed: 15 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,9 +35,9 @@ The paper is available on [arXiV](TODO) and [ACL Anthology](TODO).
35
35
-[Code for the "AIRTBench" AI Red Teaming Agent](#code-for-the-airtbench-ai-red-teaming-agent)
36
36
-[Setup](#setup)
37
37
-[Run the Evaluation](#run-the-evaluation)
38
+
-[Basic Usage](#basic-usage)
39
+
-[Challenge Filtering](#challenge-filtering)
38
40
-[Model requests](#model-requests)
39
-
-[Support the Project and Contributing](#support-the-project-and-contributing)
40
-
-[Star History](#star-history)
41
41
42
42
## Setup
43
43
@@ -51,17 +51,25 @@ uv sync
51
51
52
52
<mark>In order to run the code, you will need access to the Dreadnode strikes platform, see the [docs](https://docs.Dreadnode.io/strikes/overview) or submit for the Strikes waitlist [here](https://platform.dreadnode.io/waitlist/strikes)</mark>.
53
53
54
-
This [rigging](https://docs.dreadnode.io/open-source/rigging/intro)-based agent works to solve a variety of AI ML CTF challenges from the dreadnode [Crucible](https://platform.dreadnode.io/crucible) platform and given access to execute python commands on a network-local container with custom [Dockerfile](./ai_ctf/container/Dockerfile). This example-agent is also a compliment to our research paper [AIRTBench: Can Language Models Autonomously Exploit
54
+
This [rigging](https://docs.dreadnode.io/open-source/rigging/intro)-based agent works to solve a variety of AI ML CTF challenges from the dreadnode [Crucible](https://platform.dreadnode.io/crucible) platform and given access to execute python commands on a network-local container with custom [Dockerfile](./airtbench/container/Dockerfile). This example-agent is also a compliment to our research paper [AIRTBench: Can Language Models Autonomously Exploit
55
55
Language Models?](https://arxiv.org/abs/TODO). # TODO: Add link to paper once published.
To run the agent against challenges that match the `is_llm:true` criteria, which are LLM-based challenges, you can use the following command:
62
70
63
71
```bash
64
-
uv run -m ai_ctf --model <model> --llm-challenges-only
72
+
uv run -m airtbench --model <model> --llm-challenges-only
65
73
```
66
74
67
75
The harness will automatically build the defined number of containers with the supplied flag, and load them
@@ -74,21 +82,9 @@ as needed to ensure they are network-isolated from each other. The process is ge
74
82
5. If the CTF challenge is solved and flag is observed, the agent must submit the flag
75
83
6. Otherwise run until an error, give up, or max-steps is reached
76
84
77
-
Check out [the challenge manifest](./ai_ctf/challenges/.challenges.yaml) to see current challenges in scope.
85
+
Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to see current challenges in scope.
78
86
79
87
80
88
## Model requests
81
89
82
-
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.
83
-
84
-
## Support the Project and Contributing
85
-
86
-
We welcome any issues or contributions to the project, share the treasure! If you like our project, please feel free to drop us some love <3
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.
0 commit comments