Aider runs LLM-driven edits and shell commands as part of its core loop, which means it shares the threat surface that has been catalogued for general AI coding agents: prompt injection, exfiltration patterns, dangerous shell invocations slipped into LLM output, and crafted file content that triggers unintended actions during edit/commit. Many of these cases are already documented as concrete regex and rule patterns in the open Agent Threat Rules project at https://github.com/Agent-Threat-Rule/agent-threat-rules.
I would like to propose an optional pre-commit safety scan that runs ATR-style detection against the diff and recent prompt context before aider commits, behind an opt-in flag like --scan-with-atr or a similar config knob. The scan would be local, deterministic, and based on regex patterns. It would not call any external service and would not change behavior unless the flag is enabled. On match it would print a one-line warning per finding and either continue, prompt, or abort based on the flag.
The implementation footprint is small. It can live in a new aider/scan/ module with about 5 to 10 inline patterns covering the highest-precision detection categories: shell command injection in code blocks, exfiltration to suspicious domains, secret leakage in diffs, and prompt-override directives in commit messages or modified files. ATR is Apache-2.0 licensed and the rule set has been adopted by Cisco AI Defense skill-scanner and Microsoft agent-governance-toolkit, so it is reasonable to reference upstream rather than rebuild detection logic.
Before I open a PR I want to check whether this fits aider's design philosophy. A few questions: Is an opt-in security flag something you would consider, or do you prefer to keep aider focused only on the edit and commit loop? Is aider/scan/ an acceptable location, or would you want this as a separate plugin? Is regex-only detection acceptable for a first pass, with no LLM calls and no network?
If the design fits, I am happy to send a minimal PR with a single flag, a small pattern set, tests, and docs. Happy to scope down further if you prefer.
Aider runs LLM-driven edits and shell commands as part of its core loop, which means it shares the threat surface that has been catalogued for general AI coding agents: prompt injection, exfiltration patterns, dangerous shell invocations slipped into LLM output, and crafted file content that triggers unintended actions during edit/commit. Many of these cases are already documented as concrete regex and rule patterns in the open Agent Threat Rules project at https://github.com/Agent-Threat-Rule/agent-threat-rules.
I would like to propose an optional pre-commit safety scan that runs ATR-style detection against the diff and recent prompt context before aider commits, behind an opt-in flag like --scan-with-atr or a similar config knob. The scan would be local, deterministic, and based on regex patterns. It would not call any external service and would not change behavior unless the flag is enabled. On match it would print a one-line warning per finding and either continue, prompt, or abort based on the flag.
The implementation footprint is small. It can live in a new aider/scan/ module with about 5 to 10 inline patterns covering the highest-precision detection categories: shell command injection in code blocks, exfiltration to suspicious domains, secret leakage in diffs, and prompt-override directives in commit messages or modified files. ATR is Apache-2.0 licensed and the rule set has been adopted by Cisco AI Defense skill-scanner and Microsoft agent-governance-toolkit, so it is reasonable to reference upstream rather than rebuild detection logic.
Before I open a PR I want to check whether this fits aider's design philosophy. A few questions: Is an opt-in security flag something you would consider, or do you prefer to keep aider focused only on the edit and commit loop? Is aider/scan/ an acceptable location, or would you want this as a separate plugin? Is regex-only detection acceptable for a first pass, with no LLM calls and no network?
If the design fits, I am happy to send a minimal PR with a single flag, a small pattern set, tests, and docs. Happy to scope down further if you prefer.