Summary
Planner/review-follow-up artifacts are currently machine-readable and can influence downstream agent behavior, but they are not authenticated. In a public repository, a malicious actor who can inject or edit a suitably formatted artifact-like payload could potentially trick downstream planner/implementor stages into acting on forged instructions.
This is not required to merge PR #200, but it should be handled as a later hardening pass for the multi-agent issue/PR workflow.
Threat model
- Public PRs/issues/comments contain machine-readable artifact blocks.
- Downstream agents parse those blocks and use them to decide follow-up work.
- A forged artifact could spoof review-follow-up metadata, redirect replies, or coerce implementor/planner stages into working on attacker-chosen content.
- Strict parsing alone does not address authenticity.
Goal
Add authenticity/integrity checks for planner/review-follow-up artifacts so downstream automation can distinguish trusted agent-produced artifacts from untrusted user-supplied text.
Proposed direction
- Introduce config-defined signing/verification keys for agent artifacts.
- Sign serialized planner/review-follow-up artifacts and verify signatures before trusted consumption.
- Fail closed for artifact-driven automation when signature verification fails.
- Preserve backwards-compatible handling for legacy unsigned artifacts only behind an explicit compatibility mode, if needed.
- Keep human-readable comments intact while treating unsigned payloads as untrusted text.
Acceptance criteria
Notes
Agent-authored issue.
Summary
Planner/review-follow-up artifacts are currently machine-readable and can influence downstream agent behavior, but they are not authenticated. In a public repository, a malicious actor who can inject or edit a suitably formatted artifact-like payload could potentially trick downstream planner/implementor stages into acting on forged instructions.
This is not required to merge PR #200, but it should be handled as a later hardening pass for the multi-agent issue/PR workflow.
Threat model
Goal
Add authenticity/integrity checks for planner/review-follow-up artifacts so downstream automation can distinguish trusted agent-produced artifacts from untrusted user-supplied text.
Proposed direction
Acceptance criteria
Notes
Agent-authored issue.