Skip to content

Harden agent artifacts against forged planner/review-follow-up payloads #203

Description

@davidruzicka

Summary

Planner/review-follow-up artifacts are currently machine-readable and can influence downstream agent behavior, but they are not authenticated. In a public repository, a malicious actor who can inject or edit a suitably formatted artifact-like payload could potentially trick downstream planner/implementor stages into acting on forged instructions.

This is not required to merge PR #200, but it should be handled as a later hardening pass for the multi-agent issue/PR workflow.

Threat model

  • Public PRs/issues/comments contain machine-readable artifact blocks.
  • Downstream agents parse those blocks and use them to decide follow-up work.
  • A forged artifact could spoof review-follow-up metadata, redirect replies, or coerce implementor/planner stages into working on attacker-chosen content.
  • Strict parsing alone does not address authenticity.

Goal

Add authenticity/integrity checks for planner/review-follow-up artifacts so downstream automation can distinguish trusted agent-produced artifacts from untrusted user-supplied text.

Proposed direction

  • Introduce config-defined signing/verification keys for agent artifacts.
  • Sign serialized planner/review-follow-up artifacts and verify signatures before trusted consumption.
  • Fail closed for artifact-driven automation when signature verification fails.
  • Preserve backwards-compatible handling for legacy unsigned artifacts only behind an explicit compatibility mode, if needed.
  • Keep human-readable comments intact while treating unsigned payloads as untrusted text.

Acceptance criteria

  • Define the trust boundary for artifact consumers (planner dedupe vs planner input vs implementor input vs merger gating).
  • Extend artifact format with signature metadata and signing algorithm/version fields.
  • Add signing on artifact emission and verification on trusted parsing paths.
  • Ensure untrusted/unsigned/invalidly signed artifacts do not drive follow-up execution.
  • Add typed errors or structured failure handling for signature verification failures.
  • Add tests for valid signature, invalid signature, missing signature, tampered payload, and compatibility mode behavior.
  • Document key configuration, rotation expectations, and operational failure behavior.

Notes


Agent-authored issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-supportPlanning and implementation work for GitHub-native agent development supportarchitectureSeparation-of-concerns, dependency, or boundary design improvementenhancementNew feature or requestsecuritySecurity weakness or hardening opportunity

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions