Skip to content

docs(rfcs): 0001 signed decision receipts for ABCA via AgentCore Gateway#40

Open
tomjwxf wants to merge 4 commits intoaws-samples:mainfrom
tomjwxf:rfc/0001-signed-decision-receipts
Open

docs(rfcs): 0001 signed decision receipts for ABCA via AgentCore Gateway#40
tomjwxf wants to merge 4 commits intoaws-samples:mainfrom
tomjwxf:rfc/0001-signed-decision-receipts

Conversation

@tomjwxf
Copy link
Copy Markdown

@tomjwxf tomjwxf commented Apr 18, 2026

Follow-up to closed #39 per @krokoko's review feedback. Lands as an RFC / design document rather than an integration guide, with the signing path placed explicitly outside the agent runtime's trust boundary.

Author disclosure (not hidden this time): I am the author of the receipt format specification (draft-farley-acta-signed-receipts) and of one of the four conformant signing implementations referenced in the RFC. The receipt contract in Section 5 is a format, not a vendor choice; any conformant implementation is interchangeable.

What this RFC does

  • Adds signed decision receipts alongside AgentCore Gateway's native policy engine. The gateway's existing Cedar policy evaluation stays in place; the RFC adds a parallel signed-receipt output in S3 with its own identity.
  • Places the signing identity outside the agent container. A compromised agent runtime cannot forge, alter, or suppress receipts. Signing key lives in KMS, scoped to a new Receipt Signer Lambda IAM role only, with explicit denies on the agent runtime role.
  • Makes the chain tamper-evident as a whole. Hash-chained, JCS-canonical, Ed25519-signed receipts. Verifier walks the chain offline with no AWS credentials.
  • Keeps the receipt format tool-agnostic. Four conformant implementations exist today (Section 12); ABCA can pick any.
  • Infrastructure in CDK. KMS key, S3 bucket, Receipt Signer Lambda, EventBridge rule, CloudTrail data events on kms:Sign. All reviewable TypeScript.

What this RFC deliberately does NOT do

  • Does not replace AgentCore Gateway's native policy engine. The agentcore add policy-engine --attach-mode ENFORCE workflow stays exactly as documented. Cedar evaluation is not reimplemented.
  • Does not modify ABCA source code under agent/. The signing path is additive infrastructure, not agent changes.
  • Does not require any specific signing implementation. Section 12 lists four conformant implementations; the ABCA team can choose or write their own.
  • Does not cover transparency-log anchoring (Sigstore Rekor). Supported by the chain structure but out of scope here.

How this addresses the review on #39

@krokoko raised four points; this RFC addresses each:

Point from #39 How addressed in this RFC
Author transparency was implicit Disclosure in the header of the RFC, not buried in a sign-off
"No code changes" was misleading Explicitly framed as infrastructure proposal; Section 1 says so
Signer inside the agent container (false-security critique) Section 4.3 puts the Receipt Signer in a separate Lambda with a separate IAM role, KMS access, and trust boundary
RFC / design doc first This IS the RFC. Tool-agnostic contract (Section 5), CDK-managed infra (Section 7), explicit trust model (Section 4), compromise analysis (Section 9)

What I need from the reviewers

Seven open questions in Section 10 that require ABCA-specific or AgentCore-team guidance before a reference-implementation PR can follow. The most important:

  1. AgentCore decision-event API shape. Does AgentCore Gateway emit EventBridge events per policy-engine decision? What is the detailType? If not, is CloudWatch Logs subscription on the policy-engine log group the supported fallback?
  2. Public-key distribution. Where does the signing public key live for external verifiers?
  3. Cedar policy versioning / session-id convention. Does AgentCore support per-session policy pinning, and is there a native ABCA session identifier the RFC should use?

The other four are in the RFC.

Files

  • docs/rfcs/0001-signed-decision-receipts.md (new; docs/rfcs/ directory also new)

Verification

  • RFC body references the AgentCore Gateway policy docs @krokoko provided
  • Cedar policy engine described as native (not reinvented)
  • Trust boundaries named at the entity level with a table
  • CDK snippets use real AWS CDK TypeScript constructs (kms.KeySpec.ED25519, cloudtrail.Trail, events.Rule, lambda.Function)
  • Author disclosure in the header
  • Open questions section names unknowns rather than guessing

Notes

  • The RFC deliberately references closed #39 so reviewers who followed that thread can see the connection.
  • Appendix A explicitly compares this RFC to the earlier guide, naming the three specific weaknesses that the RFC addresses.
  • Happy to split into multiple follow-up PRs (RFC acceptance first, then reference-implementation code, then CDK contribution) once the design direction is agreed.
  • Filed under the aws-samples CLA per CONTRIBUTING.md.

Follow-up to closed aws-samples#39. Per @krokoko's review feedback, this lands as
an RFC / design document rather than an integration guide.

Key properties:

- The signing identity is a separate Lambda (Receipt Signer), outside
  the agent runtime's trust boundary. A compromised agent container
  cannot forge, alter, or suppress receipts.
- Cedar policy evaluation stays native to AgentCore Gateway's built-in
  policy engine. This RFC does not reinvent Cedar evaluation; it adds
  a signed-receipt output alongside the existing CloudWatch log.
- Signing key lives in KMS (Ed25519), scoped to the Receipt Signer
  IAM role only, with explicit deny on the agent runtime role.
- Receipt chain is tamper-evident as a whole (hash-chained, JCS
  canonical, Ed25519 signed). Verifiable offline with
  @veritasacta/verify, no AWS credentials required.
- Receipt format is tool-agnostic (IETF draft-farley-acta-signed-receipts,
  four independent conformant implementations).
- CDK-managed infra: KMS key, S3 bucket with explicit denies, CloudTrail
  data events on kms:Sign for second-layer audit.

Sections:

1. Problem statement
2. What AgentCore Gateway already provides (policy engine native)
3. Goals and non-goals
4. Trust model (explicit entity + boundary table)
5. Receipt contract (wire format + three invariants)
6. Reference implementation: AgentCore Gateway + Receipt Signer Lambda
7. CDK infrastructure (KMS, S3, Lambda, CloudTrail)
8. Verification flow (no AWS credentials required)
9. Compromise analysis (5 scenarios walked through)
10. Open questions (7 ABCA-specific unknowns)
11. References
12. Conformant signing implementations
Appendix A: how this differs from the closed aws-samples#39 guide

Explicitly acknowledges the critique @krokoko raised on aws-samples#39 that led
to this rewrite. Author disclosure at the top of the document.

This RFC references live AgentCore Gateway documentation at
https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/policy-getting-started.html
provided by @krokoko. The open questions in Section 10 are the places
where AgentCore team guidance is needed before a reference-
implementation PR can follow.
@tomjwxf tomjwxf requested a review from a team as a code owner April 18, 2026 01:40
@tomjwxf tomjwxf changed the title rfc(governance): signed decision receipts for ABCA via AgentCore Gateway docs(rfcs): 0001 signed decision receipts for ABCA via AgentCore Gateway Apr 18, 2026
@krokoko
Copy link
Copy Markdown
Contributor

krokoko commented Apr 20, 2026

Hi @tomjwxf, thank you for the follow-up RFC. This is a significant improvement over #39 — the trust model is explicit, the signer is outside the agent runtime, the compromise analysis is thorough, and the author disclosure is upfront. I appreciate the work that went into this.

However, I have a fundamental concern with the trigger mechanism, and I think it needs to be reworked before we can move forward. The EventBridge trigger is the wrong pattern here.

In my review on #39, I proposed interceptor Lambdas — synchronous, in-path, part of the gateway request/response lifecycle:

Agent → Gateway → REQUEST Interceptor (sign decision receipt) → Tool Target → RESPONSE Interceptor (sign outcome receipt) → Agent

The RFC instead proposes an asynchronous EventBridge-triggered sidecar that reacts to policy decision events after the fact. A few problems with that:

  1. The event source doesn't exist. The RFC assumes source: ['aws.bedrock-agentcore'] with detailType: ['Policy Engine Decision'], but AgentCore Gateway does not emit EventBridge events for policy decisions today. The docs only mention CloudWatch logging. Building the core infrastructure around a speculative API is not something we can accept in an RFC.
  2. AgentCore Gateway already supports interceptors. REQUEST and RESPONSE interceptor Lambdas are a documented, supported feature — one of each per gateway, backed by Lambda. This is exactly the pattern I proposed. It's available today.
  3. Async is architecturally weaker for integrity. With interceptors, receipt signing is synchronous and in-path: the receipt is the authority for the decision, produced before the tool executes. With EventBridge, the receipt is a post-hoc recording with at-least-once delivery semantics — events can be delayed, duplicated, or dropped in failure scenarios. The hash chain's ordering guarantees become fragile on an eventually-consistent event bus. For an integrity-critical system targeting regulated environments, if you can be in-path, you should be in-path.
  4. You agreed with this in docs(governance-receipts): add guide for signed decision receipts on ABCA runs #39. Your own response said: "The REQUEST interceptor signs the decision receipt before the tool runs. That makes the receipt the authority for the decision, not a post-hoc recording of it." The RFC reverts to the post-hoc pattern you acknowledged was weaker.

What I'd like to see:

  • PRIMARY path: REQUEST interceptor signs the decision receipt, RESPONSE interceptor signs the outcome receipt. Synchronous, in-path, using a
    feature that exists today.
  • Drop Section 7.3's EventBridge rule entirely.

The rest of the RFC is solid — the receipt contract (Section 5), trust model (Section 4), CDK patterns for KMS/S3/IAM, and compromise analysis (Section 9) can stay largely as-is. It's the trigger mechanism in Sections 6 and 7.3 that needs to be rebuilt around what AgentCore actually supports.

Minor notes:

  • Three of the four "conformant implementations" are your own packages. That's fine (someone has to write them first), but the language "four independent implementations" overstates the diversity. Consider saying "four implementations, three by the RFC author" for accuracy.
  • S3 bucket should use BlockPublicAccess.BLOCK_ALL, not just BLOCK_ACLS, for a regulated-environment receipt store.
  • Section 7.4 mixes a CloudTrail Trail with an EventDataStore (CloudTrail Lake). Pick one — for the cross-check use case, the event data store with advanced event selectors is sufficient.

Happy to discuss further. The direction is right, the trigger mechanism just needs to match what we actually have available and what gives us the strongest guarantees.

Thank you!

@tomjwxf
Copy link
Copy Markdown
Author

tomjwxf commented Apr 21, 2026

@krokoko agreed on all.

On the trigger mechanism: I contradicted my own #39 position without a technical reason for doing so. The REQUEST interceptor signs the decision as the authority. EventBridge would react to it as a post-hoc recording.

Different integrity claims, and the interceptor version is materially stronger on ordering, delivery, and freshness. Section 7.3 also built around an event source (source: aws.bedrock-agentcore, detailType: Policy Engine Decision) that does not exist today, which shouldn't be in an RFC.

Revised architecture for the next push:

  • REQUEST interceptor (in-path, sync): builds and signs the pre-tool
    decision receipt via KMS; returns signed receipt as authority; if
    signing fails, returns a deny that AgentCore enforces.
  • RESPONSE interceptor (in-path, sync, optional per-tool): builds and
    signs the outcome receipt, chained via previousReceiptHash to the
    REQUEST receipt.
  • Both write to S3 after return. EventBridge rule dropped entirely.

Receipt contract (Section 5), trust model (Section 4), compromise analysis (Section 9), and KMS/S3/IAM CDK patterns stay as-is. Only Sections 6 and 7.3 rebuild.

Minor points, all accepted:

  • "Four implementations": three are mine (protect-mcp, protect-mcp-adk,
    sb-runtime), APS by aeoess is the genuinely independent one. Language
    will change to "four implementations, three by the RFC author and one
    independent."
  • S3 BlockPublicAccess.BLOCK_ALL: correct for a regulated-environment
    receipts bucket. Will update.
  • CloudTrail Trail vs EventDataStore: picking EventDataStore with
    advanced event selectors on eventName = Sign.

One question before I push: would it be useful to validate the interceptor-Lambda wiring against a working AgentCore deployment first so the CDK snippets match what actually ships, or is the doc-level spec enough for this round?

Revised RFC this week. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants