Governance Chains: multi-step approval workflows with separation of powers #793

jlugo63 · 2026-04-04T23:39:36Z

jlugo63
Apr 4, 2026

Hey all — I’ve been following this project since the announcement and wanted to share something we’ve been building on top of it.

The problem that got us started

The Kiro incident at Amazon stuck with me. Their AI coding agent was told to fix a minor Cost Explorer bug, decided the fastest path was to delete the entire production environment and rebuild it, and caused a 13-hour outage. Then Alibaba’s ROME agent hijacked GPUs for crypto mining because it calculated that was the most efficient way to hit its performance targets. Not compromised — just optimizing.

Both cases had the same structure: a single agent with broad permissions made a big decision with no independent review. The agent that identified the problem also decided the fix and executed it. Nobody checked whether “delete production” was a proportionate response to a config bug.

Policy engines wouldn’t have helped here. The actions were technically allowed. The issue was that nobody evaluated whether the specific decision made sense, and no independent system verified it before execution.

What we built

We’ve been working on a governance layer called Gavel that sits on top of this toolkit. The core idea is separation of powers: the agent that proposes an action cannot be the one that reviews or approves it.

Using the Kiro scenario as an example:

An agent proposes a fix and declares its scope up front (e.g., modifying a config file only). That proposal is recorded as the first event in a hash-linked audit trail.

The system evaluates risk factors (production environment, destructive potential, financial impact) and determines that a full governance workflow is required.

The proposed action is executed in an isolated sandbox first. If the agent’s actual behavior deviates from its declared scope (e.g., attempting to delete infrastructure), that becomes part of the evidence.

A deterministic reviewer checks the evidence against the declared scope. If there’s a violation, the chain stops immediately. Nothing reaches production.

If everything checks out, the process continues:

A separate review agent evaluates the evidence
A third agent approves the action
A scoped, single-use execution token is issued with a short expiration

Each step is cryptographically linked via SHA-256, forming a verifiable chain of decisions.

Key components

Governance chains
Hash-linked sequences from proposal to execution that create a verifiable decision trail.

Separation of powers
Proposer, reviewer, and approver must be distinct agents. This is enforced structurally at the API level.

Blast box
Sandbox execution environment that generates evidence for review. The focus is on proof, not just safety.

Deterministic evidence review
A set of fixed checks (scope compliance, secret detection, forbidden paths, etc.) with no LLM involvement.

Tiered autonomy
Risk-based controls. Low-risk actions move fast; high-risk actions require full governance.

Liveness monitor
Timeout-based denial. If approval doesn’t happen in time, the action is automatically rejected.

Constitutional invariants
Hard “forbid” rules that cannot be overridden by configuration.

We currently have 158 tests (including adversarial cases like self-approval, role switching, and hash tampering) with 97.5% coverage. The project is a Python package built on top of the agent-governance-toolkit.

Repo: https://github.com/jlugo63/gavel

Why now

EU AI Act high-risk obligations take effect August 2, 2026. Requirements include human oversight, continuous risk management, and verifiable auditability. Governance chains with enforced separation of powers map directly to these needs.

At the same time, most enterprise leaders expect a major AI agent incident within the next year. It feels like the window to get governance right is closing quickly.

Open questions

Curious how others are approaching this:

Are teams building approval workflows for AI-initiated production changes, or relying on permissions and policy engines?
Is structural separation of powers important, or is trust scoring sufficient?
Is anyone using sandbox-first evidence before approving actions?
How are you thinking about EU AI Act compliance for autonomous agents?

Would love feedback on whether this direction makes sense or if we’re solving the wrong problem. Happy to dive deeper into any part of the implementation.

imran-siddique · 2026-04-05T04:46:59Z

imran-siddique
Apr 5, 2026
Collaborator

Great design, @jlugo63. The separation of powers model maps directly to problems we have been working on.

What aligns with AGT today:

Hash-linked audit trail maps to our MerkleAuditChain and FlightRecorder
Risk-based tiered autonomy maps to our ContextualPolicyEngine (inner_loop/ci_cd/autonomous)
Constitutional invariants map to our deny-overrides conflict resolution
Sandbox-first evidence aligns with agent-hypervisor execution isolation

Where Gavel adds value we do not have:

Structural separation of powers (proposer != reviewer != approver) — our model uses a single policy engine, not a multi-agent review chain
Liveness monitor (timeout-based auto-deny) — pattern our kill switch should adopt
Deterministic evidence review without LLM involvement — important for auditability

On your open questions:

Policy engines + structural separation are complementary. Policy catches known-bad; separation catches novel-bad that passes policy
Trust scoring is a signal, not a gate. Separation provides the gate
EU AI Act Art. 14 practically requires something like governance chains for high-risk systems

Would be interested in exploring integration — AGT provides policy evaluation and trust scoring, Gavel wraps that in multi-agent approval workflow. Happy to discuss architecture.

0 replies

jlugo63 · 2026-04-05T05:19:13Z

jlugo63
Apr 5, 2026
Author

@imran-siddique Thanks for the detailed mapping — really helpful to see how everything lines up. The MerkleAuditChain + Gavel governance chains feel like a strong combination.

The integration you’re describing is very much where this is heading — AGT handling identity + policy evaluation, with Gavel layering on the multi-step governance workflow. I’d be happy to put together a quick POC to show how those pieces fit together.

Would you prefer starting with a design doc, or should I open a PR with an integration example?

0 replies

CorellisOrg · 2026-04-16T07:37:41Z

CorellisOrg
Apr 16, 2026

The Kiro incident is a perfect motivator for this kind of design. We had our own near-miss — an agent decided to "clean up" a production Slack channel by archiving it, which would have deleted months of operational context.
We've been running approval workflows in production since February with 28 agents. Our approach is simpler than your governance chains but solves a similar problem:
Confidence scoring instead of static rules. Each agent self-reports confidence (0-1) on every action. Below a threshold → auto-approved. Above → queued for human review. Over time the thresholds adjust based on the agent's track record. An agent that has never made a destructive mistake gets a higher auto-approve ceiling.
Correction propagation as a governance primitive. When an approval is denied and the human explains why, that correction propagates to ALL agents in the fleet — not just the one that triggered it. This is the "fleet learning" layer. After 2 months, our denial rate dropped because agents learned from each other's mistakes.
The hard part isn't the approval gate — it's what happens after. Denying an action is easy. Teaching the agent to not propose it again is the real challenge.
Our framework: Corellis (https://github.com/CorellisOrg/corellis) — open source, built on OpenClaw. Would be interesting to see if the governance chain model could be combined with fleet-wide correction propagation.

0 replies

musaabhasan · 2026-05-08T17:36:49Z

musaabhasan
May 8, 2026

The separation-of-powers model is strong, but I would make the approval binding very explicit.

A governance chain should not approve a vague intent such as "fix production issue". It should approve a specific bounded action:

proposed action,
declared scope,
target environment,
exact resources affected,
tool names and arguments,
risk tier,
rollback plan,
evidence collected in sandbox,
expiration time,
and the identity allowed to execute it.

Then the execution token should be invalid if any of those fields change. This prevents the common failure where an agent gets approval for a narrow action and then executes a broader variant after context changes.

I would also separate three review outcomes: policy denial, proportionality denial, and evidence insufficiency. They look similar operationally, but they teach different lessons. A policy denial means the action is not allowed. A proportionality denial means the action is technically allowed but excessive for the problem. Evidence insufficiency means the chain needs more sandbox proof before humans or agents can approve.

The correction-propagation idea in the comments is useful, but I would avoid turning denials directly into global behavior changes without review. A safer pattern is: denial -> classified correction -> policy/rubric update -> regression test -> rollout to agents. That keeps fleet learning auditable.

0 replies

ElamOlame31 · 2026-05-28T00:22:24Z

ElamOlame31
May 28, 2026

This maps directly to something we found critical: when an agent in a delegation chain gets compromised, you need trust contagion — the parent's trust score should drop automatically when a child is quarantined.

In AgentGate: child quarantined → parent -15pts, parent quarantined → all children -30pts, 1h TTL. Curious how AGT handles propagation across delegation chains.

https://github.com/ElamOlame31/agentgate-public

https://www.tryagentgate.com/

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Governance Chains: multi-step approval workflows with separation of powers #793

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Governance Chains: multi-step approval workflows with separation of powers #793

Uh oh!

jlugo63 Apr 4, 2026

Replies: 5 comments

Uh oh!

imran-siddique Apr 5, 2026 Collaborator

Uh oh!

Uh oh!

jlugo63 Apr 5, 2026 Author

Uh oh!

CorellisOrg Apr 16, 2026

Uh oh!

musaabhasan May 8, 2026

Uh oh!

Uh oh!

ElamOlame31 May 28, 2026

jlugo63
Apr 4, 2026

imran-siddique
Apr 5, 2026
Collaborator

jlugo63
Apr 5, 2026
Author

CorellisOrg
Apr 16, 2026

musaabhasan
May 8, 2026

ElamOlame31
May 28, 2026