Replies: 2 comments
-
|
This gap is important because a hash-chained audit record proves integrity, but not decision sufficiency. In regulated environments, reviewers usually ask not only "was this event changed?" but "was the decision explainable at the time it was made?" I would include a bounded decision context in the audit entry, but with careful redaction:
The key is to store enough to replay or defend the governance decision without turning the audit log into a sensitive prompt archive. For healthcare, financial services, and education, the audit schema should support privacy-preserving evidence by default, with deeper forensic payloads stored separately under stricter access controls. |
Beta Was this translation helpful? Give feedback.
-
|
In regulated environments we found that decision context needs to capture not just the current request but the agent's behavioral trajectory what it's been doing for the last 24h. A single read of a sensitive file looks different when it's the 11th read in 5 minutes vs. the first read of the day. AgentGate encodes this trajectory into the trust score and the audit entry before execution. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
What the audit trail records today
GovernancePolicyMiddlewareinagent_os/integrations/maf_adapter.pycallsAuditLog.log()afterPolicyEvaluator.evaluate()returns. Thedatadict passed to the log contains:AuditEntryinagentmesh/governance/audit.pycaptures:MerkleAuditChainlinks entries via SHA-256, andAuditLog.verify_integrity()confirms the chain is unbroken.The specific gap
PolicyEvaluator.evaluate()already builds a richeraudit_entrydict perPolicyDecision:GovernancePolicyMiddlewarereceives this ondecision.audit_entrybut does not forward it toAuditLog.log(). The evaluator builds the richer context. The middleware drops it. TheAuditLog.log()signature itself has noaudit_entryparameter -- pre-decision context would pass through the existingdata: Optional[dict]argument.The result: every audit entry records what the policy decided. None records what the agent was doing, at what confidence, and what governance weight that decision carried before the policy engine saw it.
Why this matters in regulated verticals
The five MAF integration scenarios cover banking, retail, healthcare, enterprise IT, and DevOps. In each, a compliance reviewer reading the Merkle-chained log can answer one question per entry: was this action permitted?
In regulated industries -- insurance, financial services, pharma -- auditors routinely ask a second question: was this decision sound? That is not a question about policy compliance. It is a question about governance quality: what confidence was the agent operating at, what category of deliberation did this decision require, and who in the accountability chain holds responsibility for this decision class.
The loan processing scenario (scenario 01) illustrates the gap concretely. An approved claim writes this to the audit trail:
{ "event_type": "policy_evaluation", "action": "allow", "outcome": "success", "data": { "matched_rule": "allow_loan_inquiries", "message_preview": "Check loan eligibility for John Smith..." } }A compliance reviewer confirms the policy cleared. They cannot determine:
The EU AI Act Art. 12 requires automatic logging. Art. 14 requires human oversight with the ability to intervene. Art. 86 requires the ability to explain a decision. A Merkle-chained trail of
allow / matched_rule: allow_loan_inquiriessatisfies Art. 12 structurally. It provides no material for Art. 14 or Art. 86 review.Where the extension point already exists
decision.audit_entryalready carriescontext_snapshotfrom the evaluator. Thedatadict inAuditLog.log()accepts arbitrary key-value pairs. The extension point requires no new parameters and no schema changes toAuditEntry.Pre-decision governance context -- confidence score, gate classification, reasoning reconstruction, accountability owner -- would merge into the same
datadict beforeAuditLog.log()is called, landing in the same Merkle-chained entry alongside the policy outcome.Proposed contribution
A new MAF integration scenario (
examples/maf-integration/) for an insurance claims processing agent, following the existing four-act structure with a fifth act added:Act 5 prints each audit entry twice: once as Acts 1-4 currently display it (policy outcome and matched rule), and once with the pre-decision context merged into
data:{ "event_type": "policy_evaluation", "action": "allow", "outcome": "success", "data": { "matched_rule": "allow_claim_approval", "confidence_score": 0.85, "confidence_threshold": 0.80, "confidence_zone": "GREEN", "gate_classification": "elevated_review", "gate_rationale": "Tool class database_write triggers elevated_review regardless of confidence zone per contoso-insurance-governance.", "alternatives_considered": ["flag_for_manual_review", "request_additional_documentation"], "reasoning_reconstruction": "claims-agent-001 evaluated database_write against contoso-insurance-governance at confidence 0.85 (GREEN). Tool class overrides to elevated_review. Human reviewer acknowledgment required before execution proceeds.", "accountability_owner": "senior-claims-manager", "decision_written_at": "2026-04-27T14:00:00Z" } }A compliance reviewer reading the second view can answer all six questions in the table above from the same Merkle-chained artifact.
The scenario covers four decision categories across the gate classification model, producing one audit entry per category:
file_readweb_searchdatabase_writebulk_deleteImplementation shape
The gate classifier wraps the existing
GovernancePolicyMiddlewarecall without modifyingPolicyEvaluator.evaluate()orAuditLog.log():No changes to
AuditEntry,MerkleAuditChain,PolicyEvaluator, orAuditLog. The scenario is self-contained and standalone per the existing five scenarios' pattern.AI-assisted contributions disclosure
This proposal was developed with AI assistance (Claude). Per CONTRIBUTING.md disclosure requirements: I directed the analysis, reviewed every claim against the source code, and can walk through any part of this. The specific findings -- that
decision.audit_entryis not forwarded toAuditLog.log(), thatAuditLog.log()has noaudit_entryparameter, that thedatadict is the correct extension point -- were verified againstmaf_adapter.py,audit.py,evaluator.py, anddecision.pydirectly.Prior art
Gate classification model and pre-decision artifact schema:
mj3b/governed-decision-intelligence(Apache 2.0). All patterns used in the implementation will be attributed per CONTRIBUTING.md requirements in the PR description and in code comments.Three questions before building
examples/maf-integration/the right path, or would maintainers prefer aregulated-verticals/subdirectory given the compliance-reviewer framing?PolicyDecision.audit_entrycarriescontext_snapshotfrom the evaluator. Would maintainers prefer the gate classification fields merge into that existing dict, or remain separate indata?Beta Was this translation helpful? Give feedback.
All reactions