Skip to content

Commit 3ff9a59

Browse files
feat(governance): confidential computing, MoE routing, PQC WORM — runnable assurance
Extend the runnable-assurance suite into four net-new verifiable domains: - Confidential computing (env-01): rego/attestation_gate.rego enforces SEV-SNP/TDX + vTPM PCR_MATCH admission (golden measurement, TCB anti-rollback, fresh nonce), with structured denial reasons; 9 OPA tests (21/21 total). TLA+ AdmissionWithAttestation proves no T0 workload runs without valid attestation and that TCB rollback / PCR drift force eviction (TLC, 64 states, no error). - MoE routing stability (rte-01): routing/sara_acr_router.py implements SARA (load-aware gating) + ACR (capacity regulation); demonstrates baseline expert collapse (entropy 0.38, load ratio 5.6) vs stabilized (entropy 0.99, ratio 1.25) satisfying entropy/load/drop invariants; 4 pytests. - PQC WORM (cry-02): kafka/pqc_worm_logger_v2.py replaces the HMAC placeholder with real CRYSTALS-Dilithium (ML-DSA-65 / FIPS 204) signatures + tamper-evident hash chain + S3 Object Lock COMPLIANCE retention; verify_chain() detects entry mutation, batch reorder, and signature forgery; 6 pytests. - OSCAL: new catalog_sentinel_v24_env_rte.json adding ENV and RTE control groups, each backed by a runnable artifact. run_runnable_assurance.sh now runs 8 checks (all PASS); CI + docs + requirements updated. No regressions in existing governance tests.
1 parent 00e43e9 commit 3ff9a59

13 files changed

Lines changed: 947 additions & 16 deletions

.github/workflows/runnable-assurance.yml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ jobs:
4040
java-version: '17'
4141

4242
- name: Install Python deps
43-
run: pip install pyyaml jsonschema
43+
run: pip install pyyaml jsonschema dilithium-py pytest
4444

4545
- name: Install OPA
4646
run: |
@@ -71,5 +71,10 @@ jobs:
7171
circom circuits/src1_concentration_bound.circom --r1cs --wasm --sym --O0 -o circuits/
7272
circom circuits/src_fair1_reason_code_check.circom --r1cs --wasm --sym --O0 -o circuits/
7373
74+
- name: Unit tests (routing + PQC WORM)
75+
run: |
76+
pytest governance_artifacts/routing/test_sara_acr_router.py -q
77+
pytest governance_artifacts/kafka/test_pqc_worm_logger_v2.py -q
78+
7479
- name: Run runnable assurance suite
7580
run: bash governance_artifacts/run_runnable_assurance.sh

governance_artifacts/RUNNABLE_ASSURANCE.md

Lines changed: 66 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,21 @@ Runs all five checks below and fails fast on any error.
2323

2424
| # | Check | Tool | Backs OSCAL control | Regime anchor |
2525
|---|-------|------|---------------------|---------------|
26-
| 1 | Deny-by-default release gate + high-impact credit gate | `opa test` (12 tests) | release-gate semantics; `con-07` quorum | SR 11-7, EU AI Act Art. 14, ECOA, GDPR Art. 22 |
26+
| 1 | Release gate + credit gate + confidential-computing attestation gate (PCR_MATCH) | `opa test` (21 tests) | release-gate, `con-07`, `env-01` | SR 11-7, EU AI Act Art. 14/15, ECOA, GDPR Art. 22, DORA |
2727
| 2 | Containment one-way ratchet & terminal-actuation quorum | TLA+ `tlc2.TLC` | `con-04`, `con-07` | EU AI Act Art. 14, DORA resilience testing |
28-
| 3 | GC-IR cross-target conformance (policy ⇔ circuit ⇔ expectation) | `opa eval` + Circom witness | obligation `ob-ecoa-adverse-reason-codes` | ECOA, GDPR Art. 22, EU AI Act Art. 13 |
29-
| 4 | Systemic-risk concentration bound (HHI) zk proof | Circom + Groth16 (snarkjs) | `cry-05` | Basel op-risk, systemic telemetry |
30-
| 5 | Governance artifact schema validation | Python validator | manifest/schema integrity | OSCAL, evidence logging (EU AI Act Art. 12) |
28+
| 3 | Attested admission — no T0 workload runs without fresh valid attestation; TCB rollback / PCR drift force eviction | TLA+ `tlc2.TLC` | `env-01` | EU AI Act Art. 15, DORA ICT risk, NIST AI RMF |
29+
| 4 | GC-IR cross-target conformance (policy ⇔ circuit ⇔ expectation) | `opa eval` + Circom witness | obligation `ob-ecoa-adverse-reason-codes` | ECOA, GDPR Art. 22, EU AI Act Art. 13 |
30+
| 5 | Systemic-risk concentration bound (HHI) zk proof | Circom + Groth16 (snarkjs) | `cry-05` | Basel op-risk, systemic telemetry |
31+
| 6 | SARA/ACR MoE routing stabilization invariants (entropy / load balance / drop) | Python simulator + pytest | `rte-01` | EU AI Act Art. 15 robustness, SR 11-7 |
32+
| 7 | PQC WORM audit log — real CRYSTALS-Dilithium (ML-DSA-65) signatures + tamper-evident hash chain + S3 Object Lock retention | Python (`dilithium-py`) + pytest | `cry-02` | DORA, EU AI Act Art. 12 logging |
33+
| 8 | Governance artifact schema validation | Python validator | manifest/schema integrity | OSCAL, evidence logging (EU AI Act Art. 12) |
34+
35+
### New control groups (`oscal/catalog_sentinel_v24_env_rte.json`)
36+
37+
- **ENV — Confidential Computing & Attested Execution**: `env-01` (hardware-attested
38+
admission for T0/T1 via SEV-SNP / TDX + vTPM PCR_MATCH; runtime TCB-rollback and
39+
PCR-drift eviction), `env-02` (enclave-bound ML-DSA key custody).
40+
- **RTE — MoE Routing Stability**: `rte-01` (SARA/ACR stabilization invariants).
3141

3242
## 1. OPA policy tests — `rego/`
3343

@@ -90,6 +100,57 @@ cd governance_artifacts/zk && bash run_src1_proof.sh
90100
> production-secure. A production deployment requires a multi-party trusted setup
91101
> (or a transparent system such as PLONK/STARK as noted in the schema enum).
92102
103+
## 6. Confidential-computing attestation gate — `rego/attestation_gate.rego` + `tla/AdmissionWithAttestation.tla`
104+
105+
The `PCR_MATCH=TRUE` assertion that recurs throughout the master docs is now
106+
*enforced*, not merely stated. The Rego gate (`sentinel.attestation`) admits a
107+
T0/T1 workload only when it presents a SEV-SNP or TDX report with a verified
108+
signature, fresh anti-replay nonce, a launch measurement in the golden registry,
109+
platform TCB at/above the ratified minimum (no rollback), and a vTPM PCR quote
110+
matching the policy digest. The TLA+ spec proves the *temporal* guarantee: across
111+
all 64 initial evidence combinations, no workload reaches `RUNNING` without a
112+
valid attestation, and runtime TCB rollback or PCR drift forces `EVICTED`.
113+
114+
```bash
115+
opa test governance_artifacts/rego/ # includes 9 attestation tests
116+
cd governance_artifacts/tla
117+
java -cp tools/tla2tools.jar tlc2.TLC -config AdmissionWithAttestation.cfg AdmissionWithAttestation.tla
118+
```
119+
120+
## 7. SARA/ACR MoE routing stabilization — `routing/sara_acr_router.py`
121+
122+
Defines and demonstrates two stack-specific mechanisms (not external standards):
123+
**SARA** (Stabilized Adaptive Routing — load-aware gating bias + temperature) and
124+
**ACR** (Adaptive Capacity Regulation — per-expert capacity factor with overflow
125+
handling). The simulator shows that under skewed gating a naive top-k router
126+
collapses (normalised entropy ≈ 0.38, load ratio ≈ 5.6) and *violates* the
127+
`rte-01` invariants, while SARA+ACR holds entropy ≈ 0.99 and load ratio ≈ 1.25,
128+
*satisfying* all invariants (entropy ≥ 0.80, load ratio ≤ 1.60, drop ≤ 0.02).
129+
130+
```bash
131+
python3 governance_artifacts/routing/sara_acr_router.py
132+
pytest governance_artifacts/routing/test_sara_acr_router.py -q # 4 tests
133+
```
134+
135+
## 8. PQC WORM audit log — `kafka/pqc_worm_logger_v2.py`
136+
137+
Replaces the original HMAC placeholder with **real CRYSTALS-Dilithium (ML-DSA-65,
138+
FIPS 204)** signatures over canonical batch payloads, linked in a tamper-evident
139+
**hash chain** (`prev_batch_hash`), with an S3 Object Lock COMPLIANCE-mode
140+
retention record per batch. `verify_chain()` re-validates every signature and link
141+
and returns a supervisory-ready report; the demo proves that entry mutation,
142+
batch reordering, and signature forgery are all detected.
143+
144+
```bash
145+
python3 governance_artifacts/kafka/pqc_worm_logger_v2.py
146+
pytest governance_artifacts/kafka/test_pqc_worm_logger_v2.py -q # 6 tests
147+
```
148+
149+
> ML-DSA-65 here is provided by the pure-Python `dilithium-py` reference
150+
> implementation — correct and FIPS-204-aligned, but **not** constant-time or
151+
> side-channel-hardened. Production signing belongs in the env-02 enclave using a
152+
> validated cryptographic module.
153+
93154
## Reproducing from a clean checkout
94155

95156
```bash
@@ -101,7 +162,7 @@ curl -L -o ~/.local/bin/circom https://github.com/iden3/circom/releases/download
101162
# TLA+ tools
102163
curl -L -o governance_artifacts/tla/tools/tla2tools.jar https://github.com/tlaplus/tlaplus/releases/download/v1.7.4/tla2tools.jar
103164
# Python
104-
pip install pyyaml jsonschema
165+
pip install pyyaml jsonschema dilithium-py
105166
# Run everything
106167
bash governance_artifacts/run_runnable_assurance.sh
107168
```
Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
#!/usr/bin/env python3
2+
"""
3+
PQC WORM Logger v2 — CRYSTALS-Dilithium (ML-DSA-65 / FIPS 204) signed audit log.
4+
================================================================================
5+
Backs OSCAL control cry-02 (hybrid PQC signatures on governance event envelopes)
6+
and the Kafka/S3 Object Lock WORM evidence pipeline.
7+
8+
Improvements over the original pqc_worm_logger.py (which used an HMAC placeholder):
9+
10+
* REAL post-quantum signatures via ML-DSA-65 (CRYSTALS-Dilithium), the exact
11+
algorithm named in cry-02. Each batch is signed; verification uses the public
12+
key only (asymmetric, unlike the prior HMAC).
13+
* Tamper-evident HASH CHAIN: each batch records prev_batch_hash, so any
14+
reordering, deletion, or mutation of historic batches is detectable.
15+
* WORM semantics modelled: an immutable "retention" record (S3 Object Lock
16+
COMPLIANCE mode + retain-until date) accompanies each committed batch.
17+
* verify_chain() re-validates every signature AND the hash linkage; returns a
18+
machine-readable report suitable for supervisory evidence.
19+
20+
Falls back is intentionally absent: if dilithium_py is unavailable the import
21+
fails loudly rather than silently downgrading crypto.
22+
"""
23+
from __future__ import annotations
24+
import hashlib
25+
import json
26+
import time
27+
from dataclasses import dataclass, field
28+
from datetime import datetime, timedelta, timezone
29+
from typing import Any
30+
31+
from dilithium_py.ml_dsa import ML_DSA_65
32+
33+
ALG = "ML-DSA-65" # CRYSTALS-Dilithium, FIPS 204
34+
RETENTION_YEARS = 7 # Basel/DORA-style retention default
35+
36+
37+
def _canon(obj: Any) -> bytes:
38+
"""Deterministic canonical JSON for signing/hashing."""
39+
return json.dumps(obj, sort_keys=True, separators=(",", ":")).encode()
40+
41+
42+
def _sha256(b: bytes) -> str:
43+
return "sha256:" + hashlib.sha256(b).hexdigest()
44+
45+
46+
@dataclass
47+
class CommittedBatch:
48+
batch_id: str
49+
timestamp: str
50+
entries: list[dict]
51+
prev_batch_hash: str
52+
payload_hash: str
53+
signature_hex: str
54+
retention: dict # S3 Object Lock model
55+
56+
def to_dict(self) -> dict:
57+
return {
58+
"batch_id": self.batch_id,
59+
"timestamp": self.timestamp,
60+
"entries": self.entries,
61+
"prev_batch_hash": self.prev_batch_hash,
62+
"payload_hash": self.payload_hash,
63+
"signature_alg": ALG,
64+
"signature_hex": self.signature_hex,
65+
"retention": self.retention,
66+
}
67+
68+
69+
@dataclass
70+
class PQCWormLoggerV2:
71+
bucket: str = "kacg-gsifi-worm-evidence-prod"
72+
batch_size_threshold: int = 10
73+
_pk: bytes = field(default=None, repr=False)
74+
_sk: bytes = field(default=None, repr=False)
75+
_pending: list[dict] = field(default_factory=list, repr=False)
76+
_chain: list[CommittedBatch] = field(default_factory=list, repr=False)
77+
_genesis: str = "sha256:" + "0" * 64
78+
79+
def __post_init__(self):
80+
if self._pk is None or self._sk is None:
81+
self._pk, self._sk = ML_DSA_65.keygen()
82+
83+
@property
84+
def public_key_fingerprint(self) -> str:
85+
return _sha256(self._pk)
86+
87+
def add_entry(self, entry: dict) -> CommittedBatch | None:
88+
self._pending.append(entry)
89+
if len(self._pending) >= self.batch_size_threshold:
90+
return self.commit_batch()
91+
return None
92+
93+
def commit_batch(self) -> CommittedBatch | None:
94+
if not self._pending:
95+
return None
96+
entries = self._pending
97+
self._pending = []
98+
99+
prev_hash = self._chain[-1].payload_hash if self._chain else self._genesis
100+
ts = datetime.now(timezone.utc).isoformat()
101+
batch_id = hashlib.sha256(f"{ts}{len(self._chain)}".encode()).hexdigest()[:16]
102+
103+
# Payload binds entries + the previous hash (chain linkage).
104+
payload = {"batch_id": batch_id, "timestamp": ts,
105+
"entries": entries, "prev_batch_hash": prev_hash}
106+
payload_bytes = _canon(payload)
107+
payload_hash = _sha256(payload_bytes)
108+
109+
# REAL ML-DSA signature over the canonical payload.
110+
signature = ML_DSA_65.sign(self._sk, payload_bytes)
111+
112+
retain_until = (datetime.now(timezone.utc)
113+
+ timedelta(days=365 * RETENTION_YEARS)).isoformat()
114+
retention = {
115+
"mode": "COMPLIANCE", # S3 Object Lock COMPLIANCE mode
116+
"retain_until": retain_until,
117+
"legal_hold": False,
118+
"bucket": self.bucket,
119+
}
120+
121+
batch = CommittedBatch(
122+
batch_id=batch_id, timestamp=ts, entries=entries,
123+
prev_batch_hash=prev_hash, payload_hash=payload_hash,
124+
signature_hex=signature.hex(), retention=retention,
125+
)
126+
self._chain.append(batch)
127+
return batch
128+
129+
def verify_chain(self) -> dict:
130+
"""Re-verify every signature and the hash linkage. Returns a report."""
131+
errors: list[str] = []
132+
prev = self._genesis
133+
for i, b in enumerate(self._chain):
134+
if b.prev_batch_hash != prev:
135+
errors.append(f"batch[{i}] {b.batch_id}: broken hash chain link")
136+
payload = {"batch_id": b.batch_id, "timestamp": b.timestamp,
137+
"entries": b.entries, "prev_batch_hash": b.prev_batch_hash}
138+
payload_bytes = _canon(payload)
139+
if _sha256(payload_bytes) != b.payload_hash:
140+
errors.append(f"batch[{i}] {b.batch_id}: payload hash mismatch")
141+
if not ML_DSA_65.verify(self._pk, payload_bytes, bytes.fromhex(b.signature_hex)):
142+
errors.append(f"batch[{i}] {b.batch_id}: ML-DSA signature INVALID")
143+
prev = b.payload_hash
144+
return {
145+
"alg": ALG,
146+
"public_key_fingerprint": self.public_key_fingerprint,
147+
"batches": len(self._chain),
148+
"status": "VERIFIED" if not errors else "FAILED",
149+
"errors": errors,
150+
}
151+
152+
153+
def _demo() -> int:
154+
log = PQCWormLoggerV2(batch_size_threshold=3)
155+
for i in range(7):
156+
log.add_entry({
157+
"event_id": f"evt-{i:03d}",
158+
"timestamp": datetime.now(timezone.utc).isoformat(),
159+
"control_id": "cry-02",
160+
"decision": ["allow", "deny", "escalate"][i % 3],
161+
})
162+
log.commit_batch() # flush remainder
163+
164+
report = log.verify_chain()
165+
print("PQC WORM Logger v2 —", ALG)
166+
print(f" public key fingerprint: {report['public_key_fingerprint'][:23]}...")
167+
print(f" committed batches : {report['batches']}")
168+
print(f" chain verification : {report['status']}")
169+
assert report["status"] == "VERIFIED", report
170+
171+
# Tamper test: mutate a historic entry and confirm detection.
172+
log._chain[0].entries[0]["decision"] = "TAMPERED"
173+
bad = log.verify_chain()
174+
print(f" after tamper : {bad['status']} ({len(bad['errors'])} error(s))")
175+
assert bad["status"] == "FAILED", "tamper went undetected!"
176+
print(" RESULT: signatures + hash chain verify; tampering detected")
177+
return 0
178+
179+
180+
if __name__ == "__main__":
181+
raise SystemExit(_demo())
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
"""Tests for PQC WORM Logger v2 (ML-DSA-65 signed, hash-chained audit log)."""
2+
import os
3+
import sys
4+
5+
sys.path.insert(0, os.path.dirname(__file__))
6+
7+
import pytest # noqa: E402
8+
9+
from pqc_worm_logger_v2 import PQCWormLoggerV2, ALG # noqa: E402
10+
11+
12+
def _fill(log, n):
13+
for i in range(n):
14+
log.add_entry({"event_id": f"e{i}", "control_id": "cry-02", "decision": "allow"})
15+
log.commit_batch()
16+
17+
18+
def test_alg_is_ml_dsa_65():
19+
assert ALG == "ML-DSA-65"
20+
21+
22+
def test_chain_verifies_clean():
23+
log = PQCWormLoggerV2(batch_size_threshold=3)
24+
_fill(log, 7)
25+
report = log.verify_chain()
26+
assert report["status"] == "VERIFIED"
27+
assert report["batches"] == 3
28+
assert not report["errors"]
29+
30+
31+
def test_retention_is_compliance_worm():
32+
log = PQCWormLoggerV2(batch_size_threshold=2)
33+
_fill(log, 2)
34+
batch = log._chain[0]
35+
assert batch.retention["mode"] == "COMPLIANCE"
36+
assert "retain_until" in batch.retention
37+
38+
39+
def test_tamper_entry_detected():
40+
log = PQCWormLoggerV2(batch_size_threshold=2)
41+
_fill(log, 4)
42+
log._chain[0].entries[0]["decision"] = "TAMPERED"
43+
report = log.verify_chain()
44+
assert report["status"] == "FAILED"
45+
46+
47+
def test_chain_reorder_detected():
48+
log = PQCWormLoggerV2(batch_size_threshold=2)
49+
_fill(log, 6)
50+
# Swap two batches -> hash linkage breaks.
51+
log._chain[0], log._chain[1] = log._chain[1], log._chain[0]
52+
report = log.verify_chain()
53+
assert report["status"] == "FAILED"
54+
55+
56+
def test_signature_forgery_detected():
57+
log = PQCWormLoggerV2(batch_size_threshold=2)
58+
_fill(log, 2)
59+
sig = bytearray(bytes.fromhex(log._chain[0].signature_hex))
60+
sig[0] ^= 0xFF
61+
log._chain[0].signature_hex = sig.hex()
62+
report = log.verify_chain()
63+
assert report["status"] == "FAILED"

0 commit comments

Comments
 (0)