This runbook provides step-by-step procedures for responding to security
incidents in the pi extension runtime. It maps directly to implemented
controls in src/extensions.rs and src/config.rs.
| Level | Label | Example | Response Time | Escalation |
|---|---|---|---|---|
| P0 | Critical | Policy boundary breached, data exfiltration detected | < 1 hour | Immediate kill-switch + rollback |
| P1 | High | Risk controller degraded, sustained deny escalation | < 4 hours | Investigation + potential rollback |
| P2 | Medium | SLO budget at risk, elevated FP rate | < 1 sprint | Root cause analysis + tuning |
| P3 | Low | Informational, single transient anomaly | Next sprint | Log and backlog |
Alert received
├── Is extension quarantined (Terminate state)?
│ └── YES → Go to: Quarantine Investigation (P0/P1)
├── Is it a policy denial (exec/env denied)?
│ ├── Single occurrence → Log, monitor (P3)
│ └── Repeated burst (>3 in 60s) → Go to: Burst Denial Triage (P1)
├── Is it an enforcement state transition?
│ ├── Escalation (Allow→Harden→Deny) → Go to: Escalation Review (P2)
│ └── De-escalation → Log, verify cooldown worked (P3)
├── Is it a rollback trigger event?
│ └── YES → Go to: Automatic Rollback Verification (P1)
└── Is it a ledger integrity failure?
└── YES → Go to: Ledger Corruption Recovery (P0)
Trigger: Extension reaches Terminate enforcement state (3+ consecutive
unsafe evaluations).
Step 1 — Verify quarantine is active
Check the extension's enforcement state via the runtime risk ledger:
// Programmatic: ExtensionManager API
let state = manager.runtime_risk_config();
assert!(state.enabled, "risk controller must be enabled");
// Check per-extension state
let ledger = manager.runtime_risk_ledger_artifact();
let ext_entries: Vec<_> = ledger.entries.iter()
.filter(|e| e.extension_id.as_deref() == Some("suspect-ext-id"))
.collect();
// Last entry should show action = "terminate"Expected artifact: RuntimeRiskLedgerEntry with action: "terminate" and
consecutive unsafe count >= 3.
Step 2 — Assess threat severity
Review the hostcall telemetry for the quarantined extension:
let telemetry = manager.runtime_hostcall_telemetry_artifact();
let suspicious: Vec<_> = telemetry.entries.iter()
.filter(|e| e.extension_id.as_deref() == Some("suspect-ext-id"))
.collect();
// Look for: capability="exec", unusual param patterns, resource_target_classQuestions to answer:
- What capabilities was the extension requesting? (
exec,http,fs/write) - Were the targets unusual? (system paths, external URLs)
- Was there a burst pattern? (many calls in short window)
Step 3 — Execute response
For confirmed threats:
- Activate kill-switch:
manager.set_kill_switch("suspect-ext-id", true, "incident-ref") - Verify: Check that subsequent calls from the extension are blocked
- Export evidence bundle for forensic review
For false positives:
- Document the FP in the incident log
- Consider tuning alpha or window_size (see Policy Tuning Guide)
- Reset extension state if appropriate
Step 4 — Verify and close
- Confirm kill-switch is active in trust state
- Verify ledger hash chain integrity:
verify_runtime_risk_ledger_artifact(&ledger) - Create post-incident bead with linked evidence
Trigger: 3+ policy denials from a single extension within 60 seconds.
Step 1 — Identify the extension and denied capabilities
Check security alerts:
let alerts = manager.security_alerts();
let recent: Vec<_> = alerts.iter()
.filter(|a| a.extension_id.as_deref() == Some("ext-id"))
.filter(|a| a.ts_ms > (now_ms - 60_000))
.collect();Step 2 — Determine if legitimate
- Is the extension newly installed? May need capability grants
- Did policy change recently? Check policy source (cli, env, config)
- Is the extension attempting capabilities it has never used before?
Step 3 — Respond
If legitimate need:
- Add per-extension capability override in policy config
- Verify the override takes effect
- Monitor for 24 hours
If suspicious:
- Let denial continue (policy is working correctly)
- Escalate to P0 if extension appears to be probing permissions
- Consider activating kill-switch preemptively
Trigger: Enforcement state escalated (e.g., Allow → Harden, or Harden → Deny).
Step 1 — Review the transition
// The enforcement transition is recorded in telemetry
let telemetry = manager.runtime_hostcall_telemetry_artifact();
// Look for entries where risk_action changedStep 2 — Evaluate if expected
- Check the risk score that triggered escalation against score band thresholds
- Review the extension's recent hostcall pattern
- Verify hysteresis is working (no rapid flapping)
Step 3 — Tune if needed
If escalation is too aggressive:
- Increase score band thresholds (see Policy Tuning Guide)
- Increase
cooldown_callsfor slower de-escalation - Consider switching to a more permissive policy profile
If escalation is correct:
- Log and monitor
- No action needed — the state machine is working as designed
Trigger: Rollback trigger fired, rollout phase reverted to Shadow.
Step 1 — Verify rollback occurred
let state = manager.rollout_state();
assert_eq!(state.phase, RolloutPhase::Shadow);
assert!(state.rolled_back_from.is_some());
// rolled_back_from tells you what phase we were inStep 2 — Identify trigger cause
Check the window statistics:
let stats = state.window_stats;
// Check which threshold was breached:
// - stats.false_positive_count / stats.total_decisions > max_false_positive_rate?
// - stats.error_count / stats.total_decisions > max_error_rate?
// - stats.avg_latency_ms > max_latency_ms?Step 3 — Root cause analysis
| Trigger | Likely Cause | Remediation |
|---|---|---|
| High FP rate | Score thresholds too aggressive | Raise band thresholds or alpha |
| High error rate | Controller bug or resource exhaustion | Check logs, fix bug, increase timeout |
| High latency | Window size too large or system load | Reduce window_size or decision_timeout_ms |
Step 4 — Recovery
- Fix the root cause
- Gradually re-advance rollout:
manager.advance_rollout() - Monitor window stats at each phase before advancing further
- Document in incident bead
Trigger: Hash chain verification fails on the runtime risk ledger.
Step 1 — Confirm corruption
let ledger = manager.runtime_risk_ledger_artifact();
let report = verify_runtime_risk_ledger_artifact(&ledger);
assert!(!report.valid, "corruption confirmed");
// report.first_invalid_index tells you where the chain brokeStep 2 — Assess impact
- How many entries are affected?
- When did corruption start? (check timestamps)
- Is the risk controller still making correct decisions?
Step 3 — Immediate response
- Roll back enforcement to Shadow mode:
manager.set_rollout_phase(RolloutPhase::Shadow) - Export the corrupted ledger as evidence
- Restart the extension manager to get a fresh ledger
Step 4 — Investigation
- Check for concurrent modification bugs
- Check for memory corruption indicators
- Review recent code changes to ledger-touching paths
Expected artifacts:
- Corrupted ledger export (JSON)
RuntimeRiskLedgerVerificationReportshowingvalid: false- Incident evidence bundle
When to use: After any P0-P2 incident, or when forensic evidence must be preserved for audit or sharing.
Step 1 -- Collect raw artifacts
let ledger = manager.runtime_risk_ledger_artifact();
let alerts = manager.security_alert_artifact();
let telemetry = manager.runtime_hostcall_telemetry_artifact();
let exec = manager.exec_mediation_artifact();
let secret = manager.secret_broker_artifact();
let quota_breaches = manager.quota_breach_events();Step 2 -- Define scope with a filter
use pi::extensions::{
IncidentBundleFilter, SecurityAlertCategory, SecurityAlertSeverity,
};
let filter = IncidentBundleFilter {
start_ms: Some(incident_start_ms), // Time window start
end_ms: Some(incident_end_ms), // Time window end
extension_id: Some("suspect-ext".into()), // Scope to one extension
alert_categories: Some(vec![ // Limit to relevant categories
SecurityAlertCategory::ExecMediation,
SecurityAlertCategory::AnomalyDenial,
]),
min_severity: Some(SecurityAlertSeverity::Warning),
};Step 3 -- Set redaction policy
For external sharing (redact all hashes):
use pi::extensions::IncidentBundleRedactionPolicy;
let redaction = IncidentBundleRedactionPolicy {
redact_params_hash: true,
redact_context_hash: true,
redact_args_shape_hash: true,
redact_command_hash: true,
redact_name_hash: true,
redact_remediation: false, // Keep human-readable remediation text
};For internal investigation (no redaction):
let redaction = IncidentBundleRedactionPolicy {
redact_params_hash: false,
redact_context_hash: false,
redact_args_shape_hash: false,
redact_command_hash: false,
redact_name_hash: false,
redact_remediation: false,
};Step 4 -- Build the bundle
use pi::extensions::build_incident_evidence_bundle;
let bundle = build_incident_evidence_bundle(
&ledger, &alerts, &telemetry, &exec, &secret,
"a_breaches, &filter, &redaction, now_ms,
);Step 5 -- Verify integrity
use pi::extensions::verify_incident_evidence_bundle;
let report = verify_incident_evidence_bundle(&bundle);
assert!(report.valid, "Bundle integrity check failed: {:?}", report.errors);
assert!(report.ledger_chain_intact, "Ledger hash chain broken");
assert_eq!(report.bundle_hash, report.recomputed_hash, "Hash mismatch");Step 6 -- Review bundle summary
let summary = &bundle.summary;
println!("Ledger entries: {}", summary.ledger_entry_count);
println!("Alerts: {}", summary.alert_count);
println!("Telemetry events: {}", summary.telemetry_event_count);
println!("Exec mediation: {}", summary.exec_mediation_count);
println!("Secret broker: {}", summary.secret_broker_count);
println!("Quota breaches: {}", summary.quota_breach_count);
println!("Distinct extensions: {}", summary.distinct_extensions);
println!("Peak risk score: {:.3}", summary.peak_risk_score);
println!("Deny/Terminate count: {}", summary.deny_or_terminate_count);
println!("Ledger chain intact: {}", summary.ledger_chain_intact);Step 7 -- Forensic replay (optional)
Reconstruct the decision sequence step-by-step:
use pi::extensions::replay_runtime_risk_ledger_artifact;
let replay = replay_runtime_risk_ledger_artifact(&ledger)?;
for step in &replay.steps {
println!("[{}] ext={} cap={}.{} score={:.3} action={:?} state={:?}",
step.ts_ms, step.extension_id, step.capability, step.method,
step.risk_score, step.selected_action, step.derived_state);
}Expected artifacts:
IncidentEvidenceBundle(schemapi.ext.incident_evidence_bundle.v1)IncidentBundleVerificationReportconfirmingvalid: true- Optional
RuntimeRiskReplayArtifactfor decision reconstruction
All incidents should produce an evidence bundle containing:
- Runtime risk ledger -- hash-chained decision history
- Hostcall telemetry -- per-call feature vectors and scores
- Security alerts -- alert stream filtered to incident timeframe
- Exec mediation log -- command classifications and decisions
- Secret broker log -- redaction decisions
- Quota breach events -- resource limit violations
- Rollout state snapshot -- phase, enforce flag, window stats
- Extension trust state -- current trust level and kill-switch audit
Bundle integrity is verified via SHA-256 hashing. The
verify_incident_evidence_bundle() function confirms bundle hash and ledger
chain validity.
- Incident severity classified (P0-P3)
- Immediate response executed per procedure above
- Evidence bundle collected and hash verified
- Root cause identified
- Remediation applied (config tuning, bug fix, policy update)
- Verification steps confirm remediation worked
- Incident bead created with linked evidence artifacts
- Rollout phase restored (if rolled back) with monitoring