SEC-7.3 (bd-2kle2) -- Procedures for ongoing security system maintenance, calibration, and operations.
- Scanner Rule Maintenance
- Risk Controller Calibration
- Policy Profile Updates
- Secret Broker Maintenance
- Exec Mediation Maintenance
- CI Security Gate Maintenance
- Waiver Lifecycle Management
- Extension Conformance Monitoring
- Troubleshooting Guide
The compatibility scanner detects dangerous imports and code patterns at extension load time.
When to add a pattern:
- A new evasion technique was discovered (see INC-4 in Incident Response Runbook)
- A new dangerous API surface was identified
- The scanner's detection rate (SLO-02) drops below 95%
Steps:
- Identify the pattern to add (e.g.,
require('node:child_process')) - Add to the scanner's forbidden/flagged pattern list in
src/extensions.rs - Add a regression test in
tests/install_time_security_scanner.rs - Verify:
cargo test --test install_time_security_scanner -- --nocapture - Check for false positives against the extension corpus:
cargo test --test ext_conformance_generated --features ext-conformance -- --nocapture
Periodic review (monthly):
- Run the full conformance suite to measure current detection rates
- Compare against SLO-02 (>= 95% detection) and SLO-03 (<= 5% false positives)
- Review any scanner-bypass incidents in the past period
- Update patterns if new evasion techniques are documented
- After deploying new extensions with different call patterns
- When false-positive rate (SLO-10) exceeds 10%
- When false-negative rate (SLO-11) exceeds 5%
- After significant changes to the hostcall dispatch pipeline
1. Baseline in shadow mode:
{
"extensionRisk": {
"enabled": true,
"enforce": false
}
}Run for a representative workload period (at least 100 hostcall decisions).
2. Analyze results:
cargo test --test accuracy_performance_sec63 -- --nocapture
cargo test --test runtime_risk_quantile_validation -- --nocaptureCheck:
- False positive rate from benign traces
- False negative rate from adversarial traces
- Latency distribution (SLO-06: p99 <= 5ms)
3. Tune parameters:
| Symptom | Adjustment |
|---|---|
| Too many false positives | Decrease alpha (e.g., 0.01 -> 0.005) |
| Missing real threats | Increase alpha (e.g., 0.01 -> 0.02) |
| Slow decisions | Decrease windowSize or increase decisionTimeoutMs |
| Memory pressure from ledger | Decrease ledgerLimit |
4. Enable enforcement:
{
"extensionRisk": {
"enabled": true,
"enforce": true
}
}5. Verify:
cargo test --test ledger_calibration_sec35 -- --nocapture
cargo test --test baseline_modeling_evidence -- --nocaptureThe risk scorer has golden fixtures that validate scoring determinism:
cargo test --test risk_scorer_golden_fixtures -- --nocaptureAfter calibration changes, update golden fixtures if the scoring algorithm changed.
If a new non-dangerous capability is introduced (e.g., analytics):
- Add to
PolicyProfile::Safe.to_policy().default_caps - Add to
PolicyProfile::Standarddefault policy - Update
Capabilityenum if needed - Update the compatibility test matrix:
cargo test --test security_conformance_benign -- --nocapture - Update
BENIGN_CAPABILITIESin the test if the capability should be tested - Verify the compatibility dashboard shows the new capability
If a new capability should be classified as dangerous:
- Add to
Capability::is_dangerous()check - Add to
Capability::dangerous_list() - Add to
deny_capsin Safe and Standard profiles - Update tests:
cargo test --test policy_profile_hardening -- --nocapture cargo test --test capability_denial_matrix -- --nocapture
- Verify invariant INV-008 (dangerous caps default-deny)
The 5-layer precedence chain is an invariant (INV-001). Modifications require:
- Review against the threat model (T3: capability escalation)
- Update
evaluate_for()inExtensionPolicy - Update all precedence tests:
cargo test --test capability_policy_model -- --nocapture cargo test --test capability_policy_scoped -- --nocapture
- Update the operator handbook documentation
Exact name (highest priority):
Add to the secret_exact list in SecretBrokerPolicy::default().
Suffix pattern (catches *_API_KEY, *_SECRET, etc.):
Add to secret_suffixes.
Prefix pattern (catches AWS_SECRET_*, etc.):
Add to secret_prefixes.
Verification:
cargo test --test security_budgets -- secret_broker --nocaptureIf a variable matches a secret pattern but should be disclosed:
Add to disclosure_allowlist in the policy config.
Document why the variable is safe to expose.
The exec mediation layer filters shell commands after the exec capability is granted.
- Add the pattern to
ExecMediationPolicy.deny_patterns - Set the appropriate
deny_threshold(Low/Medium/High/Critical) - Verify:
cargo test --test exec_mediation_integration -- --nocapture
For known-safe commands that might match deny patterns:
Add to allow_patterns. Allow patterns are checked before deny patterns.
The full-suite CI gate (ci_full_suite_gate.rs) includes 14 sub-gates. Security-relevant gates:
| Gate ID | Name | Blocking | Artifact |
|---|---|---|---|
security_compat |
Security compatibility | YES | tests/security_compat/security_compat_dashboard.json |
conformance_regression |
Conformance regression | YES | tests/ext_conformance/reports/regression_verdict.json |
ext_must_pass |
Extension must-pass (208) | YES | tests/ext_conformance/reports/gate/must_pass_gate_verdict.json |
non_mock_unit |
Non-mock compliance | YES | docs/non-mock-rubric.json |
waiver_lifecycle |
Waiver lifecycle | YES | tests/full_suite_gate/waiver_audit.json |
- Read the gate detail. Each gate produces a detail message explaining the failure.
- Run the reproduction command. Each gate includes a
reproduce_command. - Check for regressions. Compare against the last known-good state.
- Fix or waive. Either fix the underlying issue or create a time-bounded waiver.
Gate thresholds are configured in ci.yml:
CI_GATE_MIN_PASS_RATE_PCT: "80.0" # Minimum conformance pass rate
CI_GATE_MAX_FAIL_COUNT: "36" # Maximum failures
CI_GATE_MAX_NA_COUNT: "170" # Maximum N/A countTo adjust: update the GitHub variable and document the justification.
Waivers provide time-bounded CI gate bypass. Add to tests/suite_classification.toml:
[waiver.security_compat]
owner = "YourName"
created = "2026-02-14"
expires = "2026-02-28"
bead = "bd-XXXX"
reason = "Scanner update pending for new evasion pattern"
scope = "full"
remove_when = "Scanner update deployed and all compatibility tests pass"Required fields: owner, created, expires, bead, reason, scope, remove_when.
Constraints:
- Maximum duration: 30 days
- Valid scopes:
full,preflight,both - Must link to a bead tracking the fix
- Expired waivers cause CI failure
cargo test --test ci_full_suite_gate -- waiver_lifecycle_audit --nocapture --exactThis validates all waivers and produces tests/full_suite_gate/waiver_audit.json with:
- Active/expired/expiring-soon/invalid counts
- Per-waiver validation details
- Days remaining for active waivers
Update expires and document why more time is needed. Maximum duration from created remains 30 days. If more time is needed, you must set a new created date and justify the extension.
The compatibility dashboard tracks benign extension behavior under hardened policy:
cargo test --test security_conformance_benign -- generate_compat_dashboard_artifact --nocapture --exactProduces tests/security_compat/security_compat_dashboard.json with:
- Per-profile pass rates (Safe, Standard)
- Individual check results (24 compatibility checks)
- Regression detection flag
If the dashboard shows regression_detected: true:
- Check which specific checks failed (see
checksarray) - Determine if a policy change or code change caused the regression
- Fix the regression or document it as intentional (with waiver if needed)
- Re-run to confirm the dashboard shows
regression_detected: false
When new extensions are added to the conformance corpus:
- Run the full conformance suite
- Check that conformance pass rate stays above 80% (SLO threshold)
- Update the conformance baseline if the new extensions are expected to pass
- File beads for any new failures that need investigation
Check:
- Is the extension in the compatibility scanner's blocklist?
- Scanner may flag dangerous patterns
- Check scanner results for the extension
- Is the policy too restrictive?
- Run
pi --explain-extension-policyto see effective policy
- Run
- Is the QuickJS runtime healthy?
- Check for module resolution errors in extension logs
Check:
- Is the risk controller alpha too aggressive?
- Default
alpha: 0.01may be too sensitive for some workloads - Try
alpha: 0.005in shadow mode first
- Default
- Is the extension's behavior pattern unusual but benign?
- Add to the risk controller's baseline if confirmed benign
- Is the secret broker matching non-secret variables?
- Add to
disclosure_allowlist
- Add to
Check:
- Did the dependency change any security-relevant behavior?
- Run the specific failing gate's reproduction command
- Update conformance baselines if the change is expected
- File a waiver if the fix requires time
Target: SLO-06 requires p99 <= 5ms.
Actions:
- Reduce
windowSize(fewer entries to evaluate per decision) - Increase
decisionTimeoutMsonly as last resort (allows more time but increases latency) - Check for I/O contention in the ledger write path
- Profile with:
cargo bench --bench system -- risk_decision
Default: ledgerLimit: 2048 entries in memory.
Actions:
- Reduce
ledgerLimitto a smaller value - Export and archive evidence bundles periodically
- The oldest entries are automatically evicted when the limit is reached
- Chain integrity is preserved even after eviction (entries reference hashes, not indices)