Guardian Shell - Claude Agent Handoff Document

This document provides everything a Claude agent needs to continue development on this project on a Linux machine.

Project Summary

Guardian Shell is a Linux security tool that uses eBPF to monitor and restrict LLM agent activities. It's built with Rust and the Aya eBPF framework.

Current state: Phase 11 - Security Hardening & Performance (compiled on Linux)

Phase 10 introduces defense-in-depth for cgroup agents — a creative architectural shift that solves all CRITICAL and HIGH security vulnerabilities by using the right enforcement tool for each layer:

Landlock LSM sandbox (Linux 5.13+): Inode-level file access control applied in guardian-launch before exec. Resolves symlinks at kernel VFS layer — completely immune to the #1 CRITICAL symlink bypass. Default-deny model mirrors agent policy.
Expanded seccomp filter: Blocks mount (165-166), namespace escape (setns 308, unshare 272), chroot (161), pivot_root (155), and new mount API (428-433, 442) in addition to existing io_uring and memfd_create blocks.
PR_SET_NO_NEW_PRIVS: Prevents SUID privilege escalation. Required by Landlock, good practice regardless.
IPC sandbox config: Daemon sends agent policy to launcher during registration via SandboxConfig in IPC response. Launcher builds Landlock rules + seccomp filter from it.
Two security tiers: Cgroup agents (Tier 1, hardened: Landlock+seccomp+eBPF+cgroup) vs comm-based agents (Tier 2, limited: eBPF monitoring only).
Landlock TCP network filtering (kernel 6.7+): Port-based outbound TCP control as additional layer alongside eBPF LSM socket_connect.

Key insight: eBPF tracepoints operate on path strings (vulnerable to symlinks, TOCTOU). Landlock operates on inodes (immune). Phase 10 makes Landlock the primary enforcement layer for cgroup agents, with eBPF as the audit/visibility layer.

Phase 11 addresses critical security vulnerabilities from comprehensive code audit:

PENDING map fail-closed: Per-CPU overflow arrays prevent enforcement bypass when BPF HashMaps are full (16,384 entries, up from 4,096)
Privilege dropping: guardian-launch drops root to SUDO_UID/SUDO_GID before Landlock+exec. Fixes Landlock+exec EACCES on SELinux. New --user/--group flags.
Grant accumulation enforced: Limits checked before sending decision to agent (was checked after)
CSRF protection: Dashboard validates HX-Request header on POST/PUT/DELETE
O(1) agent lookup: HashMap cache for event processing (was O(N) linear scan)
Default cgroup config: Auto-created when new agent registers without config
Memory cleanup: Rate limiter (1h TTL) and grant accumulator (24h TTL) prevent unbounded growth
IPC timeout: 30-second read timeout prevents client stall attacks
Debian/Ubuntu: Dynamic linker multiarch paths, /sbin, /snap support

Phase 9 adds kernel-level network enforcement, upgrading from Phase 7's log-only network monitoring to actual connection blocking:

Network enforcement: LSM socket_connect hook blocks denied connections at kernel level (returns -ECONNREFUSED)
Network policy maps: Port-based deny/allow BPF maps (NET_DENY_PORTS, NET_ALLOW_PORTS) evaluated in-kernel by sys_enter_connect tracepoint
PENDING_NET_DENY map: Same tracepoint→PENDING_MAP→LSM pattern as file_open, inode_rename, etc.
Per-cgroup network defaults: NET_CGROUP_DEFAULT_ACTION map for cgroup-based agents
BPF stack fix: Dynamic linker detection reads argv[1] directly into event buffer (eliminates 256-byte stack allocation that exceeded BPF 512-byte limit)

Phase 8 adds security hardening (IPC auth, rate limiting, SSRF prevention, SRI hashes, etc.). Phase 7 adds:

Path canonicalization: normalize_path() strips /proc/self/root/, /proc/<pid>/root/, resolves .. components
openat2 tracepoint: sys_enter_openat2 eBPF hook closes the openat2 syscall bypass (Linux 5.6+)
Permission rate limiting: Per-agent rate limits (3/min, 15/hr), exponential backoff after denials, same-resource cooldown
Risk classification: 4-tier risk scoring (Low/Medium/High/Critical) with path patterns, exec multiplier, post-denial multiplier
Auto-deny: Configurable never-approve list for critical resources (/etc/shadow, SSH keys, etc.)
Auto-approve: Configurable auto-approve for low-risk resources (/tmp/**, /proc/self/**)
Justification analysis: Pattern matching for suspicious text (urgency, security bypass, reassurance, authority claims)
UI friction: Mandatory wait timers (0/3/5/10s by risk level), type-to-confirm for CRITICAL risk, risk-colored banners
Persistent audit trail: SQLite permission_audit table with full metadata for all permission decisions
Risk display: Risk level badges, justification warnings, and risk flags shown in banners and requests page
Exec enforcement: LSM bprm_check_security hook with PENDING_EXEC_DENY map for kernel-side binary blocking
Network monitoring: sys_enter_connect tracepoint with sockaddr parsing (AF_INET/AF_INET6) and port-based policy
Legacy open hook: sys_enter_open tracepoint as belt-and-suspenders for rare code paths using legacy open() syscall
SSE connection fix: Single shared EventSource with custom DOM events prevents browser connection pool exhaustion

Architecture: Permission requests use oneshot channels for long-poll IPC. Agent sends request via Unix socket, daemon creates oneshot channel and broadcasts to dashboard via tokio::sync::broadcast. Human approves/denies in browser, decision sent back via oneshot, agent unblocks immediately. SSE endpoint uses tokio_stream::StreamExt::merge to combine two broadcast streams. Security hardening adds permissions.rs module with rate limiter, risk classifier, auto-deny/approve, and justification analyzer. All evaluated before the oneshot channel is created. Exec enforcement uses a separate PENDING_EXEC_DENY map (not shared with file PENDING_DENY) because during execve, the kernel internally opens the binary, triggering file_open LSM which would consume a shared pending entry. Network enforcement uses the same tracepoint→PENDING_MAP→LSM pattern: sys_enter_connect evaluates port-based policy and sets PENDING_NET_DENY, then LSM socket_connect consumes the entry and returns -ECONNREFUSED to block the connection.

Project Structure

guardian_shell/
├── Cargo.toml                  # Workspace root
├── .cargo/config.toml          # BPF linker config
├── rust-toolchain.toml         # Nightly Rust (required for eBPF)
├── config.toml                 # Example security policy
├── README.md                   # User-facing docs
├── CLAUDE.md                   # This file
│
├── guardian-common/            # Shared types (no_std for eBPF, std for userspace)
│   ├── Cargo.toml
│   └── src/lib.rs              # FileAccessEvent, NetworkEvent, IPC protocol types, constants
│
├── guardian-ebpf/              # eBPF kernel program (BPF bytecode)
│   ├── Cargo.toml              # Target: bpfel-unknown-none
│   └── src/main.rs             # Tracepoints (openat/open/openat2/execve/connect) + LSM hooks (file_open/bprm_check_security) + cgroup identification
│
├── guardian/                   # Userspace daemon
│   ├── Cargo.toml
│   ├── askama.toml             # Template config
│   ├── templates/              # Phase 5/6: Askama HTML templates
│   │   ├── base.html           # Base layout (nav, head, htmx/Alpine.js, permission banner)
│   │   ├── index.html          # Dashboard overview with status cards + recent events
│   │   ├── events.html         # Live SSE event stream with filtering
│   │   ├── agents.html         # Agent management (list, stop, grant with exec type)
│   │   ├── policy.html         # Policy editor (per-agent allow/deny rules)
│   │   ├── alerts.html         # Alert configuration editor
│   │   └── requests.html       # Phase 6: Permission requests (pending + resolved history)
│   ├── static/                 # Phase 5: Static assets (embedded via rust-embed)
│   │   ├── app.js              # Custom JavaScript
│   │   ├── app.css             # Custom CSS
│   │   ├── htmx.min.js         # htmx library (bundled)
│   │   └── alpine.min.js       # Alpine.js library (bundled)
│   └── src/
│       ├── main.rs             # Entry point, eBPF loading, event loop, IPC server
│       ├── config.rs           # TOML parsing, policy engine, path normalization, alerting + dashboard config
│       ├── permissions.rs      # Permission hardening: rate limiting, risk classification, auto-deny/approve, justification analysis
│       ├── ipc.rs              # IPC server, agent registration, cgroup lifecycle, permission requests
│       ├── alerting/           # Phase 4: Alerting & Integration
│       │   ├── mod.rs          # AlertManager, AlertSender, dedup, dispatch, broadcast
│       │   ├── json_log.rs     # Structured JSONL logging with rotation
│       │   ├── webhook.rs      # Generic HTTP POST webhook
│       │   ├── slack.rs        # Slack Block Kit notifications
│       │   ├── email.rs        # SMTP email alerts
│       │   └── metrics.rs      # Prometheus metrics + HTTP server
│       └── dashboard/          # Phase 5: Web Dashboard
│           ├── mod.rs          # Axum router, static file handler, server startup
│           ├── state.rs        # DashboardState (shared refs to IPC, alerts, event bus)
│           ├── db.rs           # SQLite backend (permission audit trail)
│           └── routes/
│               ├── mod.rs      # Route module declarations
│               ├── pages.rs    # Page handlers (/, /agents, /policy, /alerts, /events, /requests)
│               ├── api.rs      # API handlers (stop, grant, policy update, alerts, reload, permissions)
│               └── sse.rs      # SSE event stream endpoint (merged alert + permission streams)
│
├── guardian-launch/            # Agent launcher with cgroup isolation (Phase 3)
│   ├── Cargo.toml
│   └── src/main.rs             # Creates cgroup, registers, exec's agent
│
├── guardian-ctl/               # CLI for managing agents (Phase 3+6)
│   ├── Cargo.toml
│   └── src/main.rs             # list/stop/grant/request-permission commands
│
├── configs/                    # Preset configuration templates (Phase 4)
│   ├── minimal.toml            # Bare minimum, monitor-only
│   ├── recommended.toml        # Production defaults
│   ├── strict.toml             # Maximum security
│   └── development.toml        # Verbose debugging
│
└── xtask/                      # Build tooling
    ├── Cargo.toml
    └── src/main.rs             # Cross-compiles eBPF program

How It Works

Comm-based agents (Phase 1/2 — backward compatible)

Userspace daemon reads config.toml to learn which processes to monitor
Daemon scans /proc/ to find PIDs matching configured process_name values
Daemon loads eBPF program and populates WATCHED_COMMS + WATCHED_TGIDS maps
eBPF program hooks syscalls + LSM to monitor and enforce policy

Cgroup-based agents (Phase 3 — recommended)

guardian-launch --name <agent> -- <command> creates a cgroup, registers via IPC, exec's agent
Daemon receives registration, populates WATCHED_CGROUPS BPF map with cgroup ID
eBPF program uses bpf_get_current_cgroup_id() — strongest, unspoofable identification
All child processes automatically inherit the cgroup — no PID tracking needed
Resource limits (memory, PIDs, CPU) enforced via cgroup controllers
guardian-ctl provides list/stop/grant (file & exec)/request-permission commands

First Steps on Linux

1. Install Prerequisites

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install nightly toolchain + rust-src (needed for eBPF)
rustup install nightly
rustup component add rust-src --toolchain nightly

# Install BPF linker
cargo install bpf-linker

# Verify kernel BPF support
cat /boot/config-$(uname -r) | grep CONFIG_BPF
# Should show CONFIG_BPF=y and CONFIG_BPF_SYSCALL=y

2. Build

# Build eBPF kernel program (cross-compile to BPF target)
cargo xtask build-ebpf --release

# Build userspace daemon
cargo build --release

3. Test with Comm-Based Agent (Phase 1/2)

Edit config.toml to watch a common process like cat:

[global]
log_level = "info"
mode = "enforce"
pid_rescan_interval = 5
socket_path = "/run/guardian.sock"

[[agents]]
name = "test-cat"
process_name = "cat"
watch_children = true

[agents.file_access]
default = "deny"
allow = ["/tmp/**"]
deny = ["/etc/shadow"]

# Terminal 1:
sudo RUST_LOG=debug target/release/guardian --config config.toml

# Terminal 2:
cat /tmp/somefile       # Should show [ALLOW]
cat /etc/passwd         # In enforce mode: BLOCKED (returns EACCES)
cat /etc/shadow         # Blocked (explicit deny rule)

3b. Test with Cgroup-Based Agent (Phase 3)

Add a cgroup agent to config.toml:

[[agents]]
name = "test-agent"
identity = "cgroup"

[agents.file_access]
default = "deny"
allow = ["/tmp/**", "/proc/**", "/usr/lib/**", "/lib/**", "/lib64/**"]
deny = ["/etc/shadow"]

# Terminal 1: Start the daemon
sudo RUST_LOG=info target/release/guardian --config config.toml

# Terminal 2: Launch a process with cgroup isolation
sudo target/release/guardian-launch --name test-agent --memory 1G --pids 50 -- bash

# Inside the launched bash shell:
cat /tmp/somefile       # ALLOWED — in the allow list
cat /etc/shadow         # BLOCKED — in the deny list

# Terminal 3: Manage agents
sudo target/release/guardian-ctl list                    # List agents
sudo target/release/guardian-ctl grant -n test-agent \
    -p "/etc/shadow" -d 60                              # Temporary 60s file grant
sudo target/release/guardian-ctl grant -n test-agent \
    -p "/usr/bin/curl" -d 60 -t exec                   # Temporary 60s exec grant
sudo target/release/guardian-ctl stop -n test-agent     # Stop the agent

4. Potential Build Issues to Watch For

bpf-linker fails to install: May need llvm-dev package (sudo apt install llvm-dev on Ubuntu)
eBPF verifier rejects program: Build with --release flag (optimized code passes verifier more reliably). Check error message for specific rejection reason.
"Failed to attach to tracepoint": Kernel needs CONFIG_FTRACE=y and CONFIG_BPF=y. Most modern distros (Ubuntu 20.04+, Fedora 33+) have these.
Permission denied: Must run as root or with CAP_BPF + CAP_PERFMON.
Tracepoint offsets wrong on non-x86_64: The offsets in guardian-ebpf/src/main.rs (lines 219-231) are for x86_64. Verify on your arch by reading: cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/format

Key Design Decisions

Decision	Rationale
Tracepoint + LSM hybrid	Tracepoint captures filename from syscall args (easy). LSM hook blocks access (enforcement). Tracepoint sets PENDING_DENY map entry, LSM reads it. Avoids complex path reading in LSM context.
3-tier identification (cgroup > TGID > comm)	Cgroup is unspoofable (kernel-enforced). TGID tracking catches children. Comm is the fallback. All three checked in eBPF for maximum coverage.
Kernel-side policy evaluation	Deny/allow rules stored in BPF Array maps. Tracepoint evaluates policy in-kernel with bounded loops. Eliminates userspace round-trip for enforcement decisions.
Per-CPU array scratch buffer	eBPF has 512-byte stack limit. `FileAccessEvent` is 292 bytes. Using `PerCpuArray` as a pre-allocated buffer is the standard pattern.
`PerfEventArray` (not `RingBuf`)	Compatible with Linux 5.2+. `RingBuf` is more efficient but needs 5.8+.
Deny-takes-precedence policy	Security best practice. Even if a path matches an allow rule, a deny rule overrides it. Prevents accidental over-permissioning.
`#[repr(C)]` on shared structs	Ensures identical memory layout between BPF target and native target. Without it, Rust may reorder fields differently per target.
Graceful LSM fallback	If LSM attachment fails (kernel doesn't support it), daemon falls back to monitor-only mode instead of crashing.
Cgroup v2 for agent isolation	Unspoofable identity, automatic child tracking via inheritance, resource limits via controllers. Process cannot escape its cgroup.
Launcher + IPC registration	`guardian-launch` creates cgroup, registers with daemon via Unix socket, then exec's agent. Clean separation of concerns.
Length-prefixed JSON IPC	Simple, debuggable protocol over Unix domain socket. Supports agent registration, listing, stopping, and temporary grants.
Temporary grants with expiry	Allow rules added to BPF maps (file) or exec policy (exec) with automatic removal after duration. Both `guardian-ctl grant -t exec` and dashboard support exec grants.
Async alert dispatch	AlertManager runs as tokio task with mpsc channel. Event processors never block on I/O.
Synchronous Prometheus metrics	Counters updated atomically in event processors. Accurate even when alert channel is full.
Per-output severity filters	Webhook gets warnings, Slack/email get critical only. Reduces noise per channel.
JSONL format (one JSON per line)	Easy to grep, tail, pipe to SIEM. No parser state between lines. Industry standard.
Simple TCP metrics server	Avoids axum dependency for Phase 4. Serves Prometheus text format directly.
Hash-based dedup	Same (agent, event_type, path, action) suppressed within window. Prevents alert storms.
Preset config templates	Inspired by Falco: ship working configs for common scenarios. Reduces onboarding friction.
axum + htmx + Alpine.js	Server-rendered HTML with htmx for partial updates, Alpine.js for client-side filtering. No JS build step. ~30KB total frontend.
askama templates	Compile-time template checking catches errors at build time. Zero-allocation rendering.
rust-embed for static files	Single binary deployment. No external file dependencies.
broadcast channel for SSE	Standard tokio pattern. Lagged SSE clients skip events rather than blocking producers.
Manual TOML serialization	Preserves readable config format. serde_toml round-trips lose comments and ordering.
Dashboard behind `enabled` flag	Zero overhead when disabled. No axum server spawned.
Oneshot channel for permission long-poll	Agent blocks on `oneshot::Receiver`, dashboard resolves via `oneshot::Sender`. No polling loops.
SSE stream merging	`tokio_stream::StreamExt::merge` combines alert + permission broadcast streams into single SSE endpoint.
Alpine.js global permission store	Defined in `base.html`, available on every page. Banners appear everywhere without code duplication.
120s auto-deny timeout	Fail-secure: unanswered requests are denied. Prevents agents from hanging indefinitely.
Dual data sources (fetch + SSE)	HTTP fetch catches pre-existing pending requests; SSE delivers new ones in real time.
Userspace path normalization	Quick win for `/proc/self/root/` and `..` bypasses without kernel changes. Not full canonicalization (no symlinks).
Risk-based approval friction	4-tier risk scoring with mandatory wait timers prevents reflexive rubber-stamping of high-risk requests.
Justification pattern matching	Simple string matching flags social engineering patterns (urgency, authority claims). Low false positive rate.
Per-agent rate limiting	Prevents approval fatigue via flood attacks. Exponential backoff on consecutive denials.
SQLite permission audit	Persistent trail survives daemon restarts. Enables future anomaly detection on approval patterns.
openat2 graceful fallback	`load_tracepoint` failure is non-fatal — daemon continues without openat2 coverage on kernels < 5.6.
Separate PENDING_EXEC_DENY map	During execve, kernel internally opens the binary triggering `file_open` LSM. A shared PENDING map would be consumed by the file_open check, so exec enforcement needs its own map.
sys_enter_connect + LSM socket_connect for network enforcement	Tracepoint parses sockaddr, evaluates port-based policy, sets PENDING_NET_DENY. LSM socket_connect blocks with -ECONNREFUSED. Same pattern as file_open enforcement.
Single shared SSE connection	Browser HTTP/1.1 limits (~6 connections per origin). Multiple EventSource instances per page exhausted the pool. Single shared SSE with custom DOM events fixes this.
Legacy `sys_enter_open` hook	Belt-and-suspenders: most code uses `openat`, but rare binaries or direct syscalls may use legacy `open`. Reuses PENDING_DENY and EVENT_BUF maps.
Landlock as primary enforcement for cgroup agents	eBPF tracepoints see path strings (vulnerable to symlinks, TOCTOU). Landlock operates on inodes (immune). Use the right tool for each job: Landlock enforces, eBPF audits.
IPC sandbox config delivery	Daemon sends agent policy to launcher in registration Ack. Avoids config parsing duplication and keeps single source of truth.
Landlock default-deny only	Landlock has no deny rules — it's inherently default-deny. Agents with `file_access.default = "allow"` skip Landlock (incompatible model).
System read paths in Landlock	Common paths (/usr/lib, /etc/resolv.conf, /dev/null, etc.) get read+execute for dynamic linking. Without these, most binaries can't start.
Two security tiers	Cgroup agents get 4-layer defense (Landlock+seccomp+eBPF+cgroup). Comm-based agents get eBPF only. Clear documentation prevents false sense of security.

Known Limitations (Phase 11)

Symlinks bypass eBPF enforcement: eBPF tracepoints see raw path strings, not resolved inodes. Mitigated for cgroup agents by Landlock (inode-level, symlink-immune) and privilege dropping (agent runs as non-root user). Comm-based agents remain vulnerable.
openat2 tracepoint requires kernel 5.6+: Gracefully skipped on older kernels
x86_64 offsets hardcoded: Tracepoint field offsets may differ on aarch64/arm
BPF LSM enforcement optional for cgroup agents: Landlock provides primary enforcement. BPF LSM (CONFIG_BPF_LSM=y) adds a second enforcement layer but is no longer required for security.
Tracepoint-LSM timing dependency: eBPF enforcement relies on tracepoint firing before LSM hook. Not applicable to Landlock (separate enforcement path).
Cgroup requires root: Creating cgroups and running guardian-launch needs root
Cgroup v2 required: Cgroup-based identification requires cgroup v2 (default on modern distros)
Comm-based agents have limited security: No Landlock, no seccomp hardening. Use cgroup agents for production.
SIGHUP reload doesn't update alerting outputs: Alerting config changes still require daemon restart
No webhook retry logic: Failed webhook/Slack/email sends are logged and dropped
Email password stored in plaintext config: Use file permissions to protect config
Dashboard policy changes don't update BPF maps: Require daemon restart or SIGHUP
Config write-back loses comments: Dashboard saves config as clean TOML
Dashboard uses custom CSS: htmx and Alpine.js are bundled locally via rust-embed; no CDN or internet required
Landlock requires Linux 5.13+: Gracefully skipped on older kernels. Network filtering requires 6.7+.
Landlock incompatible with default = "allow": Agents with permissive default skip Landlock sandbox.
Seccomp filter is x86_64 only: Syscall numbers hardcoded for x86_64 in guardian-launch
UDP not enforced by Landlock: Only TCP connect is filtered. UDP sendto() without prior connect() bypasses both Landlock and eBPF.
DNS unmonitored: DNS resolution happens before connect(). No domain-based policy possible.
Privilege drop requires SUDO_UID or --user: Direct root login without sudo can't auto-detect target user
CSRF protection requires HX-Request header: Non-htmx browser forms without auth token will be rejected

Build Notes

The log_level field in GlobalConfig triggers a dead_code warning since env_logger uses RUST_LOG env var. This is intentional for future use.

Roadmap for Future Phases

Phase 2: Enforcement + Exec Monitoring ✅ DONE

LSM BPF file_open hook for kernel-level blocking
sys_enter_execve tracepoint for command execution monitoring
Process tree tracking via sched_process_fork / sched_process_exit
Periodic PID rescanning via tokio interval
Kernel-side policy evaluation with deny/allow rules in BPF maps

Phase 3: Advanced Identity & Access ✅ DONE

Cgroup-based agent identification via bpf_get_current_cgroup_id() in eBPF
Guardian Launcher (guardian-launch): cgroup creation, resource limits, IPC registration
Guardian Ctl (guardian-ctl): list/stop/grant CLI for agent management
Unix socket IPC for launcher-daemon communication (/run/guardian.sock)
Time-based access windows (file and exec) with automatic cleanup on expiry
Resource limits via cgroup v2 controllers (memory, PIDs, CPU)
3-tier eBPF identification: cgroup ID → TGID → comm name (backward compatible)
Cgroup lifecycle: automatic cleanup when agent exits (cgroup becomes empty)

Phase 4: Alerting & Integration ✅ DONE

Structured JSON logging with SIEM-compatible JSONL format and size-based rotation
Webhook alerts via HTTP POST with JSON payload and auth headers
Slack notifications with Block Kit formatting and severity-colored messages
Email notifications via async SMTP (lettre) with STARTTLS
Prometheus metrics on HTTP endpoint (file events, exec events, alerts sent/dropped)
Alert dedup/throttling with configurable time window and rate limits
Config validation CLI (--validate-config) for pre-deployment checks
SIGHUP config reload for hot-reloading agent policies
Preset configs in configs/ (minimal, recommended, strict, development)

Phase 5: Dashboard & UI ✅ DONE

Web dashboard embedded in guardian binary (axum + htmx + Alpine.js)
Live event stream via SSE with severity/action filtering
Agent management: view configured agents, stop cgroup agents, grant temporary access
Policy editor: edit file access and exec rules per agent, save to disk
Alert configuration: toggle and configure all alerting outputs from browser
Status overview: auto-refreshing mode/agent/event/blocked cards
Config reload: reload config from dashboard UI
Prometheus metrics endpoint integrated into dashboard server (/metrics)
Single binary: templates compiled in, static files embedded via rust-embed

Phase 6: Interactive Permission Requests ✅ DONE

Interactive permission requests via guardian-ctl request-permission
Long-poll IPC with tokio::sync::oneshot channels (agent blocks waiting for human)
Real-time dashboard notifications: permission banners on every page via SSE
Dedicated /requests page: pending requests table + resolved history audit trail
SSE stream merging: alert events + permission events on single SSE connection
Approve/deny with grant duration: 1 min to 1 hour configurable grant duration
120-second auto-deny timeout: fail-secure, unanswered requests denied
Exec grant support: temporary grants for both file access and exec commands
Alpine.js permission store: global store with countdown timer, badge counter
Resolved history: last 100 resolved requests with full metadata

Phase 7: Security Hardening (Partial) ✅ DONE

Based on docs/security-improvements-research.md:

7a: Critical Security Fixes (partial):

Userspace path normalization (normalize_path()) in config.rs + main.rs event loop
openat2 tracepoint hook in eBPF (closes openat2 syscall bypass)
Not yet implemented: LSM file_open with bpf_d_path(), LSM bprm_check_security, dynamic linker detection, inode_rename/inode_unlink hooks

7c: Approval Hardening (complete):

Per-agent rate limiting (3/min, 15/hr, exponential backoff, same-resource cooldown)
Risk classification with 4-tier scoring (Low/Medium/High/Critical)
Auto-deny for never-approve resources
Auto-approve for low-risk resources with configurable duration
Justification text analysis (urgency, security bypass, reassurance, authority claims)
Mandatory wait timers in UI (0/3/5/10s by risk level)
Type-to-confirm for CRITICAL risk resources
Risk-colored permission banners with justification warnings
Persistent SQLite audit trail for all permission decisions
/api/permissions/audit endpoint for querying audit history

Not yet implemented: Phase 7b (network monitoring), Phase 7d (advanced hardening: inode deny map, content hashing, io_uring blocking, mmap_file LSM, anomaly detection)

Phase 8: Security Fixes ✅ DONE

Based on docs/security/security-fixes.md and docs/security/security-limitations.md:

8a: Critical Security (P0+P1):

BPF map capacity increased from 256 to 1024 entries (MAX_POLICY_RULES = 1024)
Seccomp filter in guardian-launch blocks io_uring (syscalls 425-427) and memfd_create (319) with EPERM
Inode LSM hooks: inode_rename, inode_unlink, inode_link with PENDING maps and tracepoints for rename/unlink/hardlink enforcement
Path truncation handling: status_flags field in FileAccessEvent, EVENT_FLAG_TRUNCATED flag, deny-by-default for truncated paths

8b: Exec Hardening + Dashboard Security (P2):

Dynamic linker detection: DYNAMIC_LINKERS BPF map, reads argv[1] for real binary behind ld-linux
execveat tracepoint: detects AT_EMPTY_PATH flag (memfd_create + execveat attack vector)
Strict enforcement mode: mode = "strict" bails on any LSM load/attach failure
Default /memfd: exec deny: unconditionally blocks exec of memfd paths
Dashboard authentication: optional auth_token in config, Bearer header or ?token= query param

8c: Approval Hardening (P3):

Risk-based configurable timeouts: RiskTimeoutConfig with per-level timeout settings (60/120/180/300s defaults)
CLI permission approval: guardian-ctl pending/approve/deny commands + IPC message handlers
Grant accumulation limits: GrantAccumulator tracking 24h cumulative grant durations, max_grant_total_secs config
Improved justification analysis: weighted scoring (per-pattern weights), graduated risk bumps (score >= 8 -> +2, >= 3 -> +1)

8d: Polish (P4):

Configurable fail-closed mode: fail_closed: true per agent, FAIL_CLOSED_CGROUPS BPF map
SIGHUP reload: agent policies and permissions config reloaded (alerting outputs still require restart)
Anomaly detection: hourly background task checking rubber-stamping (>90% approval), high-volume agents, deny-then-approve persistence patterns
SQLite query methods: approval_rate_24h(), high_volume_agents_24h(), agents_with_deny_then_approve()

Phase 9: Network Enforcement ✅ DONE

LSM socket_connect hook blocks denied connections at kernel level (-ECONNREFUSED)
Port-based BPF maps: NET_DENY_PORTS, NET_ALLOW_PORTS, NET_DEFAULT_ACTION, NET_CGROUP_DEFAULT_ACTION
PENDING_NET_DENY map: tracepoint→PENDING→LSM pattern for network enforcement
populate_net_enforcement_maps() loads port policy from config into BPF maps
BLOCKED status in process_net_event() for enforce mode

Phase 10: Hardened Cgroup Agents ✅ DONE

Creative architectural shift: use Landlock LSM as primary enforcement, eBPF as audit layer.

Landlock sandbox in guardian-launch: inode-level file access control (Linux 5.13+). Resolves symlinks at VFS layer. Default-deny model. TCP connect filtering (kernel 6.7+).
Expanded seccomp: Blocks mount (165-166), namespace escape (setns, unshare), chroot, pivot_root, new mount API (428-433, 442)
PR_SET_NO_NEW_PRIVS: Prevents SUID escalation, required by Landlock
IPC SandboxConfig: Daemon sends agent policy in registration Ack response. Launcher builds Landlock + seccomp from it.
Two security tiers: Cgroup = hardened (Landlock+seccomp+eBPF+cgroup), Comm = limited (eBPF only)
Every CRITICAL and HIGH vulnerability (symlinks, TOCTOU, io_uring, rename/hardlink) is mitigated for cgroup agents

Phase 11: Security Hardening & Performance ✅ DONE

Based on comprehensive code audit (docs/security/comprehensive-code-analysis.md):

11a: Critical Security Fixes:

PENDING map overflow fail-closed (per-CPU overflow arrays + 4x map size)
Privilege dropping in guardian-launch (fixes Landlock+exec on SELinux)
Grant accumulation limit enforcement (check before oneshot send)

11b: High Security Fixes:

Privilege drop mandatory on SELinux (bail instead of warn)
Landlock default-allow returns error (not silent Ok)
CSRF protection for dashboard (HX-Request header validation)

11c: Performance & Reliability:

O(1) agent lookup via comm_cache HashMap (was O(N) per event)
IPC socket 30-second read timeout (prevents stall attacks)
Rate limiter TTL cleanup (1 hour, prevents unbounded memory)
Grant accumulator TTL cleanup (24 hours, hourly background task)
BPF grant removal logged at WARN (was DEBUG)

11d: Usability & Platform:

Default cgroup agent config auto-created on registration
Debian/Ubuntu dynamic linker multiarch paths
Dashboard form defaults updated for Fedora + Debian system paths
Landlock system paths: /usr/libexec, /sbin, /snap, /var

Dependency Versions

Crate	Version	Purpose
aya	0.13	eBPF userspace library
aya-ebpf	0.1	eBPF kernel-side library
aya-log	0.2	Log forwarding from eBPF to userspace
aya-log-ebpf	0.1	Log macros for eBPF programs
tokio	1	Async runtime for event processing
serde	1	Config deserialization + IPC
serde_json	1	IPC message serialization
toml	0.8	TOML config parsing
clap	4	CLI argument parsing
anyhow	1	Error handling with context
bytes	1	Perf buffer byte management
log	0.4	Logging facade
env_logger	0.11	Log output
libc	0.2	Unix system calls (kill, etc.)
reqwest	0.12	HTTP client for webhook/Slack alerts
lettre	0.11	Async SMTP email transport
chrono	0.4	ISO 8601 timestamps
prometheus	0.13	Prometheus metrics counters + encoding
axum	0.8	HTTP framework for dashboard
askama	0.12	Compile-time HTML templates
askama_axum	0.4	Askama + axum integration
rust-embed	8	Embed static files in binary
tower-http	0.6	HTTP middleware (CORS)
seccompiler	0.4	Seccomp BPF filter for blocking io_uring/memfd_create/mount/namespace
landlock	0.4	Landlock LSM for inode-level file access control (symlink-immune)
tokio-stream	0.1	Stream adapters for SSE broadcast

Code Quality Notes

All source files have extensive inline comments explaining eBPF concepts, Rust patterns, and security rationale - the user is learning all three simultaneously
guardian/src/config.rs has 15 unit tests covering path matching, policy evaluation, identity, and path normalization
guardian/src/permissions.rs has 6 unit tests for rate limiting, risk classification, auto-deny/approve, and justification analysis
guardian/src/main.rs has 4 unit tests for flag decoding and comm conversion
The tests in guardian/ can only run on Linux (aya dependency)
guardian-common tests pass on any platform

Security Best Practices Implemented

Default-deny policy model
Deny rules override allow rules
Config validation warns about overly permissive patterns
Sensitive paths (SSH keys, cloud creds, .env) in default deny list
eBPF program never blocks syscalls in Phase 1 (fail-open for safety)
Error in eBPF program returns 0 (don't interfere with system)
Documented that config file should be root-owned and not world-writable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guardian Shell - Claude Agent Handoff Document

Project Summary

Project Structure

How It Works

Comm-based agents (Phase 1/2 — backward compatible)

Cgroup-based agents (Phase 3 — recommended)

First Steps on Linux

1. Install Prerequisites

2. Build

3. Test with Comm-Based Agent (Phase 1/2)

3b. Test with Cgroup-Based Agent (Phase 3)

4. Potential Build Issues to Watch For

Key Design Decisions

Known Limitations (Phase 11)

Build Notes

Roadmap for Future Phases

Phase 2: Enforcement + Exec Monitoring ✅ DONE

Phase 3: Advanced Identity & Access ✅ DONE

Phase 4: Alerting & Integration ✅ DONE

Phase 5: Dashboard & UI ✅ DONE

Phase 6: Interactive Permission Requests ✅ DONE

Phase 7: Security Hardening (Partial) ✅ DONE

Phase 8: Security Fixes ✅ DONE

Phase 9: Network Enforcement ✅ DONE

Phase 10: Hardened Cgroup Agents ✅ DONE

Phase 11: Security Hardening & Performance ✅ DONE

Dependency Versions

Code Quality Notes

Security Best Practices Implemented

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Guardian Shell - Claude Agent Handoff Document

Project Summary

Project Structure

How It Works

Comm-based agents (Phase 1/2 — backward compatible)

Cgroup-based agents (Phase 3 — recommended)

First Steps on Linux

1. Install Prerequisites

2. Build

3. Test with Comm-Based Agent (Phase 1/2)

3b. Test with Cgroup-Based Agent (Phase 3)

4. Potential Build Issues to Watch For

Key Design Decisions

Known Limitations (Phase 11)

Build Notes

Roadmap for Future Phases

Phase 2: Enforcement + Exec Monitoring ✅ DONE

Phase 3: Advanced Identity & Access ✅ DONE

Phase 4: Alerting & Integration ✅ DONE

Phase 5: Dashboard & UI ✅ DONE

Phase 6: Interactive Permission Requests ✅ DONE

Phase 7: Security Hardening (Partial) ✅ DONE

Phase 8: Security Fixes ✅ DONE

Phase 9: Network Enforcement ✅ DONE

Phase 10: Hardened Cgroup Agents ✅ DONE

Phase 11: Security Hardening & Performance ✅ DONE

Dependency Versions

Code Quality Notes

Security Best Practices Implemented