docs: add directory-specific AGENTS.md files and improve root guidelines (#124)

benvinegar · web-flow · commit 55acd3066dc4 · 2026-02-22T15:32:03.000-05:00
diff --git a/AGENTS.md b/AGENTS.md
@@ -1,242 +1,81 @@
 # Baudbot — Agent Guidelines
 
-Baudbot is hardened infrastructure for running always-on AI agents. Source is admin-owned; agents run from deployed copies.
+Baudbot is hardened infrastructure for running always-on AI agents.
 
-## Repo Layout
+Use this file for **repo-wide** guidance. For directory-specific rules, use the nearest nested `AGENTS.md`:
+- [`bin/AGENTS.md`](bin/AGENTS.md)
+- [`pi/extensions/AGENTS.md`](pi/extensions/AGENTS.md)
+- [`slack-bridge/AGENTS.md`](slack-bridge/AGENTS.md)
 
-```
-bin/                        security & operations scripts
-  baudbot                   CLI (attach, sessions, update, deploy)
-  deploy.sh                 deploys a prepared source tree → agent runtime
-  update-release.sh         temp checkout → git-free /opt release snapshot → deploy
-  rollback-release.sh       rollback to previous or specified /opt release snapshot
-  security-audit.sh         security posture audit
-  setup-firewall.sh         iptables per-UID egress allowlist
-  baudbot-safe-bash          shell command deny list (installed to /usr/local/bin)
-  baudbot-docker             Docker wrapper (blocks privilege escalation)
-  harden-permissions.sh     filesystem hardening (runs on boot)
-  scan-extensions.mjs       extension static analysis
-  redact-logs.sh            secret scrubber for session logs
-  prune-session-logs.sh     retention cleanup for old pi session logs
-  config.sh                 interactive secrets/config setup
-  broker-register.mjs       Slack broker workspace registration CLI
-  doctor.sh                 system health checks
-  uninstall.sh              clean removal of baudbot
-  test.sh                   runs all test suites
-  baudbot-firewall.service  systemd unit for firewall persistence
-  baudbot.service           systemd unit for agent process
-  ci/                       CI integration scripts
-    droplet.sh              ephemeral DigitalOcean droplet lifecycle (create/destroy/ssh)
-    setup-ubuntu.sh         Ubuntu droplet: prereqs + setup + tests
-    setup-arch.sh           Arch Linux droplet: prereqs + setup + tests
-  lib/                      shared shell helpers sourced by CLI/release scripts
-    shell-common.sh         strict mode + shared logging/error/root-check helpers
-    paths-common.sh         shared path constants (bb_init_paths, bb_refresh_release_paths)
-    release-common.sh       shared update/rollback helpers
-    deploy-common.sh        deploy/runtime helper functions
-    doctor-common.sh        doctor status/check formatting helpers
-    json-common.sh          shared JSON field extraction helper (jq)
-    baudbot-runtime.sh      runtime/status/session/attach helper module for bin/baudbot
-hooks/
-  pre-commit                blocks agent from modifying security files in git
-pi/
-  extensions/               source of truth for pi agent extensions
-    tool-guard.ts           🔒 tool call interception (blocks dangerous patterns)
-    tool-guard.test.mjs     🔒 86 tests for tool-guard
-    heartbeat.ts            periodic health check loop
-    auto-name.ts            session naming
-    control.ts              inter-session communication
-    idle-compact.ts         compact context during idle periods (default 25% threshold)
-    ...
-  skills/                   source of truth for agent skill templates
-    control-agent/          orchestration agent
-      HEARTBEAT.md          health check checklist (deployed to ~/.pi/agent/)
-      memory/               seed files for persistent memory
-    dev-agent/              coding agent
-    sentry-agent/           monitoring/triage agent
-  settings.json             pi agent settings
-slack-bridge/
-  bridge.mjs                Slack ↔ agent bridge (legacy Socket Mode)
-  broker-bridge.mjs         Slack ↔ agent bridge (broker pull mode — preferred)
-  security.mjs              🔒 content wrapping, rate limiting, auth
-  security.test.mjs         🔒 tests for security module
-setup.sh                    one-time system setup (creates user, firewall, etc.)
-start.sh                    agent launcher (deployed to ~/runtime/start.sh)
-```
-
-🔒 = security-critical files. Protected at runtime (read-only perms + tool-guard blocks writes).
+## How Baudbot works
 
-See [CONFIGURATION.md](CONFIGURATION.md) for all env vars and how to obtain them.
+Baudbot is a persistent, team-facing coding agent system. It connects to Slack, receives work requests from developers, and autonomously executes coding tasks (branch, code, test, PR, CI) on a Linux server.
 
-## Architecture: Source / Runtime Separation
+**Runtime model:** A long-running **control agent** stays connected to Slack, triages incoming requests, and delegates work to ephemeral **dev agents** that each run in isolated git worktrees. A **sentry agent** handles on-demand incident triage. All agents run as an unprivileged `baudbot_agent` Unix user.
 
-Live execution is release/runtime-based (`/opt/baudbot` + `baudbot_agent` runtime).
-
-Live operations are now release-based under `/opt/baudbot` (git-free):
-
-```
-/opt/baudbot/
-├── releases/<sha>/          immutable snapshot (no .git)
-├── current -> releases/<sha>   active release symlink
-└── previous -> releases/<sha>  previous release symlink (for rollback)
+```text
+Slack
+   ↓
+slack-bridge (broker pull-mode or legacy Socket Mode)
+   ↓
+control-agent (always-on, manages todo/routing/Slack threads)
+   ├── dev-agent(s) — ephemeral coding workers in isolated worktrees
+   └── sentry-agent — incident triage (Sentry alerts)
+         ↓
+git commits → PRs → CI feedback → thread updates back to Slack
 ```
 
-`baudbot update` flow:
-1) clone target ref into `/tmp/baudbot-update.*`
-2) run preflight checks in temp checkout
-3) publish git-free snapshot to `/opt/baudbot/releases/<sha>`
-4) deploy runtime files from snapshot
-5) restart + health check
-6) atomically switch `/opt/baudbot/current`
+**Deployment model:** Admin-managed source (this repo) is deployed as immutable, git-free release snapshots under `/opt/baudbot`. The agent runtime (`/home/baudbot_agent`) receives deployed extensions, skills, and bridge code. Updates and rollbacks are atomic symlink switches. See `docs/architecture.md` for full details.
 
-`baudbot rollback previous|<sha>` re-deploys an existing snapshot and flips `current`/`previous` without network access.
+## What this repo contains (high-level)
 
-Agent runtime layout:
-```
-/home/baudbot_agent/
-├── runtime/
-│   ├── start.sh                deployed launcher
-│   ├── bin/                    harden-permissions.sh, redact-logs.sh, prune-session-logs.sh
-│   └── slack-bridge/           deployed bridge
-├── .pi/agent/
-│   ├── extensions/             deployed extensions
-│   ├── skills/                 agent-owned (can modify freely)
-│   ├── HEARTBEAT.md            periodic health check checklist (admin-managed)
-│   ├── memory/                 persistent agent memory (agent-owned, survives deploys)
-│   ├── baudbot-version.json     deploy version (git SHA, timestamp)
-│   └── baudbot-manifest.json    SHA256 hashes of all deployed files
-├── workspace/                  project repos + git worktrees
-└── .config/.env                secrets (600 perms)
-```
+- `bin/` — operational shell CLI, deploy/update/rollback, security and health scripts
+- `pi/extensions/` — tool extensions and runtime behaviors deployed into agent runtime
+- `pi/skills/` — agent personas and behavior (`SKILL.md` defines each agent's identity, rules, and tools)
+  - `control-agent/` — orchestration/triage persona + persistent memory seeds
+  - `dev-agent/` — coding worker persona
+  - `sentry-agent/` — incident triage persona
+- `pi/settings.json` — pi agent settings
+- `slack-bridge/` — Slack integration bridges + security module
+- `docs/` — architecture/operations/security documentation
+- `test/` — vitest wrappers for shell scripts, integration, and legacy Node tests
+- `hooks/` — git hooks (security-critical `pre-commit` protecting admin-managed files)
+- `.github/` — CI workflows, PR template, issue templates
+- `.env.schema` — canonical schema for all environment variables (see `CONFIGURATION.md`)
+- `bootstrap.sh`, `setup.sh`, `install.sh`, `start.sh` — bootstrap installer, system setup, interactive install, and runtime launcher
 
-## Development Workflow
+## Core workflow
 
 ```bash
-# First-time install (interactive — handles everything)
-sudo ./install.sh
-
-# Edit source files in this repository checkout
+# JS/TS + shell tests + lint
+npm test
+npm run lint
 
-# For source-only changes (extensions/skills/bridge), deploy directly:
+# Source-only deploy (extensions/skills/bridge changes)
 ./bin/deploy.sh
 
-# For operational updates from git (recommended for live bot):
+# Live operational update/rollback
 sudo baudbot update
-
-# Roll back live bot to previous snapshot if needed:
 sudo baudbot rollback previous
-
-# Register a server with Slack broker (after OAuth callback)
-sudo baudbot broker register --broker-url https://broker.example.com --workspace-id T0123ABCD --registration-token <token>
-
-# Rotate an API key after setup (prompts hidden input)
-sudo baudbot env set ANTHROPIC_API_KEY --restart
-
-# Optional: use external secret source instead of ~/.baudbot/.env
-sudo baudbot env backend set-command 'your-secret-tool export baudbot-prod'
-sudo baudbot env sync --restart
-
-# Launch agent directly (debug/dev)
-sudo -u baudbot_agent ~/runtime/start.sh
-
-# Or in tmux
-tmux new-window -n baudbot 'sudo -u baudbot_agent ~/runtime/start.sh'
 ```
 
-## Slack broker pull-mode notes
-
-- Broker delivery is now pull-based. Registration is callback-free:
-  - `sudo baudbot broker register --broker-url ... --workspace-id T... --registration-token ...`
-- After a successful broker registration, always restart to load new keys:
-  - `sudo baudbot restart`
-- The runtime starts `broker-bridge.mjs` automatically when `SLACK_BROKER_*` vars are present.
-- Quick troubleshooting when Slack replies stop:
-  - `sudo -u baudbot_agent tmux ls` (check `slack-bridge` session exists)
-  - `sudo baudbot attach --tmux slack-bridge` (bridge logs)
-  - `sudo journalctl -u baudbot.service -n 200 --no-pager` (startup/runtime errors)
-- For local/semi-integration tests that spawn `slack-bridge/broker-bridge.mjs`, keep `libsodium-wrappers-sumo` available from root install (`npm install` at repo root).
-
-## Running Tests
-
-```bash
-# All tests (unified Vitest runner)
-npm test
-
-# Only JS/TS tests
-npm run test:js
-
-# Only shell/security script tests
-npm run test:shell
-
-# JS/TS coverage
-npm run test:coverage
-
-# Lint (Biome + ShellCheck)
-npm run lint
-```
-
-Add new test files to `vitest.config.mjs` (and shell wrappers under `test/` as needed) — don't scatter test invocations across CI or docs.
-
-## Conventions
-
-- Security functions must be pure, testable modules (no side effects, no env vars at module scope).
-- All security code must have tests before merging.
-- Run `bin/security-audit.sh --deep` after any security-relevant changes.
-- Keep shell CLIs thin: move reusable logic to `bin/lib/*.sh`, and source shared helpers (`shell-common.sh`, `paths-common.sh`, `release-common.sh`, `deploy-common.sh`, `doctor-common.sh`) instead of duplicating logging/error/root-check patterns.
-- For shell scripts, standardize on `bb_enable_strict_mode` and shared helper functions (`bb_log`, `bb_die`, etc.) rather than ad-hoc wrappers.
-- Protected files (`tool-guard.ts`, `security.mjs`, their tests) are deployed read-only. The agent cannot modify them at runtime.
-- New integrations get their own subdirectory (e.g. `discord-bridge/`).
-- Extensions are deployed from `pi/extensions/` → agent's `~/.pi/agent/extensions/`.
-- Skills are deployed from `pi/skills/` → agent's `~/.pi/agent/skills/`.
-- Agent commits operational learnings to its own skills dir (not back to source).
-- **When changing behavior, update all docs.** Check and update: `README.md`, relevant pages in `docs/`, `CONFIGURATION.md`, skill files (`pi/skills/*/SKILL.md`), and `AGENTS.md`. Inline code examples in docs must match the actual implementation.
-- **Prefer distro-agnostic commands, but prioritize reliability.** Scripts should work on both Arch and Ubuntu (and standard Linux), but distro-specific branches are allowed when they improve setup reliability/UX. If you use distro-specific logic, include graceful fallbacks (or clear prereq docs), keep behavior consistent across distros, and add tests. Continue using portable shell patterns (`grep -E`, POSIX-compatible tools) wherever possible.
-
-## Testing on Droplets
-
-Use `bin/ci/droplet.sh` to spin up ephemeral DigitalOcean droplets for testing setup, install, or shell changes on real Linux. Requires `DO_API_TOKEN` env var.
-
-```bash
-# Generate a throwaway SSH key
-ssh-keygen -t ed25519 -f /tmp/ci_key -N "" -q
-
-# Create a droplet (Ubuntu or Arch)
-eval "$(bin/ci/droplet.sh create my-test ubuntu-24-04-x64 /tmp/ci_key.pub)"
-# → sets DROPLET_ID, DROPLET_IP, SSH_KEY_ID
-
-# Or Arch (custom image):
-eval "$(bin/ci/droplet.sh create my-test 217410218 /tmp/ci_key.pub)"
-
-# Wait for SSH, upload source, run a CI script
-bin/ci/droplet.sh wait-ssh "$DROPLET_IP" /tmp/ci_key
-tar czf /tmp/baudbot-src.tar.gz --exclude=node_modules --exclude=.git .
-scp -i /tmp/ci_key /tmp/baudbot-src.tar.gz "root@$DROPLET_IP:/tmp/"
-bin/ci/droplet.sh run "$DROPLET_IP" /tmp/ci_key bin/ci/setup-ubuntu.sh
-
-# Or SSH in for manual poking
-ssh -i /tmp/ci_key "root@$DROPLET_IP"
-
-# Clean up when done (~$0.003/run)
-bin/ci/droplet.sh destroy "$DROPLET_ID" "$SSH_KEY_ID"
-```
+## Non-negotiable guardrails
 
-Droplets take ~15s to create, ~10s for SSH, ~90s for full setup+tests. Always destroy after — they cost ~$0.003 per run but add up if forgotten.
+**Hard constraints (enforced by pre-commit hook or CI):**
+- Never commit directly to `main`; use feature branches + PRs.
+- Security-critical files are protected by `hooks/pre-commit` — the agent cannot modify them at runtime.
+- Security-sensitive changes MUST include or update tests.
+- Do NOT weaken runtime hardening (permissions, least privilege, egress restrictions).
 
-The CI scripts (`bin/ci/setup-ubuntu.sh`, `bin/ci/setup-arch.sh`) run the bootstrap flow (`bootstrap.sh` → `baudbot install`) with simulated input, verify the result, then run the full test suite. Use them as-is or SSH in and test manually.
+**Strong defaults:**
+- When behavior changes, update docs in the same PR (`README.md`, `docs/*`, `CONFIGURATION.md`, and relevant `AGENTS.md` files).
+- Prefer distro-agnostic shell; distro-specific branches are acceptable when reliability improves.
 
-## Security Notes
+## Tests and quality gates
 
-- `tool-guard.ts` is a policy/guidance layer: it blocks many risky writes/bash patterns and provides safety-interruption reasoning, but it is not a hard sandbox boundary by itself.
-- `baudbot-safe-bash` (root-owned, `/usr/local/bin/`) is a second deny-list layer at the shell level; hard containment still comes from OS permissions and runtime hardening.
-- The firewall (`setup-firewall.sh`) restricts `baudbot_agent`'s network egress to an allowlist.
-- `/proc` is mounted with `hidepid=2` — agent can only see its own processes.
-- Secrets in `~/.config/.env` are `600` perms, never committed.
-- Session logs are auto-pruned on startup (14-day retention) and auto-redacted for API keys/tokens.
+Before merge, run at minimum: `npm run lint` and `npm test`.
 
-## Git Workflow
+## Git / PR expectations
 
-- **Never commit directly to `main`.** All changes go on feature branches with PRs.
-- One branch per todo/task. Branch names: `<gh-username>/<description>` (e.g. `youruser/add-uninstall-script`).
-- Use `gh pr create` to open PRs (not the GitHub API with tokens).
-- Concise, action-oriented commit messages: `security: add rate limiting to bridge API`
-- Prefix with area: `security:`, `bridge:`, `deploy:`, `docs:`, `arch:`, `tests:`
+- Use concise commit messages with area prefix (`security:`, `bridge:`, `deploy:`, `docs:`, `tests:`, ...).
+- If a change is scoped to a subdirectory with its own `AGENTS.md`, follow that local file first.
diff --git a/bin/AGENTS.md b/bin/AGENTS.md
@@ -0,0 +1,50 @@
+# bin/ — Agent Guidelines
+
+Scope: shell CLI and operational scripts under `bin/`.
+
+## Focus areas
+
+- CLI entrypoint (`baudbot`) and runtime helpers
+- deploy/update/rollback flows
+- security audit and firewall scripts
+- install/uninstall operational scripts
+- CI infrastructure (`bin/ci/`)
+- JS tooling: `broker-register.mjs`, `scan-extensions.mjs` (and their tests)
+- systemd unit files: `baudbot.service`, `baudbot-firewall.service`
+
+## Rules
+
+- Keep CLIs thin; move reusable logic into `bin/lib/*.sh`.
+- Reuse shared helpers (`shell-common.sh`, `paths-common.sh`, `release-common.sh`, etc.) instead of duplicating constants or logging/error patterns.
+- Prefer portable shell patterns; distro-specific branches are acceptable when reliability improves.
+- Any security-relevant shell change must include/adjust tests.
+
+## Critical files
+
+Treat as security-critical:
+- `baudbot-safe-bash` — runtime command-blocking wrapper
+- `harden-permissions.sh` — filesystem permission lockdown
+- `setup-firewall.sh` — network egress lockdown
+- `security-audit.sh` — security posture audit
+- `scan-extensions.mjs` — static analysis scanner for extensions
+- `redact-logs.sh` — secret redaction from session logs
+
+## Notes
+
+- Shared helpers in `bin/lib/` have co-located test files (e.g. `deploy-common.test.sh`, `json-common.test.sh`). Update tests when changing helpers.
+- For JS files in `bin/`, also run `npm run lint:js` and `npm run test:js`.
+
+## Validation
+
+Run before finishing shell work:
+
+```bash
+npm run lint:shell
+npm run test:shell
+```
+
+For security-sensitive updates, also run:
+
+```bash
+bin/security-audit.sh --deep
+```
diff --git a/pi/extensions/AGENTS.md b/pi/extensions/AGENTS.md
@@ -0,0 +1,36 @@
+# pi/extensions/ — Agent Guidelines
+
+Scope: extension code under `pi/extensions/`.
+
+## Focus areas
+
+- Tool extensions and runtime behaviors
+- Session-control and orchestration helpers
+- Safety/policy logic and stateful agent features
+
+## Rules
+
+- Keep extension behavior deterministic and testable.
+- Avoid module-scope side effects for security-sensitive code.
+- Preserve compatibility with deployed runtime assumptions (`~/.pi/agent/...`).
+- If changing extension behavior, update related skill/docs references where relevant.
+
+## Subdirectory extensions
+
+Some extensions have their own package scope:
+- `kernel/` — on-kernel cloud browser execution (has own `package.json`)
+- `agentmail/` — email integration (has own `package.json`)
+- `email-monitor/` — email monitoring
+
+## Critical files
+
+Treat these as security-critical and require strong justification + tests:
+- `tool-guard.ts`
+- `tool-guard.test.mjs`
+
+## Validation
+
+```bash
+npm run lint:js
+npm run test:js
+```
diff --git a/slack-bridge/AGENTS.md b/slack-bridge/AGENTS.md