Guidance for AI agents and developers working on promptfoo/crabcode.
crabcode is a tmux-based multi-workspace manager for multi-repo development. The primary CLI is the Bash script in src/crabcode; it creates Git worktrees, manages tmux panes, saves/restores WIP state, integrates with Slack/Linear, and supports multiple coding agents.
Codex is the default agent, but the codebase intentionally supports both Codex and Claude through the agent abstraction helpers in src/crabcode. Preserve that abstraction and avoid hardcoding one agent's session paths, CLI flags, or config file layout in new features.
# Run all local tests
make test
# Run Bats unit tests only
make test-unit
# Run integration tests
make test-integration
# Run integration or e2e tests in Docker
make test-docker
make test-e2e
# Lint the main shell script
make lint
# Install the local CLI to /usr/local/bin
make installFor focused unit coverage while iterating on the agent abstraction, run:
./tests/bats/bin/bats tests/unit/test_agent_helpers.batssrc/crabcode: main CLI implementation and all workspace, agent, sync, Slack, WIP, and plugin commands.Makefile: test, lint, Docker test, and install entrypoints..github/workflows/test.yml: Bats + Shellcheck CI with a requiredCI Successaggregate check..github/workflows/release-please.yml: release automation driven by Conventional Commit subjects.tests/unit/: Bats tests for shell helper behavior.tests/run.shandtests/e2e/: integration and Dockerized end-to-end test harnesses.tests/MANUAL.md: manual QA notes for scenarios that are hard to fully automate.
- Keep shell changes compatible with
set -eand existing Bash style insrc/crabcode. - Prefer extending the agent helper functions (
get_agent_type,agent_*_for_type,agent_sync_*) instead of branching directly onclaudeorcodexthroughout command handlers. - When adding repo-level agent instructions to generated workspaces, write to
AGENTS.mdfor Codex mode and.claude/CLAUDE.mdfor Claude mode through the existing helper functions. - Keep user-facing command output concise and consistent with the existing colored status helpers (
info,success,warn,error). - Do not commit generated runtime state from
~/.crabcode,.local/, or test artifacts.
- Run
make lintafter editingsrc/crabcode. - Run
make test-unitfor helper/command logic changes. - Run
make test-integrationormake test-dockerwhen worktree, tmux, or filesystem behavior changes. - Run
make test-e2efor risky changes to workspace lifecycle, agent startup, or plugin orchestration when Docker coverage is practical.
- Do not commit directly to
main; use a feature branch and open a PR. - Use Conventional Commit subjects (
feat:,fix:,docs:,test:,refactor:,build:,ci:,chore:) so release-please can infer changelog entries and version bumps. - Keep PR descriptions specific about user-facing command changes and list the validation commands you ran.