Skip to content

Latest commit

 

History

History
190 lines (154 loc) · 7.21 KB

File metadata and controls

190 lines (154 loc) · 7.21 KB

AGENTS.md

Project-wide guidance for any code-writing agent (Codex, OpenAI agent runtime, Cursor, Aider, Cline, Continue, …). Claude Code reads this too but also reads CLAUDE.md.

Tooling infrastructure. On top of this file, the repo ships agent-specific wiring so every major coding agent has the same rules, skills, and permissions:

  • .claude/ — Claude Code settings, path-scoped rules, skills (/verify, /add-tool, /add-provider, /add-agent, /debug-loop, /release-check), subagents (code-reviewer, test-runner, docs-auditor), and safe/deny permissions.
  • .codex/ — Codex config.toml, execpolicy rules/default.rules, custom agents (forge_explorer, forge_reviewer, forge_test_runner), guardrail hooks, and Codex-native skills.
  • .cursor/rules/ — Cursor *.mdc rules (always-applied core + path-scoped TypeScript/testing/tools/models/security/UI rules).
  • .agents/skills/ — portable skills following the open agentskills.io spec. Any compliant agent can read these without vendor-specific wiring.

If you update a workflow (e.g. the verify chain), update it in .agents/skills/verify/SKILL.md first, then mirror the change into .claude/skills/verify/SKILL.md and .codex/skills/verify/SKILL.md if they diverge.

Agentic Coding Flywheel. Forge follows a plan-heavy, bead-driven workflow for any non-trivial change. See FLYWHEEL.md for the full methodology. In brief:

  • Plan space.flywheel/plans/ (iterated markdown plans)
  • Bead space.beads/beads.jsonl (self-contained work units)
  • Code spacesrc/ + test/ (implementation)

Skills: /plan, /plan-synthesize, /plan-to-beads, /polish-beads (run 4–6×), /fresh-eyes, /dedupe-beads, /idea-wizard, /deep-review, /reality-check, /landing, /de-slopify. Subagents: bead-polisher, plan-synthesizer, skill-refiner (Claude); forge_bead_polisher, forge_plan_synthesizer (Codex). The 8 canonical operators live in .flywheel/operators/.

Post-compaction ritual. When an agent gets confused, send: "Reread AGENTS.md, CLAUDE.md, and FLYWHEEL.md so they're still fresh in your mind." This is the single most common intervention in the methodology.

Conform to the OpenAI AGENTS.md convention: this file is a flat Markdown cheat-sheet that answers the questions "where am I, what can I run, and what shouldn't I break?".


1. Project identity

Forge is a TypeScript CLI runtime for local-first agentic software engineering. Node 20+. Ships via npm and a multi-arch Docker image.

  • Entry point: bin/forge.jsdist/cli/index.js
  • Orchestrator: src/core/orchestrator.ts
  • Agentic loop: src/core/loop.ts

2. Commands you'll run

npm ci --ignore-scripts
npm run build          # tsc + copy-assets
npm test               # vitest; 249 tests must pass
npm run typecheck
npm run lint
npm run format         # writes
npm run format:check   # reads, for CI
npm run test:coverage
./bin/forge.js doctor

Always end a change with:

npm run format && npm run lint && npm run build && npm test

3. Layout

Path Purpose
src/cli/ commander-based CLI, REPL, raw-mode input editor
src/core/ orchestrator, agentic loop, mode policy, validation gate
src/agents/ planner, architect, executor, reviewer, debugger, memory
src/models/ providers (ollama/openai/anthropic/llamacpp/vllm/lmstudio), router, adapter, catalog
src/tools/ 18 tools
src/permissions/ risk + interactive permission manager
src/sandbox/ path-safe fs + command risk classifier
src/persistence/ tasks/sessions/conversations/events + SQLite
src/memory/ hot/warm/cold/learning
src/ui/ HTTP + WS dashboard + static app
src/mcp/ MCP bridge
test/unit/ vitest unit tests
docs/ ARCHITECTURE, INSTALL, SETUP, metrics
.github/workflows/ ci, release, nightly
docker/ Dockerfile + compose

4. Rules

Must

  • Keep npm test at 100% pass.
  • Respect the state machine in src/persistence/tasks.ts.
  • Gate every new tool through requestPermission.
  • Classify model ids through src/models/local-catalog.ts (don't hand-roll regexes in a new provider).
  • Add a unit test for any new logic in src/core, src/agents, or src/tools.

Must not

  • Bypass the permission system.
  • Introduce network calls in tests — use vi.mock.
  • Log credentials. Use src/security/redact.ts.
  • Add dependencies without a clear reason.
  • Rename exported APIs without updating every caller + the docs.

Good defaults

  • Prefer readonly / immutable data flow.
  • Prefer function modules over classes unless state really needs encapsulation (see the provider classes for the accepted shape).
  • Prefer explicit Result<T, E>-style shapes over thrown errors for expected failures. Throw only for programmer errors.

5. Testing

Vitest. Patterns to copy:

  • Stubbing callModel: see test/unit/executor-loop.test.ts.
  • Stubbing providers: see test/unit/adapter.test.ts.
  • Tempdir + cleanup: see test/unit/validation-gate.test.ts.

6. CI quick reference

9 parallel jobs on every PR:

  1. 🎨 format (prettier --check)
  2. 🧹 lint (eslint)
  3. 🧠 typecheck (tsc --noEmit)
  4. 🧪 test (matrix: ubuntu + macOS × Node 20 + 22)
  5. 📈 coverage
  6. 🏗️ build (full npm run build)
  7. 🐳 docker-build (catches Dockerfile drift)
  8. 🔐 audit (npm audit, informational)
  9. 📊 pipeline status (aggregates + fails if any required job failed)

Release (tag v*): 6-stage pipeline — gate, artifacts, docker publish to GHCR, signed manifest + GH release, npm publish (provenance), status.

7. Performance posture

The product runs on personal machines, often alongside Ollama. Keep it lean:

  • No UI framework (app shell is vanilla JS + CSS, < 100 KB).
  • No synchronous disk reads on REPL redraw or UI poll paths.
  • Default executor turn cap in src/core/mode-policy.ts.
  • Watchers are ref-counted so multiple surfaces share one file watcher.
  • Providers do 1.5 s availability probes, not long timeouts.

8. Security posture

  • All paths resolved to realpath + confined to project root.
  • All tool invocations classified by risk × sideEffect and gated.
  • Shell commands classified before execution; critical is hard-blocked.
  • Credentials via src/keychain/ (macOS keychain / libsecret / DPAPI).
  • Prompt injection fenced by src/security/injection.ts.

9. Useful anchors

10. When uncertain

Read the failing test first. Then read src/core/loop.ts. Then ask. Surfacing "I don't know what this invariant is" is better than guessing and breaking a test six commits later.