AGENTS.md

Project-wide guidance for any code-writing agent (Codex, OpenAI agent runtime, Cursor, Aider, Cline, Continue, …). Claude Code reads this too but also reads CLAUDE.md.

Tooling infrastructure. On top of this file, the repo ships agent-specific wiring so every major coding agent has the same rules, skills, and permissions:

.claude/ — Claude Code settings, path-scoped rules, skills (/verify, /add-tool, /add-provider, /add-agent, /debug-loop, /release-check), subagents (code-reviewer, test-runner, docs-auditor), and safe/deny permissions.

.codex/ — Codex config.toml, execpolicy rules/default.rules, custom agents (forge_explorer, forge_reviewer, forge_test_runner), guardrail hooks, and Codex-native skills.

.cursor/rules/ — Cursor *.mdc rules (always-applied core + path-scoped TypeScript/testing/tools/models/security/UI rules).

.agents/skills/ — portable skills following the open agentskills.io spec. Any compliant agent can read these without vendor-specific wiring.

If you update a workflow (e.g. the verify chain), update it in .agents/skills/verify/SKILL.md first, then mirror the change into .claude/skills/verify/SKILL.md and .codex/skills/verify/SKILL.md if they diverge.

Agentic Coding Flywheel. Forge follows a plan-heavy, bead-driven workflow for any non-trivial change. See FLYWHEEL.md for the full methodology. In brief:

Plan space → .flywheel/plans/ (iterated markdown plans)

Bead space → .beads/beads.jsonl (self-contained work units)

Code space → src/ + test/ (implementation)

Skills: /plan, /plan-synthesize, /plan-to-beads, /polish-beads (run 4–6×), /fresh-eyes, /dedupe-beads, /idea-wizard, /deep-review, /reality-check, /landing, /de-slopify. Subagents: bead-polisher, plan-synthesizer, skill-refiner (Claude); forge_bead_polisher, forge_plan_synthesizer (Codex). The 8 canonical operators live in .flywheel/operators/.

Post-compaction ritual. When an agent gets confused, send: "Reread AGENTS.md, CLAUDE.md, and FLYWHEEL.md so they're still fresh in your mind." This is the single most common intervention in the methodology.

Conform to the OpenAI AGENTS.md convention: this file is a flat Markdown cheat-sheet that answers the questions "where am I, what can I run, and what shouldn't I break?".

1. Project identity

Forge is a TypeScript CLI runtime for local-first agentic software engineering. Node 20+. Ships via npm and a multi-arch Docker image.

Entry point: bin/forge.js → dist/cli/index.js
Orchestrator: src/core/orchestrator.ts
Agentic loop: src/core/loop.ts

2. Commands you'll run

npm ci --ignore-scripts
npm run build          # tsc + copy-assets
npm test               # vitest; 249 tests must pass
npm run typecheck
npm run lint
npm run format         # writes
npm run format:check   # reads, for CI
npm run test:coverage
./bin/forge.js doctor

Always end a change with:

npm run format && npm run lint && npm run build && npm test

3. Layout

Path	Purpose
`src/cli/`	commander-based CLI, REPL, raw-mode input editor
`src/core/`	orchestrator, agentic loop, mode policy, validation gate
`src/agents/`	planner, architect, executor, reviewer, debugger, memory
`src/models/`	providers (ollama/openai/anthropic/llamacpp/vllm/lmstudio), router, adapter, catalog
`src/tools/`	18 tools
`src/permissions/`	risk + interactive permission manager
`src/sandbox/`	path-safe fs + command risk classifier
`src/persistence/`	tasks/sessions/conversations/events + SQLite
`src/memory/`	hot/warm/cold/learning
`src/ui/`	HTTP + WS dashboard + static app
`src/mcp/`	MCP bridge
`test/unit/`	vitest unit tests
`docs/`	ARCHITECTURE, INSTALL, SETUP, metrics
`.github/workflows/`	ci, release, nightly
`docker/`	Dockerfile + compose

4. Rules

Must

Keep npm test at 100% pass.
Respect the state machine in src/persistence/tasks.ts.
Gate every new tool through requestPermission.
Classify model ids through src/models/local-catalog.ts (don't hand-roll regexes in a new provider).
Add a unit test for any new logic in src/core, src/agents, or src/tools.

Must not

Bypass the permission system.
Introduce network calls in tests — use vi.mock.
Log credentials. Use src/security/redact.ts.
Add dependencies without a clear reason.
Rename exported APIs without updating every caller + the docs.

Good defaults

Prefer readonly / immutable data flow.
Prefer function modules over classes unless state really needs encapsulation (see the provider classes for the accepted shape).
Prefer explicit Result<T, E>-style shapes over thrown errors for expected failures. Throw only for programmer errors.

5. Testing

Vitest. Patterns to copy:

Stubbing callModel: see test/unit/executor-loop.test.ts.
Stubbing providers: see test/unit/adapter.test.ts.
Tempdir + cleanup: see test/unit/validation-gate.test.ts.

6. CI quick reference

9 parallel jobs on every PR:

🎨 format (prettier --check)
🧹 lint (eslint)
🧠 typecheck (tsc --noEmit)
🧪 test (matrix: ubuntu + macOS × Node 20 + 22)
📈 coverage
🏗️ build (full npm run build)
🐳 docker-build (catches Dockerfile drift)
🔐 audit (npm audit, informational)
📊 pipeline status (aggregates + fails if any required job failed)

Release (tag v*): 6-stage pipeline — gate, artifacts, docker publish to GHCR, signed manifest + GH release, npm publish (provenance), status.

7. Performance posture

The product runs on personal machines, often alongside Ollama. Keep it lean:

No UI framework (app shell is vanilla JS + CSS, < 100 KB).
No synchronous disk reads on REPL redraw or UI poll paths.
Default executor turn cap in src/core/mode-policy.ts.
Watchers are ref-counted so multiple surfaces share one file watcher.
Providers do 1.5 s availability probes, not long timeouts.

8. Security posture

All paths resolved to realpath + confined to project root.
All tool invocations classified by risk × sideEffect and gated.
Shell commands classified before execution; critical is hard-blocked.
Credentials via src/keychain/ (macOS keychain / libsecret / DPAPI).
Prompt injection fenced by src/security/injection.ts.

9. Useful anchors

Why the agentic loop works: docs/ARCHITECTURE.md §2
State machine diagram: docs/ARCHITECTURE.md §3
Mode caps table: docs/ARCHITECTURE.md §4
Provider routing: docs/ARCHITECTURE.md §6
Dev setup: docs/SETUP.md
Install: docs/INSTALL.md

10. When uncertain

Read the failing test first. Then read src/core/loop.ts. Then ask. Surfacing "I don't know what this invariant is" is better than guessing and breaking a test six commits later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AGENTS.md

1. Project identity

2. Commands you'll run

3. Layout

4. Rules

Must

Must not

Good defaults

5. Testing

6. CI quick reference

7. Performance posture

8. Security posture

9. Useful anchors

10. When uncertain

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

AGENTS.md

1. Project identity

2. Commands you'll run

3. Layout

4. Rules

Must

Must not

Good defaults

5. Testing

6. CI quick reference

7. Performance posture

8. Security posture

9. Useful anchors

10. When uncertain