A local-first, plan-first, multi-agent, and programmable software-engineering runtime.
Not an assistant. A runtime. Forge brings its own scheduler, sandbox, permission system, state machine, agentic loop, memory layers, and plugin ecosystem. You pick the model. You approve the actions. Everything is inspectable, replayable, and yours.
Install Β· Dev setup Β· Architecture Β· Releases & versioning Β· Demo walkthrough Β· Wiki Page Β· NPM Package Β· License
- At a glance
- Why Forge
- Quick start
- The agentic loop (with diagrams)
- Task state machine
- Executor β iterative tool-use loop
- Memory layers
- Provider routing & auto-adaptation
- Safety model
- Modes
- CLI reference
- Filesystem layout
- Skills Β· Instructions Β· MCP
- Run in a container
- CI/CD pipeline
- Architecture map
- Development
- License
Forge is a local-first, plan-first, multi-agent, and programmable software-engineering runtime. Unlike Claude Code or OpenAI Codex, Forge is local-first infrastructure, not a hosted assistant. It brings its own scheduler, sandbox, permission system, state machine, agentic loop, memory layers, and plugin ecosystem. You pick & host the model. You approve the actions. Everything is inspectable, replayable, and yours.
| value | reproducer | |
|---|---|---|
β‘ forge doctor cold-start |
173 ms | time node bin/forge.js doctor --no-banner |
β‘ forge --help cold-start |
238 ms | time node bin/forge.js --help |
| π¦ UI shell Β· zero CDN | 90 KB uncompressed | wc -c src/ui/public/app.js |
| π Provider probe timeout | 1.5 s | src/models/openai.ts#isAvailable |
| π Model providers (auto-detected) | 6 | ollama Β· lmstudio Β· vllm Β· llama.cpp Β· openai-compat Β· anthropic |
| π§ Model families classified | 41 | Llama / Qwen / DeepSeek / Gemma / Phi / Mistral / Codestral / β¦ |
| π€ Built-in agents | 6 | planner Β· architect Β· executor Β· reviewer Β· debugger Β· memory |
| π Tools available to agents | 18 | read Β· write Β· edit Β· grep Β· glob Β· run_command Β· git Β· web Β· β¦ |
| π¬ CLI subcommands Β· slash commands | 24 Β· 55 | forge --help Β· /help in REPL |
| π Modes | 9 | fast Β· balanced Β· heavy Β· plan Β· execute Β· audit Β· debug Β· architect Β· offline-safe |
| β Tests | 548 / 97 files Β· 100% passing Β· ~5.5 s wall-clock | npx vitest run |
| π³ CI jobs Β· release stages | 9 Β· 6 | .github/workflows/ |
| π¦ Container image | ~355 MB Β· multi-arch Β· non-root Β· HEALTHCHECK | docker pull ghcr.io/hoangsonw/forge-agentic-coding-cli:latest |
Tech Stack:
Most "AI coding tools" are thin chat wrappers over a cloud API. Forge is engineering infrastructure with first-class:
mindmap
root((Forge))
Local-first
Auto-detect Ollama / LM Studio / vLLM / llama.cpp
Model-family auto-adapt
Offline-safe mode
Agentic
6 role-typed agents
Iterative tool-use executor
Validation gate (typecheck/lint)
Bounded retries + diagnose
Controllable
Default-deny permissions
Path-realpath-confined sandbox
Risk-classified shell
OS-keychain credentials
Inspectable
Tasks JSON Β· Sessions JSONL Β· Events JSONL
Prompt-hashed, replayable
Concurrent-writer-safe
Extensible
Markdown skills
MCP connectors
Pluggable agents + tools
Performant
REPL cold-start 238 ms
UI shell 89 KB Β· zero CDN
Providers probe in 1.5 s
- Local-first. Forge auto-detects Ollama, LM Studio, vLLM, and llama.cpp on their default ports. Cloud (Anthropic / OpenAI / LocalAI / Together / Groq / Azure) is opt-in, not required.
- Agentic but controllable. Every action is classified (risk Γ side-effect Γ sensitivity), gated by a permission system, and logged with a reproducible prompt hash.
- Inspectable. Sessions JSONL, tasks JSON, events JSONL. Two processes
can edit the same conversation concurrently (POSIX
O_APPEND+mkdirlockfile). - Mode-driven. 9 explicit modes β each carries enforceable budgets (max executor turns, max validation retries, allowMutations, maxAutoRisk).
- Extensible. Drop a Markdown file in
~/.forge/skills/. Add anAgent. Wire an MCP connector. No rebuild required. - Performant.
forge doctorcold-starts in 173 ms. The UI shell is a single 89 KB JavaScript file with zero CDN dependencies. Providers are probed in parallel with a 1.5 s timeout. - Open source. MIT license. No telemetry, no phoning home, no hidden backdoors. You get the whole stack. Unlike hosted assistants, Forge is fully inspectable, replayable, and yours.
Tip
Unlike Claude Code or OpenAI Codex, Forge is not a hosted assistant. It's local-first infrastructure. You pick & host the model. You approve the actions. Everything is inspectable, replayable, and yours.
# Option 1 β npm (global):
npm install -g @hoangsonw/forge
forge doctor # green checks + roleβmodel mapping
forge run "explain this repo"
# Option 2 β Docker:
docker run --rm -it \
-v forge-home:/data -v "$PWD:/workspace" \
ghcr.io/hoangsonw/forge-agentic-coding-cli:latest forge run "explain this repo"
# Option 3 β full stack (forge + ollama + dashboard):
docker compose -f docker/docker-compose.yml up -d
# open http://127.0.0.1:7823| Minimum | Notes | |
|---|---|---|
| Node.js | β₯ 20 (22 tested) | Enforced via package.json#engines. Not needed if you use Docker. |
| OS | macOS Β· Linux Β· Windows (WSL recommended) | better-sqlite3 ships prebuilds for darwin-x64, darwin-arm64, linux-x64, linux-arm64, win32-x64 β no compile step. |
| Disk | ~150 MB for node_modules; state under ~/.forge grows with history |
Override via FORGE_HOME. |
| RAM | Forge ~100 MB; your local model consumes its own RAM/VRAM | forge doctor cold-starts in ~170 ms. |
| Docker (alt path) | β₯ 25 | Multi-arch (amd64, arm64) image on GHCR. Zero host Node needed. |
| At least one model source | Ollama Β· LM Studio Β· vLLM Β· llama.cpp Β· Anthropic Β· OpenAI-compatible | forge doctor tells you which are reachable. |
Runtime npm dependencies (13, zero optional): @modelcontextprotocol/sdk, better-sqlite3 (native, prebuilt), chalk, cli-table3, commander, dotenv, ora, prompts, semver, undici, ws, yaml, zod. No Python, Rust, or Go toolchain.
Recommended (not required): ripgrep (fast grep tool path), git (diff/status tools + project-root detection), $EDITOR (used when you pick "Edit" on a plan).
See docs/INSTALL.md for per-OS notes and docs/SETUP.md for contributor setup.
Three surfaces, one runtime.
REPL (Interactive Terminal) Mode
REPL.mp4
CLI (Headless, One-shot run) Mode
CLI.mp4
Web UI Dashboard
UI.mp4
Every non-trivial task flows through the same pipeline. Nothing escapes it β no hidden shortcut, no "just this once" bypass.
flowchart LR
classDef step fill:#0f172a,stroke:#38bdf8,color:#f1f5f9,rx:4,ry:4
classDef gate fill:#1e1b4b,stroke:#a78bfa,color:#ede9fe,rx:4,ry:4
classDef term fill:#14532d,stroke:#10b981,color:#d1fae5,rx:4,ry:4
classDef fail fill:#450a0a,stroke:#f87171,color:#fee2e2,rx:4,ry:4
IN([user prompt]):::step --> CLASSIFY[classify]:::step
CLASSIFY --> PLAN[plan Β· DAG]:::step
PLAN --> VALID{valid plan?}:::gate
VALID -->|no| FIX[auto-fix]:::step --> VALID
VALID -->|yes| APPROVE{user approves?}:::gate
APPROVE -->|edit| PLAN
APPROVE -->|cancel| CANCEL([cancelled]):::fail
APPROVE -->|yes| EXEC[execute]:::step
EXEC --> STEP[next step]:::step
STEP --> TOOLS[iterative tool use]:::step
TOOLS --> VGATE{validation gate?}:::gate
VGATE -->|fail + budget| TOOLS
VGATE -->|fail + exhausted| RETRY{retries?}:::gate
VGATE -->|ok| DONE{more steps?}:::gate
RETRY -->|yes| STEP
RETRY -->|no| DIAG[diagnose]:::step --> FAIL([failed]):::fail
DONE -->|yes| STEP
DONE -->|no| VERIFY[reviewer]:::step
VERIFY --> VSUM{approves?}:::gate
VSUM -->|no| STEP
VSUM -->|yes| COMP([completed]):::term
Source: src/core/loop.ts. Retry cap is 3, then the
debugger agent diagnoses before the task is marked failed.
forge run "fix the failing login test" --mode heavy
β classified: bugfix Β· complexity=moderate Β· risk=low
β plan: 4 steps (analyze β locate β patch β run_tests)
β approve? [y/n/edit]
β executor: turn 1 β read_file src/auth/login.ts
turn 2 β grep "issuedAt" in src
turn 3 β apply_patch src/auth/login.ts
turn 4 β run_command "npm test -- auth.login"
β validate: typecheck β lint β
β reviewer: approved
β β Done. Files changed: src/auth/login.ts
Every task lives in exactly one of 10 statuses. Transitions are
enforced by LEGAL_TRANSITIONS β illegal moves throw state_invalid
with the legal-next list in recoveryHint.
stateDiagram-v2
[*] --> draft
draft --> planned: planner output
draft --> cancelled
planned --> approved: user approves
planned --> cancelled
planned --> blocked
approved --> scheduled
approved --> cancelled
scheduled --> running
scheduled --> cancelled
scheduled --> blocked
running --> verifying
running --> failed
running --> blocked
running --> cancelled
verifying --> completed
verifying --> failed
verifying --> running: reviewer bounces
completed --> draft: forge resume
failed --> draft: forge resume
blocked --> draft: forge resume
blocked --> cancelled
cancelled --> draft: forge resume
completed --> [*]
failed --> [*]
cancelled --> [*]
Source: src/persistence/tasks.ts#LEGAL_TRANSITIONS.
Each plan step runs a bounded modelβtool conversation, not a one-shot call. The model sees every tool result and can adapt within the same step β retry with different args, switch tools, or signal done.
sequenceDiagram
autonumber
participant L as loop.ts
participant E as executor.ts
participant M as model
participant T as tool
participant V as validator
L->>E: runStep(step)
loop up to maxExecutorTurns (mode-capped)
E->>M: prompt + schema (JSON-mode)
M-->>E: { actions[], summary, done? }
alt done && no failures
E-->>L: completed
else has actions
E->>T: execute each action
T-->>E: stdout / stderr / exitCode / error
E->>E: digest + append user turn
end
end
opt step wrote files & mode enables gate
loop up to maxValidationRetries
E->>V: typecheck / lint / tsc
alt passes
E-->>L: completed
else fails
E->>M: VALIDATION_FAILED Β· <output>
M-->>E: corrective actions
E->>T: execute
end
end
end
E-->>L: { toolResults, summary, filesChanged, completed }
Mode caps β read directly from src/core/mode-policy.ts:
| Mode | maxExecutorTurns | maxValidationRetries | allowMutations | maxAutoRisk |
|---|---|---|---|---|
| fast | 2 | 0 | β | low |
| balanced | 4 | 1 | β | medium |
| heavy | 8 | 2 | β | high |
| plan | 0β1 | 0 | β | low |
| execute | 4 | 1 | β | medium |
| audit | 3 | 0 | β | low |
| debug | 6 | 2 | β | medium |
| architect | 3 | 1 | β | medium |
| offline-safe | 3 | 1 | β | medium |
Four tiers with distinct retention and access cost:
flowchart TB
classDef hot fill:#450a0a,stroke:#f87171,color:#fee2e2,rx:4,ry:4
classDef warm fill:#451a03,stroke:#fb923c,color:#ffedd5,rx:4,ry:4
classDef cold fill:#0c4a6e,stroke:#38bdf8,color:#e0f2fe,rx:4,ry:4
classDef learn fill:#14532d,stroke:#10b981,color:#d1fae5,rx:4,ry:4
Q[retrieve.ts Β· query] --> H["Hot<br/>current-session facts<br/>src/memory/hot.ts"]:::hot
Q --> W["Warm<br/>recent tasks Β· SQLite<br/>src/memory/warm.ts"]:::warm
Q --> C["Cold<br/>project files Β· grep Β· AST<br/>src/memory/cold.ts"]:::cold
Q --> L["Learning<br/>patterns + confidence<br/>src/memory/learning.ts"]:::learn
H -.clear on task end.-> X([evict])
W -.age out after N days.-> X
L -.decay if unreinforced.-> L
- Hot β in-process per-task facts, cleared at task end.
- Warm β SQLite index of recent task metadata; powers "what was I doing yesterday" queries.
- Cold β lazy file/grep/AST index scoped to
projectRoot. No background indexer; populated on demand. - Learning β patterns keyed by
intent:scopewith confidence that evolves on success/failure. The planner reads the top-K patterns before producing every plan (seesrc/agents/planner.ts#learnedPatternBlock).
flowchart LR
classDef local fill:#0c4a6e,stroke:#38bdf8,color:#e0f2fe,rx:4,ry:4
classDef hosted fill:#3f1d5c,stroke:#a78bfa,color:#ede9fe,rx:4,ry:4
classDef route fill:#1e293b,stroke:#f1f5f9,color:#f1f5f9,rx:4,ry:4
ROUTER[router.ts Β· resolveModel]:::route
ADAPT[adapter.ts Β· resolveLocalModel]:::route
CB[circuit-breaker]:::route
RL[rate-limit]:::route
CACHE[prompt cache]:::route
COST[USD cost ledger]:::route
subgraph LOCAL[Local runtimes Β· auto-detected]
OLL["ollama<br/>:11434"]:::local
LMS["lmstudio<br/>:1234"]:::local
VLL["vllm<br/>:8000"]:::local
LCP["llamacpp<br/>:8080"]:::local
end
subgraph HOSTED[Hosted Β· opt-in]
ANT["anthropic"]:::hosted
OAI["openai-compat<br/>(OpenAI / Azure / LocalAI / Together / Groq / Fireworks)"]:::hosted
end
ROUTER --> ADAPT --> OLL & LMS & VLL & LCP
ROUTER --> ANT & OAI
ROUTER --> CB & RL & CACHE & COST
If your configured model isn't pulled on the provider, Forge picks the
best-fit installed model for each role via
src/models/local-catalog.ts +
src/models/adapter.ts. Cached per process,
warns once, never refuses to route.
| Runtime | Default endpoint | Override |
|---|---|---|
| Ollama | http://127.0.0.1:11434 |
OLLAMA_ENDPOINT |
| LM Studio | http://127.0.0.1:1234/v1 |
LMSTUDIO_ENDPOINT |
| vLLM | http://127.0.0.1:8000/v1 |
VLLM_ENDPOINT |
| llama.cpp server | http://127.0.0.1:8080/v1 |
LLAMACPP_ENDPOINT |
| OpenAI-compatible | env-configured | OPENAI_BASE_URL + OPENAI_API_KEY |
| Anthropic | hosted | ANTHROPIC_API_KEY |
| Role | Families preferred |
|---|---|
| architect / reviewer / debugger | Llama 3.x / 4.x, Mixtral, Command-R+, DeepSeek V3/R1, Mistral-Large |
| planner | Qwen 2.5/3, Llama 3.x, DeepSeek V3, Gemma 3, Mistral-Nemo, Command-R, Phi 4 |
| executor (code specialists) | DeepSeek-Coder, Qwen 2.5-Coder, CodeLlama, Codestral, StarCoder, Granite-Code, WizardCoder |
| fast | Phi 3/4, Gemma 2, TinyLlama, SmolLM, MiniCPM |
Unknown models are accepted too β Forge rates them as generic executors rather than refusing to route.
The agentic loop is cheap for the runtime but expensive for the model. Every step is a multi-turn tool-use conversation that returns strict JSON. Small models struggle with this in recognisable ways β please pick the right tool for the job.
| Work you want to do | Safe local floor | What fails below the floor |
|---|---|---|
| Pure chat ("explain closures") | any 3B instruct (phi-3:mini, gemma-3:2b) | fine β conversation fast-path bypasses tool use entirely |
| Summarize a file, explain a snippet | 7B instruct (qwen2.5:7b, llama3.1:8b) | summary is a line of "I read the file" instead of content |
| Single-file edits / small features | 7B+ code specialist (deepseek-coder:6.7b, qwen2.5-coder:7b) | picks wrong tool (run_command to write files), splits "create empty + edit" patterns, escalates to ask_user on tool errors |
| Multi-file refactors, new features | 14B+ code specialist or a hosted frontier model | plan quality drops; step IDs get inconsistent; validation retries exhausted |
| Architecture-level changes | hosted (Claude Opus/Sonnet, GPT-4 class) realistically | budgets blow out; changes go off-plan |
Forge ships with defences so a small model fails loudly instead of
silently corrupting files: the executor prompt spells out step-type β
tool mappings, ask_user rejects empty/too-short questions as
non-retryable, edit_file handles "create empty then fill" gracefully,
parent directories auto-create, provider warm-up is explicit, and the
router streams prose without jsonMode for narrator/conversation
paths. The result is that a small model will often tell you it can't
finish a task; it will rarely write the wrong code into a file.
If in doubt: configure a code specialist for the code role, keep
something lighter for fast, and set ANTHROPIC_API_KEY or
OPENAI_API_KEY as a fallback β the router uses the hosted provider
automatically when the local one fails or trips its circuit breaker.
forge config set models.code deepseek-coder:6.7b
forge config set models.planner qwen2.5:7b
forge config set models.fast phi3:mini
export ANTHROPIC_API_KEY=sk-β¦ # optional fallbackForge treats safety as load-bearing. These invariants are enforced in code, not convention:
flowchart TB
classDef ask fill:#1e1b4b,stroke:#a78bfa,color:#ede9fe,rx:4,ry:4
classDef allow fill:#14532d,stroke:#10b981,color:#d1fae5,rx:4,ry:4
classDef deny fill:#450a0a,stroke:#f87171,color:#fee2e2,rx:4,ry:4
REQ[tool invocation] --> CLASSIFY[classify risk Γ sideEffect Γ sensitivity]
CLASSIFY --> SANDBOX{path in sandbox? / cmd allow-listed?}
SANDBOX -->|no| BLOCK[hard-block Β· sandbox_violation]:::deny
SANDBOX -->|yes| GATE{risk Γ sideEffect}
GATE -->|low Β· read| AUTO[auto-allow]:::allow
GATE -->|med Β· write| ASK[ask user]:::ask
GATE -->|high Β· execute / network| STRICT[ask even with --skip-permissions]:::ask
ASK --> FLAGS{session flags?}
FLAGS -->|--allow-shell / --allow-files etc.| AUTO
FLAGS -->|--non-interactive| DENY[deny silently]:::deny
FLAGS -->|else| PROMPT[interactive prompt]
PROMPT -->|allow| AUTO
PROMPT -->|deny| DENY
AUTO --> EXEC[execute] --> TRUST[trust calibration<br/>auto-allow after N confirmations<br/>src/permissions/manager.ts]
| Invariant | Where |
|---|---|
Instruction precedence: System Safety > Page Rules > Mode Rules > Approved Plan > Project Defaults > User Preferences |
src/prompts/assembler.ts |
| Permission model = default deny | src/permissions/manager.ts |
--skip-permissions skips routine prompts only; critical/destructive still ask |
src/permissions/risk.ts |
| Retry cap = 3, then debugger escalates | src/core/loop.ts |
Hard limits: maxSteps=50 Β· maxToolCalls=100 Β· maxRuntimeSeconds=600 |
src/config/schema.ts |
| Untrusted content (web / MCP / retrieved) fenced as data, never instructions | src/security/injection.ts |
| Secrets redacted before every log, session entry, and prompt | src/security/redact.ts |
| Scoped filesystem sandbox; symlink-escape-proof via realpath | src/sandbox/fs.ts |
Destructive shell commands blocked (rm -rf /, sudo, fork bombs, curl-to-shell) |
src/sandbox/shell.ts |
| Credentials in OS keychain (macOS / libsecret / DPAPI) + AES-GCM fallback | src/keychain/ |
| Release artefacts: SHA-256 + Ed25519 signature verification | src/release/ |
flowchart LR
classDef ro fill:#1e293b,stroke:#64748b,color:#cbd5e1,rx:4,ry:4
classDef rw fill:#0c4a6e,stroke:#38bdf8,color:#e0f2fe,rx:4,ry:4
classDef big fill:#3f1d5c,stroke:#a78bfa,color:#ede9fe,rx:4,ry:4
FAST[fast Β· 2 turns]:::rw
BAL[balanced Β· 4 turns Β· default]:::rw
HEAVY[heavy Β· 8 turns Β· 2 validate retries]:::big
PLAN[plan Β· 0 turns Β· no mutations]:::ro
EXEC[execute Β· 4 turns]:::rw
AUDIT[audit Β· 3 turns Β· no mutations]:::ro
DEBUG[debug Β· 6 turns Β· 2 validate retries]:::rw
ARCH[architect Β· 3 turns]:::big
OFFLINE[offline-safe Β· 3 turns Β· never hosted]:::rw
Each mode is an enforceable budget β not a hint to the model. See
src/core/mode-policy.ts.
βΆ See each surface in action in DEMO.md β REPL walkthrough,
forge runone-shots, and the web dashboard.
24 subcommands. Full surface:
forge # REPL (default)
forge init # create ~/.forge + project .forge
forge run "<prompt>" # full agentic loop
forge plan "<prompt>" # plan-only
forge execute "<prompt>" # auto-approve + execute
forge resume [taskId] # resume any prior task (any status)
forge status # runtime state
forge doctor # health check + roleβmodel mapping
forge task list|search|delete # task history (SQLite-indexed); delete prompts (or -y)
forge session list|replay <id> # session JSONL inspection
forge model list # probe all providers
forge config get|set|path # configuration
forge mcp list|add|remove # MCP connections
forge skills list|new # skill management
forge agents list # custom agents
forge permissions reset|list # permission grants
forge daemon start|stop|status # optional background process
forge memory {hot|warm|cold} # memory inspection
forge cost # USD spend ledger
forge ui start # local dashboard at :7823
forge bundle {pack|unpack} # offline bundles
forge container up|down # compose wrapper
forge update [--check|--force] # self-update (REPL also checks on start, cache-gated)
forge migrate # DB migrations
forge changelog # local changelog view
forge dev # dev helpers
forge web {search|fetch} # web tools
forge spec {new|show|diff} # spec-driven development
--mode <m> fast|balanced|heavy|plan|execute|audit|debug|architect|offline-safe
--yes auto-approve plan
--skip-permissions skip routine prompts (high-risk still asked)
--allow-files pre-approve file writes for this session
--allow-shell pre-approve shell for this session
--allow-network pre-approve network tools
--allow-web pre-approve web search/fetch/browse
--allow-mcp pre-approve MCP tool calls
--strict confirm every action
--non-interactive deny all prompts silently (CI mode)
--deterministic fixed temperatures for reproducibility
--trace full trace (implies --debug)
--no-banner omit startup banner
flowchart TB
classDef g fill:#18181b,stroke:#f59e0b,color:#fef3c7,rx:4,ry:4
classDef p fill:#0c4a6e,stroke:#38bdf8,color:#e0f2fe,rx:4,ry:4
subgraph GLOBAL["~/.forge (global)"]
G1["config.json"]:::g
G2["instructions.md"]:::g
G3["skills/*.md"]:::g
G4["agents/*.md"]:::g
G5["mcp/*"]:::g
G6["models/"]:::g
G7["logs/forge.log"]:::g
G8["global/index.db β SQLite"]:::g
G9["projects/<hash>/tasks Β· sessions Β· events"]:::g
end
subgraph PROJECT["./.forge (per-project)"]
P1["config.json"]:::p
P2["instructions.md"]:::p
P3["skills/ (override global)"]:::p
P4["agents/"]:::p
P5["mcp/"]:::p
end
Paths resolved via src/config/xdg.ts β respects
XDG_* env vars on Linux.
---
name: conventional-commit
description: Enforce Conventional Commits in every commit message.
triggers: [commit, git]
---
When writing commit messages, use Conventional Commits:
feat(scope): β¦
fix(scope): β¦
refactor(scope): β¦Drop into ~/.forge/skills/ (global) or ./.forge/skills/ (project).
Project skills override global.
Both ~/.forge/instructions.md and ./.forge/instructions.md are
layered into every prompt via src/prompts/assembler.ts.
Precedence is: System Safety > Page > Mode > Plan > Project > User.
forge mcp list
forge mcp add <name> --transport stdio --command "β¦"
forge mcp add <name> --transport http --url https://β¦ --auth oauth2-pkce
forge mcp statusBoth stdio and HTTP-stream transports supported. OAuth 2.0 + PKCE or
API key auth. Tokens stored in the OS keychain.
Single hardened image (non-root, HEALTHCHECK, OCI labels, ~355 MB) that serves both CLI and UI.
βΆ Dashboard demo β
forge ui startdriving a full task end-to-end (plan approval, streamed model output, follow-up thread). More in DEMO.md.
# Pull (multi-arch: linux/amd64 + linux/arm64):
docker pull ghcr.io/hoangsonw/forge-agentic-coding-cli:latest
# One-shot CLI:
docker run --rm -it -v forge-home:/data -v "$PWD:/workspace" \
ghcr.io/hoangsonw/forge-agentic-coding-cli:latest forge run "explain this repo"
# Dashboard:
docker run --rm -p 7823:7823 -v forge-home:/data \
ghcr.io/hoangsonw/forge-agentic-coding-cli:latest forge ui start --bind 0.0.0.0
# Full stack (forge + ollama + UI):
docker compose -f docker/docker-compose.yml up -d
# or: podman-compose -f docker/docker-compose.yml up -dStack topology:
flowchart LR
classDef c fill:#0c4a6e,stroke:#38bdf8,color:#e0f2fe,rx:4,ry:4
classDef v fill:#18181b,stroke:#f59e0b,color:#fef3c7,rx:4,ry:4
OLLAMA["ollama<br/>:11434 Β· healthcheck"]:::c
UI["forge-ui<br/>:7823 Β· healthcheck Β· restart unless-stopped"]:::c
CORE["forge-core<br/>(on-demand via compose run)"]:::c
FH[forge-home Β· named volume]:::v
OM[ollama-models Β· named volume]:::v
OLLAMA --> OM
UI --> FH
CORE --> FH
UI --> OLLAMA
CORE --> OLLAMA
Full install guide: docs/INSTALL.md.
flowchart LR
classDef pass fill:#14532d,stroke:#10b981,color:#d1fae5,rx:4,ry:4
classDef gate fill:#1e1b4b,stroke:#a78bfa,color:#ede9fe,rx:4,ry:4
PR[PR / push] --> FMT["π¨ format"]:::pass
PR --> LINT["π§Ή lint"]:::pass
PR --> TYPE["π§ typecheck"]:::pass
PR --> TEST["π§ͺ test matrix<br/>Ubuntu + macOS Γ Node 20 + 22"]:::pass
TEST --> COV["π coverage"]:::pass
TYPE --> BUILD["ποΈ build"]:::pass
BUILD --> DOCKER["π³ docker-build"]:::pass
PR --> AUDIT["π audit"]:::pass
FMT & LINT & TYPE & TEST & BUILD & DOCKER & AUDIT & COV --> STATUS["π pipeline status<br/>GH step summary Β· fails if any required job failed"]:::gate
flowchart LR
classDef gate fill:#1e1b4b,stroke:#a78bfa,color:#ede9fe,rx:4,ry:4
classDef ship fill:#451a03,stroke:#fb923c,color:#ffedd5,rx:4,ry:4
TAG[git tag v*] --> GATE["π§ͺ pre-release gate<br/>build + full test suite"]:::gate
GATE --> ART["π¦ artifacts<br/>5 tarball targets"]:::ship
GATE --> DOCKP["π³ docker publish<br/>multi-arch β ghcr.io"]:::ship
ART --> MAN["π manifest + gh-release<br/>ed25519-signed"]:::ship
MAN --> NPM["π€ npm publish<br/>--provenance --access public"]:::ship
GATE & ART & DOCKP & MAN & NPM --> RSUM["π release status"]:::gate
Workflows: .github/workflows/ci.yml,
.github/workflows/release.yml,
.github/workflows/nightly.yml.
Full versioning & release playbook (SemVer policy, channels, signing,
hotfix flow, rollback, built-in updater): RELEASES.md.
flowchart TB
classDef surface fill:#0f172a,stroke:#38bdf8,color:#f1f5f9,rx:6,ry:6
classDef core fill:#082f49,stroke:#38bdf8,color:#e0f2fe,rx:6,ry:6
classDef agent fill:#1e293b,stroke:#a78bfa,color:#ede9fe,rx:6,ry:6
classDef io fill:#0f172a,stroke:#10b981,color:#d1fae5,rx:6,ry:6
classDef store fill:#18181b,stroke:#f59e0b,color:#fef3c7,rx:6,ry:6
subgraph S[User surfaces]
CLI["CLI (commander)"]:::surface
REPL["REPL (raw-mode editor)"]:::surface
UI["Dashboard (HTTP + WS)"]:::surface
end
ORCH["Orchestrator Β· src/core/orchestrator.ts"]:::core
LOOP["Agentic loop Β· src/core/loop.ts"]:::core
CLS["Classifier"]:::core
subgraph A[Agents Β· src/agents]
PL[planner]:::agent
AR[architect]:::agent
EX[executor]:::agent
RV[reviewer]:::agent
DB[debugger]:::agent
ME[memory]:::agent
end
subgraph I[I/O surfaces]
TOOLS["18 tools Β· src/tools"]:::io
MODELS["6 providers Β· src/models"]:::io
PERM["Permissions"]:::io
SAND["Sandbox (fs + shell)"]:::io
MCP["MCP bridge"]:::io
end
subgraph P[Durable state]
TASKS[tasks/*.json]:::store
SESS[sessions/*.jsonl]:::store
CONV[conversations/*.jsonl]:::store
IDX[SQLite index]:::store
MEM["memory/{hot,warm,cold,learning}"]:::store
end
CLI --> ORCH
REPL --> ORCH
UI --> ORCH
ORCH --> CLS --> LOOP
LOOP --> PL --> EX --> RV
RV --> LOOP
LOOP --> AR & DB & ME
EX --> TOOLS
TOOLS --> PERM & SAND & MCP
PL --> MODELS
EX --> MODELS
LOOP --> TASKS & SESS & CONV & IDX
ME --> MEM
Full map with every subsystem explained: docs/ARCHITECTURE.md.
xychart-beta
title "Executor turns per mode (hard runtime cap)"
x-axis ["plan", "fast", "audit", "architect", "offline-safe", "balanced", "execute", "debug", "heavy"]
y-axis "turns" 0 --> 8
bar [1, 2, 3, 3, 3, 4, 4, 6, 8]
git clone https://github.com/hoangsonww/Forge-Agentic-Coding-CLI && cd forge
npm install
npm run build # tsc + copy-assets
npm test # 548 tests across 97 files; all must pass
./bin/forge.js doctor| Task | Command |
|---|---|
| Build | npm run build |
| Watch | npm run build:watch |
| Tests | npm test |
| One test file | npx vitest run test/unit/<file>.test.ts |
| Coverage | npm run test:coverage |
| Typecheck | npm run typecheck |
| Lint / format | npm run lint Β· npm run format Β· npm run format:check |
| Metrics | bash scripts/metrics.sh |
| Docker | docker build -f docker/Dockerfile -t forge/core:dev . |
| REPL | ./bin/forge.js |
| Dashboard | ./bin/forge.js ui start |
Full guide: docs/SETUP.md.
| Target | Measured | How |
|---|---|---|
forge --help cold-start |
238 ms | time node bin/forge.js --help |
forge doctor cold-start |
173 ms | time node bin/forge.js doctor --no-banner |
UI app.js uncompressed |
89 KB | wc -c src/ui/public/app.js |
Landing index.html |
25 KB, self-contained, zero CDN | wc -c index.html |
| Full test suite | ~3.3 s wall-clock | npx vitest run |
| Container image | ~355 MB multi-arch non-root | docker images |
If you're a code-writing agent (Claude Code, Codex, Cursor, Aider, Cline, Continue, β¦) working in this repo, start here:
CLAUDE.mdβ Claude Code / Claude-family contextAGENTS.mdβ OpenAIAGENTS.mdconvention (used by Codex and most others)
Both files carry: canonical commands, hot paths, conventions, performance posture, security posture, and pre-completion checklist.
MIT. See LICENSE for more details.
Son Nguyen Β· sonnguyenhoang.com Β· github.com/hoangsonww
Thank you for checking out Forge! If you have any questions, feedback, or want to contribute, please open an issue or a pull request.