Codebase Map

Last updated: 2026-02-26 · Doc version: 2.0.0

Read this first. Every file, class, and key function — one reference.

Package: `closed_claw/`

`init.py`

Empty package marker.

`cli.py`

Purpose: argparse CLI entrypoints for every user-facing command.

Symbol	Type	What it does
`cmd_init`	function	Creates DB, initializes graph wiring
`cmd_doctor`	function	Validates env: provider config, API key presence, sqlite/langgraph health
`cmd_run`	function	Runs a single task through the full coordinator graph
`cmd_agents`	function	Lists registered agents with manifest details
`cmd_agent`	function	Shows one agent in full detail (manifest + skill.md + memory count)
`cmd_runs`	function	Lists run history from DB
`cmd_audit`	function	Lists audit events from DB
`cmd_runlog`	function	Streams JSONL events for a specific run
`cmd_tools`	function	Lists supported tools; optionally shows one agent's allowlist
`cmd_delete_agent`	function	Removes a single agent from registry + capsule dir
`cmd_delete_all_agents`	function	Removes all agents from registry + capsule dirs
`cmd_setup`	function	Delegates to interactive `run_setup_wizard`
`cmd_cancel_run`	function	Sets a cancellation flag in the run log to stop a running task
`cmd_web`	function	Launches FastAPI web dashboard on `http://127.0.0.1:7860`
`_build_parser`	function	Constructs the argparse tree; register new commands here
`_migrate_legacy_agents`	function	Auto-migrates agent entrypoints to v14 on `init`/`run`
`cmd_rewrite_entrypoints`	function	Overwrites every capsule's `entrypoint.py` with the current shim template (use after bumping `CLOSED_CLAW_ENTRYPOINT_VERSION`)
`_sync_registry_index`	function	Syncs `agents/registry.json` from DB state
`main`	function	CLI entry (`python -m closed_claw.cli`) — opens menu if no subcommand

Subcommands: setup, init, run, agents, agent, runs, audit, runlog, cancel-run, doctor, tools, delete-agent, delete-all-agents, menu, web

How to add a CLI command: add cmd_<name>(args: Namespace) -> int then register a subparser in _build_parser().

`compat.py`

Purpose: Thin Pydantic v1/v2 compatibility shim.

Symbol	What it does
`BaseModel`	Re-export of `pydantic.BaseModel`
`Field`	Re-export of `pydantic.Field`

Always import BaseModel/Field from here, not directly from pydantic.

`config.py`

Purpose: Environment-backed runtime Settings loading.

Symbol	Type	What it does
`Settings`	dataclass	All runtime config as typed fields; source of truth for all paths
`Settings.from_env()`	classmethod	Reads `.env` + `os.environ`; call this at CLI entry, never at module init
`_load_dotenv`	function	Parses `.env` file to dict (no dependencies)
`_getenv`	function	Reads from `os.environ` first, then dotenv, then falls back to default

Key fields in Settings (34 total):

db_path — path to registry.db
agents_dir — root of all agent capsule dirs
run_logs_dir — where JSONL run logs go
llm_provider — siemens | openai | gemini | claude (default: siemens)
llm_model — provider-specific model string (default: qwen3-30b-a3b-instruct-2507 for siemens)
llm_timeout_sec — LLM call timeout in seconds
create_approval_mode / api_approval_mode — interactive | approve | deny | web
low_confidence_threshold — float; below this triggers create/reuse gate
siemens_api_key / openai_api_key / gemini_api_key / anthropic_api_key — provider keys
siemens_base_url / openai_base_url / gemini_base_url / anthropic_base_url — provider base URLs
extra_allowed_paths — filesystem sandbox roots for tool execution
subtask_max_attempts (default 2) — retry count for failed subtasks
max_tool_calls_per_agent (default 50) — intent count limit per agent run
max_agents_per_run (default 10) — cap on agents created/used in one run
max_subtasks_per_phase (default 4) — cap on subtasks per discovery/execution phase

`interactive.py`

Purpose: Rich-powered terminal interactive menu.

Symbol	Type	What it does
`run_main_menu`	function	Infinite loop rendering numbered options (setup/init/doctor/run/inspect/delete)

Called by main() in cli.py when no subcommand is given.

`setup_wizard.py`

Purpose: Guided provider/key setup with live verification before writing .env.

Symbol	Type	What it does
`run_setup_wizard`	function	Prompts for provider → model → API key → verifies → writes `.env`

Package: `closed_claw/agents/`

`factory.py`

Purpose: Creates agent capsule directories and manages the registry.json flat index.

Symbol	Type	What it does
`ENTRYPOINT_TEMPLATE`	str constant	Template for `entrypoint.py` (currently v14). Thin shim that `import`s and calls `closed_claw.runtime.agent_loop.main`. v14 replaced the v13 per-capsule loop body with a shared module so fixes land in every capsule via `cli rewrite-entrypoints` (no per-capsule code edit). Loop config (`agent_loop_max_steps`, `agent_loop_max_consecutive_errors`) is read from `config` at run time. Auto-migrated on `cli init` / `cli run`.
`AgentFactory`	class	Creates capsule dirs: `manifest.json`, `skill.md`, `memory.db`, `entrypoint.py`, `logs/`
`AgentFactory.create_capsule(name, description, ..., skill_content, skill_ids)`	method	Writes capsule files to `agents_dir/<agent_id>/`; updates `registry.json`; stores `skill_ids` in manifest
`AgentFactory.save_registry_index(path, manifests)`	static method	Overwrites `registry.json` with a fresh list

Version detection: The entrypoint template version tag (CLOSED_CLAW_ENTRYPOINT_VERSION=14) is read by _migrate_legacy_agents in cli.py to auto-upgrade older capsules. The shared loop body lives at closed_claw/runtime/agent_loop.py — see the closed_claw/runtime/ package section below.

Package: `closed_claw/coordinator/`

`state.py`

Purpose: Shared coordinator state shapes.

Symbol	What it is
`CoordinatorState`	`TypedDict` — all possible keys in graph state dict
`Candidate`	`TypedDict` — `{agent_id, score, reason}` entry in candidate lists

`graph.py`

Purpose: Builds and wires the LangGraph StateGraph.

Symbol	What it does
`build_graph(settings)`	Instantiates all sub-components, creates `StateGraph(dict)`, adds all nodes + edges, compiles and returns the runnable graph

Node order (edges):

ingest_task → decompose_task → execute_task_pool → validate_outputs
  → update_registry_and_audit → synthesize_final_response → END

To add a node: add it to CoordinatorNodes, then call graph.add_node(...) and graph.add_edge(...) in build_graph.

`nodes.py`

Purpose: All coordinator graph node implementations. ~1650 lines — the heart of the system.

Symbol	Type	What it does
`CoordinatorNodes`	class	Holds all node methods plus shared sub-components
`__init__`	method	Wires settings, registry, reranker, embedder, runner, factory, approval_gate, audit, tool_executor
`ingest_task`	async node	Validates task string, sets run_id/session_id, initializes state
`decompose_task`	async node	Uses LLM/heuristic to split task into atomic subtasks with role tags and dependencies (discovery phase)
`execute_task_pool`	async node	Two-phase execution: discovery phase → execution phase. Dependency-aware loop: resolves agents per role tag, runs them via `AgentRunner`, handles api_call_intent / tool_call_intent mid-flight
`validate_outputs`	async node	Checks all subtasks completed OK; marks failures
`update_registry_and_audit`	async node	Updates agent metrics (usage, success rate, latency); writes audit rows
`synthesize_final_response`	async node	LLM synthesis — calls LLM to merge all subtask results into a coherent final response string
`_handle_api_call_intent`	method	Routes api_call_intent through `ApprovalGate` and circuit breaker
`_handle_tool_call_intent`	method	Validates allowlist + executes via `ToolExecutor`
`_acquire_agent_for_role`	method	Embeds role description → semantic search → rerank → reuse or create
`_create_agent`	method	Calls `AgentFactory.create_capsule`, generates skill.md + tools_allowlist + skill_ids via LLM/heuristic
`_compose_system_prompt`	method	Loads base skill modules (Layer 1) + role overlay (Layer 2) into a single system prompt string
`_request_config_for_agent`	method	Builds `config` dict for `CoordinatorRequest` — includes `llm`, `system_prompt` (composed), and `tool_registry`
`_prepare_phase_pool`	method	Prepares a subtask pool for a given phase (discovery or execution)
`_execute_phase_pool`	method	Runs all subtasks in a phase pool with dependency-awareness
`_run_single_subtask`	method	Executes one subtask: acquires agent, runs it, handles retries. Runs `_verify_subtask_tool_execution` + `_verify_acceptance_criteria` before marking the subtask `completed`
`_find_reusable_capability_agent`	method	Reuse gate: semantic-similarity-primary search; role-tag overlap + required-tools subset as sanity filters; below-threshold candidates skipped; success-rate-weighted ranking
`_should_skip_discovery`	static method	Decides whether to skip discovery based on which subtask pools are populated (used by the `execute_task_pool` simple-classification fast path)
`_verify_subtask_tool_execution`	static method	Filesystem-side check that any tool actually ran for a tool-requiring subtask
`_verify_acceptance_criteria`	static method	Lightweight content check: matches eight canonical "I gave up" failure phrasings (safety net for the agent's self-reported `status`) plus the `$HOME` / `~` rejection for "absolute path" criteria
`_merge`	static method	Shallow-merges two state dicts
`_emit_runlog`	method	Writes a named event to the run's JSONL log

Legacy/unwired methods (not connected to the graph): embed_task, semantic_search, llm_rerank, human_gate_if_low_confidence, decide_reuse_or_create, create_agent_if_needed, dispatch_agents_async, approval_gate_for_api_calls, continue_or_deny_api_path, failure_recovery

Package: `closed_claw/embeddings/`

`provider.py`

Purpose: Text-to-vector embedding with graceful fallback.

Symbol	What it does
`EmbeddingProvider`	Wraps sentence-transformers if available; returns zero-vector of correct dim otherwise
`EmbeddingProvider.embed(text)`	Returns `list[float]` of length `embedding_dim`

Package: `closed_claw/observability/`

`runlog.py`

Purpose: Per-run JSONL event logging.

Symbol	What it does
`RunLogger`	Writes `{"event": ..., "payload": ..., "ts": ...}` lines to `.closed_claw/runs/<run_id>.jsonl`
`RunLogger.emit(event, payload)`	Appends one event line

Package: `closed_claw/policy/`

`approval.py`

Purpose: Human-in-the-loop approval gate for paid API calls / low-confidence agent creation.

Symbol	What it does
`ApprovalRequest`	Pydantic model — call type, provider, endpoint, cost, reason, session_id
`ApprovalDecision`	Pydantic model — approved bool, operator, timestamp, note
`ApprovalGate`	Prompts user with timeout; returns `ApprovalDecision`
`ApprovalGate.prompt(req, operator)`	Blocks on stdin with `timeout_sec` deadline

Gate modes (set via CLI flags or env vars):

interactive — shows prompt and waits
approve — auto-approves (CI / scripts)
deny — auto-denies
web — publishes to in-memory queue; web UI resolves via /api/approvals/*/decide

Web mode functions: get_pending_approvals(), resolve_approval(), _web_prompt()

`audit.py`

Purpose: Structured audit event persistence.

Symbol	What it does
`AuditStore`	Writes rows to `audit_events` table in `registry.db`
`AuditStore.record(event_type, agent_id, session_id, payload)`	Inserts one audit row

Package: `closed_claw/registry/`

`schema.sql`

Purpose: SQLite DDL. Tables:

agents — agent manifests (JSON blob + indexed fields)
agent_vectors — sqlite-vec virtual table for cosine search
runs — run history
agent_compositions — multi-agent composition records
provider_circuit_breakers — failure tracking per provider

Note: audit_events table is created dynamically by AuditStore._init_tables(), not in this DDL file.

`store.py`

Purpose: SQLite-backed CRUD and semantic vector search for agents and runs.

Symbol	What it does
`AgentManifest`	Pydantic model — full agent identity, metrics, tools, embedding. Includes `skill_ids: list[str]` for base skill module selection.
`SearchCandidate`	Dataclass — `{agent_id, score, description}` from vector search
`RegistryStore`	All DB operations
`RegistryStore.upsert_agent(manifest)`	Insert or update agent + vector
`RegistryStore.search_agents(vector, limit)`	Cosine similarity search via sqlite-vec
`RegistryStore.get_agent(agent_id)`	Fetch one `AgentManifest`
`RegistryStore.list_agents(limit)`	Fetch N agents ordered by usage
`RegistryStore.delete_agent(agent_id)`	Remove agent + vector row
`RegistryStore.record_run(...)`	Insert a run record
`RegistryStore.update_agent_metrics(...)`	Update usage_count, success_rate, avg_latency

`search.py`

Purpose: Reranking, task plan generation, and agent profile generation.

Symbol	What it does
`_BASE_SKILL_IDS`	`list[str]` constant — canonical list of available base skill module IDs; intended to match files in `agents/skills/` (those files don't exist on disk yet; see `_BASE_SKILL_DESCRIPTIONS`)
`_BASE_SKILL_DESCRIPTIONS`	`dict[str, str]` constant — short faithful scope description per skill ID; serves as the ground truth the profile-gen prompt sees until per-skill `.md` files exist
`_build_tool_catalog(allowed)`	Injects description + args_schema + side_effects per tool into the profile-gen prompt (Fix A — closes hallucinated-capability bug where bare names let the LLM invent tool constraints)
`_build_skill_catalog()`	Mirror of `_build_tool_catalog` for skill IDs — same hallucination-closing pattern, applied to skill modules
`_CANONICAL_ROLE_RULES` / `_derive_canonical_tags(task, description, tools)`	Deterministic regex-based tagging (`fs.list`, `db.read`, `tool.terminal`, ...) appended to LLM-generated tags so reuse tag-overlap doesn't depend on LLM tag stability
`RerankerProtocol`	Protocol interface — `.rerank(task, candidates) -> list[Candidate]`
`HeuristicReranker`	Keyword overlap + recency scoring; used internally as fallback when LLM calls fail
`LLMReranker`	Uses configured LLM provider to rerank candidates
`build_reranker(settings)`	Builds an `LLMReranker` using the configured provider
`generate_task_plan(settings, task, *, phase, discovery_results)`	Decomposes task text into subtask list via LLM/heuristic; supports `phase="discovery"` and `phase="execution"` with `discovery_results`
`generate_agent_profile(settings, task, supported_tools, fallback_tools)`	Generates `skill_md`, `tools_allowlist`, and `skill_ids` for a new agent
`_normalize_profile_payload(payload, task)`	Validates and normalises LLM-returned profile dict; filters `skill_ids` to valid `_BASE_SKILL_IDS` members; appends `_derive_canonical_tags(...)` output to `tags`

Package: `closed_claw/runtime/`

`protocol.py`

Purpose: All JSON-line protocol message models.

Symbol	Direction	What it is
`CoordinatorRequest`	Coordinator → Agent	Task, context, artifacts, config
`ApiCallIntent`	Agent → Coordinator	Request to call a paid external API
`ApiCallDecision`	Coordinator → Agent	Approval/denial for `ApiCallIntent`
`ToolCallIntent`	Agent → Coordinator	Request to run a tool
`ToolCallResult`	Coordinator → Agent	Tool execution result/error
`AgentResponse`	Agent → Coordinator	Final response: status, result, memory_updates, artifacts, metrics
`AgentMetrics`	(nested in AgentResponse)	latency_ms
`parse_agent_line(line)`	utility	Parses a raw stdout line into `ApiCallIntent

`runner.py`

Purpose: Subprocess lifecycle management for agent capsules.

Symbol	What it does
`AgentRuntimeError`	Exception raised on agent protocol violations or subprocess errors
`AgentRunner`	Launches `entrypoint.py`, drives JSON-line I/O loop, enforces retries and intent count limit
`AgentRunner.run_agent(agent_dir, request, on_intent)`	Sends `CoordinatorRequest`, loops on intents until `AgentResponse`, returns it

on_intent is a callback (intent) -> str that the coordinator uses to handle ApiCallIntent/ToolCallIntent and return the JSON-encoded response line.

`agent_loop.py`

Purpose: Shared ReAct loop body imported by every capsule's entrypoint.py shim (v14). Centralising it means a fix lands in every capsule without rewriting per-capsule code — cli rewrite-entrypoints bumps the shim version.

Symbol	What it does
`CLOSED_CLAW_ENTRYPOINT_VERSION`	int constant (currently `14`) read by `_migrate_legacy_agents`
`main()`	Subprocess entry. Reads `CoordinatorRequest` from stdin, runs the ReAct loop (observe → think → act), prints `AgentResponse` to stdout
`append_memory(memory_db, session_id, content)`	Persists one episodic memory row

Final-action protocol: the agent's final JSON action accepts an optional status field — {"type":"final","status":"ok"|"error","result":...,"artifacts":[]}. Emit status="error" only when the task cannot be completed with the available tools; the coordinator treats this as a subtask failure and propagates to the run's top-level status.

Package: `closed_claw/tools/`

`executor.py`

Purpose: Sandboxed tool execution.

Symbol	What it does
`SUPPORTED_TOOLS`	`list[str]` — canonical tool names
`TOOL_REGISTRY`	`dict` — name → description + args_schema (used for CLI display + LLM prompt)
`ToolExecutionError`	Exception for tool-level failures
`ToolExecutor`	Executes tools within allowed filesystem roots
`ToolExecutor.execute(tool, args)`	Dispatches to the right tool impl; returns `dict` result
`tool_registry_for_allowlist(allowlist)`	Returns filtered `TOOL_REGISTRY` subset for an agent

Tool implementations (all inside ToolExecutor):

_run_terminal(cmd, timeout_s) — subprocess shell command
_run_http_api(method, url, ...) — HTTP request via requests
_run_web_fetch(url, ...) — webpage fetch + text extraction
_run_file_io(op, path, ...) — list/read/write/append within allowed roots
_run_python_exec(code, timeout_s) — executes Python snippet in subprocess
_run_sql_query(db_path, query, params) — SELECT-only SQLite query

Package: `closed_claw/web/`

`server.py`

Purpose: FastAPI web dashboard + REST API. Launched via python -m closed_claw.cli web → http://127.0.0.1:7860.

API endpoints (27):

Method	Path	Description
GET	`/`	HTML dashboard (ui.html)
GET	`/api/health`	Health check
POST	`/api/health/verify`	Live-test LLM connection
POST	`/api/settings/apikey`	Verify + save API key
GET	`/api/system`	System metadata
GET	`/api/system/status`	Connection status
GET	`/api/settings`	Configurable settings catalog
POST	`/api/settings`	Save settings to .env
GET	`/api/agents`	List agents
GET	`/api/agents/{agent_id}`	Get single agent detail
DELETE	`/api/agents/{agent_id}`	Delete agent
POST	`/api/agents/delete-bulk`	Bulk delete agents
GET	`/api/skills`	List base skill modules
GET	`/api/skills/{skill_id}`	Get skill content
GET	`/api/runs`	List runs
GET	`/api/runs/enriched`	Enriched run list with analysis
GET	`/api/runs/{run_id}/analysis`	Detailed run analysis
GET	`/api/runlog/{run_id}`	Get run log events
GET	`/api/runlog/{run_id}/stream`	SSE stream of run events
GET	`/api/audit`	List audit events
GET	`/api/tools`	List tools + schemas
POST	`/api/tools/{tool_name}/test`	Sandbox tool test
GET	`/api/stats`	Dashboard stats
POST	`/api/runs`	Create/start a run from web UI
GET	`/api/runs/active`	List active runs
GET	`/api/runs/{run_id}/status`	Live run status
GET	`/api/approvals/pending`	Pending web-mode approvals
POST	`/api/approvals/{id}/decide`	Resolve approval
POST	`/api/runs/{run_id}/cancel`	Cancel run
GET	`/api/events/stream`	Global SSE dashboard stream
POST	`/api/init`	Init system from web

`ui.html`

Single-page HTML UI served by server.py. Full dashboard with agent management, run creation, settings editor, approval queue, and SSE event streaming.

Generated Directories (runtime, not in source control)

`agents/skills/`

Shared base skill library. Each file is a Markdown document describing tool usage patterns for one capability domain. These are not agent-specific — they are composed into any agent's system prompt based on manifest.skill_ids.

File	Domain
`terminal.md`	Shell execution, process management, packages
`python_scripting.md`	Python code via `python_exec` tool
`git.md`	Git workflow via `terminal` tool
`file_system.md`	`file_io` tool patterns
`web_http.md`	`http_api` and `web_fetch` tool patterns
`sql_databases.md`	`sql_query` tool, SELECT-only SQLite
`data_analysis.md`	Cross-tool data pipeline patterns

To add a module: create the .md file here, add its name to _BASE_SKILL_IDS, and add a short faithful scope description to _BASE_SKILL_DESCRIPTIONS (both in registry/search.py). The description is the only ground truth the profile-gen prompt sees while the .md files don't exist on disk.

`agents/<agent_id>/`

File	Purpose
`manifest.json`	`AgentManifest` — identity, tools, metrics, tags, skill_ids, embedding
`skill.md`	Role overlay (Layer 2) — agent-specific identity, decision rules, output format
`memory.db`	Per-agent SQLite episodic memory (key/value store)
`entrypoint.py`	Thin shim (v14) that imports and calls `closed_claw.runtime.agent_loop.main`; the loop body itself lives in that shared module
`logs/`	Per-run output artifacts

`agents/registry.json`

Flat JSON array of all registered agent manifests (human-readable mirror of DB).

`.closed_claw/registry.db`

Main SQLite database. Tables: agents, agent_vectors, runs, agent_compositions, provider_circuit_breakers. The audit_events table is created by AuditStore._init_tables() (not in schema.sql DDL).

`.closed_claw/runs/<run_id>.jsonl`

One file per run. Each line is a JSON event: {event, payload, ts}.

Test Coverage Map

Test file	Covers
`tests/integration/test_flow.py`	Full coordinator graph end-to-end
`tests/unit/test_protocol.py`	Protocol model parse/serialize
`tests/unit/test_manifest.py`	AgentManifest validation
`tests/unit/test_approval.py`	ApprovalGate modes (incl. web)
`tests/unit/test_config_env.py`	Settings.from_env() loading
`tests/unit/test_registry_audit.py`	RegistryStore + AuditStore operations
`tests/unit/test_tools_executor.py`	ToolExecutor sandboxing
`tests/unit/test_reranker_config.py`	Reranker selection logic
`tests/unit/test_agent_profile_generation.py`	generate_agent_profile
`tests/unit/test_agent_entrypoint_fallback.py`	Agent entrypoint fallback behavior
`tests/unit/test_coordinator_retry_loop.py`	Coordinator retry handling
`tests/unit/test_coordinator_two_phase_pool.py`	Two-phase task pool execution
`tests/unit/test_delete_all_agents.py`	delete-all-agents CLI command
`tests/unit/test_setup_wizard.py`	Setup wizard flow
`tests/unit/test_guardrails.py`	Guardrail enforcement (max_tool_calls, max_agents, etc.)

FilesExpand file tree

CODEBASE_MAP.md

Latest commit

History

CODEBASE_MAP.md

File metadata and controls

Codebase Map

Package: closed_claw/

__init__.py

cli.py

compat.py

config.py

interactive.py

setup_wizard.py

Package: closed_claw/agents/

factory.py

Package: closed_claw/coordinator/

state.py

graph.py

nodes.py

Package: closed_claw/embeddings/

provider.py

Package: closed_claw/observability/

runlog.py

Package: closed_claw/policy/

approval.py

audit.py

Package: closed_claw/registry/

schema.sql

store.py

search.py

Package: closed_claw/runtime/

protocol.py

runner.py

agent_loop.py

Package: closed_claw/tools/

executor.py

Package: closed_claw/web/

server.py

ui.html