Ghost uses three storage backends: PostgreSQL for structured state, a GCS-backed data directory (/mnt/data) for Ghost internal runtime state, and a GCS-backed workspace directory (/mnt/workspace) for per-agent working files. When GCS is unavailable, both fall back to the local data/ directory.
Artifact versioning (bare Git repos at /mnt/repos/) is handled by the Artifact System — see artifact-system.md.
Connection is via DATABASE_URL, which is required. If it is unset, or the server cannot connect, or migrations fail, Ghost exits at startup (no file-only fallback).
Migrations are embedded in internal/store/pgstore/migrations/ and run automatically on startup.
agents — one row per agent instance. Static config (soul, goal, tools, agent) lives in the archetype markdown; this table holds runtime/dynamic state.
| Column | Type | Purpose |
|---|---|---|
| id | TEXT PK | Agent instance ID |
| archetype | TEXT | Archetype name |
| name | TEXT | Display name |
| owner_id | TEXT | Owner (empty string if unowned) |
| world_id | TEXT | World this agent belongs to |
| kind | TEXT | persistent or dynamic |
| status | TEXT | running or stopped |
| config_overrides | JSONB | Per-instance config overrides |
| metadata | JSONB | Arbitrary metadata |
| created_at | TIMESTAMPTZ | Creation time |
| updated_at | TIMESTAMPTZ | Last update |
| last_active_at | TIMESTAMPTZ | Last activity (nullable) |
Indexes on world_id, archetype, and owner_id (partial).
memories — dialogue summaries used for long-term agent memory.
| Column | Type | Purpose |
|---|---|---|
| id | BIGSERIAL PK | Auto-increment ID |
| agent_id | TEXT FK | → agents(id) ON DELETE CASCADE |
| content | TEXT | Dialogue summary text |
| turn_number | INT | Conversation turn |
| token_estimate | INT | Estimated tokens (for budget trimming) |
| created_at | TIMESTAMPTZ | Creation time |
Indexes on (agent_id, created_at DESC) and (agent_id, turn_number DESC).
agent_skills — tracks which skills each agent has acquired and whether they are active.
| Column | Type | Purpose |
|---|---|---|
| agent_id | TEXT FK | → agents(id) ON DELETE CASCADE |
| skill_name | TEXT | Skill identifier |
| active | BOOLEAN | Whether the skill is currently active |
| acquired_at | TIMESTAMPTZ | When the agent acquired the skill |
Primary key: (agent_id, skill_name).
tasks — work items assigned to agents. Created by orchestrator agents, executed by worker agents.
| Column | Type | Purpose |
|---|---|---|
| id | TEXT PK | Auto-generated UUID |
| world_id | TEXT | World this task belongs to |
| owner_id | TEXT FK | → agents(id), assigned agent |
| created_by | TEXT FK | → agents(id), agent that created it |
| title | TEXT | Short task title |
| description | TEXT | Detailed task description |
| status | TEXT | todo, wip, cancelled, or finished |
| priority | INT | Higher = more important (default 0) |
| metadata | JSONB | Arbitrary metadata |
| created_at | TIMESTAMPTZ | Creation time |
| updated_at | TIMESTAMPTZ | Last update |
Indexes on (owner_id, status) and world_id.
knowledge — world-level shared knowledge entries accessible to agents via permissions. See knowledge.md for the full design.
| Column | Type | Purpose |
|---|---|---|
| id | TEXT PK | Human-readable slug (e.g. company-handbook) |
| world_id | TEXT | World this entry belongs to |
| title | TEXT | Display name |
| content | TEXT | Knowledge body (markdown) |
| created_at | TIMESTAMPTZ | Creation time |
| updated_at | TIMESTAMPTZ | Last modification |
Index on world_id.
knowledge_access — per-agent permission grants on knowledge entries.
| Column | Type | Purpose |
|---|---|---|
| knowledge_id | TEXT FK | → knowledge(id) ON DELETE CASCADE |
| agent_id | TEXT | Agent granted access |
| permission | TEXT | read or write |
| granted_at | TIMESTAMPTZ | When access was granted |
Primary key: (knowledge_id, agent_id, permission). Index on agent_id.
token_usage_daily — pre-aggregated daily token usage buckets, one row per (agent, provider, model, day).
| Column | Type | Purpose |
|---|---|---|
| agent_id | TEXT | Agent that made the LLM calls |
| provider | TEXT | LLM provider (e.g. anthropic) |
| model | TEXT | Model name (e.g. claude-sonnet-4-20250514) |
| day | DATE | Calendar day (UTC) |
| input_tokens | BIGINT | Sum of input tokens for the day |
| output_tokens | BIGINT | Sum of output tokens for the day |
| calls | BIGINT | Number of LLM calls for the day |
Primary key: (agent_id, provider, model, day). Index on day.
Agent records, memories, skills, knowledge, tasks, schedules, token usage, and related features use the tables above. There is no DB-less mode.
Events, world state, and workspace files still use the filesystem under /mnt/data and the workspace root.
When world_id is set in the world YAML (e.g. world0), Ghost sets DataDir, workspace, prompt-debug, and artifact-repo roots to /mnt/data/{world_id}, /mnt/workspace/{world_id}, /mnt/prompt_debug/{world_id}, and /mnt/repos/{world_id} unless GHOST_DATA_DIR, WORKSPACE_DIR, GHOST_PROMPT_DEBUG_DIR, or ARTIFACT_REPOS_DIR are already set in the environment. Local Docker mounts the repo config/ tree at /app/config. Optionally set GHOST_CONFIG=/app/config/<file>.yaml in the environment for boot-time world, or load a world at runtime via the admin API / dashboard.
The data directory (default data/, or an explicit path from env / world_id above) holds Ghost internal runtime state.
/mnt/data/
├── events/
│ └── events.jsonl # Event WAL (append-only JSONL)
└── world/
└── state.json # Shared world state (JSON key-value map)
Agent memory, skills, tasks, and related state live in PostgreSQL only. agents/{id}/ may still contain legacy files from older runs; they are not written in normal operation when Ghost starts with a database.
| Path | Purpose | Writer |
|---|---|---|
events/events.jsonl |
Append-only event WAL, replayed on startup | internal/events/memory.go (WithWAL) |
world/state.json |
Shared world state (JSON key-value map) | internal/world/state.go |
With world_id set, the per-world root is /mnt/workspace/{world_id}/; each agent then uses /mnt/workspace/{world_id}/{agent_id}/. Without world_id, the legacy layout /mnt/workspace/{agent_id}/ applies when the workspace root is /mnt/workspace.
Each agent gets an isolated workspace directory under that root. This is the working directory for the agent's sandbox and for toolbox commands. Cross-agent access is supported.
Shared read-only directories (e.g. sites templates) are exposed to all agents via the WORKSPACE_SHARED_DIRS config. Agents access them through workspace.read("sites-template/..."). See agent-workspace.md for the full workspace design: layout, GCS mounts, shared directories, per-agent isolation, toolbox integration, and security.
Ghost can optionally publish events to an external message bus. The event store is always in-memory (backed by the WAL on disk); these publishers send a copy externally.
| Publisher | Env vars | Code |
|---|---|---|
| Google Pub/Sub | GIS_EVENT_PUBLISHER=pubsub, GIS_PUBSUB_PROJECT, GIS_PUBSUB_TOPIC |
internal/events/pubsub.go |
| NATS | GIS_EVENT_PUBLISHER=nats, GIS_NATS_URL, GIS_NATS_STREAM |
internal/events/nats.go |
These directories are read at startup and never written to at runtime:
| Directory | Content | Baked into image |
|---|---|---|
architect/worlds/{world}/ |
World definitions (world.md, laws.md, ghost-whisper.md, archetypes) |
No — Ghost uses /app/architect from gcsfuse in Kubernetes (bizs-*-shared-resource/architect) or a read-only bind mount of gis/architect in local Docker Compose |
architect/skills/ |
Skill definitions (SKILL.md + optional config) |
No — same as worlds |
architect/templates/ |
Business artifact templates (used by the Artifact System) | No — Artifact System uses TEMPLATES_DIR (bind mount or gcsfuse architect/templates), not the Ghost image |
config/ |
World config YAML (world-config.yaml, etc.) |
No (mounted) |
toolbox/ |
Toolbox manifests, binaries, and Dockerfiles | No (mounted) |
In production, containers use a mix of PVCs and gcsfuse mounts. Agent workspaces and bare repos are on PVCs. gcsfuse is used for Ghost internal data and shared read-only resources. Additional gcsfuse mounts use the numbered pattern GCS_BUCKET_2 / GCS_BUCKET_MOUNT_PATH_2 / GCSFUSE_OPTIONS_2, and optionally _3, _4, … up to _9.
| Mount point | Container | Backend | Purpose |
|---|---|---|---|
/mnt/workspace |
Toolbox-Devtools | PVC | Per-agent workspace (read-write) |
/mnt/repos |
Toolbox-Devtools + Artifact System | PVC | Bare git repos |
/mnt/shared-readonly |
Toolbox-Devtools | gcsfuse | Shared read-only resources (site templates, cursor config) |
/mnt/data (Ghost) |
Ghost | gcsfuse | Ghost internal data |
/mnt/prompt_debug |
Ghost | gcsfuse | Optional; prompt debug logs |
/app/architect |
Ghost | gcsfuse | Full architect tree (Kubernetes); prefix architect/ on bizs-*-shared-resource (read-only) |
Local Docker Compose: Ghost bind-mounts the repo’s architect/ directory to /app/architect (:ro) instead of using a second gcsfuse mount.
When GCS is not configured (GCS_BUCKET env var is empty), entrypoint.sh skips gcsfuse. The Go engine uses DataDir (default ./data) for everything, and WorkspaceDir falls back to DataDir. This preserves backward compatibility for local development without GCS.