Skip to content

Latest commit

Β 

History

History
144 lines (99 loc) Β· 5.35 KB

File metadata and controls

144 lines (99 loc) Β· 5.35 KB

Operations β€” Overview

How to run ZeroClaw in production. The surface is intentionally small: one binary, one config file, one SQLite workspace. Most "operations" is "systemd and journald".

This section covers:

The shape of a deployment

A typical always-on ZeroClaw install is:

zeroclaw service (systemd / launchctl / Windows Service)
  β”œβ”€β”€ zeroclaw daemon                 β€” the long-running process
  β”‚   β”œβ”€β”€ gateway listener (:42617)   β€” REST / WebSocket / webhooks
  β”‚   β”œβ”€β”€ channel pollers             β€” Telegram, IMAP, Nostr relays, etc.
  β”‚   β”œβ”€β”€ channel listeners           β€” Discord / Slack / Matrix / WebSocket
  β”‚   β”œβ”€β”€ cron scheduler              β€” scheduled SOPs and jobs
  β”‚   └── agent loop (per session)    β€” provider call + tool execution
  β”œβ”€β”€ SQLite workspace                β€” ~/.zeroclaw/workspace/
  β”œβ”€β”€ config.toml                     β€” ~/.zeroclaw/config.toml
  β”œβ”€β”€ tool-receipts log               β€” ~/.zeroclaw/workspace/receipts/
  └── platform logs                   β€” journald / launchctl / Event Log

Everything except the binary can move β€” the workspace path is configurable, config paths resolve per environment (Homebrew vs. bootstrap vs. XDG), and log destinations are platform-native by default.

What to monitor

Four signals matter:

1. Service liveness

Is the process running?

# Linux
systemctl --user is-active zeroclaw

# macOS
launchctl list | grep -c com.zeroclaw.daemon

# Windows
sc query ZeroClaw | findstr STATE

If it's dying repeatedly, check Troubleshooting β†’ Daemon keeps restarting.

2. Channel health

Are channels connected? The gateway exposes /health/channels:

curl -s http://localhost:42617/health/channels | jq
{
  "telegram": {"status": "connected", "last_event_ago_secs": 12},
  "discord":  {"status": "connected", "last_event_ago_secs": 4},
  "email":    {"status": "polling",   "next_poll_in_secs": 42},
  "matrix":   {"status": "disconnected", "error": "401 Unauthorized"}
}

Monitor status != "connected" on push-based channels.

3. Provider reliability

Are LLM calls succeeding? /health/providers:

curl -s http://localhost:42617/health/providers | jq
{
  "claude": {"ok": true,  "last_latency_ms": 1240, "error_rate_1h": 0.0},
  "local":  {"ok": true,  "last_latency_ms": 3890, "error_rate_1h": 0.0}
}

For fallback chains, the meta-provider reports its current working child.

4. Tool-call volume and blocks

/metrics/tools (Prometheus format):

zeroclaw_tool_calls_total{tool="shell",outcome="success"} 342
zeroclaw_tool_calls_total{tool="shell",outcome="blocked"} 4
zeroclaw_tool_calls_total{tool="shell",outcome="denied"} 2
zeroclaw_tool_calls_total{tool="file_write",outcome="success"} 89

Blocks and denials are worth looking at β€” if the agent is repeatedly hitting the same policy block, either your policy is wrong or your agent is misbehaving.

Capacity

A single ZeroClaw instance can handle:

  • Multiple concurrent conversations across all channels
  • Tool calls at whatever rate the provider and sandbox allow
  • Long-running agent loops (tool chains of 20+ calls)

Scale laterally by running one instance per workspace. Don't try to run two daemons on the same workspace β€” SQLite's single-writer model will produce lock contention and ultimately corruption.

For multi-tenant hosting, see the proposal in #2765 (closed, historical β€” the architecture for in-process multi-workspace routing).

Backups

What to back up:

  • ~/.zeroclaw/config.toml β€” contains channel credentials (encrypted if using secrets store)
  • ~/.zeroclaw/workspace/*.db β€” SQLite conversation memory
  • ~/.zeroclaw/secrets.key β€” master key for the encrypted secrets store (if used). Without it, the config's secrets are unrecoverable.
  • ~/.zeroclaw/workspace/receipts/ β€” tool-receipts log

A plain tar czf zeroclaw-$(date +%F).tar.gz ~/.zeroclaw covers everything. Restic, borg, or Duplicacy work fine for incremental backups.

Do not back up ~/.zeroclaw/workspace/cache/ β€” it's regenerable and can be large.

Updates

The service does not auto-update. Subscribe to the release feed (GitHub releases or the Discord #releases channel β€” see Contributing β†’ Communication). Typical update cadence:

  1. Read the release notes
  2. Back up ~/.zeroclaw/
  3. Update the binary (brew upgrade, bootstrap re-run, or cargo install --force)
  4. zeroclaw service restart
  5. Verify /health/* endpoints return green

If the new version requires config migrations, the startup log emits a warning and the binary usually auto-migrates. Check zeroclaw config list to spot-check values after upgrade, and zeroclaw config migrate to apply any pending schema migrations manually.

See also