Skip to content

Latest commit

 

History

History
140 lines (101 loc) · 4.69 KB

File metadata and controls

140 lines (101 loc) · 4.69 KB

Cloud Mode

Overview

Cloud mode (cloud_mode = true in config) enables Docker-backed terminal sessions with full lifecycle management. In this mode, the runner orchestrates containers instead of spawning local PTY processes.

How It Works

gRPC Routing

When cloud mode is enabled, gRPC session management RPCs (CreateSession, TerminateSession, ListSessions) return FAILED_PRECONDITION. Clients must use the REST API for session CRUD.

The AttachSession RPC continues to work — it routes through CloudSessionManager which attaches to the Docker exec instance instead of a local PTY.

REST API Takes Over

All session lifecycle operations go through the REST API:

Operation Endpoint
Create session POST /api/v1/projects/{id}/sessions
List sessions GET /api/v1/sessions
Stop session POST /api/v1/sessions/{id}/stop
Resume session POST /api/v1/sessions/{id}/resume
Delete session DELETE /api/v1/sessions/{id}

Initialization

relay-runner init --cloud

This creates:

  1. TLS certificates (CA, server cert)
  2. Data directory structure:
    <data_dir>/
    ├── projects/     # Git clones
    ├── workspaces/   # Git worktrees for session isolation
    ├── sessions/     # Session metadata
    └── relay.db      # SQLite database (WAL mode)
    
  3. SQLite database with migrations applied
  4. Bearer token for API authentication
  5. Config with cloud_mode = true

Session Lifecycle

  1. Create projectPOST /api/v1/projects triggers git clone
  2. Wait for ready — monitor via SSE project_state_changed event
  3. Create sessionPOST /api/v1/projects/{id}/sessions with a profile
  4. Container flow:
    • Pull image (if needed)
    • Create container with resource limits and labels
    • Create dedicated network for isolation
    • Start container with idle entrypoint
    • Create git worktree for the session
    • Exec attach with TTY
  5. Attach terminal — gRPC AttachSession for interactive I/O
  6. Stop/Resume — container is stopped/restarted; PTY is restartable without container recreation

Docker Orchestration

DockerOrchestrator (runner/src/docker.rs) manages containers via the bollard crate (async Docker client).

Container Configuration

  • Resource limits from [docker] config (memory, CPU, swap)
  • Seccomp profile (configurable, or Docker default)
  • Labels for tracking (relay.managed=true, relay.session_id=<uuid>)
  • Idle entrypoint — container stays alive, exec provides the interactive shell

Docker Socket Proxy

In the Docker Compose setup, the runner does not access the Docker socket directly. A tecnativa/docker-socket-proxy restricts API access to safe operations only:

  • Allowed: containers, images, exec, info, version, events, ping
  • Denied: auth, secrets, networks, volumes, build, swarm, etc.

Cloud Session Manager

CloudSessionManager (runner/src/cloud_session_manager.rs) coordinates:

  • Per-project concurrency limits via tokio::sync::Semaphore
  • State machine enforcement (Starting → Running → Stopped / Failed)
  • SQLite persistence of every state transition
  • Git worktree creation for each session
  • VT snapshot management for reconnect
  • Crash recovery: reconciliation at startup marks dead sessions as stopped

SQLite Persistence

Two domain entities in the Store (runner/src/store.rs):

Entity Key fields
Project id, name, git_url, state (Cloning/Ready/Failed)
Session id, project_id, branch, image, container_id, state (Starting/Running/Stopped/Failed), profile
  • WAL mode for concurrent async readers
  • Foreign keys enforced
  • Optimistic concurrency via updated_at WHERE clause

Reconciliation

On startup, before accepting connections:

  1. Dead sessions (DB says Running, Docker says dead) → mark Stopped
  2. Orphan containers (Docker has it, DB doesn't) → force remove
  3. Orphan networks (no attached containers) → remove

See Session Management: Reconciliation and ADR-009.

Configuration

cloud_mode = true
rest_port = 8080
data_dir = "/var/lib/relay-data"

[docker]
socket_path = "/var/run/docker.sock"
default_image = "ubuntu:24.04"
allowed_images = []
memory_limit = "2g"
cpu_limit = 2.0

[git]
clone_timeout_secs = 300
allowed_hosts = []

[[profiles]]
name = "claude-default"
# init_command = "claude"
# env_vars = { "NODE_ENV" = "production" }

The Docker image is specified at session creation, not in the profile. See domain model §5 for image resolution chain.

See Server Deployment for the full config reference.