Maestro's headless mode is the embedding-oriented JSON-over-stdio contract used by native TUIs and external control planes such as EvalOps Chat.
Transport rules:
- stdin carries one JSON object per line into Maestro
- stdout emits one JSON object per line back to the client
- stderr is diagnostics only and is not part of the protocol contract
- startup failures should emit a fatal
errorprotocol message on stdout when the headless transport has been requested, with human diagnostics on stderr
The protocol is versioned. The runtime sends the version in ready and
hello_ok, and clients may send their version in hello.
Current version: 2026-04-02
Source of truth:
- generated constants: packages/contracts/src/headless-protocol-generated.ts
- runtime message shapes: src/cli/headless-protocol.ts
- transport implementation: src/cli/headless.ts
- hosted runner contract: docs/protocols/hosted-runner-contract.md
- hosted runner retention: docs/protocols/hosted-runner-retention.md
- conformance suite: docs/protocols/headless-conformance.md
Compatibility expectations:
- treat unknown fields as additive
- reject unknown message
typevalues unless your client intentionally ignores them - compare
protocol_versionduring handshake when you require exact compatibility
When Maestro runs through maestro hosted-runner, the HTTP server exposes a
Platform attach-fencing endpoint:
GET /.well-known/evalops/remote-runner/identityThe endpoint returns only runtime identity needed by Platform's headless gateway:
{
"protocol_version": "evalops.remote-runner.identity.v1",
"runner_session_id": "mrs_123",
"owner_instance_id": "pod_123",
"ready": true,
"draining": false
}runner_session_id comes from --runner-session-id, MAESTRO_RUNNER_SESSION_ID,
or REMOTE_RUNNER_SESSION_ID. owner_instance_id comes from
--owner-instance-id, MAESTRO_REMOTE_RUNNER_OWNER_INSTANCE_ID, or
REMOTE_RUNNER_OWNER_INSTANCE_ID.
This surface intentionally omits workspace, organization, user, token, and
prompt metadata. Platform compares the returned session and owner generation
with the control-plane session before proxying attach traffic. ready=false or
draining=true means the runtime should not receive new attach requests yet.
Hosted runner mode also exposes a local drain/snapshot hook for Platform:
POST /.well-known/evalops/remote-runner/drainThe request body is optional:
{
"reason": "ttl_expired",
"requested_by": "platform",
"export_paths": ["."]
}The endpoint immediately marks the runtime as draining, stops active headless
runtime work, flushes Maestro session and memory state, validates every
export_paths entry stays inside the hosted workspace root, and writes a
snapshot manifest. The manifest directory comes from --snapshot-root,
MAESTRO_REMOTE_RUNNER_SNAPSHOT_ROOT, or
REMOTE_RUNNER_SNAPSHOT_ROOT; the default is
.maestro/runner-snapshots under the workspace root.
{
"protocol_version": "evalops.remote-runner.drain.v1",
"status": "drained",
"manifest_path": "/workspace/.maestro/runner-snapshots/mrs_123-2026-04-23T00_00_00_000Z.json",
"manifest": {
"protocol_version": "evalops.remote-runner.snapshot-manifest.v1",
"runner_session_id": "mrs_123",
"workspace_id": "workspace_123",
"agent_run_id": "run_123",
"maestro_session_id": "session_123",
"reason": "ttl_expired",
"requested_by": "platform",
"created_at": "2026-04-23T00:00:00.000Z",
"workspace_root": "/workspace",
"runtime": {
"flush_status": "completed",
"session_id": "session_123",
"session_file": "/workspace/.maestro/agent/sessions/session.jsonl",
"protocol_version": "2026-04-02",
"cursor": 42
},
"workspace_export": {
"mode": "local_path_contract",
"paths": [
{
"input": ".",
"path": "/workspace",
"relative_path": ".",
"type": "directory"
}
]
},
"work_continuity": {
"protocol_version": "evalops.remote-runner.work-continuity.v1",
"codex_subagent_schema_version": "evalops.maestro.codex.subagent-workgraph.v1",
"active_tool_count": 0,
"tracked_tool_count": 0,
"pending_request_count": 0,
"codex_subagent_tool_call_ids": [],
"codex_subagent_child_run_ids": [],
"codex_subagent_thread_ids": []
},
"platform_evidence": {
"protocol_version": "evalops.remote-runner.platform-evidence.v1",
"event_type": "hosted_runner_drain_manifest_recorded",
"runner_session_id": "mrs_123",
"workspace_id": "workspace_123",
"agent_run_id": "run_123",
"maestro_session_id": "session_123",
"status": "drained",
"runtime_flush_status": "completed",
"manifest_path": "/workspace/.maestro/runner-snapshots/mrs_123-2026-04-23T00_00_00_000Z.json",
"manifest_protocol_version": "evalops.remote-runner.snapshot-manifest.v1",
"created_at": "2026-04-23T00:00:00.000Z",
"work_continuity": {
"protocol_version": "evalops.remote-runner.work-continuity.v1",
"codex_subagent_schema_version": "evalops.maestro.codex.subagent-workgraph.v1",
"active_tool_count": 0,
"tracked_tool_count": 0,
"pending_request_count": 0,
"codex_subagent_tool_call_count": 0,
"codex_subagent_child_run_count": 0,
"codex_subagent_thread_count": 0,
"codex_subagent_edge_count": 0,
"codex_subagent_tool_call_ids": [],
"codex_subagent_child_run_ids": [],
"codex_subagent_thread_ids": []
},
"retention": {
"policy_version": "evalops.remote-runner.retention.v1",
"control_plane_metadata_visibility": "operator",
"runtime_snapshot_visibility": "internal",
"redaction_required_before_external_persistence": [
"runtime_snapshot",
"runtime_logs"
]
},
"evidence_refs": [
"remote-runner://sessions/mrs_123/drain#manifest",
"maestro://headless/sessions/session_123#drain",
"platform-agent-run:run_123"
]
},
"retention_policy": {
"policy_version": "evalops.remote-runner.retention.v1",
"managed_by": "platform",
"visibility": {
"control_plane_metadata": "operator",
"workspace_export": "tenant",
"runtime_snapshot": "internal",
"runtime_logs": "operator"
},
"redaction": {
"required_before_external_persistence": [
"runtime_snapshot",
"runtime_logs"
],
"forbidden_plaintext": [
"provider_credentials",
"tool_secrets",
"attach_tokens",
"artifact_access_tokens",
"raw_environment"
]
}
},
"snapshot": {
"protocolVersion": "2026-04-02",
"session_id": "session_123",
"cursor": 42,
"last_init": null,
"state": {
"connection_count": 0,
"subscriber_count": 0,
"connections": [],
"pending_requests": [],
"pending_approvals": [],
"pending_client_tools": [],
"pending_mcp_elicitations": [],
"pending_user_inputs": [],
"pending_tool_retries": [],
"tracked_tools": [],
"active_tools": [],
"active_utility_commands": [],
"active_file_watches": [],
"is_ready": true,
"is_responding": false
}
}
}
}status=interrupted means Maestro entered draining mode and wrote a manifest,
but runtime flush failed before completion. Platform should treat the manifest
as a partial handoff record, stop sending attach traffic, and decide whether to
retry drain or terminate the pod. Maestro does not upload to GCS or require a
Cloud Storage mount; Platform/deploy own artifact upload, retention, and any
future resume controller behavior. The visibility and redaction rules for those
uploaded artifacts live in Hosted Runner Retention.
The runtime flush status is the field that controls restore readiness:
completed is attachable, failed is an interrupted restore, and skipped
means no runtime activity was persisted. Older local manifests that used
interrupted for the runtime flush status are treated as failed.
When Platform has already restored workspace artifacts and a prior snapshot manifest into the workspace, the Rust hosted runner can seed its runtime state from that manifest at startup:
MAESTRO_REMOTE_RUNNER_RESTORE_MANIFESTREMOTE_RUNNER_RESTORE_MANIFEST
Relative manifest paths resolve under MAESTRO_WORKSPACE_ROOT. Startup
validates the manifest protocol and workspace export paths against the current
workspace before binding the HTTP server. A restored runner keeps the new
runner_session_id for Platform identity, restores the logical Maestro session
id and cursor from the manifest, returns the restored state from
GET /api/headless/sessions/:id/state, and emits an initial SSE reset envelope
with reason restored_from_snapshot.
Only manifests with runtime.flush_status=completed report ready=true and
accept new controller/viewer attachments. failed or skipped manifests still
preserve the logical session id, cursor, and last snapshot for inspection, but
identity and runtime snapshots stay not-ready with last_error populated so
Platform can retry, quarantine, or terminate without silently attaching clients
to a partial restore.
Restore is deliberately local and provider-neutral. Maestro does not download artifacts, pick a provider object, or decide retention policy; Platform/deploy must hydrate the workspace and pass the manifest path before the runner starts.
Typical controller flow:
- Client starts Maestro in headless mode.
- Client sends
hello. - Maestro replies with
hello_ok. - Maestro emits
ready. - Client optionally sends
init. - Client sends
prompt.
Minimal hello:
{
"type": "hello",
"protocol_version": "2026-04-02",
"client_info": {
"name": "evalops-chat",
"version": "0.1.0"
},
"role": "controller"
}Handshake acknowledgement:
{
"type": "hello_ok",
"protocol_version": "2026-04-02",
"connection_id": "conn_123",
"client_protocol_version": "2026-04-02",
"role": "controller",
"server_capabilities": {
"server_requests": ["approval", "client_tool", "mcp_elicitation", "user_input", "tool_retry"],
"utility_operations": ["command_exec", "file_search", "file_watch", "file_read"],
"raw_agent_events": true,
"connection_roles": ["controller", "viewer"]
}
}Initial runtime state:
{
"type": "ready",
"protocol_version": "2026-04-02",
"model": "claude-opus-4-6",
"provider": "anthropic",
"executor_type": "live",
"session_id": null
}Connection roles:
controller- may send prompts, approvals, utility commands, and shutdown
viewer- read-only subscriber role
Negotiated client capabilities in hello.capabilities:
server_requests- supported request classes for approval and control-plane callbacks
utility_operationscommand_exec,file_search,file_watch,file_read
raw_agent_events- opt into raw internal agent events
Advertised server capabilities in hello_ok.server_capabilities:
server_requests- request classes this Maestro runtime can emit when a client advertises support
utility_operations- utility operations this Maestro runtime can host for clients
raw_agent_events- whether raw internal agent events are available as an opt-in stream
connection_roles- connection roles the runtime understands
Optional notification opt-outs in hello.opt_out_notifications:
statusheartbeatconnection_infocompaction
hello- handshake and capability negotiation
init- runtime configuration such as
system_prompt,append_system_prompt,thinking_level, andapproval_mode
- runtime configuration such as
prompt- starts or continues a run; supports
attachments
- starts or continues a run; supports
interrupt- requests clean cancellation of the active run
cancel- alias for
interrupt
- alias for
shutdown- graceful process termination
tool_response- resolves a legacy approval-gated tool call via
call_id
- resolves a legacy approval-gated tool call via
client_tool_result- returns structured content for a client-executed tool
server_request_response- resolves a
server_requestusingrequest_idandrequest_type
- resolves a
Supported server_request_response.request_type values:
approvalclient_toolmcp_elicitationuser_inputtool_retry
utility_command_startutility_command_terminateutility_command_stdinutility_command_resizeutility_file_searchutility_file_readutility_file_watch_startutility_file_watch_stop
hello_ok- handshake acknowledgement
ready- runtime-ready event with protocol version, active model/provider, and
executor_type executor_type=livemeans the runtime is backed by an external or local model provider;executor_type=replaymeans the session is driven by a deterministic scripted scenario and should be visibly badged as replay by clients and control planes
- runtime-ready event with protocol version, active model/provider, and
session_info- current
session_id,cwd, andgit_branch
- current
connection_info- current connection graph, controller lease, and subscriber state
compaction- session compaction summary and token counts
response_startresponse_chunk- streamed text or thinking;
is_thinking=truemarks reasoning content
- streamed text or thinking;
response_end- final usage and execution telemetry
response_end is the authoritative place to read:
usageinput_tokensoutput_tokenscache_read_tokenscache_write_tokenstotal_tokenstotal_cost_usdmodel_idprovider
tools_summarytools_usedcalls_succeededcalls_failedsummary_labels
duration_msttft_ms
tool_calltool_starttool_outputtool_endclient_tool_requestserver_requestserver_request_resolved
Supported server_request.resolution values:
approveddeniedcompletedfailedansweredretriedskippedabortedcancelled
Supported server_request.resolved_by values:
userpolicyclientruntime
Supported server_request_response.decision_action values for tool retries:
retryskipabort
utility_command_startedutility_command_resizedutility_command_outpututility_command_exitedutility_file_search_resultsutility_file_read_resultutility_file_watch_startedutility_file_watch_eventutility_file_watch_stopped
status- human-readable runtime status
error- recoverable or fatal error classification
raw_agent_event- full internal agent event stream when the client negotiated
raw_agent_events=true
- full internal agent event stream when the client negotiated
Supported error_type values:
transientfataltoolcancelledprotocol
- Treat
response_chunkas append-only. - Persist
response_end.usageandresponse_end.tools_summaryinstead of reconstructing totals from streamed chunks. - Use
initinstead of shell-interpolating system prompts or approval mode. - Viewer connections are intentionally limited; use
controllerfor active orchestration. - Prefer
server_request/server_request_responsefor new control-plane integrations.tool_call/tool_responseremains for legacy approval compatibility.
Managed EvalOps deployments can mirror the headless runtime surface onto the
shared platform event bus without enabling user training telemetry. Set
MAESTRO_EVENT_BUS_URL or EVALOPS_NATS_URL to publish typed CloudEvents to
JetStream subjects that match platform's maestro.* event catalog.
The shared publisher lives in @evalops/ai/telemetry and currently emits:
maestro.sessions.session.started|suspended|resumed|closedmaestro.events.approval_hitmaestro.events.sandbox_violationmaestro.events.firewall_blockmaestro.events.tool_call.attempted|completedmaestro.events.prompt_variant.selectedmaestro.events.context.learnedmaestro.events.skill.invoked|succeeded|failedmaestro.events.eval.scored
Use recordMaestroLearnedContext when a Maestro coding session learns a
durable, evidence-backed fact that future agents should be able to recall. The
event must include a stable learning_id, statement, claim-family
dimension, confidence score/reason, supporting evidence, and the normal
org/user/workspace/session/run correlation. Cerebro projects this event into a
learned-context document Thing plus an agent-authored Fact, so agents can query
it later without treating it as connector source truth.
Use maestroCorrelationToChronicleMetadata(correlation) when handing the same
run/session identity to Chronicle or agentd capture. It emits the stable
metadata keys consumed by Platform Chronicle evidence and Cerebro's Chronicle
consumer, including organization_id, user_id, workspace_id,
maestro_session_id, agent_run_id, agent_run_step_id,
tool_execution_id, trace_id, task_id, and source_issue.
For end-to-end Platform traceability, managed launchers should set the org and user identity environment variables before starting Maestro:
MAESTRO_EVALOPS_ORG_IDorEVALOPS_ORGANIZATION_IDMAESTRO_EVALOPS_USER_ID,EVALOPS_USER_ID, orMAESTRO_USER_IDMAESTRO_EVALOPS_WORKSPACE_IDorEVALOPS_WORKSPACE_IDMAESTRO_SESSION_ID,MAESTRO_AGENT_RUN_ID, and, for tool-level spans,MAESTRO_AGENT_RUN_STEP_ID
Maestro copies those values into CloudEvent extensions and OpenTelemetry span
attributes (evalops.organization_id, enduser.id, evalops.workspace_id,
maestro.session_id, and maestro.agent_run_id). Platform traces normalizes
those attributes into first-class trace fields, and Cerebro imports the same
event correlation into org/user/session/run/tool graph nodes.
When a session uses an EvalOps managed model provider, Maestro also forwards the
same content-free join keys to llm-gateway request metadata. The gateway
metadata includes agent_id, workspace_id, objective_id, run_id,
agent_run_id, agent_run_step_id, session_id, maestro_session_id,
trace_id, turn_id, tool_call_id, workload, and surface when those
values are available from stored managed-agent identity or the environment. This
lets Platform's AgentRuntime operating ledger attach model usage to the same run
as tool execution, approval, trace, and timeline evidence without copying raw
prompts or responses into operator surfaces.
For direct Cerebro MCP access through the EvalOps plugin, set
MAESTRO_PLATFORM_MCP_URL, MAESTRO_EVALOPS_AGENT_MCP_URL, or the manifest
form MAESTRO_PLATFORM_MCP_MANIFEST_URL, then grant scopes with
MAESTRO_CEREBRO_MCP_SCOPES or MAESTRO_PLATFORM_MCP_SCOPES. The URL can be
the public app base URL, the /mcp endpoint, or
/.well-known/evalops/agent-mcp.json; Maestro normalizes those forms to the
HTTP MCP endpoint. Use cerebro:read for recall-only agents and add
cerebro:assert only for agents allowed to write explicit learned facts.
Maestro forwards X-EvalOps-Workspace-Id, X-EvalOps-Session-Id,
X-EvalOps-Agent-Id, X-EvalOps-Agent-Run-Id, trace/request IDs, and scopes so
Cerebro can attribute every query and assertion to the user/org session.
The publisher conformance fixture used by Platform can be regenerated from the
same shared publisher with
tsx scripts/generate-maestro-publisher-conformance-fixture.ts.
The GitHub agent worker also records task session start/close events through
this shared publisher with MAESTRO_SURFACE=github-agent and task correlation
attributes, so platform subscribers can join worker runs to issue/PR work.
The Rust Ambient Agent daemon publishes session start/suspend/resume/close
events with source=maestro.ambient-agent when the same event-bus NATS
environment is configured.
It also publishes plan-level routing and outcome events:
maestro.ambient_agent.routing.selectedmaestro.ambient_agent.plan.cost_limitedmaestro.ambient_agent.plan.completed
These include the session correlation block plus repository, upstream event ID, task type, complexity, selected provider/model/tier, estimated cost, and final success/cost metadata when available. Maestro web surfaces the same model tier and Platform bus readiness fields in the fleet dashboard.
MAESTRO_TELEMETRY continues to control local training and diagnostic
telemetry. Audit-bus publishing is controlled separately with
MAESTRO_EVENT_BUS; set it to 0 or false to suppress bus writes even when
managed EvalOps routing is active.
To verify a live JetStream route without relying on best-effort runtime
publishing, run bun run smoke:event-bus with MAESTRO_EVENT_BUS_URL or
EVALOPS_NATS_URL configured. The smoke publishes a single
maestro.sessions.session.started CloudEvent and fails on connection or
publish errors.
For the larger remote-attach and control-plane architecture, see the companion design document: docs/design/HEADLESS_CONTROL_PLANE.md.