dotPilot Agent Control Plane Experience

Summary

dotPilot is a desktop-first control plane for local-first agent operations. It must let an operator manage agent profiles, provider toolchains, sessions, files, tools, approvals, telemetry, evaluation, and local runtimes from one workbench.

The product must support coding sessions, but it must not be limited to coding. The same architecture and UI should support research, analysis, orchestration, review, and operator workflows.

Scope

In Scope

desktop workbench shell
provider toolchain management for Codex, Claude Code, and GitHub Copilot
session composition for one or many agents
agent profiles and reusable roles
repository tree, file viewing, attachments, tool-call visibility, and Git workflows
MCP/tool federation and repo intelligence
local runtime selection through LLamaSharp and ONNX Runtime
telemetry, evaluation, replay, and policy-aware audit trails

Out Of Scope

implementing the actual runtime in this task
replacing the current Uno shell with a different product layout
adding MLXSharp in the first product wave

Product Rules

dotPilot must remain a desktop-first operator workbench, not only a prompt window.
The existing shell direction must be preserved:
- left navigation and workspace tree
- central session/chat surface
- right inspector or activity pane
- dedicated agent-builder/profile surface
A session must be able to use:
- one provider agent
- many provider agents
- a mixed provider plus local-model composition
The operator must be able to see which provider toolchains are installed, authenticated, outdated, misconfigured, or unavailable.
A session must expose plan, execute, and review states explicitly.
Files, screenshots, logs, diffs, and generated artifacts must be attachable and inspectable from the workbench.
Tool calls, approvals, and diffs must never be hidden behind opaque provider output.
Git flows must remain in-app for common operations:
- status
- diff
- stage
- commit
- history
- compare
- branch or worktree selection
Local runtime support must share the same event and session model as remote provider sessions.
Telemetry and evaluation must be first-class:
- OpenTelemetry-first runtime traces
- quality and safety evaluations through Microsoft.Extensions.AI.Evaluation*
- replay and export for session history
MLXSharp must not be planned into the first roadmap wave.
GitHub backlog items must describe product capability directly and must not mention blocked competitor names.

Primary Operator Flow

flowchart LR
  Open["Open dotPilot"]
  Check["Open Toolchain Center"]
  Configure["Verify provider versions, auth, settings"]
  Profile["Create or edit agent profile"]
  Compose["Create session and choose participating agents"]
  Work["Browse files, attach context, run plan/execute/review"]
  Approve["Approve tool calls, commands, writes, MCP actions"]
  Inspect["Inspect diffs, artifacts, telemetry, and evaluations"]
  Resume["Pause, resume, branch, or replay session"]

  Open --> Check --> Configure --> Profile --> Compose --> Work --> Approve --> Inspect --> Resume

Session Lifecycle

stateDiagram-v2
  [*] --> Plan
  Plan --> Execute
  Execute --> Review
  Execute --> Paused
  Review --> Execute
  Review --> Completed
  Execute --> Failed
  Paused --> Execute
  Failed --> Review

Main Behaviour

Toolchain and Provider Setup

The operator opens the settings or toolchain center.
The app detects whether each provider is installed and reachable.
The app shows:
- version
- auth status
- health status
- update availability
- configuration errors
The operator can run a connection test before starting a session.

Session Composition

The operator starts a new session.
The operator chooses one or more participating agents.
Each selected agent can bind to:
- a provider CLI or SDK-backed provider
- a local model runtime
The operator can pick role templates such as:
- coding
- research
- analyst
- reviewer
- operator
- orchestrator

Session Execution

The operator can browse a repo tree and open files inline.
The operator can attach files, folders, logs, screenshots, and diffs.
The session surface must show:
- conversation output
- tool calls
- approvals
- diffs
- artifacts
- branch or workspace context

Review and Audit

The operator can inspect agent-generated changes before accepting them.
The operator can inspect tool-call history and session events.
The operator can replay or export the session for later inspection.

Telemetry and Evaluation

The runtime emits OpenTelemetry-friendly traces, metrics, and logs.
The operator can inspect a local telemetry view.
Evaluations can score:
- relevance
- groundedness
- completeness
- task adherence
- tool-call accuracy
- safety metrics where configured

Edge and Failure Flows

Provider Missing or Outdated

If a provider is not installed, the toolchain center must show that state before session creation.
If a provider is installed but stale, the app must show a warning and available update action.
If auth is missing, the app must not silently fail during the first live session turn.

Mixed Session with Partial Availability

If one selected agent is unavailable, the operator must be told which agent failed and why.
The operator can remove or replace the failing agent without recreating the entire session conceptually.

Approval Pause

When a session reaches an approval gate, it must move to a paused state.
The operator must be able to resume the same session after approval.

Local Runtime Failure

If a local runtime is incompatible with the selected model, the operator must see a compatibility error rather than silent degraded behavior.

Telemetry or Evaluation Disabled

The app must continue to function if optional trace export backends are not configured.
The app must surface which evaluation metrics are active and which are unavailable in the current environment.

Verification Strategy

Documentation and Planning Verification

docs/Architecture.md reflects the same boundaries described here.
docs/ADR/ADR-0001-agent-control-plane-architecture.md records the architectural choice and trade-offs.
docs/Features/control-plane-domain-model.md captures the reusable issue #22 contract relationships for agents, sessions, fleets, providers, runtimes, approvals, artifacts, telemetry, and evaluations.
docs/Features/runtime-communication-contracts.md captures the shared result/problem language from issue #23 for runtime public boundaries.
GitHub issues map back to the capabilities and flows in this spec.

Future Product Verification

Uno.UITests cover the workbench shell, toolchain center, session composition, approvals, and Git flows.
provider-independent runtime and session tests use an in-repo deterministic test client so CI can validate agent flows without external provider CLIs.
tests that require real Codex, Claude Code, or GitHub Copilot toolchains run only when the matching toolchain is available in the environment.
UI tests cover each feature's visible interactive elements plus at least one complete operator flow through the affected surface.
integration tests cover provider adapters, session persistence, replay, and orchestration flows.
local runtime smoke tests cover LLamaSharp and ONNX Runtime.
evaluation harness tests exercise transcript scoring and regression detection.

Definition of Done

The repository contains:
- updated governance reflecting the product direction
- updated architecture documentation
- an ADR for the control-plane architecture
- this executable feature spec
- a GitHub issue backlog that tracks the approved roadmap as epics plus child issues
The issue backlog is detailed enough that implementation can proceed feature by feature without re-inventing the scope.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dotPilot Agent Control Plane Experience

Summary

Scope

In Scope

Out Of Scope

Product Rules

Primary Operator Flow

Session Lifecycle

Main Behaviour

Toolchain and Provider Setup

Session Composition

Session Execution

Review and Audit

Telemetry and Evaluation

Edge and Failure Flows

Provider Missing or Outdated

Mixed Session with Partial Availability

Approval Pause

Local Runtime Failure

Telemetry or Evaluation Disabled

Verification Strategy

Documentation and Planning Verification

Future Product Verification

Definition of Done

FilesExpand file tree

agent-control-plane-experience.md

Latest commit

History

agent-control-plane-experience.md

File metadata and controls

dotPilot Agent Control Plane Experience

Summary

Scope

In Scope

Out Of Scope

Product Rules

Primary Operator Flow

Session Lifecycle

Main Behaviour

Toolchain and Provider Setup

Session Composition

Session Execution

Review and Audit

Telemetry and Evaluation

Edge and Failure Flows

Provider Missing or Outdated

Mixed Session with Partial Availability

Approval Pause

Local Runtime Failure

Telemetry or Evaluation Disabled

Verification Strategy

Documentation and Planning Verification

Future Product Verification

Definition of Done