Skip to content

Latest commit

 

History

History
196 lines (126 loc) · 11.7 KB

File metadata and controls

196 lines (126 loc) · 11.7 KB

Proof

Overview

Proof in ADE is intentional, not auto-captured. The agent does computer use however it wants — claude's computer_use, the codex shell, a scripted browser, whatever. ADE does not wrap, proxy, or observe external tools. When the agent (or the user) decides that a moment deserves evidence, the agent runs the ade proof CLI or promotes an ADE Browser scratch observation with ade browser proof. Those commands are the intentional proof interface.

The old system sat upstream of the agent and tried to normalize every backend. It carried a readiness model, a policy surface (off/auto/enabled), per-phase coverage requirements, an artifact broker, an auto-observer, and a separate tool-delivery path. All of that is gone. What stays is a tiny CLI, a single SQLite table, and a drawer in the UI.

The result: one interface for all models, no backend matrix, no coverage math. A proof set is a handful of captioned screenshots a reviewer can skim in under a minute.

Runtime ownership

Proof storage and the broker are owned by the ADE runtime (ade serve) that owns the project. Artifacts on disk live under the runtime machine's .ade/artifacts/computer-use/ directory; the SQLite rows live in that runtime's .ade/ade.db. For local projects that is the user's machine; for remote projects it is the remote machine. The desktop renderer and the headless ADE CLI both call into the broker over JSON-RPC; nothing about the proof pipeline lives in the renderer or in a separate host process.

That means: proof captured during a remote-runtime session lives on the remote host. The desktop drawer fetches preview bytes through the same SSH-tunneled JSON-RPC channel as the rest of the remote project surface; raw artifact files are not synced back to the desktop machine, and proof is only viewable while the runtime that captured it is reachable.


CLI reference

Common subcommands under ade proof print a JSON summary on success and exit non-zero on failure. Use ade help proof for the complete current flag list.

ade proof capture

Take a screenshot now and file it as proof for the current session.

ade proof capture [--caption "<text>"] [--owner-kind chat|lane] [--owner-id <id>]
  • --caption — short free-text label. Prominent in the drawer grid.
  • Owner flags — override inferred owner (see below). Rarely needed.

Example:

ade proof capture --caption "logged in as admin"
ade proof capture --caption "order #1234 submitted, confirmation visible"

Exit codes: 0 success, 2 capture failed (screencapture unavailable, unsupported OS), 3 owner could not be resolved.

ade proof attach

Promote an existing image, video, or browser trace file to proof. Useful for headless-browser screenshots, Playwright traces rendered as PNG, or anything the agent produced out-of-band.

ade proof attach <path> [--caption "<text>"] [--title "<text>"] [--owner-kind ...] [--owner-id ...]

The CLI infers the proof kind from the file extension:

Extension Inferred kind
.png, .jpg/.jpeg, .webp, .gif, .heic/.heif, .tif/.tiff screenshot
.mov, .mp4, .m4v, .webm video_recording
.zip, .har browser_trace
anything else browser_verification

Example:

ade proof attach /tmp/playwright-run/checkout-success.png --caption "checkout flow completes on Firefox"

The file is copied into .ade/artifacts/computer-use/; the original is left in place. Internally attach calls the same ingest_computer_use_artifacts RPC tool with backendStyle: "manual" and backendName: "ade-cli".

ade proof list

Print the proof set for the current session as JSON.

ade proof list [--owner-kind chat|lane] [--owner-id <id>] [--limit <n>]

No args: lists the inferred session. Primarily for agents to see what they have already captured.

Other proof commands

  • ade proof status --text shows capture/back-end capabilities.
  • ade proof record --seconds <n> records a short video proof where supported.
  • ade proof launch, ade proof interact, and ade proof environment are lower-level computer-use helpers for capture workflows.
  • ade proof ingest --input-json ... ingests externally produced artifacts directly through the proof broker.

Owner inference

The CLI resolves the owner of a capture from environment variables set by the desktop app when it spawns an agent subprocess:

Env var Owner kind Precedence
ADE_CHAT_SESSION_ID chat highest
ADE_LANE_ID lane lowest

Agents spawned inside ADE pick up the right owner automatically. If more than one var is set — e.g. a chat also has a lane — the highest-precedence kind wins.

If no env var is set and no --owner-kind/--owner-id flags are passed, ade proof capture exits with code 3. This is deliberate: an un-owned proof has no home in the UI.

Explicit owner on RPC tools

The screenshot_environment, record_environment, ingest_computer_use_artifacts, get_environment_info, interact_gui, and list_computer_use_artifacts JSON-RPC tools accept explicit ownerKind + ownerId fields. resolveComputerUseOwners in apps/ade-cli/src/adeRpcServer.ts is the single normalizer:

  • Canonical kinds: lane, chat_session, automation_run, github_pr, linear_issue.
  • Friendly aliases: chatchat_session, prgithub_pr. Any other value raises a JsonRpcError(invalidParams) with an "Unsupported proof ownerKind" message.

Explicit owners are added in addition to the session identity inferred from ADE_* env vars, so an agent can attach the same artifact to its current chat plus a specific PR in one call.


Storage

Images live on disk under the project's .ade/ scaffold on the runtime host:

<runtime host>/<project root>/.ade/artifacts/computer-use/<uuid>.<ext>

(Path will move to .ade/artifacts/proof/ in a future phase.)

Metadata is a single SQLite row per capture in computer_use_artifacts, with ownership links in computer_use_artifact_links. The columns relevant to the new system are a small subset of what the table carries today: id, kind (screenshot for captures; attached files are normalized to screenshot, video recording, browser trace, or browser verification by extension), uri, mime_type, caption, created_at, plus the owner link row.

There is no retention policy — captures persist until the project is cleaned up. Disk is the budget; nothing ages out automatically. For remote-runtime projects, the disk being filled is the remote host's, not the desktop machine's.

ADE browser-agent observations are intentionally not proof. ade browser observe and post-action browser observations write scratch PNG/JSON files under .ade/cache/browser-observations/, include a bounded DOM element list plus console/network diagnostics by default for agent targeting, can add a numbered visual UI map with --map, and prune to the latest 3 observations per tab by default. DOM elements carry short-lived handles such as obs-...:e:3 so agents can click/fill/press/wait without another hit-test, including same-origin iframe/open-shadow-root targets when the observation captured that context. ade browser session start --tab <id> only creates a reusable tab-targeting handle for repeated agent actions; session observations and traces are still scratch state until promoted. ade browser trace --tab <id> or ade browser trace --browser-session <id> exposes the bounded per-tab action log for debugging but remains scratch state. Promote only reviewer-facing checkpoints into proof through ade browser proof --tab <id> --caption "...", ade browser proof --browser-session <id> --caption "...", the shorthand ade browser session proof <session-id> --caption "...", or the lower-level ade proof attach / ingest commands. The proof broker allows ADE's cache root as an import source so browser scratch PNGs can be promoted without copying them elsewhere.


Drawer UI

Proof surfaces in two places:

  • Chat — the proof drawer below the composer shows a thumbnail grid for the current session. Captions are rendered in full below each thumbnail, not as hover tooltips. Click to preview at full size.
  • Lane and PR review — linked proof can be surfaced alongside lane work and PR closeout.

Review controls (accept / reject / annotate) remain as first-class actions on each proof.


For agents

When an agent session starts inside ADE, the system prompt includes a short priming directive:

When you reach a checkpoint worth showing — a login succeeds, a form submits, an error reproduces, a test passes — run ade proof capture --caption "<short description>". Captions are what reviewers skim; write them like a teammate is reading them.

A good proof set is three to eight captures with captions a reviewer can read in one pass. Avoid dumping a screenshot after every click. Avoid captions like "screenshot 3"; prefer the exact state being proven.


Not supported

  • Cinematic post-processing. No before/after stitching, no annotated overlays — deferred.
  • Cross-device sync. Proof records replicate via cr-sqlite, but the image files do not — proof is viewable only on the device that produced it.
  • Auto-capture. The old proof observer is gone. Nothing watches the agent and files screenshots for it.

Headless-browser screenshots are supported — use ade proof attach with the output file path.

proof capture, proof record, proof environment, proof launch, and proof interact set preferHeadless: true on the CLI plan: the connection layer drops to headless mode unless --socket is explicitly passed. This lets agent subprocesses capture proof without depending on the machine runtime endpoint being live; visual proof state still flows back to the broker on the next reconcile.


Architecture

  agent (any model, any runtime host)
      │
      │  shell invocation
      ▼
  ade proof capture --caption "…"
      │
      │  JSON-RPC over ~/.ade/sock/ade.sock when socket-backed
      ▼
  proof action (runtime: ade serve)
      │
      ├── screencapture  ─► <runtime host>/.ade/artifacts/computer-use/<uuid>.png
      │
      └── computerUseArtifactBrokerService
              │
              │  SQLite insert into <runtime host>/.ade/ade.db
              ▼
          computer_use_artifacts + …_artifact_links
                                     │
                                     ▼
                          drawer UI (renderer reads via
                          window.ade.proof.* → preload →
                          local or remote runtime RPC)

The broker (apps/desktop/src/main/services/computerUse/computerUseArtifactBrokerService.ts) is the only ingest path — both the ade proof CLI and any in-process call go through it. The same module is loaded by the desktop main process for local projects and by the standalone ade serve runtime for headless / remote use. Supporting modules in the same directory:

  • controlPlane.ts builds owner snapshots + backend status for the UI.
  • localComputerUse.ts reports macOS-only proof-capture capabilities (screencapture, app launch, GUI interaction). Reflects the runtime host's environment, not the desktop machine's.
  • agentBrowserArtifactAdapter.ts parses agent-browser output into ComputerUseArtifactInput[].
  • syntheticToolResult.ts produces tool-result stubs for the Claude compaction path.

Every piece upstream of the CLI is the agent's own business. Every piece downstream is a thin line to disk, a broker insert, and the drawer. No backend abstraction, no policy engine, no observer — the proof observer was deleted with this rebuild, along with ComputerUsePolicy and the Settings > Computer Use panel.