Skip to content

docs(rfc): Computer Use for opencode#33082

Open
EtienneLescot wants to merge 2 commits into
anomalyco:devfrom
EtienneLescot:rfc/computer-use
Open

docs(rfc): Computer Use for opencode#33082
EtienneLescot wants to merge 2 commits into
anomalyco:devfrom
EtienneLescot:rfc/computer-use

Conversation

@EtienneLescot

@EtienneLescot EtienneLescot commented Jun 20, 2026

Copy link
Copy Markdown

Issue for this PR

Related: #20490, #20917, #26772, #30755, #32945

This is a docs-only RFC PR seeking design alignment per CONTRIBUTING.md ("UI / core features must go through a design review with the core team before implementation"). No code changes.

I noticed #20490 was closed by @rekram1-node with "closing due to issue spam" and #32945 explicitly asks to "agree on the shape here before a PR." So I'm opening this as a draft and would rather have the shape agreed in-thread than push implementation.

Type of change

  • Documentation

What does this PR do?

Adds rfcs/computer-use.md — a single design document for a computer-use feature in opencode (screenshot + mouse + keyboard, multi-OS).

Key proposals (one paragraph each):

  • Core-native, experimental. Tool surface, permission gate, and event-stream integration in Core, behind computer_use.enabled: false. Not a plugin.
  • Provider-agnostic with four autodetected adapters: Anthropic native (computer_use), OpenAI native (computer-use-preview), UI-TARS-1.5-7B native (local Apache-2.0, beats Claude 3.7 and CUA on OSWorld), and a generic-vision fallback.
  • Multi-OS backends: Linux X11 / Wayland (/dev/uinput + ydotoold) / XWayland, macOS (Accessibility), Windows (UIPI documented), headless CI.
  • Permission model: one prompt per session with a max_steps_per_task: 50 budget. Rejected per-action prompts as impractical UX.
  • opencode computer-use --local: a one-line CLI to bootstrap vLLM + UI-TARS-1.5-7B locally, with a pre-flight hardware check.

Full rationale, tool surface, per-OS failure modes, and phased delivery are in the RFC file.

How did you verify your code works?

Docs-only PR — no code.

I drafted this after reading the four open issues above, the UI-TARS-1.5 model card and paper, and opencode's own CONTEXT.md and AGENTS.md. The RFC cross-references each.

I'd particularly like feedback on the open questions in the doc (which adapter ordering, what triggers graduation out of experimental, grid-overlay default for generic VLMs).

Checklist

  • I have tested my changes locally (rendered the markdown to confirm structure)
  • I have not included unrelated changes in this PR

Proposes a first-class, experimental Core-native computer-use feature
for opencode: agent-driven screenshot + mouse + keyboard control of
the host desktop. Provider-agnostic core with four adapter paths:

- Anthropic native (computer_use typed block)
- OpenAI native (computer-use-preview)
- UI-TARS-1.5-7B native (Apache-2.0, local, SOTA on OSWorld)
- Generic vision fallback for any other VLM

Multi-OS backends (X11, Wayland, macOS, Windows, headless/CI),
batch-session permission model, audit via the existing event stream,
and a one-line 'opencode computer-use --local' setup command for
the UI-TARS path. Includes pre-flight hardware check and detailed
failure handling for /dev/uinput, UIPI, and Accessibility perms.

Cross-references anomalyco#20490, anomalyco#20917, anomalyco#26772, anomalyco#30755, anomalyco#32945.

Status: Draft, seeking maintainer sign-off on tool surface and
permission model before implementation.
@github-actions github-actions Bot added needs:compliance This means the issue will auto-close after 2 hours. and removed needs:compliance This means the issue will auto-close after 2 hours. labels Jun 20, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

@EtienneLescot EtienneLescot marked this pull request as ready for review June 20, 2026 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant