docs(rfc): Computer Use for opencode#33082
Open
EtienneLescot wants to merge 2 commits into
Open
Conversation
Proposes a first-class, experimental Core-native computer-use feature for opencode: agent-driven screenshot + mouse + keyboard control of the host desktop. Provider-agnostic core with four adapter paths: - Anthropic native (computer_use typed block) - OpenAI native (computer-use-preview) - UI-TARS-1.5-7B native (Apache-2.0, local, SOTA on OSWorld) - Generic vision fallback for any other VLM Multi-OS backends (X11, Wayland, macOS, Windows, headless/CI), batch-session permission model, audit via the existing event stream, and a one-line 'opencode computer-use --local' setup command for the UI-TARS path. Includes pre-flight hardware check and detailed failure handling for /dev/uinput, UIPI, and Accessibility perms. Cross-references anomalyco#20490, anomalyco#20917, anomalyco#26772, anomalyco#30755, anomalyco#32945. Status: Draft, seeking maintainer sign-off on tool surface and permission model before implementation.
Contributor
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Related: #20490, #20917, #26772, #30755, #32945
This is a docs-only RFC PR seeking design alignment per
CONTRIBUTING.md("UI / core features must go through a design review with the core team before implementation"). No code changes.I noticed #20490 was closed by @rekram1-node with "closing due to issue spam" and #32945 explicitly asks to "agree on the shape here before a PR." So I'm opening this as a draft and would rather have the shape agreed in-thread than push implementation.
Type of change
What does this PR do?
Adds
rfcs/computer-use.md— a single design document for a computer-use feature in opencode (screenshot + mouse + keyboard, multi-OS).Key proposals (one paragraph each):
computer_use.enabled: false. Not a plugin.computer_use), OpenAI native (computer-use-preview), UI-TARS-1.5-7B native (local Apache-2.0, beats Claude 3.7 and CUA on OSWorld), and a generic-vision fallback./dev/uinput+ydotoold) / XWayland, macOS (Accessibility), Windows (UIPI documented), headless CI.max_steps_per_task: 50budget. Rejected per-action prompts as impractical UX.opencode computer-use --local: a one-line CLI to bootstrap vLLM + UI-TARS-1.5-7B locally, with a pre-flight hardware check.Full rationale, tool surface, per-OS failure modes, and phased delivery are in the RFC file.
How did you verify your code works?
Docs-only PR — no code.
I drafted this after reading the four open issues above, the UI-TARS-1.5 model card and paper, and opencode's own
CONTEXT.mdandAGENTS.md. The RFC cross-references each.I'd particularly like feedback on the open questions in the doc (which adapter ordering, what triggers graduation out of experimental, grid-overlay default for generic VLMs).
Checklist