Skip to content

microsoft/shell-use

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

117 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

shell-use

Warning

Work in progress. shell-use is still being built out, so commands and behavior may change between releases & installation instructions may not yet work.

shell-use is a rust powered cli for controlling, inspecting, testing, and recording shell sessions and terminal apps. It supports all standard terminal actions (send keys, mouse clicks) & user actions (screenshot, record sessions), & testing (matches screenshot, contains text). shell-use supports Windows, Linux, & macOS and it supports a wide range of shells (see Supported shells).

Install

homebrew (macOS/linux)

brew tap microsoft/shell-use https://github.com/microsoft/shell-use
brew install shell-use

winget (windows)

winget install Microsoft.ShellUse

download from releases

Download the latest release.

Quick start

Run a command and check the result:

shell-use open                  # start a shell session (auto-starts the daemon)
shell-use submit "echo hello"   # type the command, press Enter
shell-use wait command          # block until it finishes
shell-use expect text "hello"   # assert it showed up
shell-use expect exit-code 0    # assert it exited 0
shell-use close

Drive a full-screen TUI the same way:

shell-use run vim file.txt
shell-use wait idle             # let the screen settle
shell-use press i
shell-use type "some text"
shell-use press Escape : w q Enter
shell-use wait exit

Built for agents

shell-use is an AI native CLI. Point yours at the built-in docs and it can serve itself the rest:

  • shell-use agent-context prints versioned JSON for every command, flag, enum, default, and exit code. It is generated from the CLI, so it cannot drift from the real surface.
  • shell-use usage prints a one-screen cheatsheet.
  • shell-use skill prints the full workflow guide (SKILL.md).

Skill quick start

# ~/.agents or ~/.claude instead to install it for every repo.
mkdir -p .agents/skills/shell-use && shell-use skill > .agents/skills/shell-use/SKILL.md   # copilot / codex
mkdir -p .claude/skills/shell-use && shell-use skill > .claude/skills/shell-use/SKILL.md   # copilot / claude

Each command returns a stable exit code (see Exit codes), so an agent can tell an assertion failure from a missing session without scraping text.

Command reference

Global flags: --session <name> (env SHELL_USE_SESSION, default default), --json for machine-readable output, and --verbose/-v to log PTY traffic (see Debugging).

Session & lifecycle

Command Description
open [--shell S] [--cols N --rows N] [--cwd D] [--env K=V] Spawn a shell session.
run <program> [args...] Spawn a session running a program directly.
sessions List active sessions.
close [--all] Close the current session (or all).
daemon status / daemon stop Inspect / stop the daemon.

Inspection

Command Description
state cwd, size, cursor, last command + exit code, text snapshot.
text [--full] Plain text of the viewport (or scrollback).
screenshot [-o file.svg] [--full] Terminal text to stdout, or a crisp full-color SVG image (svg-term-style window) to a file.
cells X Y [W H] Per-cell attributes (char, fg, bg, flags).
get command|output|exit-code|cwd|cursor|size Structured getters.

Input

Command Description
type "text" Type literal text.
submit ["text"] Type then press the shell return key.
press <Key...> Named keys, e.g. press Escape : w q Enter, press Ctrl+C.
keys "Control+a" A single key combo.
mouse click X Y / mouse click --on-text "OK" [--clicks N] Click by coords or label.
mouse move|down|up|drag|scroll ... Full mouse control.

PTY

Command Description
resize COLS ROWS Resize the PTY and emulator.
write <data> Write raw bytes (no return key).
signal INT|TERM|KILL / kill Signal / kill the child.

Wait

Command Description
wait text "T" [--regex --full --not --timeout MS] Until text is (not) visible.
wait idle Until the screen stops changing.
wait command Until the current command finishes.
wait exit Until the session exits.

Expect (exit 0 = pass, 1 = fail)

Command Description
expect text "T" [--regex --full --no-strict --not --fg C --bg C --timeout MS] Visibility + optional color.
expect exit-code N Last command's exit code.
expect output "T" [--regex] Last command's captured output.
expect snapshot NAME [-u] [--include-colors] Compare against __snapshots__/NAME.snap.

Colors accept ANSI-256 (9), hex (#ff0000), or rgb (255,0,0).

Screenshots

Screenshots render a snapshot of the session in the current terminal by default, but can render an SVG using the -o output flag

full-color SVG screenshot of a TUI rendered by shell-use

Recording

Every session records automatically from the moment it opens, in the standard asciinema v2 cast format.

Command Description
get-recording [session] Print the session's recording (cast) to stdout.
shell-use get-recording > demo.cast   # capture the current session's recording
asciinema play demo.cast              # replay it
agg demo.cast demo.gif                # render a GIF

Live monitor

Watch a live session in a second terminal while an agent drives it. Both share the same daemon. monitor takes over an alternate screen and streams the session in full color at ~20fps; press q, Esc, or Ctrl-C to detach.

shell-use-demo.mp4
Command Description
monitor Attach a live, full-color framed view of the session (--session selects which).
shell-use --session work monitor   # watch the 'work' session live

It needs an interactive terminal (exit 2 otherwise) and an existing session (exit 3 if none). The view reads only the shared screen state, so watching never blocks the commands the agent is running, and resizing the window just re-fits the frame.

Agents

Command Description
usage Compact command cheatsheet.
agent-context Versioned JSON describing every command, flag, enum, default, and the exit-code taxonomy (generated from the CLI, so it can't drift).
skill Long-form workflow guide (SKILL.md).

Exit codes

Every command returns a stable exit code so an agent can branch on the failure class without parsing text:

Code Meaning
0 success
1 assertion or wait condition not met (expect/wait)
2 usage / invalid argument
3 no active session (run open/run first)
4 daemon or IPC error
5 internal error

With --json, failures also carry a "kind" field (assertion/usage/no_session/internal).

Supported shells

  • bash
  • zsh
  • fish
  • PowerShell (powershell and pwsh)
  • xonsh
  • elvish
  • nushell
  • cmd

Comparison

shell-use shares its shape with two related tools, both linked in the table:

shell-use tui-use terminal-use
Language Rust TypeScript/Node Rust
Emulator alacritty xterm (headless) alacritty
Primary target shells and TUI apps REPLs / debuggers / TUI apps TUI apps
Shell command tracking (OSC 133/633) βœ… command boundaries, exit codes, cwd ❌ ❌
Testing / snapshots βœ… expect text / output / exit-code / snapshot ❌ ❌
Color & per-cell attributes βœ… fg/bg, ANSI-256/hex/rgb, cells ❌ plain text (+ highlights) via PNG
Image screenshots βœ… SVG ❌ βœ… PNG
Built-in recording βœ… always-on asciinema cast + GIF ❌ ❌
Live monitor view βœ… ❌ βœ…
Stable exit-code taxonomy for agents βœ… ❌ ❌
Runtime native Node.js native
Platforms Windows + Unix Windows + Unix Linux / macOS

The difference is testing: shell-use tracks command boundaries and exit codes across every supported shell, then adds assertions, snapshots, color checks, and an always-on recording on top of the usual drive-and-read loop. It ships as one self-contained native binary with no runtime to install, where a Node tool like tui-use pulls in a full Node.js runtime and its native modules.

Debugging

By default the daemon writes no log. Start it with --verbose to record every byte read from and written to the PTY, plus lifecycle events, to ~/.shell-use/<session>.log:

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.