Example task files and Dockerfiles for desktest.
A simple test that opens a text file in gedit, adds a line, and saves.
Uses the folder app deploy type with a local application directory.
desktest run examples/gedit-save.json
desktest run examples/gedit-save.json --monitor # Watch live at http://localhost:7860
desktest interactive examples/gedit-save.jsonA spreadsheet test that enters values and a formula in LibreOffice Calc.
Uses the docker_image app type with a pre-built custom image.
# Build the custom image first
docker build -t tent-libreoffice:latest -f examples/Dockerfile.libreoffice .
# Run the test
desktest run examples/libreoffice-calc.json
# Or interactively
desktest interactive examples/libreoffice-calc.jsonA minimal Electron todo app that demonstrates testing Electron applications.
Uses the folder app deploy type with electron: true for Node.js support.
# Build the electron Docker image first
docker build -t desktest-desktop:latest docker/
docker build -f docker/Dockerfile.electron -t desktest-desktop:electron docker/
# Run the test
desktest run examples/electron-todo.jsonSee ELECTRON_QUICKSTART.md for a complete guide to testing Electron apps.
A harder test that exercises multi-app coordination: curl a CSV from a local HTTP server in a terminal, open it in gedit, find-and-replace ERRORβFIXED, save, and verify with grep. Tests app switching, dialog navigation, terminal interaction, and multi-step evaluation (4 metrics).
desktest run examples/multi-app-terminal-gedit.json
desktest run examples/multi-app-terminal-gedit.json --monitorTests basic text editing on macOS inside a Tart VM. Requires Apple Silicon, Tart, and a golden image prepared with desktest init-macos.
desktest run examples/macos-textedit.json --config config.jsonDeploys and tests an Electron todo app inside a Tart VM. Requires the desktest-macos-electron:latest golden image.
desktest run examples/macos-electron.json --config config.jsonSame TextEdit test but using macos_native mode β runs directly on the host macOS desktop with no VM isolation. Useful for quick local iteration without setting up Tart. Requires a local desktop session (not SSH) with Accessibility, Automation, and Screen Recording permissions granted.
desktest run examples/macos-native-textedit.json --config config.jsonSee docs/macos-support.md for the full macOS testing guide.
Tests basic Windows Calculator interaction inside a QEMU/KVM VM. Requires a Linux host with KVM, QEMU, and a golden image prepared with desktest init-windows.
desktest run examples/windows-calculator.json --config config.jsonSee dev-docs/windows-ci-guide.md for the full Windows testing guide.
Dockerfile.libreoffice shows how to create a compatible custom image.
Custom images must include these packages for desktest to work:
| Category | Packages |
|---|---|
| Display | xvfb, x11vnc, xfce4, xfce4-terminal |
| Tools | scrot, xdotool, ffmpeg |
| Accessibility | at-spi2-core, libatspi2.0-0 |
| Python | python3, python3-pyautogui, python3-xlib, python3-pyatspi, python3-pyperclip |
| Clipboard | xclip |
| D-Bus | dbus, dbus-x11 |
Custom images must also create ~/.Xauthority for the tester user. Without it, PyAutoGUI will crash with Xlib.error.XauthError. Add this after USER tester:
RUN touch /home/tester/.XauthorityYou must also copy the helper scripts from docker/:
docker/get-a11y-tree.pyβ/usr/local/bin/get-a11y-treedocker/execute-action.pyβ/usr/local/bin/execute-actiondocker/entrypoint.shβ/usr/local/bin/entrypoint.sh
desktest validates custom images at startup. If a required dependency is missing, it exits with code 2 and a clear error message.
# Validate a task file without running
desktest validate examples/libreoffice-calc.jsonAny example can be run with --qa to enable bug reporting. The agent will complete its task while also watching for application bugs:
desktest run examples/gedit-save.json --qaBug reports are written as markdown files in desktest_artifacts/bugs/. Each report includes a summary, reproduction steps, screenshot references, and diagnostic evidence gathered via bash commands.
Any example can be run with the --monitor flag to open a real-time web dashboard:
# Single test with live dashboard
desktest run examples/gedit-save.json --monitor
# Suite with progress tracking
desktest suite examples/ --monitor
# Custom port
desktest run examples/gedit-save.json --monitor --monitor-port 8080Open http://localhost:7860 in your browser to watch the agent's screenshots, thoughts, and actions stream in as each step completes. The dashboard uses the same UI as desktest review.
See src/task.rs for the full schema definition. Key fields:
{
"schema_version": "1.0",
"id": "unique-test-id",
"instruction": "What the agent should do",
"completion_condition": "Optional β when the agent should consider the task done",
"app": { "type": "appimage|folder|docker_image|vnc_attach|macos_tart|macos_native|windows_vm|windows_native", "..." : "..." },
"config": [ { "type": "execute|copy|open|sleep", "..." : "..." } ],
"evaluator": {
"mode": "llm|programmatic|hybrid",
"metrics": [ { "type": "file_exists|command_output|...", "..." : "..." } ]
},
"timeout": 120
}