From 6abff315dc4aada9ddcd45a7cbe9496136bebac4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:01:37 +0100 Subject: [PATCH 01/13] docs: tighten agent-device skill defaults --- skills/agent-device/SKILL.md | 35 ++++++++++++++---- .../references/bootstrap-install.md | 2 +- skills/agent-device/references/exploration.md | 37 +++++++++++++++---- 3 files changed, 57 insertions(+), 17 deletions(-) diff --git a/skills/agent-device/SKILL.md b/skills/agent-device/SKILL.md index 2008beb4a..e6c2ff7c2 100644 --- a/skills/agent-device/SKILL.md +++ b/skills/agent-device/SKILL.md @@ -5,29 +5,48 @@ description: Automates interactions for Apple-platform apps (iOS, tvOS, macOS) a # agent-device -Use this skill as a router. +Use this skill as a router with mandatory defaults. Read this file first. For normal device tasks, then load `references/bootstrap-install.md` and `references/exploration.md` before acting. + +## Default operating rules + +- Start conservative. Prefer read-only inspection before mutating the UI. +- Use plain `snapshot` when the task is to verify what text or structure is currently visible on screen. +- Use `snapshot -i` only when you need interactive refs such as `@e3` for a requested action or targeted query. +- Avoid speculative mutations. You may take the smallest reversible UI action needed to unblock inspection or complete the requested task, such as dismissing a popup, closing an alert, or clearing an unintended surface. +- Do not browse the web or use external sources unless the user explicitly asks. +- Re-snapshot after meaningful UI changes instead of reusing stale refs. +- Prefer `@ref` or selector targeting over raw coordinates. +- Keep the loop short: `open` -> inspect/act -> verify if needed -> `close`. + +## Default flow + +1. Choose the correct target and open the app or session you want to work on. +2. Start with plain `snapshot` if the goal is to read or verify what is visible. +3. Escalate to `snapshot -i` only if you need refs for interactive exploration or a requested action. +4. Use `get`, `is`, or `find` before mutating the UI when a read-only command can answer the question. +5. End by capturing proof if needed, then `close`. ## QA modes - Open-ended bug hunt with reporting: use [../dogfood/SKILL.md](../dogfood/SKILL.md). - Pass/fail QA from acceptance criteria: stay in this skill, start with [references/bootstrap-install.md](references/bootstrap-install.md), then use the QA loop in [references/exploration.md](references/exploration.md). -## Mental model +## Required references -- First choose the correct target and open the app or session you want to work on. -- Then inspect the current UI with `snapshot -i` and pick targets from the actual UI state. -- Act with `press`, `fill`, `get`, `is`, `wait`, or `find`. -- Re-snapshot after meaningful UI changes instead of reusing stale refs. -- End by capturing proof if needed, then `close`. +- For every normal device task, after reading this file, load [references/bootstrap-install.md](references/bootstrap-install.md) first, then [references/exploration.md](references/exploration.md), before acting. +- Load additional references only when their scope is needed. ## Decision rules - Use plain `snapshot` when you need to verify whether text is visible. - Use `snapshot -i` mainly for interactive exploration and choosing refs. +- Use `get`, `is`, or `find` when they can answer the question without changing UI state. - Use `fill` to replace text. - Use `type` to append text. +- Use the smallest unblock action first when transient UI blocks inspection, but do not navigate, search, or enter new text just to make the UI reveal data unless the user asked for that interaction. +- Do not use external lookups to compensate for missing on-screen data unless the user asked for them. +- If the needed information is not exposed on screen, say that plainly instead of compensating with extra navigation, text entry, or web search. - Prefer `@ref` or selector targeting over raw coordinates. -- Keep the default loop short: `open` -> explore/act -> optional debug or verify -> `close`. ## Choose a reference diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index 181272799..4ed81646a 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -22,7 +22,7 @@ Do not start acting before you have pinned the correct target and opened an `app ```bash agent-device ensure-simulator --platform ios --device "iPhone 17 Pro" --boot agent-device open MyApp --platform ios --device "iPhone 17 Pro" --relaunch -agent-device snapshot -i +agent-device snapshot agent-device close ``` diff --git a/skills/agent-device/references/exploration.md b/skills/agent-device/references/exploration.md index 43cc0ab2e..ed53707b6 100644 --- a/skills/agent-device/references/exploration.md +++ b/skills/agent-device/references/exploration.md @@ -4,16 +4,33 @@ Open this file when the app or screen is already running and you need to discover the UI, choose targets, read state, wait for conditions, or perform normal interactions. +## Read-only first + +- If the question is what text, labels, or structure is visible on screen, start with plain `snapshot`. +- Escalate to `snapshot -i` only when you need refs such as `@e3` for interactive exploration or a requested action. +- Prefer `get`, `is`, or `find` before mutating the UI when a read-only command can answer the question. +- You may take the smallest reversible UI action needed to unblock inspection, such as dismissing a popup, closing an alert, or backing out of an unintended surface. +- Do not type or fill text just to make hidden information easier to access unless the user asked for that interaction. +- Do not use external sources to infer missing UI state unless the user explicitly asked. +- If the answer is not visible or exposed in the UI, report that gap instead of compensating with search, navigation, or text entry. + +## Decision shortcut + +- User asks what is visible on screen: `snapshot` +- User asks for exact text from a known target: `get text` +- User asks you to tap, type, or choose an element: `snapshot -i`, then act +- UI does not expose the answer: say so plainly; do not browse or force the app into a new state unless asked + ## Main commands to reach for first - `snapshot` -- `snapshot -i` -- `press` -- `fill` - `get` - `is` -- `wait` - `find` +- `wait` +- `snapshot -i` +- `press` +- `fill` ## Most common mistake to avoid @@ -23,9 +40,7 @@ Do not treat `@ref` values as durable after navigation or dynamic updates. Re-sn ```bash agent-device open Settings --platform ios -agent-device snapshot -i -agent-device press @e3 -agent-device wait visible 'label="Privacy & Security"' 3000 +agent-device snapshot agent-device get text 'label="Privacy & Security"' agent-device close ``` @@ -34,7 +49,7 @@ agent-device close - Use plain `snapshot` when you only need to verify whether visible text or structure is on screen. - Use `snapshot -i` when you need refs such as `@e3` for interactive exploration. -- Treat large text-surface lines in `snapshot -i` as discovery output. If a node shows preview/truncation metadata, use `get text @ref` to expand the actual text after you choose the surface. +- Treat large text-surface lines in `snapshot -i` as discovery output. If a node shows preview or truncation metadata, use `get text @ref` only after you have already decided that `snapshot -i` is needed for that surface. - Use `snapshot -i -s "Camera"` or `snapshot -i -s @e3` when you want a smaller, scoped result. Example: @@ -74,6 +89,7 @@ agent-device is visible 'id="camera_settings_anchor"' - Use `fill` to replace text in an editable field. - Use `type` to append text to the current insertion point. +- Do not use `fill` or `type` just to make the app reveal information that is not currently visible unless the user asked for that interaction. ## Query and sync rules @@ -109,6 +125,11 @@ Anti-hallucination rules: - Discover them first with `devices`, `open`, `snapshot -i`, `find`, or `session list`. - If refs drift after navigation, re-snapshot or switch to selectors instead of guessing. +Common failure pattern to avoid: + +- Wrong for a visible-text question: `snapshot -i` -> `get text @ref` -> web search -> type into a search box +- Right: `snapshot` first; if the text is not visible or exposed, report that directly + Canonical QA loop: ```bash From 90dc9ab11c19561e293e9c3df3b95a6295b5f744 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:06:13 +0100 Subject: [PATCH 02/13] docs: split exploration skill loops --- skills/agent-device/references/exploration.md | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/skills/agent-device/references/exploration.md b/skills/agent-device/references/exploration.md index ed53707b6..e29ca3b2f 100644 --- a/skills/agent-device/references/exploration.md +++ b/skills/agent-device/references/exploration.md @@ -36,15 +36,30 @@ Open this file when the app or screen is already running and you need to discove Do not treat `@ref` values as durable after navigation or dynamic updates. Re-snapshot after the UI changes, and switch to selectors when the flow must stay stable. -## Canonical loop +## Common loops + +### Interactive exploration loop ```bash agent-device open Settings --platform ios -agent-device snapshot +agent-device snapshot -i +agent-device press @e3 +agent-device wait visible 'label="Privacy & Security"' 3000 agent-device get text 'label="Privacy & Security"' agent-device close ``` +### Screen verification loop + +```bash +agent-device open Settings --platform ios +agent-device snapshot +agent-device press 'label="Privacy & Security"' +agent-device diff snapshot +agent-device snapshot +agent-device close +``` + ## Snapshot choices - Use plain `snapshot` when you only need to verify whether visible text or structure is on screen. From 57eca7e16bbd8fa4b8b8e262300fc436a7bd84e1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:08:30 +0100 Subject: [PATCH 03/13] docs: require interactive snapshots before actions --- skills/agent-device/references/exploration.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/skills/agent-device/references/exploration.md b/skills/agent-device/references/exploration.md index e29ca3b2f..1ab00ca0c 100644 --- a/skills/agent-device/references/exploration.md +++ b/skills/agent-device/references/exploration.md @@ -8,6 +8,7 @@ Open this file when the app or screen is already running and you need to discove - If the question is what text, labels, or structure is visible on screen, start with plain `snapshot`. - Escalate to `snapshot -i` only when you need refs such as `@e3` for interactive exploration or a requested action. +- If you intend to `press`, `fill`, or otherwise interact, start with `snapshot -i` and fall back to plain `snapshot` only if interactive refs are unavailable. - Prefer `get`, `is`, or `find` before mutating the UI when a read-only command can answer the question. - You may take the smallest reversible UI action needed to unblock inspection, such as dismissing a popup, closing an alert, or backing out of an unintended surface. - Do not type or fill text just to make hidden information easier to access unless the user asked for that interaction. @@ -53,8 +54,8 @@ agent-device close ```bash agent-device open Settings --platform ios -agent-device snapshot -agent-device press 'label="Privacy & Security"' +agent-device snapshot -i +agent-device press @e3 agent-device diff snapshot agent-device snapshot agent-device close @@ -63,7 +64,7 @@ agent-device close ## Snapshot choices - Use plain `snapshot` when you only need to verify whether visible text or structure is on screen. -- Use `snapshot -i` when you need refs such as `@e3` for interactive exploration. +- Use `snapshot -i` when you need refs such as `@e3` for interactive exploration or for an intended interaction. - Treat large text-surface lines in `snapshot -i` as discovery output. If a node shows preview or truncation metadata, use `get text @ref` only after you have already decided that `snapshot -i` is needed for that surface. - Use `snapshot -i -s "Camera"` or `snapshot -i -s @e3` when you want a smaller, scoped result. From ba7013b24f5c45f83dd4fb1062ff8cdef38bb8e7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:11:42 +0100 Subject: [PATCH 04/13] docs: clarify exploration loops are examples --- skills/agent-device/references/exploration.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/skills/agent-device/references/exploration.md b/skills/agent-device/references/exploration.md index 1ab00ca0c..888249968 100644 --- a/skills/agent-device/references/exploration.md +++ b/skills/agent-device/references/exploration.md @@ -37,7 +37,9 @@ Open this file when the app or screen is already running and you need to discove Do not treat `@ref` values as durable after navigation or dynamic updates. Re-snapshot after the UI changes, and switch to selectors when the flow must stay stable. -## Common loops +## Common example loops + +These are examples, not required exact sequences. Adapt them to the app, state, and task at hand. ### Interactive exploration loop @@ -53,11 +55,10 @@ agent-device close ### Screen verification loop ```bash -agent-device open Settings --platform ios -agent-device snapshot -i -agent-device press @e3 -agent-device diff snapshot +agent-device open MyApp --platform ios +# perform the necessary actions to reach the state you need to verify agent-device snapshot +# verify whether the expected element or text is present agent-device close ``` From 03495aadc14f0cbe00289be0f1b0e3a6f936adb7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:14:14 +0100 Subject: [PATCH 05/13] docs: simplify exploration anti-pattern wording --- skills/agent-device/references/exploration.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/skills/agent-device/references/exploration.md b/skills/agent-device/references/exploration.md index 888249968..aba72f2f2 100644 --- a/skills/agent-device/references/exploration.md +++ b/skills/agent-device/references/exploration.md @@ -142,10 +142,10 @@ Anti-hallucination rules: - Discover them first with `devices`, `open`, `snapshot -i`, `find`, or `session list`. - If refs drift after navigation, re-snapshot or switch to selectors instead of guessing. -Common failure pattern to avoid: +Avoid this escalation path for visible-text questions: -- Wrong for a visible-text question: `snapshot -i` -> `get text @ref` -> web search -> type into a search box -- Right: `snapshot` first; if the text is not visible or exposed, report that directly +- Do not jump from `snapshot -i` to `get text @ref`, then to web search, then to typing into a search box just to force the app to reveal the answer. +- Start with `snapshot`. If the text is not visible or exposed, report that directly. Canonical QA loop: From 7702e24601a9e380c1d88fe4a99308bc634c9352 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:17:27 +0100 Subject: [PATCH 06/13] docs: reframe bootstrap setup examples --- skills/agent-device/references/bootstrap-install.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index 4ed81646a..fef46183e 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -17,13 +17,15 @@ Open this file when you still need to choose the right target, start the right s Do not start acting before you have pinned the correct target and opened an `app` session. In mixed-device environments, always pass `--device`, `--udid`, or `--serial`. -## Canonical loop +## Common starting points + +These are examples, not required exact sequences. Use the smallest setup flow that matches the task. + +### Boot a simulator and open an app ```bash agent-device ensure-simulator --platform ios --device "iPhone 17 Pro" --boot agent-device open MyApp --platform ios --device "iPhone 17 Pro" --relaunch -agent-device snapshot -agent-device close ``` ## Choose the right starting point From f7423c043c20ef95460a093b9d556100ee86d952 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:22:07 +0100 Subject: [PATCH 07/13] docs: clarify deterministic bootstrap routing --- skills/agent-device/SKILL.md | 21 ++++++++++++------- .../references/bootstrap-install.md | 6 +++++- 2 files changed, 18 insertions(+), 9 deletions(-) diff --git a/skills/agent-device/SKILL.md b/skills/agent-device/SKILL.md index e6c2ff7c2..ff4970401 100644 --- a/skills/agent-device/SKILL.md +++ b/skills/agent-device/SKILL.md @@ -5,7 +5,7 @@ description: Automates interactions for Apple-platform apps (iOS, tvOS, macOS) a # agent-device -Use this skill as a router with mandatory defaults. Read this file first. For normal device tasks, then load `references/bootstrap-install.md` and `references/exploration.md` before acting. +Use this skill as a router with mandatory defaults. Read this file first. If target, app, or session readiness is uncertain, load `references/bootstrap-install.md` first. Once the app session is open and stable, use `references/exploration.md` for inspection and interaction. ## Default operating rules @@ -16,24 +16,28 @@ Use this skill as a router with mandatory defaults. Read this file first. For no - Do not browse the web or use external sources unless the user explicitly asks. - Re-snapshot after meaningful UI changes instead of reusing stale refs. - Prefer `@ref` or selector targeting over raw coordinates. +- Ensure the correct target is pinned and an app session is open before interacting. - Keep the loop short: `open` -> inspect/act -> verify if needed -> `close`. ## Default flow -1. Choose the correct target and open the app or session you want to work on. -2. Start with plain `snapshot` if the goal is to read or verify what is visible. -3. Escalate to `snapshot -i` only if you need refs for interactive exploration or a requested action. -4. Use `get`, `is`, or `find` before mutating the UI when a read-only command can answer the question. -5. End by capturing proof if needed, then `close`. +1. Decide whether the correct target, app install, and app session are already ready. +2. If readiness is uncertain, or there is no simulator, device, app install, or open app session yet, load [references/bootstrap-install.md](references/bootstrap-install.md) and establish that deterministically. +3. Once the app session is open and stable, load [references/exploration.md](references/exploration.md). +4. Start with plain `snapshot` if the goal is to read or verify what is visible. +5. Escalate to `snapshot -i` only if you need refs for interactive exploration or a requested action. +6. Use `get`, `is`, or `find` before mutating the UI when a read-only command can answer the question. +7. End by capturing proof if needed, then `close`. ## QA modes - Open-ended bug hunt with reporting: use [../dogfood/SKILL.md](../dogfood/SKILL.md). - Pass/fail QA from acceptance criteria: stay in this skill, start with [references/bootstrap-install.md](references/bootstrap-install.md), then use the QA loop in [references/exploration.md](references/exploration.md). -## Required references +## Deterministic routing -- For every normal device task, after reading this file, load [references/bootstrap-install.md](references/bootstrap-install.md) first, then [references/exploration.md](references/exploration.md), before acting. +- Load [references/bootstrap-install.md](references/bootstrap-install.md) when target, install, open, or session readiness is uncertain, especially in sandbox or cloud environments. +- Load [references/exploration.md](references/exploration.md) once the app session is open and stable. - Load additional references only when their scope is needed. ## Decision rules @@ -43,6 +47,7 @@ Use this skill as a router with mandatory defaults. Read this file first. For no - Use `get`, `is`, or `find` when they can answer the question without changing UI state. - Use `fill` to replace text. - Use `type` to append text. +- If there is no simulator, no app install, or no open app session yet, switch to `bootstrap-install.md` instead of improvising setup steps. - Use the smallest unblock action first when transient UI blocks inspection, but do not navigate, search, or enter new text just to make the UI reveal data unless the user asked for that interaction. - Do not use external lookups to compensate for missing on-screen data unless the user asked for them. - If the needed information is not exposed on screen, say that plainly instead of compensating with extra navigation, text entry, or web search. diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index fef46183e..860c715ad 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -2,7 +2,7 @@ ## When to open this file -Open this file when you still need to choose the right target, start the right session, install or relaunch the app, or pin automation to one device before interacting. +Open this file when you still need to choose the right target, start the right session, install or relaunch the app, or pin automation to one device before interacting. This is the deterministic setup layer for sandbox, cloud, or other environments where install paths, device state, or app readiness may be uncertain. ## Main commands to reach for first @@ -17,6 +17,10 @@ Open this file when you still need to choose the right target, start the right s Do not start acting before you have pinned the correct target and opened an `app` session. In mixed-device environments, always pass `--device`, `--udid`, or `--serial`. +## Deterministic setup rule + +If there is no simulator, no app install, no open app session, or any uncertainty about where the app should come from, stay in this file and use deterministic setup commands or bootstrap scripts first. Do not improvise install paths or app-launch flows while exploring. + ## Common starting points These are examples, not required exact sequences. Use the smallest setup flow that matches the task. From 0346b15db46b54f600327aae73469a4bc811608d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:25:15 +0100 Subject: [PATCH 08/13] docs: add app discovery to bootstrap flow --- skills/agent-device/references/bootstrap-install.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index 860c715ad..1b4c546eb 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -7,6 +7,7 @@ Open this file when you still need to choose the right target, start the right s ## Main commands to reach for first - `devices` +- `apps` - `ensure-simulator` - `open` - `install` or `reinstall` @@ -21,6 +22,12 @@ Do not start acting before you have pinned the correct target and opened an `app If there is no simulator, no app install, no open app session, or any uncertainty about where the app should come from, stay in this file and use deterministic setup commands or bootstrap scripts first. Do not improvise install paths or app-launch flows while exploring. +## If `open` fails + +- If `open ` fails, or you are not sure which app name is available on the target, run `agent-device apps` first and choose from the discovered app list instead of guessing. +- Use `apps --platform ` together with `--device`, `--udid`, or `--serial` when target selection matters. +- Once you have the correct app name, retry `open` with that exact value. + ## Common starting points These are examples, not required exact sequences. Use the smallest setup flow that matches the task. From 3f87f45278e05ed6d8a557d60be1cbccc2a54a65 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:26:46 +0100 Subject: [PATCH 09/13] docs: require loading bootstrap and exploration --- skills/agent-device/SKILL.md | 15 ++++++++------- .../agent-device/references/bootstrap-install.md | 2 ++ 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/skills/agent-device/SKILL.md b/skills/agent-device/SKILL.md index ff4970401..2250473be 100644 --- a/skills/agent-device/SKILL.md +++ b/skills/agent-device/SKILL.md @@ -5,7 +5,7 @@ description: Automates interactions for Apple-platform apps (iOS, tvOS, macOS) a # agent-device -Use this skill as a router with mandatory defaults. Read this file first. If target, app, or session readiness is uncertain, load `references/bootstrap-install.md` first. Once the app session is open and stable, use `references/exploration.md` for inspection and interaction. +Use this skill as a router with mandatory defaults. Read this file first. For normal device tasks, always load `references/bootstrap-install.md` and `references/exploration.md` before acting. Use bootstrap to confirm or establish deterministic setup. Use exploration for UI inspection, interaction, and verification once the app session is open. ## Default operating rules @@ -21,9 +21,9 @@ Use this skill as a router with mandatory defaults. Read this file first. If tar ## Default flow -1. Decide whether the correct target, app install, and app session are already ready. -2. If readiness is uncertain, or there is no simulator, device, app install, or open app session yet, load [references/bootstrap-install.md](references/bootstrap-install.md) and establish that deterministically. -3. Once the app session is open and stable, load [references/exploration.md](references/exploration.md). +1. Load [references/bootstrap-install.md](references/bootstrap-install.md) and [references/exploration.md](references/exploration.md) before acting on a normal device task. +2. Use bootstrap first to confirm or establish the correct target, app install, and open app session. +3. Once the app session is open and stable, use exploration for inspection, interaction, and verification. 4. Start with plain `snapshot` if the goal is to read or verify what is visible. 5. Escalate to `snapshot -i` only if you need refs for interactive exploration or a requested action. 6. Use `get`, `is`, or `find` before mutating the UI when a read-only command can answer the question. @@ -34,10 +34,11 @@ Use this skill as a router with mandatory defaults. Read this file first. If tar - Open-ended bug hunt with reporting: use [../dogfood/SKILL.md](../dogfood/SKILL.md). - Pass/fail QA from acceptance criteria: stay in this skill, start with [references/bootstrap-install.md](references/bootstrap-install.md), then use the QA loop in [references/exploration.md](references/exploration.md). -## Deterministic routing +## Required references -- Load [references/bootstrap-install.md](references/bootstrap-install.md) when target, install, open, or session readiness is uncertain, especially in sandbox or cloud environments. -- Load [references/exploration.md](references/exploration.md) once the app session is open and stable. +- For every normal device task, after reading this file, load [references/bootstrap-install.md](references/bootstrap-install.md) first, then [references/exploration.md](references/exploration.md), before acting. +- Use bootstrap to confirm or establish deterministic setup, especially in sandbox or cloud environments. +- Use exploration once the app session is open and stable. - Load additional references only when their scope is needed. ## Decision rules diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index 1b4c546eb..b650d2f40 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -22,6 +22,8 @@ Do not start acting before you have pinned the correct target and opened an `app If there is no simulator, no app install, no open app session, or any uncertainty about where the app should come from, stay in this file and use deterministic setup commands or bootstrap scripts first. Do not improvise install paths or app-launch flows while exploring. +After setup is confirmed or completed, move to `exploration.md` before doing UI inspection or interaction. + ## If `open` fails - If `open ` fails, or you are not sure which app name is available on the target, run `agent-device apps` first and choose from the discovered app list instead of guessing. From 6bb1cef493f66016ba0cd1bb7009d6a238999c80 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:33:30 +0100 Subject: [PATCH 10/13] docs: tighten bootstrap setup guidance --- .../references/bootstrap-install.md | 37 +++++++++++++++---- .../agent-device/references/remote-tenancy.md | 12 ++++++ 2 files changed, 41 insertions(+), 8 deletions(-) diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index b650d2f40..649fdd4d5 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -9,8 +9,8 @@ Open this file when you still need to choose the right target, start the right s - `devices` - `apps` - `ensure-simulator` -- `open` - `install` or `reinstall` +- `open` - `close` - `session list` @@ -41,13 +41,34 @@ agent-device ensure-simulator --platform ios --device "iPhone 17 Pro" --boot agent-device open MyApp --platform ios --device "iPhone 17 Pro" --relaunch ``` +### Install an app artifact, then open it + +```bash +agent-device install com.example.app ./build/app.apk --platform android --serial emulator-5554 +agent-device open com.example.app --platform android --serial emulator-5554 +``` + +```bash +agent-device install com.example.app ./build/MyApp.app --platform ios --device "iPhone 17 Pro" +agent-device open MyApp --platform ios --device "iPhone 17 Pro" +``` + +## Install guidance + +- Use `install ` when the app may already be installed and you do not need a fresh-state reset. +- Use `reinstall ` when you explicitly need uninstall plus install as one deterministic step. +- Supported binary formats: + - Android: `.apk` and `.aab` + - iOS: `.app` and `.ipa` +- For iOS `.ipa` files, `` is used as the bundle id or bundle name hint when the archive contains multiple app bundles. +- After install or reinstall, use `open ` with the installed app name or package/bundle identifier, not the artifact path. + ## Choose the right starting point - iOS local QA: prefer simulators unless the task explicitly requires physical hardware. - iOS in mixed simulator and device environments: run `ensure-simulator` first, then keep using `--device` or `--udid`. - TV targets: use `--target tv` together with `--platform` when the task is for tvOS or Android TV rather than phone or tablet surfaces. - Android binary flow: use `install` or `reinstall` for `.apk` or `.aab`, then open by installed package name. -- Android React Native plus Metro flow: `reinstall ` first, then `open --remote-config --relaunch`. - macOS desktop app flow: use `open --platform macos`. Only load [macos-desktop.md](macos-desktop.md) if a desktop surface or macOS-specific behavior matters. TV example: @@ -110,8 +131,6 @@ export AGENT_DEVICE_PLATFORM=ios export AGENT_DEVICE_SESSION_LOCK=strip agent-device open MyApp --relaunch -agent-device snapshot -i -agent-device close ``` - `AGENT_DEVICE_SESSION` plus `AGENT_DEVICE_PLATFORM` provides the default binding. @@ -126,10 +145,7 @@ Android emulator variant: export AGENT_DEVICE_SESSION=qa-android export AGENT_DEVICE_PLATFORM=android -agent-device reinstall MyApp /path/to/app-debug.apk --serial emulator-5554 agent-device --session-lock reject open com.example.myapp --relaunch -agent-device snapshot -i -agent-device close --shutdown ``` ## Scoped discovery @@ -170,7 +186,12 @@ agent-device replay -u ./session.ad --session auth ```bash agent-device reinstall MyApp /path/to/app-debug.apk --platform android --serial emulator-5554 -agent-device open com.example.myapp --remote-config ./agent-device.remote.json --relaunch +agent-device open com.example.myapp --platform android --serial emulator-5554 +``` + +```bash +agent-device install com.example.app ./build/MyApp.ipa --platform ios --device "iPhone 17 Pro" +agent-device open MyApp --platform ios --device "iPhone 17 Pro" ``` Do not use `open --relaunch` on Android. diff --git a/skills/agent-device/references/remote-tenancy.md b/skills/agent-device/references/remote-tenancy.md index 36e355a12..b9e927ca2 100644 --- a/skills/agent-device/references/remote-tenancy.md +++ b/skills/agent-device/references/remote-tenancy.md @@ -67,6 +67,18 @@ curl -sS "${AGENT_DEVICE_DAEMON_BASE_URL}/rpc" \ - Direct JSON-RPC callers can authenticate with request params, `Authorization: Bearer `, or `x-agent-device-token`. - Prefer an auth hook such as `AGENT_DEVICE_HTTP_AUTH_HOOK` when the host needs caller validation or tenant injection. +## Remote Metro-backed launch + +Use this when the agent must launch a remote React Native app through a checked-in `--remote-config` profile rather than a purely local bootstrap flow. + +```bash +agent-device open com.example.myapp --remote-config ./agent-device.remote.json --relaunch +``` + +- This is the main remote Metro-backed launch path for sandbox or cloud agents. +- For Android React Native relaunch flows, install or reinstall the APK first, then relaunch by installed package name. +- Do not use `open --relaunch`; remote runtime hints are applied through the installed app sandbox. + ## Lease lifecycle Use JSON-RPC methods on `POST /rpc`: From cb27cac642e01c68933c3ba14be5d22df9a6287d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:35:36 +0100 Subject: [PATCH 11/13] docs: prefer open before install on first attempt --- skills/agent-device/references/bootstrap-install.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index 649fdd4d5..c9b4ee0ca 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -24,6 +24,13 @@ If there is no simulator, no app install, no open app session, or any uncertaint After setup is confirmed or completed, move to `exploration.md` before doing UI inspection or interaction. +## First attempt rule + +- If the user asks to test an app and does not provide an install artifact or explicit install instruction, try `open ` first. +- If `open ` fails, run `agent-device apps` and retry with a discovered app name before considering install steps. +- Do not install or reinstall on the first attempt unless the user explicitly asks for installation or provides a concrete artifact path or URL. +- When installation is required from a known location, prefer a checked-in shell script or other deterministic bootstrap command over ad hoc path guessing. + ## If `open` fails - If `open ` fails, or you are not sure which app name is available on the target, run `agent-device apps` first and choose from the discovered app list instead of guessing. From 3907bb3a6b86a17695902c3b34a05c5b157e967b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:44:25 +0100 Subject: [PATCH 12/13] docs: simplify agent-device skill routing --- skills/agent-device/SKILL.md | 6 ++---- .../agent-device/references/bootstrap-install.md | 15 ++++++--------- skills/agent-device/references/exploration.md | 8 ++++++-- skills/agent-device/references/remote-tenancy.md | 2 +- skills/agent-device/references/verification.md | 3 +-- 5 files changed, 16 insertions(+), 18 deletions(-) diff --git a/skills/agent-device/SKILL.md b/skills/agent-device/SKILL.md index 2250473be..d32e6cf08 100644 --- a/skills/agent-device/SKILL.md +++ b/skills/agent-device/SKILL.md @@ -54,11 +54,9 @@ Use this skill as a router with mandatory defaults. Read this file first. For no - If the needed information is not exposed on screen, say that plainly instead of compensating with extra navigation, text entry, or web search. - Prefer `@ref` or selector targeting over raw coordinates. -## Choose a reference +## Additional references -- Pick target device, install, open, or manage sessions: [references/bootstrap-install.md](references/bootstrap-install.md) -- Need to discover UI, pick refs, wait, query, or interact: [references/exploration.md](references/exploration.md) - Need logs, network, alerts, permissions, or failure triage: [references/debugging.md](references/debugging.md) - Need screenshots, diff, recording, replay maintenance, or perf data: [references/verification.md](references/verification.md) - Need desktop surfaces, menu bar behavior, or macOS-specific interaction rules: [references/macos-desktop.md](references/macos-desktop.md) -- Need to connect to a remote `agent-device` daemon over HTTP or use tenant leases: [references/remote-tenancy.md](references/remote-tenancy.md) +- Need remote HTTP transport, `--remote-config` launches, or tenant leases on a remote macOS host: [references/remote-tenancy.md](references/remote-tenancy.md) diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index c9b4ee0ca..276235a47 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -9,8 +9,8 @@ Open this file when you still need to choose the right target, start the right s - `devices` - `apps` - `ensure-simulator` -- `install` or `reinstall` - `open` +- `install` or `reinstall` - `close` - `session list` @@ -35,7 +35,7 @@ After setup is confirmed or completed, move to `exploration.md` before doing UI - If `open ` fails, or you are not sure which app name is available on the target, run `agent-device apps` first and choose from the discovered app list instead of guessing. - Use `apps --platform ` together with `--device`, `--udid`, or `--serial` when target selection matters. -- Once you have the correct app name, retry `open` with that exact value. +- Once you have the correct app name, retry `open` with that exact discovered value. ## Common starting points @@ -48,27 +48,26 @@ agent-device ensure-simulator --platform ios --device "iPhone 17 Pro" --boot agent-device open MyApp --platform ios --device "iPhone 17 Pro" --relaunch ``` -### Install an app artifact, then open it +### Install an app artifact ```bash agent-device install com.example.app ./build/app.apk --platform android --serial emulator-5554 -agent-device open com.example.app --platform android --serial emulator-5554 ``` ```bash agent-device install com.example.app ./build/MyApp.app --platform ios --device "iPhone 17 Pro" -agent-device open MyApp --platform ios --device "iPhone 17 Pro" ``` ## Install guidance - Use `install ` when the app may already be installed and you do not need a fresh-state reset. - Use `reinstall ` when you explicitly need uninstall plus install as one deterministic step. +- Keep install and open as separate phases. Do not turn them into one default command flow. - Supported binary formats: - Android: `.apk` and `.aab` - iOS: `.app` and `.ipa` - For iOS `.ipa` files, `` is used as the bundle id or bundle name hint when the archive contains multiple app bundles. -- After install or reinstall, use `open ` with the installed app name or package/bundle identifier, not the artifact path. +- After install or reinstall, later use `open ` with the exact discovered or known package/bundle identifier, not the artifact path. ## Choose the right starting point @@ -189,16 +188,14 @@ agent-device replay -u ./session.ad --session auth - Once the correct target and session are pinned, move to [exploration.md](exploration.md). - If opening, startup, permissions, or logs become the blocker, switch to [debugging.md](debugging.md). -## Install and open examples +## Install examples ```bash agent-device reinstall MyApp /path/to/app-debug.apk --platform android --serial emulator-5554 -agent-device open com.example.myapp --platform android --serial emulator-5554 ``` ```bash agent-device install com.example.app ./build/MyApp.ipa --platform ios --device "iPhone 17 Pro" -agent-device open MyApp --platform ios --device "iPhone 17 Pro" ``` Do not use `open --relaunch` on Android. diff --git a/skills/agent-device/references/exploration.md b/skills/agent-device/references/exploration.md index aba72f2f2..b1ea3c448 100644 --- a/skills/agent-device/references/exploration.md +++ b/skills/agent-device/references/exploration.md @@ -22,16 +22,20 @@ Open this file when the app or screen is already running and you need to discove - User asks you to tap, type, or choose an element: `snapshot -i`, then act - UI does not expose the answer: say so plainly; do not browse or force the app into a new state unless asked -## Main commands to reach for first +## Read-only commands - `snapshot` - `get` - `is` - `find` -- `wait` + +## Interaction commands + - `snapshot -i` - `press` - `fill` +- `type` +- `wait` ## Most common mistake to avoid diff --git a/skills/agent-device/references/remote-tenancy.md b/skills/agent-device/references/remote-tenancy.md index b9e927ca2..3247692e2 100644 --- a/skills/agent-device/references/remote-tenancy.md +++ b/skills/agent-device/references/remote-tenancy.md @@ -2,7 +2,7 @@ ## When to open this file -Open this file only for remote daemon HTTP flows that require explicit daemon URL setup, authentication, lease allocation, or tenant-scoped command admission. +Open this file for remote daemon HTTP flows, including `--remote-config` launches, that let an agent running in a Linux sandbox talk to another `agent-device` instance on a remote macOS host in order to control devices that are not available locally. This file covers daemon URL setup, authentication, lease allocation, and tenant-scoped command admission. ## Main commands to reach for first diff --git a/skills/agent-device/references/verification.md b/skills/agent-device/references/verification.md index 2bb5d5fec..25a99b254 100644 --- a/skills/agent-device/references/verification.md +++ b/skills/agent-device/references/verification.md @@ -20,9 +20,8 @@ Do not use verification tools as the first exploration step. First get the app i ```bash agent-device open Settings --platform ios +# after using exploration to reach the state you want to verify agent-device snapshot -i -agent-device press @e5 -agent-device diff snapshot -i agent-device screenshot /tmp/settings-proof.png agent-device close ``` From 1ba41721bf11a086ce0aacf8a9b47e866cc7eb62 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Pierzcha=C5=82a?= Date: Wed, 25 Mar 2026 19:47:30 +0100 Subject: [PATCH 13/13] docs: simplify weak-model skill guidance --- .../references/bootstrap-install.md | 12 ++++----- .../agent-device/references/macos-desktop.md | 3 +-- .../agent-device/references/remote-tenancy.md | 27 ++++++++++--------- .../agent-device/references/verification.md | 2 +- 4 files changed, 22 insertions(+), 22 deletions(-) diff --git a/skills/agent-device/references/bootstrap-install.md b/skills/agent-device/references/bootstrap-install.md index 276235a47..74423900a 100644 --- a/skills/agent-device/references/bootstrap-install.md +++ b/skills/agent-device/references/bootstrap-install.md @@ -4,16 +4,18 @@ Open this file when you still need to choose the right target, start the right session, install or relaunch the app, or pin automation to one device before interacting. This is the deterministic setup layer for sandbox, cloud, or other environments where install paths, device state, or app readiness may be uncertain. -## Main commands to reach for first +## Open-first path - `devices` - `apps` - `ensure-simulator` - `open` -- `install` or `reinstall` -- `close` - `session list` +## Install path + +- `install` or `reinstall` + ## Most common mistake to avoid Do not start acting before you have pinned the correct target and opened an `app` session. In mixed-device environments, always pass `--device`, `--udid`, or `--serial`. @@ -24,15 +26,13 @@ If there is no simulator, no app install, no open app session, or any uncertaint After setup is confirmed or completed, move to `exploration.md` before doing UI inspection or interaction. -## First attempt rule +## Open-first rule - If the user asks to test an app and does not provide an install artifact or explicit install instruction, try `open ` first. - If `open ` fails, run `agent-device apps` and retry with a discovered app name before considering install steps. - Do not install or reinstall on the first attempt unless the user explicitly asks for installation or provides a concrete artifact path or URL. - When installation is required from a known location, prefer a checked-in shell script or other deterministic bootstrap command over ad hoc path guessing. -## If `open` fails - - If `open ` fails, or you are not sure which app name is available on the target, run `agent-device apps` first and choose from the discovered app list instead of guessing. - Use `apps --platform ` together with `--device`, `--udid`, or `--serial` when target selection matters. - Once you have the correct app name, retry `open` with that exact discovered value. diff --git a/skills/agent-device/references/macos-desktop.md b/skills/agent-device/references/macos-desktop.md index d86051590..e55f1e3eb 100644 --- a/skills/agent-device/references/macos-desktop.md +++ b/skills/agent-device/references/macos-desktop.md @@ -21,8 +21,7 @@ Do not treat every macOS surface the same. Use the normal `app` surface when you ```bash agent-device open TextEdit --platform macos -agent-device snapshot -i -agent-device fill @e3 "desktop smoke test" +agent-device snapshot agent-device close ``` diff --git a/skills/agent-device/references/remote-tenancy.md b/skills/agent-device/references/remote-tenancy.md index 3247692e2..8d5f7fa01 100644 --- a/skills/agent-device/references/remote-tenancy.md +++ b/skills/agent-device/references/remote-tenancy.md @@ -6,6 +6,7 @@ Open this file for remote daemon HTTP flows, including `--remote-config` launche ## Main commands to reach for first +- `agent-device open --remote-config --relaunch` - `AGENT_DEVICE_DAEMON_BASE_URL=...` - `AGENT_DEVICE_DAEMON_AUTH_TOKEN=...` - `curl ... agent_device.lease.allocate` @@ -17,7 +18,19 @@ Open this file for remote daemon HTTP flows, including `--remote-config` launche Do not run a tenant-isolated command without matching `tenant`, `run`, and `lease` scope. Admission checks require all three to line up. -## Canonical loop +## Preferred remote launch path + +Use this when the agent needs the simplest remote control flow: a Linux sandbox agent talks over HTTP to `agent-device` on a remote macOS host and launches the target app through a checked-in `--remote-config` profile. + +```bash +agent-device open com.example.myapp --remote-config ./agent-device.remote.json --relaunch +``` + +- This is the preferred remote launch path for sandbox or cloud agents. +- For Android React Native relaunch flows, install or reinstall the APK first, then relaunch by installed package name. +- Do not use `open --relaunch`; remote runtime hints are applied through the installed app sandbox. + +## Lease flow example ```bash export AGENT_DEVICE_DAEMON_BASE_URL=http://mac-host.example:4310 @@ -67,18 +80,6 @@ curl -sS "${AGENT_DEVICE_DAEMON_BASE_URL}/rpc" \ - Direct JSON-RPC callers can authenticate with request params, `Authorization: Bearer `, or `x-agent-device-token`. - Prefer an auth hook such as `AGENT_DEVICE_HTTP_AUTH_HOOK` when the host needs caller validation or tenant injection. -## Remote Metro-backed launch - -Use this when the agent must launch a remote React Native app through a checked-in `--remote-config` profile rather than a purely local bootstrap flow. - -```bash -agent-device open com.example.myapp --remote-config ./agent-device.remote.json --relaunch -``` - -- This is the main remote Metro-backed launch path for sandbox or cloud agents. -- For Android React Native relaunch flows, install or reinstall the APK first, then relaunch by installed package name. -- Do not use `open --relaunch`; remote runtime hints are applied through the installed app sandbox. - ## Lease lifecycle Use JSON-RPC methods on `POST /rpc`: diff --git a/skills/agent-device/references/verification.md b/skills/agent-device/references/verification.md index 25a99b254..94560956e 100644 --- a/skills/agent-device/references/verification.md +++ b/skills/agent-device/references/verification.md @@ -21,7 +21,7 @@ Do not use verification tools as the first exploration step. First get the app i ```bash agent-device open Settings --platform ios # after using exploration to reach the state you want to verify -agent-device snapshot -i +agent-device snapshot agent-device screenshot /tmp/settings-proof.png agent-device close ```