docs: improve agent discoverability for observability (#431)

thymikee · web-flow · commit d6b8b12acf57 · 2026-04-21T09:48:45.000-04:00
* 0.12.9

* docs: improve agent discoverability for observability

* docs: add debugging and profiling guide
diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@
 
 # agent-device
 
-`agent-device` is a CLI for UI automation on iOS, tvOS, macOS, Android, and AndroidTV. It is designed for agent-driven workflows: inspect the UI, act on it deterministically, and keep that work session-aware and replayable.
+`agent-device` is a CLI for UI automation and app observability on iOS, tvOS, macOS, Android, and AndroidTV. It is built for agent-driven workflows: inspect the UI, interact deterministically, collect logs/network/perf evidence when behavior breaks, and keep the whole flow session-aware and replayable.
 
 If you know Vercel's [agent-browser](https://github.com/vercel-labs/agent-browser), this project applies the same broad idea to mobile apps and devices.
 
@@ -19,18 +19,26 @@ If you know Vercel's [agent-browser](https://github.com/vercel-labs/agent-browse
 - Give agents a practical way to understand mobile UI state through structured snapshots.
 - Keep automation flows token-efficient enough for real agent loops.
 - Make common interactions reliable enough for repeated automation runs.
+- Make debugging evidence easy to collect through logs, network inspection, and performance snapshots.
 - Keep automation grounded in sessions, selectors, and replayable flows instead of one-off scripts.
 
 ## Core Ideas
 
 - Sessions: open a target once, interact within that session, then close it cleanly.
 - Snapshots: inspect the current accessibility tree in a compact form and get current-screen refs for exploration.
 - Refs vs selectors: use refs for discovery, use selectors for durable replay and assertions.
+- Observability: collect session logs, inspect recent HTTP traffic with `network dump`, and sample CPU/memory with `perf`.
 - Tests: run deterministic `.ad` scripts as a light e2e test suite.
 - Replay scripts: save `.ad` flows with `--save-script`, replay one script with `replay`, or run a folder/glob as a serial suite with `test`.
   `test` supports metadata-aware retries up to 3 additional attempts, per-test timeouts, flaky pass reporting, and runner-managed artifacts under `.agent-device/test-artifacts` by default. Each attempt writes `replay.ad` and `result.txt`; failed attempts also keep copied logs and artifacts when available.
 - Human docs vs agent skills: docs explain the system for people; skills provide compact operating guidance for agents.
 
+## Complementary Tooling
+
+Use `agent-device` for on-device UI automation, screenshots/recordings, app logs, network inspection, and performance snapshots.
+
+When the task needs the React component tree, props, state, hooks, or render profiling, pair it with the complementary [`agent-react-devtools`](https://github.com/callstackincubator/agent-react-devtools) project. The two tools solve different layers of the same debugging workflow.
+
 ## Command Flow
 
 The canonical loop is:
diff --git a/package.json b/package.json
@@ -1,7 +1,7 @@
 {
   "name": "agent-device",
-  "version": "0.12.8",
-  "description": "Unified control plane for physical and virtual devices via an agent-driven CLI.",
+  "version": "0.12.9",
+  "description": "Agent-driven CLI for mobile UI automation, network inspection, and performance diagnostics across iOS, Android, tvOS, and macOS.",
   "license": "MIT",
   "author": "Callstack",
   "homepage": "https://agent-device.dev/",
@@ -128,11 +128,20 @@
     "agent",
     "device",
     "cli",
+    "automation",
     "adb",
     "simctl",
     "devicectl",
     "ios",
-    "android"
+    "android",
+    "tvos",
+    "macos",
+    "react-native",
+    "observability",
+    "diagnostics",
+    "network",
+    "profiling",
+    "performance"
   ],
   "dependencies": {
     "fast-xml-parser": "^5.5.10",
diff --git a/skills/agent-device/SKILL.md b/skills/agent-device/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: agent-device
-description: Automates interactions for Apple-platform apps (iOS, tvOS, macOS) and Android devices. Use when navigating apps, taking snapshots/screenshots, tapping, typing, scrolling, or extracting UI info across mobile, TV, and desktop targets.
+description: Automates interactions for Apple-platform apps (iOS, tvOS, macOS) and Android devices. Use when navigating apps, taking snapshots/screenshots, tapping, typing, scrolling, extracting UI info, or collecting logs, network inspection, and perf snapshots across mobile, TV, and desktop targets.
 ---
 
 # agent-device
@@ -71,3 +71,4 @@ Use this skill as a router with mandatory defaults. Read this file first. For no
 - Need desktop surfaces, menu bar behavior, or macOS-specific interaction rules: [references/macos-desktop.md](references/macos-desktop.md)
 - Need remote HTTP transport, `connect --remote-config`, or tenant leases on a remote macOS host: [references/remote-tenancy.md](references/remote-tenancy.md)
   This includes remote React Native runs where `agent-device` now prepares Metro locally and manages the local Metro companion tunnel automatically.
+- Need the React component tree, props, state, hooks, or render profiling: pair `agent-device` with the complementary [`agent-react-devtools`](https://github.com/callstackincubator/agent-react-devtools) project when available.
diff --git a/skills/agent-device/references/debugging.md b/skills/agent-device/references/debugging.md
@@ -4,6 +4,8 @@
 
 Open this file when the task turns into failure triage, logs, network inspection, permission prompts, setup trouble, or unstable session behavior.
 
+If the debugging task needs the React component tree, props, state, hooks, or render profiling, pair `agent-device` with the complementary [`agent-react-devtools`](https://github.com/callstackincubator/agent-react-devtools) project instead of trying to infer those internals from the accessibility tree or app logs alone.
+
 ## Main commands to reach for first
 
 - `logs clear --restart`
diff --git a/skills/agent-device/references/exploration.md b/skills/agent-device/references/exploration.md
@@ -20,6 +20,7 @@ Open this file when the app or screen is already running and you need to discove
 - User asks what is visible on screen: `snapshot`
 - User asks for exact text from a known target: `get text`
 - User asks you to tap, type, or choose an element: `snapshot -i`, then act
+- User asks for the React component tree, props/state/hooks, or render profiling: pair `agent-device` with the complementary [`agent-react-devtools`](https://github.com/callstackincubator/agent-react-devtools) project
 - React Native dev or debug build shows warning/error UI: capture enough evidence to identify it, dismiss it if it is not the requested behavior, then continue the flow and report it in the summary
 - The on-screen keyboard is blocking the next step: `keyboard dismiss`; on iOS do this only while an app session is active, and use `keyboard status|get` only on Android
 - UI does not expose the answer: say so plainly; do not browse or force the app into a new state unless asked
diff --git a/skills/agent-device/references/verification.md b/skills/agent-device/references/verification.md
@@ -129,5 +129,6 @@ agent-device perf --json
 - `startup` is command round-trip timing around `open`.
 - It is not true first-frame or first-interactive telemetry.
 - Android app sessions also expose `memory` (`dumpsys meminfo`) and `cpu` (`dumpsys cpuinfo`) snapshots when the session has an app package context.
-- Apple app sessions on macOS and iOS simulators also expose `memory` and `cpu` process snapshots when the session has an app bundle ID.
-- `fps` is still unavailable, and physical iOS devices still leave `memory` and `cpu` unavailable in this release.
+- Apple app sessions on macOS, iOS simulators, and physical iOS devices also expose `memory` and `cpu` process snapshots when the session has an app bundle ID.
+- On physical iOS devices, sampling uses a short `xcrun xctrace` Activity Monitor capture, so keep the device unlocked, connected, and the app active in the foreground while sampling.
+- `fps` is still unavailable in this release.
diff --git a/skills/dogfood/SKILL.md b/skills/dogfood/SKILL.md
@@ -166,6 +166,7 @@ agent-device --session {SESSION} close
 - Re-snapshot after any mutation (navigation, modal, list update, form submit).
 - Use `fill` for clear-then-type semantics; use `type` for incremental typing behavior checks.
 - Keep logs optional and targeted: enable/read app logs only when useful for diagnosis.
+- If the issue appears rooted in React internals rather than device/app runtime behavior, pair `agent-device` with the complementary [`agent-react-devtools`](https://github.com/callstackincubator/agent-react-devtools) project for component-tree or render-profiling inspection.
 - Never read source code of the app under test; findings must come from observed runtime behavior.
 - Write each issue immediately to avoid losing evidence.
 - Never delete screenshots/videos/report artifacts during a session.
diff --git a/website/docs/docs/_meta.json b/website/docs/docs/_meta.json
@@ -14,6 +14,11 @@
     "type": "file",
     "label": "Quick Start"
   },
+  {
+    "name": "debugging-profiling",
+    "type": "file",
+    "label": "Debugging & Profiling"
+  },
   {
     "name": "client-api",
     "type": "file",
diff --git a/website/docs/docs/debugging-profiling.md b/website/docs/docs/debugging-profiling.md
@@ -0,0 +1,78 @@
+---
+title: Debugging & Profiling
+---
+
+# Debugging & Profiling
+
+Use `agent-device` when the task moves past UI automation and you need runtime evidence from the app or device layer.
+
+## What `agent-device` covers well
+
+- Session app logs for targeted debugging windows
+- Network inspection from recent HTTP(s) entries in app logs via `network dump`
+- Performance snapshots with `perf` / `metrics`
+- Screenshots, recordings, and replayable repro flows
+
+## What to use instead
+
+If the task needs the React component tree, props, state, hooks, or render profiling, pair `agent-device` with the complementary [`agent-react-devtools`](https://github.com/callstackincubator/agent-react-devtools) project.
+
+`agent-device` is centered on the device and app runtime layer. `agent-react-devtools` is the better fit for React internals.
+
+## Fast path
+
+```bash
+agent-device open MyApp --platform ios
+agent-device logs clear --restart
+agent-device network dump 25 --include headers
+agent-device perf --json
+agent-device logs path
+```
+
+Use this flow when you need a clean repro window with logs, recent network activity, and a quick perf sample from the active app session.
+
+## Core commands
+
+### Logs
+
+```bash
+agent-device logs start
+agent-device logs stop
+agent-device logs clear --restart
+agent-device logs path
+agent-device logs doctor
+agent-device logs mark "before submit"
+```
+
+- Logging is off by default; enable it only for focused debugging windows.
+- Prefer `logs clear --restart` for clean repro loops.
+- Use `logs path` and then grep the file instead of loading whole logs into agent context.
+
+### Network inspection
+
+```bash
+agent-device network dump 25
+agent-device network dump 25 --include headers
+agent-device network dump 25 --include all
+```
+
+- `network dump` parses recent HTTP(s) entries from the session app log.
+- `network log` is an alias for `network dump`.
+- Parsed results depend on what the app emits into the platform log backend.
+
+### Performance snapshots
+
+```bash
+agent-device perf --json
+agent-device metrics --json
+```
+
+- `perf` returns session-scoped startup and, where supported, CPU and memory samples.
+- Startup is measured around the `open` command; it is not first-frame instrumentation.
+- CPU and memory availability depends on platform and whether the active session is bound to an app/package.
+
+## Where to go deeper
+
+- Full command reference: [Commands](/docs/commands)
+- Typed client observability APIs: [Typed Client](/docs/client-api)
+- Session behavior and lifecycle: [Sessions](/docs/sessions)
diff --git a/website/docs/docs/introduction.md b/website/docs/docs/introduction.md
@@ -9,22 +9,26 @@ title: Introduction
 - Accessibility snapshots for UI understanding
 - Deterministic interactions (tap, type, scroll)
 - Session-aware workflows and replay
+- Session logs and network inspection for debugging broken flows
+- Performance snapshots with `perf`/`metrics`, including CPU and memory data where supported
 
-If you know `agent-browser`, this is the mobile-native counterpart for iOS/Android UI automation.
+If you know `agent-browser`, this is the mobile-native counterpart for iOS/Android UI automation and app-level observability.
 For exploratory QA and bug-hunting workflows, see `skills/dogfood/SKILL.md` in this repository.
+For React component trees, props/state/hooks, and render profiling, pair it with the complementary [`agent-react-devtools`](https://github.com/callstackincubator/agent-react-devtools) project.
 
 ## What it’s good at
 
-- Capturing structured UI state for LLMs
-- Driving common UI actions with refs or semantic selectors
-- Replaying flows for regression checks
+- Exploring and driving app flows on real devices and simulators
+- Collecting debugging evidence through logs, network traffic, screenshots, recordings, and performance snapshots
+- Replaying successful flows as lightweight regression checks
 
 ## Platform support highlights
 
 - iOS core runner commands: `snapshot`, `snapshot --diff`, `diff snapshot`, `wait`, `click`, `fill`, `get`, `is`, `find`, `press`, `long-press`, `focus`, `type`, `scroll`, `back`, `home`, `rotate`, `app-switcher`, `open` (app), `close`, `screenshot`, `apps`, `appstate`, `install`, `install-from-source`, `reinstall`, `trigger-app-event`.
 - iOS `appstate` is session-scoped on the selected target device.
 - iOS/tvOS simulator-only: `settings`, `push`, `clipboard`.
 - Apple simulators and macOS desktop app sessions: `alert`, `pinch`.
+- Session diagnostics: `logs` and `network dump` are available for debugging active app sessions, with network inspection based on recent HTTP(s) entries captured in the session app log.
 - Session performance metrics: `perf`/`metrics` is available on iOS, macOS, and Android. Startup timing comes from `open` command round-trip duration. Android app sessions and Apple app sessions on macOS, iOS simulators, or connected iOS devices also expose CPU and memory snapshots when an app identifier is available in the session.
 - iOS `record` supports simulators and physical devices.
   - Simulators use native `simctl io ... recordVideo`.
@@ -43,6 +47,12 @@ For exploratory QA and bug-hunting workflows, see `skills/dogfood/SKILL.md` in t
 3. iOS uses XCTest runner for snapshots and input on simulators and physical devices.
 4. Android uses ADB-based tooling.
 
+## Complementary React tooling
+
+`agent-device` is intentionally centered on the device/app layer: UI automation, screenshots/recordings, app logs, network inspection, and performance sampling.
+
+When a debugging workflow needs React internals such as the component tree, props, state, hooks, or render profiling, use the complementary `agent-react-devtools` project alongside `agent-device` rather than as a replacement.
+
 ## Example
 
 ```bash
diff --git a/website/docs/index.md b/website/docs/index.md
@@ -3,7 +3,7 @@ pageType: home
 
 hero:
   name: Control iOS and Android devices with AI agents.
-  tagline: agent-device is a token-efficient, lightweight CLI for iOS and Android device automation. Gives your agents "eyes" to see and "hands" to interact with the UI.
+  tagline: agent-device is a token-efficient CLI for mobile UI automation and app observability. It gives agents structured UI access, deterministic interactions, and built-in logs, network inspection, and perf metrics when the happy path breaks.
   actions:
     - theme: brand
       text: Get Started
@@ -19,10 +19,12 @@ features:
     details: Accessibility trees give agents a complete view of the UI while keeping output compact.
   - title: Interactions that just work
     details: Tap, swipe, scroll, focus, and type with precise coordinates or semantic finders.
+  - title: Built-in observability
+    details: Collect session logs, inspect recent HTTP traffic with network dump, and sample CPU and memory metrics with perf.
   - title: Session and replay
     details: Open apps, switch apps in-session, and replay recorded actions to reproduce flows across platforms.
   - title: Visual verification
     details: Capture full-resolution screenshots and video recordings for reporting and visual checks.
-  - title: Traceable automation
-    details: Collect trace logs for XCTest to debug flaky interactions.
+  - title: Complementary React internals
+    details: Pair agent-device with agent-react-devtools when you need the React component tree, props, state, hooks, or render profiling.
 ---
diff --git a/website/rspress.config.ts b/website/rspress.config.ts
@@ -6,9 +6,10 @@ export default withCallstackPreset(
     context: __dirname,
     docs: {
       title: 'agent-device',
-      description: 'CLI to control iOS and Android devices for AI agents',
+      description:
+        'CLI for mobile UI automation, logs, network inspection, and performance diagnostics for AI agents',
       editUrl: 'https://github.com/callstackincubator/agent-device/edit/main/website',
-      rootUrl: 'https://oss.callstack.com/agent-device',
+      rootUrl: 'https://incubator.callstack.com/agent-device',
       rootDir: 'docs',
       icon: '/logo.svg',
       logoLight: '/logo-light.svg',