feat: add agent-device dogfood skill package

thymikee · thymikee · commit 7081be099bf1 · 2026-02-25T16:23:38.000+01:00
diff --git a/README.md b/README.md
@@ -35,6 +35,7 @@ npx agent-device open SampleApp
 ```
 
 The skill is also accessible on [ClawHub](https://clawhub.ai/okwasniewski/agent-device).
+For structured exploratory QA workflows, use the dogfood skill at [skills/dogfood/SKILL.md](skills/dogfood/SKILL.md).
 
 ## Quick Start
 
diff --git a/skills/agent-device/SKILL.md b/skills/agent-device/SKILL.md
@@ -6,6 +6,7 @@ description: Automates interactions for iOS simulators/devices and Android emula
 # Mobile Automation with agent-device
 
 For exploration, use snapshot refs. For deterministic replay, use selectors.
+For structured exploratory QA bug hunts and reporting, use [../dogfood/SKILL.md](../dogfood/SKILL.md).
 
 ## Start Here (Read This First)
 
diff --git a/skills/dogfood/SKILL.md b/skills/dogfood/SKILL.md
@@ -0,0 +1,183 @@
+---
+name: dogfood
+description: Systematically explore and test a mobile app on iOS/Android with agent-device to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", or "test this app" on mobile. Produces a structured report with reproducible evidence: screenshots, optional repro videos, and detailed steps for every issue.
+allowed-tools: Bash(agent-device:*), Bash(npx agent-device:*)
+---
+
+# Dogfood (agent-device)
+
+Systematically explore a mobile app, find issues, and produce a report with full reproduction evidence for every finding.
+
+## Setup
+
+Only the **Target app** is required. Everything else has sensible defaults.
+
+| Parameter | Default | Example override |
+|-----------|---------|-----------------|
+| **Target app** | _(required)_ | `Settings`, `com.example.app`, deep link URL |
+| **Platform** | Infer from user context; otherwise ask (`ios` or `android`) | `--platform ios` |
+| **Session name** | Slugified app/platform (for example `settings-ios`) | `--session my-session` |
+| **Output directory** | `./dogfood-output/` | `Output directory: /tmp/mobile-qa` |
+| **Scope** | Full app | `Focus on onboarding and profile` |
+| **Authentication** | None | `Sign in to user@example.com` |
+
+If the user gives enough context to start, begin immediately with defaults. Ask follow-up only when a required detail is missing (for example platform or credentials).
+
+Prefer direct `agent-device` binary when available.
+
+## Workflow
+
+```
+1. Initialize    Set up session, output dirs, report file
+2. Launch/Auth   Open app and sign in if needed
+3. Orient        Capture initial snapshot and map navigation
+4. Explore       Systematically test flows and states
+5. Document      Record reproducible evidence per issue
+6. Wrap up       Reconcile summary, close session
+```
+
+### 1. Initialize
+
+```bash
+mkdir -p {OUTPUT_DIR}/screenshots {OUTPUT_DIR}/videos
+cp {SKILL_DIR}/templates/dogfood-report-template.md {OUTPUT_DIR}/report.md
+```
+
+### 2. Launch/Auth
+
+Start a named session and launch target app:
+
+```bash
+agent-device --session {SESSION} open {TARGET_APP} --platform {PLATFORM}
+agent-device --session {SESSION} snapshot -i
+```
+
+If login is required:
+
+```bash
+agent-device --session {SESSION} snapshot -i
+agent-device --session {SESSION} fill @e1 "{EMAIL}"
+agent-device --session {SESSION} fill @e2 "{PASSWORD}"
+agent-device --session {SESSION} press @e3
+agent-device --session {SESSION} wait 1000
+agent-device --session {SESSION} snapshot -i
+```
+
+For OTP/email codes: ask the user, wait for input, then continue.
+
+### 3. Orient
+
+Capture initial evidence and navigation anchors:
+
+```bash
+agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/initial.png
+agent-device --session {SESSION} snapshot -i
+```
+
+Map top-level navigation, tabs, and key workflows before deep testing.
+
+### 4. Explore
+
+Read [references/issue-taxonomy.md](references/issue-taxonomy.md) for severity/category calibration.
+
+Strategy:
+
+- Move through each major app area (tabs, drawers, settings pages).
+- Test core journeys end-to-end (create, edit, delete, submit, recover).
+- Validate edge states (empty/error/loading/offline/permissions denied).
+- Use `snapshot -i` after UI transitions to avoid stale refs.
+- Periodically capture `logs path` and inspect the app log when behavior looks suspicious.
+
+Useful commands per screen:
+
+```bash
+agent-device --session {SESSION} snapshot -i
+agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/{screen-name}.png
+agent-device --session {SESSION} appstate
+agent-device --session {SESSION} logs path
+```
+
+### 5. Document Issues (Repro-First)
+
+Explore and document in one pass. When you find an issue, stop and fully capture evidence before continuing.
+
+#### Interactive/behavioral issues
+
+Use video + step screenshots:
+
+1. Start recording:
+
+```bash
+agent-device --session {SESSION} record start {OUTPUT_DIR}/videos/issue-{NNN}-repro.mp4
+```
+
+2. Reproduce with visible pacing. Capture each step:
+
+```bash
+agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-1.png
+sleep 1
+# perform action
+sleep 1
+agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-2.png
+```
+
+3. Capture final broken state:
+
+```bash
+sleep 2
+agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-result.png
+```
+
+4. Stop recording:
+
+```bash
+agent-device --session {SESSION} record stop
+```
+
+5. Append issue immediately to report with numbered steps and screenshot references.
+
+#### Static/on-load issues
+
+Single screenshot is sufficient; no video required:
+
+```bash
+agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}.png
+```
+
+Set **Repro Video** to `N/A` in the report.
+
+### 6. Wrap Up
+
+Target 5-10 well-evidenced issues, then finish:
+
+1. Reconcile summary severity counts in `report.md`.
+2. Close session:
+
+```bash
+agent-device --session {SESSION} close
+```
+
+3. Report total issues, severity breakdown, and highest-risk findings.
+
+## Guidance
+
+- Repro quality matters more than issue count.
+- Use refs (`@eN`) for fast exploration, selectors for deterministic replay assertions when needed.
+- Re-snapshot after any mutation (navigation, modal, list update, form submit).
+- Use `fill` for clear-then-type semantics; use `type` for incremental typing behavior checks.
+- Keep logs optional and targeted: enable/read app logs only when useful for diagnosis.
+- Never read source code of the app under test; findings must come from observed runtime behavior.
+- Write each issue immediately to avoid losing evidence.
+- Never delete screenshots/videos/report artifacts during a session.
+
+## References
+
+| Reference | When to Read |
+|-----------|--------------|
+| [references/issue-taxonomy.md](references/issue-taxonomy.md) | Start of session; severity/categories/checklist |
+
+## Templates
+
+| Template | Purpose |
+|----------|---------|
+| [templates/dogfood-report-template.md](templates/dogfood-report-template.md) | Copy into output directory as the report file |
diff --git a/skills/dogfood/references/issue-taxonomy.md b/skills/dogfood/references/issue-taxonomy.md
@@ -0,0 +1,83 @@
+# Issue Taxonomy (Mobile)
+
+Reference for categorizing issues found during mobile dogfooding.
+
+## Severity Levels
+
+| Severity | Definition |
+|----------|------------|
+| **critical** | Blocks a core workflow, causes data loss, or crashes/freeze loops the app |
+| **high** | Major feature broken or unusable, no practical workaround |
+| **medium** | Feature works with notable friction or partial failure; workaround exists |
+| **low** | Minor cosmetic or polish issue |
+
+## Categories
+
+### Visual / UI
+
+- Layout broken, clipped, overlapped, or unreadable text
+- Safe-area/notch overlap issues
+- Incorrect dark/light appearance rendering
+- Missing assets/icons
+- Animation glitches or flicker
+
+### Functional
+
+- Buttons/controls do nothing or trigger wrong action
+- Flows fail (create/edit/delete/submit)
+- Navigation dead-ends or wrong destination
+- State loss after background/foreground transitions
+- Deep link opens wrong screen or fails
+
+### UX
+
+- Confusing hierarchy or navigation labels
+- Missing loading/progress feedback
+- Unclear error handling or no recovery affordance
+- Excessive steps for common tasks
+- Inconsistent behavior between similar screens
+
+### Content
+
+- Typos, incorrect copy, placeholder text
+- Wrong labels/help text
+- Truncated text with no affordance
+- Inconsistent terminology across screens
+
+### Performance
+
+- Slow startup or route transitions
+- Input lag or gesture jank
+- Scroll hitches/frame drops
+- Notable battery/thermal symptoms during basic usage
+
+### Diagnostics / Logs
+
+- Native crashes or repeated fatal exceptions
+- Repeated warnings correlated with broken behavior
+- Unhandled runtime errors visible during repro
+
+### Permissions / Platform
+
+- Permission prompt flow broken or loops forever
+- Denied permissions not handled gracefully
+- Platform-specific regressions (iOS-only or Android-only)
+- Background/foreground lifecycle regressions
+
+### Accessibility
+
+- Missing labels or incorrect accessibility names
+- Focus order/navigation issues for assistive tech
+- Low contrast or unreadable text scaling
+- Touch targets too small for reliable interaction
+
+## Exploration Checklist
+
+1. Visual scan: capture screenshot; verify layout/safe areas/text/icon rendering.
+2. Interactions: press controls, open menus/modals, validate expected response.
+3. Forms/input: test valid/invalid/empty/boundary input.
+4. Navigation: traverse all top-level sections and return paths.
+5. App states: loading/empty/error/offline/permission-denied/background-resume.
+6. Logs/diagnostics: inspect app logs when behavior is suspicious.
+7. Platform parity: verify critical flows on each requested platform.
+8. Accessibility basics: labels, touch target sizes, readability/contrast.
diff --git a/skills/dogfood/templates/dogfood-report-template.md b/skills/dogfood/templates/dogfood-report-template.md
@@ -0,0 +1,52 @@
+# Dogfood Report: {APP_NAME}
+
+| Field | Value |
+|-------|-------|
+| **Date** | {DATE} |
+| **Platform** | {PLATFORM} |
+| **Target App** | {TARGET_APP} |
+| **Session** | {SESSION_NAME} |
+| **Scope** | {SCOPE} |
+
+## Summary
+
+| Severity | Count |
+|----------|-------|
+| Critical | 0 |
+| High | 0 |
+| Medium | 0 |
+| Low | 0 |
+| **Total** | **0** |
+
+## Issues
+
+<!-- Copy this block for each issue found. Interactive issues need video + step screenshots. Static issues can be screenshot-only (Repro Video = N/A). -->
+
+### ISSUE-001: {Short title}
+
+| Field | Value |
+|-------|-------|
+| **Severity** | critical / high / medium / low |
+| **Category** | visual / functional / ux / content / performance / diagnostics / permissions / accessibility |
+| **Screen / Route** | {screen where issue was found} |
+| **Repro Video** | {path to video, or N/A for static issues} |
+
+**Description**
+
+{What is wrong, what was expected, and what actually happened.}
+
+**Repro Steps**
+
+1. Open {screen/entry point}
+   ![Step 1](screenshots/issue-001-step-1.png)
+
+2. {Action}
+   ![Step 2](screenshots/issue-001-step-2.png)
+
+3. {Action}
+   ![Step 3](screenshots/issue-001-step-3.png)
+
+4. **Observe:** {broken behavior}
+   ![Result](screenshots/issue-001-result.png)
+
+---
diff --git a/website/docs/docs/introduction.md b/website/docs/docs/introduction.md
@@ -11,6 +11,7 @@ title: Introduction
 - Session-aware workflows and replay
 
 If you know `agent-browser`, this is the mobile-native counterpart for iOS/Android UI automation.
+For exploratory QA and bug-hunting workflows, see `skills/dogfood/SKILL.md` in this repository.
 
 ## What it’s good at