-
Notifications
You must be signed in to change notification settings - Fork 130
feat: port dogfood skill to agent-device #133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,183 @@ | ||
| --- | ||
| name: dogfood | ||
| description: Systematically explore and test a mobile app on iOS/Android with agent-device to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", or "test this app" on mobile. Produces a structured report with reproducible evidence: screenshots, optional repro videos, and detailed steps for every issue. | ||
| allowed-tools: Bash(agent-device:*), Bash(npx agent-device:*) | ||
| --- | ||
|
|
||
| # Dogfood (agent-device) | ||
|
|
||
| Systematically explore a mobile app, find issues, and produce a report with full reproduction evidence for every finding. | ||
|
|
||
| ## Setup | ||
|
|
||
| Only the **Target app** is required. Everything else has sensible defaults. | ||
|
|
||
| | Parameter | Default | Example override | | ||
| |-----------|---------|-----------------| | ||
| | **Target app** | _(required)_ | `Settings`, `com.example.app`, deep link URL | | ||
| | **Platform** | Infer from user context; otherwise ask (`ios` or `android`) | `--platform ios` | | ||
| | **Session name** | Slugified app/platform (for example `settings-ios`) | `--session my-session` | | ||
| | **Output directory** | `./dogfood-output/` | `Output directory: /tmp/mobile-qa` | | ||
| | **Scope** | Full app | `Focus on onboarding and profile` | | ||
| | **Authentication** | None | `Sign in to user@example.com` | | ||
|
|
||
| If the user gives enough context to start, begin immediately with defaults. Ask follow-up only when a required detail is missing (for example platform or credentials). | ||
|
|
||
| Prefer direct `agent-device` binary when available. | ||
|
|
||
| ## Workflow | ||
|
|
||
| ``` | ||
| 1. Initialize Set up session, output dirs, report file | ||
| 2. Launch/Auth Open app and sign in if needed | ||
| 3. Orient Capture initial snapshot and map navigation | ||
| 4. Explore Systematically test flows and states | ||
| 5. Document Record reproducible evidence per issue | ||
| 6. Wrap up Reconcile summary, close session | ||
| ``` | ||
|
|
||
| ### 1. Initialize | ||
|
|
||
| ```bash | ||
| mkdir -p {OUTPUT_DIR}/screenshots {OUTPUT_DIR}/videos | ||
| cp {SKILL_DIR}/templates/dogfood-report-template.md {OUTPUT_DIR}/report.md | ||
| ``` | ||
|
|
||
| ### 2. Launch/Auth | ||
|
|
||
| Start a named session and launch target app: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} open {TARGET_APP} --platform {PLATFORM} | ||
| agent-device --session {SESSION} snapshot -i | ||
| ``` | ||
|
|
||
| If login is required: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} snapshot -i | ||
| agent-device --session {SESSION} fill @e1 "{EMAIL}" | ||
| agent-device --session {SESSION} fill @e2 "{PASSWORD}" | ||
| agent-device --session {SESSION} press @e3 | ||
| agent-device --session {SESSION} wait 1000 | ||
| agent-device --session {SESSION} snapshot -i | ||
| ``` | ||
|
|
||
| For OTP/email codes: ask the user, wait for input, then continue. | ||
|
|
||
| ### 3. Orient | ||
|
|
||
| Capture initial evidence and navigation anchors: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/initial.png | ||
| agent-device --session {SESSION} snapshot -i | ||
| ``` | ||
|
|
||
| Map top-level navigation, tabs, and key workflows before deep testing. | ||
|
|
||
| ### 4. Explore | ||
|
|
||
| Read [references/issue-taxonomy.md](references/issue-taxonomy.md) for severity/category calibration. | ||
|
|
||
| Strategy: | ||
|
|
||
| - Move through each major app area (tabs, drawers, settings pages). | ||
| - Test core journeys end-to-end (create, edit, delete, submit, recover). | ||
| - Validate edge states (empty/error/loading/offline/permissions denied). | ||
| - Use `snapshot -i` after UI transitions to avoid stale refs. | ||
| - Periodically capture `logs path` and inspect the app log when behavior looks suspicious. | ||
|
|
||
| Useful commands per screen: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} snapshot -i | ||
| agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/{screen-name}.png | ||
| agent-device --session {SESSION} appstate | ||
| agent-device --session {SESSION} logs path | ||
| ``` | ||
|
|
||
| ### 5. Document Issues (Repro-First) | ||
|
|
||
| Explore and document in one pass. When you find an issue, stop and fully capture evidence before continuing. | ||
|
|
||
| #### Interactive/behavioral issues | ||
|
|
||
| Use video + step screenshots: | ||
|
|
||
| 1. Start recording: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} record start {OUTPUT_DIR}/videos/issue-{NNN}-repro.mp4 | ||
| ``` | ||
|
|
||
| 2. Reproduce with visible pacing. Capture each step: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-1.png | ||
| sleep 1 | ||
| # perform action | ||
| sleep 1 | ||
| agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-2.png | ||
| ``` | ||
|
|
||
| 3. Capture final broken state: | ||
|
|
||
| ```bash | ||
| sleep 2 | ||
| agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-result.png | ||
| ``` | ||
|
|
||
| 4. Stop recording: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} record stop | ||
| ``` | ||
|
|
||
| 5. Append issue immediately to report with numbered steps and screenshot references. | ||
|
|
||
| #### Static/on-load issues | ||
|
|
||
| Single screenshot is sufficient; no video required: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}.png | ||
| ``` | ||
|
|
||
| Set **Repro Video** to `N/A` in the report. | ||
|
|
||
| ### 6. Wrap Up | ||
|
|
||
| Target 5-10 well-evidenced issues, then finish: | ||
|
|
||
| 1. Reconcile summary severity counts in `report.md`. | ||
| 2. Close session: | ||
|
|
||
| ```bash | ||
| agent-device --session {SESSION} close | ||
| ``` | ||
|
|
||
| 3. Report total issues, severity breakdown, and highest-risk findings. | ||
|
|
||
| ## Guidance | ||
|
|
||
| - Repro quality matters more than issue count. | ||
| - Use refs (`@eN`) for fast exploration, selectors for deterministic replay assertions when needed. | ||
| - Re-snapshot after any mutation (navigation, modal, list update, form submit). | ||
| - Use `fill` for clear-then-type semantics; use `type` for incremental typing behavior checks. | ||
| - Keep logs optional and targeted: enable/read app logs only when useful for diagnosis. | ||
| - Never read source code of the app under test; findings must come from observed runtime behavior. | ||
| - Write each issue immediately to avoid losing evidence. | ||
| - Never delete screenshots/videos/report artifacts during a session. | ||
|
|
||
| ## References | ||
|
|
||
| | Reference | When to Read | | ||
| |-----------|--------------| | ||
| | [references/issue-taxonomy.md](references/issue-taxonomy.md) | Start of session; severity/categories/checklist | | ||
|
|
||
| ## Templates | ||
|
|
||
| | Template | Purpose | | ||
| |----------|---------| | ||
| | [templates/dogfood-report-template.md](templates/dogfood-report-template.md) | Copy into output directory as the report file | | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| # Issue Taxonomy (Mobile) | ||
|
|
||
| Reference for categorizing issues found during mobile dogfooding. | ||
|
|
||
| ## Severity Levels | ||
|
|
||
| | Severity | Definition | | ||
| |----------|------------| | ||
| | **critical** | Blocks a core workflow, causes data loss, or crashes/freeze loops the app | | ||
| | **high** | Major feature broken or unusable, no practical workaround | | ||
| | **medium** | Feature works with notable friction or partial failure; workaround exists | | ||
| | **low** | Minor cosmetic or polish issue | | ||
|
|
||
| ## Categories | ||
|
|
||
| ### Visual / UI | ||
|
|
||
| - Layout broken, clipped, overlapped, or unreadable text | ||
| - Safe-area/notch overlap issues | ||
| - Incorrect dark/light appearance rendering | ||
| - Missing assets/icons | ||
| - Animation glitches or flicker | ||
|
|
||
| ### Functional | ||
|
|
||
| - Buttons/controls do nothing or trigger wrong action | ||
| - Flows fail (create/edit/delete/submit) | ||
| - Navigation dead-ends or wrong destination | ||
| - State loss after background/foreground transitions | ||
| - Deep link opens wrong screen or fails | ||
|
|
||
| ### UX | ||
|
|
||
| - Confusing hierarchy or navigation labels | ||
| - Missing loading/progress feedback | ||
| - Unclear error handling or no recovery affordance | ||
| - Excessive steps for common tasks | ||
| - Inconsistent behavior between similar screens | ||
|
|
||
| ### Content | ||
|
|
||
| - Typos, incorrect copy, placeholder text | ||
| - Wrong labels/help text | ||
| - Truncated text with no affordance | ||
| - Inconsistent terminology across screens | ||
|
|
||
| ### Performance | ||
|
|
||
| - Slow startup or route transitions | ||
| - Input lag or gesture jank | ||
| - Scroll hitches/frame drops | ||
| - Notable battery/thermal symptoms during basic usage | ||
|
|
||
| ### Diagnostics / Logs | ||
|
|
||
| - Native crashes or repeated fatal exceptions | ||
| - Repeated warnings correlated with broken behavior | ||
| - Unhandled runtime errors visible during repro | ||
|
|
||
| ### Permissions / Platform | ||
|
|
||
| - Permission prompt flow broken or loops forever | ||
| - Denied permissions not handled gracefully | ||
| - Platform-specific regressions (iOS-only or Android-only) | ||
| - Background/foreground lifecycle regressions | ||
|
|
||
| ### Accessibility | ||
|
|
||
| - Missing labels or incorrect accessibility names | ||
| - Focus order/navigation issues for assistive tech | ||
| - Low contrast or unreadable text scaling | ||
| - Touch targets too small for reliable interaction | ||
|
|
||
| ## Exploration Checklist | ||
|
|
||
| 1. Visual scan: capture screenshot; verify layout/safe areas/text/icon rendering. | ||
| 2. Interactions: press controls, open menus/modals, validate expected response. | ||
| 3. Forms/input: test valid/invalid/empty/boundary input. | ||
| 4. Navigation: traverse all top-level sections and return paths. | ||
| 5. App states: loading/empty/error/offline/permission-denied/background-resume. | ||
| 6. Logs/diagnostics: inspect app logs when behavior is suspicious. | ||
| 7. Platform parity: verify critical flows on each requested platform. | ||
| 8. Accessibility basics: labels, touch target sizes, readability/contrast. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| # Dogfood Report: {APP_NAME} | ||
|
|
||
| | Field | Value | | ||
| |-------|-------| | ||
| | **Date** | {DATE} | | ||
| | **Platform** | {PLATFORM} | | ||
| | **Target App** | {TARGET_APP} | | ||
| | **Session** | {SESSION_NAME} | | ||
| | **Scope** | {SCOPE} | | ||
|
|
||
| ## Summary | ||
|
|
||
| | Severity | Count | | ||
| |----------|-------| | ||
| | Critical | 0 | | ||
| | High | 0 | | ||
| | Medium | 0 | | ||
| | Low | 0 | | ||
| | **Total** | **0** | | ||
|
|
||
| ## Issues | ||
|
|
||
| <!-- Copy this block for each issue found. Interactive issues need video + step screenshots. Static issues can be screenshot-only (Repro Video = N/A). --> | ||
|
|
||
| ### ISSUE-001: {Short title} | ||
|
|
||
| | Field | Value | | ||
| |-------|-------| | ||
| | **Severity** | critical / high / medium / low | | ||
| | **Category** | visual / functional / ux / content / performance / diagnostics / permissions / accessibility | | ||
| | **Screen / Route** | {screen where issue was found} | | ||
| | **Repro Video** | {path to video, or N/A for static issues} | | ||
|
|
||
| **Description** | ||
|
|
||
| {What is wrong, what was expected, and what actually happened.} | ||
|
|
||
| **Repro Steps** | ||
|
|
||
| 1. Open {screen/entry point} | ||
|  | ||
|
|
||
| 2. {Action} | ||
|  | ||
|
|
||
| 3. {Action} | ||
|  | ||
|
|
||
| 4. **Observe:** {broken behavior} | ||
|  | ||
|
|
||
| --- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.