Skip to content

Commit 995d6bd

Browse files
Generalize input source from PR-only to PR-or-issue
- Drop the runtime-code gate. Test/repro steps are the source of truth, not the surrounding code diff. The skill no longer fetches `gh pr diff --name-only` and no longer SKIPs on docs-only inputs. - Inputs now accept a GitHub URL pointing at a PR (`/pull/N`) or an issue (`/issues/N`). URL path disambiguates kind. Bare numbers rejected (PR/issue share the GitHub number namespace). - Per-source heading anchors with whole-body fallback: PR -> `### Tests` / `### Test` / `## Tests` issue -> `## Action Performed:` / `## Repro` / `## Steps to reproduce` / `## Reproduction Steps` - Issue platform-checklist used as a real platform-restriction signal (filled boxes denote where the bug reproduces). Aliases: `iOS: App` ≡ `iOS: Native`, `Android: App` ≡ `Android: Native`. - Manifest envelope changed from `pr: <num>` to `source: {kind, number, url, title}`. New per-flow `expected` field populated from issue `## Expected Result:` blocks. - Run-output dir generalized from `<pr-num>/` to `<source-kind>-<source-num>/` (e.g. `pr-89475/`, `issue-89855/`). - New exit code `8 BAD_INPUT` for malformed / non-PR-non-issue URLs. Removed exit `2 SKIP` and the docs-only error-handling row. - Allowed-tools: dropped `Bash(gh pr diff *)`, added `Bash(gh issue view *)` and `Bash(gh api *)`. Skill name `agent-device-pr-media` no longer matches the broadened scope; rename flagged as a follow-up.
1 parent 7a76ca8 commit 995d6bd

2 files changed

Lines changed: 106 additions & 47 deletions

File tree

.claude/skills/agent-device-pr-media/ISSUES.md

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
- **Concurrent runs / locking**: single-user assumption. Latest run wins. Two simultaneous runs against the same PR cache dir = undefined behavior.
3535
- **Idempotency**: two runs on the same PR may produce different videos (LLM-driven Phase 1 may take different paths). Acceptable for PR evidence - demonstrates the flow works, byte-identity not required.
3636
- **Test data accumulation**: accounts/expenses/workspaces created during runs accumulate in the test backend; rely on periodic test-account reset.
37-
- **Self-demo paradox**: the skill's own PR (`#89475`) is `.claude/`-only and would correctly hit the `SKIP: no runtime code changed` gate. First real demo target should be a PR with `#### Test case N:` headers (e.g. PR #89743 - 5 explicit flows).
37+
- **Self-demo path**: the skill's own PR (`#89475`) is `.claude/`-only but the skill no longer gates on code changes - it would parse the Tests section and try to drive whatever steps are there. Demo target choice should reflect what the steps actually exercise on the device.
3838

3939
## Convention adopted (v1)
4040

@@ -43,6 +43,23 @@
4343
- **Per-flow artifact**: one MP4 per flow per platform (or one PNG for verify-only single-step flows). Not one big MP4.
4444
- **Persistent cache**: `~/.cache/agent-device-pr-media/<pr-num>/<run-ts>/`. Survives reboots; latest-run-wins.
4545

46+
## Generalize input source: PR or issue (added 2026-05-07)
47+
48+
- **Removed**: the runtime-code gate. The skill no longer fetches `gh pr diff --name-only` and no longer skips PRs whose diff is `.claude/`-only / docs-only. The user's directive: test steps are the source of truth, not the surrounding code; the skill should be smart enough to extract them from whatever GitHub Markdown body it's pointed at.
49+
- **Inputs broadened** from "PR number or URL" to "Source URL (PR or issue)". URL path disambiguates kind: `/pull/N` → PR, `/issues/N` → issue. Bare numbers rejected (PR/issue share namespace).
50+
- **Source-aware parsing**:
51+
- PR body → anchor on `### Tests` (with `### Test`, `## Tests` fallbacks).
52+
- Issue body → anchor on `## Action Performed:` (with `## Repro`, `## Steps to reproduce`, `## Reproduction Steps` fallbacks).
53+
- No anchor match → pass whole body to LLM. The anchor list is a token-cost optimization, not a hard contract.
54+
- **Issue platform-checklist is a real signal** (unlike PR's, where every box is always checked as "tested on"). The issue template's `## Platforms:` checkboxes denote where the bug reproduces. The platform resolver now uses this when the source is an issue. Aliases: `iOS: App``iOS: Native`, `Android: App``Android: Native`. mWeb / Windows / MacOS variants stay out of scope.
55+
- **Issues are typically single-flow.** Bug reports describe one repro path. Multi-flow segmentation logic stays for PRs.
56+
- **`expected` field added to per-flow manifest** (issues only) - populated from `## Expected Result:`. The driver MAY use it as a final-state assertion target.
57+
- **New exit code `8 BAD_INPUT`** for malformed / non-PR-non-issue source URLs. Replaces the old exit `2 SKIP` behavior.
58+
- **Run-output dir generalized** from `<pr-num>/` to `<source-kind>-<source-num>/` (e.g. `pr-89475/`, `issue-89855/`).
59+
- **Skill name not yet aligned with broadened scope.** Directory + cache path + cross-links still say `agent-device-pr-media`. Rename to something like `agent-device-flow-evidence` or `agent-device-test-recorder` is a candidate follow-up.
60+
61+
Reference: https://github.com/Expensify/App/issues/89855 (and 6 other recent bug-tagged issues sampled 2026-05-07) - all follow the `## Action Performed:` + `## Expected Result:` + `## Platforms:` template consistently.
62+
4663
## Phase 1 cache (added 2026-05-07)
4764

4865
- **Problem**: Phase 1 (LLM-driven exploration) is the expensive part; Phase 2 is just `agent-device replay`. Re-running on a PR whose Tests steps haven't changed wastes Phase 1's full cost on every invocation.

0 commit comments

Comments
 (0)