Skip to content

Commit abe4411

Browse files
Lykhoydaclaude
andauthored
feat(skills): creating-actions — guided authoring of reusable Maestro actions (#272)
* feat(skills): creating-actions — guided authoring of reusable Maestro actions New skill walking the agent through the full action-authoring contract: inventory-dedup scan before authoring, creation-path choice (recorder / direct YAML / maestro_generate), selector grounding, a required ASCII flow diagram (screens + transitions annotated with exact testIDs and ${PARAMS}, embedded in the YAML header with glyph-first lines so parseM7Header cannot misread a diagram line as metadata — a bare "# status: ..." line demonstrably overwrites the field), the M7 header contract, pre-replay validation, and replay-to-promote via cdp_run_action. Ships references/m7-header-reference.md (full field glossary, parser behavior, lifecycle, failure codes) and examples/add-product-to-cart.yaml — validated end-to-end against the real toolchain: parseM7Header (all 11 fields round-trip with the diagram embedded), learned-actions.mjs (inventory lists exact metadata + synthesizes the replay command), and Maestro check_flow_syntax (valid). Skill tested RED→GREEN with subagents: baseline run produced no diagram and no dedup scan; with the skill loaded, a confined agent followed all steps and produced a toolchain-clean action on an unseen flow. Routed from using-rn-dev-agent (decision tree + skill map, count 7→8) and cross-linked from rn-testing's M7 section; registered in plugin.json; changeset: rn-dev-agent-plugin minor. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cdp): expose params in maestro_run + cdp_run_action MCP schemas Codex review on PR #272 (P2, verified real): both handlers accept params (GH #116 — threaded to maestro as -e KEY=VALUE on first attempt AND post-repair retry, run-action.ts:107/258/447), but the zod registrations omitted the field. zod strips unknown keys by default, so parameter bindings were silently dropped at the tool-call layer and a parameterised action failed at runtime with unset ${VAR} placeholders — breaking the call shape the new creating-actions skill (and the pre-existing commands/run-action.md) document. TDD: pr-272-params-schema-wiring.test.js pins both registrations (failed before the schema fix, passes after). Key-format validation (/^[A-Z_][A-Z0-9_]*$/) remains in the maestro_run handler. Full suite 1915/1915. dist rebuilt. Changeset: rn-dev-agent-cdp minor. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
1 parent 8ca1f0e commit abe4411

11 files changed

Lines changed: 356 additions & 2 deletions

File tree

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
"rn-dev-agent-plugin": minor
3+
---
4+
5+
New `creating-actions` skill — guided authoring of reusable Maestro actions.
6+
7+
Walks the agent through the full authoring contract: inventory-dedup scan before authoring (via `learned-actions.mjs`), creation-path choice (recorder vs direct YAML vs `maestro_generate`), selector grounding (never invent a testID), a **required ASCII flow diagram** of screens/transitions annotated with exact testIDs and `${PARAMS}` (embedded in the YAML header — glyph-first lines so the M7 parser can't misread a diagram line as metadata, which would otherwise silently overwrite fields like `status`), the M7 header contract, pre-replay validation (header round-trip through the inventory parser, placeholder↔params coverage, selector audit), and replay-to-promote via `cdp_run_action` (never hand-set `active`). Ships with a full M7 field reference (`references/m7-header-reference.md`) and a toolchain-validated worked example (`examples/add-product-to-cart.yaml` — verified against `parseM7Header`, `learned-actions.mjs`, and Maestro's syntax checker). Routed from `using-rn-dev-agent` (decision tree + skill map) and cross-linked from `rn-testing`'s M7 section.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
"rn-dev-agent-cdp": minor
3+
---
4+
5+
Expose `params` in the `maestro_run` and `cdp_run_action` MCP tool schemas.
6+
7+
Both handlers have accepted `params` since GH #116 (forwarded to maestro as `-e KEY=VALUE` on the first attempt AND the post-repair retry), but the zod registrations omitted the field — and zod strips unknown keys by default, so a caller's parameter bindings were **silently dropped** at the tool-call layer and a parameterised action failed at runtime with unset `${VAR}` placeholders. Found by Codex review on PR #272 (the new `creating-actions` skill recommends `cdp_run_action({ actionId, params, trigger })`, which was un-callable as advertised; `commands/run-action.md` documented the same call shape). Key-format validation (`/^[A-Z_][A-Z0-9_]*$/`) stays in the handler. Wiring test pins both registrations.

.claude-plugin/plugin.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@
2626
"./skills/rn-testing",
2727
"./skills/rn-debugging",
2828
"./skills/rn-best-practices",
29-
"./skills/rn-setup"
29+
"./skills/rn-setup",
30+
"./skills/creating-actions"
3031
],
3132
"agents": [
3233
"./agents/rn-tester.md",

scripts/cdp-bridge/dist/index.js

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

scripts/cdp-bridge/src/index.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -898,6 +898,7 @@ trackedTool(
898898
appId: z.string().optional().describe('App bundle ID (auto-detected from app.json)'),
899899
appFile: z.string().optional().describe('iOS only — path to a built .app/.ipa for maestro-runner to reinstall on clearState. Auto-resolved from the flow appId when omitted (GH#201).'),
900900
timeoutMs: z.number().int().min(5000).max(300000).default(120000).describe('Execution timeout in ms'),
901+
params: z.record(z.string(), z.string()).optional().describe('GH #116: parameter bindings forwarded as -e KEY=VALUE for ${KEY} placeholders in the flow. Keys must match /^[A-Z_][A-Z0-9_]*$/ (validated in the handler).'),
901902
},
902903
createMaestroRunHandler(),
903904
);
@@ -1158,6 +1159,7 @@ trackedTool(
11581159
timeoutMs: z.number().optional().describe('Maestro execution timeout per attempt (ms). Default 120_000.'),
11591160
trigger: z.enum(['agent', 'ci', 'human']).optional().describe('RunRecord trigger annotation. Default "agent". CI calls should pass "ci".'),
11601161
forceReload: z.boolean().optional().describe('GH #173: when true (default), acknowledge any human edit to the YAML as the new baseline before running so downstream repair does not abort with STALE_TARGET. Pass false for the strict Phase 129 "respect external edits" behavior (useful for CI replays of fixed baselines).'),
1162+
params: z.record(z.string(), z.string()).optional().describe('Parameter bindings for the action\'s ${VAR} placeholders, forwarded to maestro as -e KEY=VALUE on the first attempt AND the post-repair retry (GH #116). Keys must match /^[A-Z_][A-Z0-9_]*$/ (validated in maestro_run).'),
11611163
},
11621164
// GH #186: supply a CDP-backed live-route reader so the route-drift guard is
11631165
// actually active. Without this the handler defaulted getLiveRoute to a no-op
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
import { test } from 'node:test';
2+
import assert from 'node:assert/strict';
3+
import { readFileSync } from 'node:fs';
4+
import { fileURLToPath } from 'node:url';
5+
import { dirname, resolve } from 'node:path';
6+
7+
const __dirname = dirname(fileURLToPath(import.meta.url));
8+
const indexSrc = readFileSync(resolve(__dirname, '../../src/index.ts'), 'utf8');
9+
10+
// PR #272 review (Codex P2): both handlers accept `params` (GH #116 — forwarded
11+
// as -e KEY=VALUE on first attempt AND post-repair retry), but the MCP zod
12+
// registrations omitted the field. zod strips unknown keys by default, so a
13+
// caller's params were SILENTLY DROPPED at the tool-call layer and a
14+
// parameterised action failed with unset placeholders. These tests pin the
15+
// schema exposure so the registration can't regress behind the handlers again.
16+
17+
/** Slice the trackedTool('<name>', ...) registration block out of index.ts. */
18+
function registrationBlock(toolName) {
19+
const start = indexSrc.indexOf(`'${toolName}',`);
20+
assert.notEqual(start, -1, `registration for ${toolName} not found`);
21+
const rest = indexSrc.slice(start);
22+
const next = rest.indexOf('trackedTool(', 1);
23+
return next === -1 ? rest : rest.slice(0, next);
24+
}
25+
26+
test('PR#272 maestro_run registration exposes params as a string record', () => {
27+
assert.match(registrationBlock('maestro_run'), /params:\s*z\.record\(z\.string\(\), z\.string\(\)\)\.optional\(\)/);
28+
});
29+
30+
test('PR#272 cdp_run_action registration exposes params as a string record', () => {
31+
assert.match(registrationBlock('cdp_run_action'), /params:\s*z\.record\(z\.string\(\), z\.string\(\)\)\.optional\(\)/);
32+
});

skills/creating-actions/SKILL.md

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
---
2+
name: creating-actions
3+
description: This skill should be used when the user asks to "create an action", "save this flow as an action", "make this replayable", "record a reusable action", "author a Maestro flow as an action", "add a login/setup action", or when a verified UI walk should be persisted under .rn-agent/actions/ so future sessions can replay it with maestro.
4+
---
5+
6+
# creating-actions — Author a Reusable Maestro Action
7+
8+
An **action** is a parameterised Maestro flow at `<project>/.rn-agent/actions/<id>.yaml` with an M7 metadata header, replayable via `/rn-dev-agent:run-action` or `cdp_run_action` (auto-repair-aware). A well-authored action turns a ~minutes interactive walk into a ~seconds deterministic replay. Authoring one well means: **dedup first, ground every selector in evidence, design the flow as an ASCII diagram before writing YAML, validate, then replay to promote.**
9+
10+
## When to Use
11+
12+
- A verified flow is worth replaying: login, navigation prologue, multi-step setup, locale/theme switching, data seeding.
13+
- `/rn-dev-agent:test-feature` verification passed and the walk should be persisted.
14+
- The user asks to make a flow replayable / save an action.
15+
16+
**When NOT to author an action:**
17+
- A one-off check — use `maestro_run` with `inlineYaml` and throw it away.
18+
- An existing action already covers the flow — extend or parameterise it instead of forking a near-duplicate.
19+
- The flow spans two apps — actions are single-`appId` by contract.
20+
21+
## Step 0 — Scan the Inventory First (dedup)
22+
23+
Before authoring anything, check what already exists:
24+
25+
```bash
26+
node "${CLAUDE_PLUGIN_ROOT}/scripts/learned-actions.mjs" --json --section b \
27+
--workspace-root "$PWD" --memory-cwd "$PWD" --filter <keyword>
28+
```
29+
30+
(or `/rn-dev-agent:list-learned-actions <keyword>`). If a match covers the goal, replay it. If a near-match exists (same flow, hardcoded values), parameterise THAT action with `${VAR}` placeholders rather than creating a sibling — duplicate actions rot independently and split the repair history.
31+
32+
## Step 1 — Pick the Creation Path
33+
34+
| Situation | Path |
35+
|---|---|
36+
| About to walk the flow live on a device anyway | **Recorder**: `cdp_record_test_start` → drive UI (`cdp_interact` / `device_*`) → `cdp_record_test_stop``cdp_record_test_save_as_action` (writes header + sidecar; pass `intent`/`tags`/`mutates`/`produces` yourself — the recorder cannot infer them) |
37+
| Flow and selectors already known (prior exploration, existing test) | **Direct authoring** — Steps 2–6 below |
38+
| Structured steps in hand, want generated YAML | `maestro_generate` — then verify the M7 header per Step 4 and continue with Steps 5–6 |
39+
40+
The recorder path still benefits from Steps 3 (diagram) and 5–6 (validate, replay to promote): add the diagram to the generated YAML's header before first replay.
41+
42+
## Step 2 — Ground Every Selector (never invent a testID)
43+
44+
Collect evidence for every element the flow will touch:
45+
46+
- `cdp_component_tree(filter="<screen-or-component>")` per screen — filtered, never the full tree
47+
- `device_snapshot` for what is actually on the native screen
48+
- `grep -r 'testID=' src/` for static discovery
49+
- `cdp_nav_graph` / `cdp_navigation_state` for exact route names (used in `expectedRouteSequence`)
50+
51+
If an element the flow needs has **no testID, stop and add one to the app source first** (see the rn-testing skill for testID conventions). Text-based selectors break on i18n and copy edits; an action built on them generates repair churn.
52+
53+
## Step 3 — Draw the ASCII Flow Diagram (required, before any YAML)
54+
55+
Map the flow screen-by-screen with the exact selectors. This is the design-review artifact: parameter gaps, missing assertion anchors, and wrong start-state assumptions are cheap to fix here and expensive to fix after the YAML exists.
56+
57+
Canonical format — one `[RouteName]` box line per screen, ``-arrow lines for interactions, each labelled with the exact selector; one **anchor** (the `assertVisible` proving arrival) per screen; `${PARAMS}` marked where caller data flows in:
58+
59+
```
60+
[any screen]
61+
│ launchApp (stopApp: false)
62+
│ tapOn tab-home
63+
64+
[Home] anchor: product-list
65+
│ scrollUntilVisible product-card-${PRODUCT_ID} (if off-screen)
66+
│ tapOn product-add-btn-${PRODUCT_ID}
67+
68+
[Home] cart-badge increments
69+
│ tapOn tab-cart
70+
71+
[Cart] anchor: cart-list
72+
verify: cart-item-${PRODUCT_ID}
73+
```
74+
75+
Review the diagram against Step-2 evidence before continuing:
76+
- [ ] Every selector in the diagram exists in the gathered tree/snapshot
77+
- [ ] Every transition has an anchor on the destination screen
78+
- [ ] Everything caller-variable is a `${PARAM}`; everything else is fixed
79+
- [ ] The entry assumption is explicit (works from any screen? requires login?)
80+
81+
**Embed the diagram in the YAML header** (below the M7 block) so the action documents itself and repair reviews can see intended structure. Safety rules for embedding — the M7 parser trims each comment line and treats a leading `word: value` as metadata:
82+
83+
1. Every line starts with `#` — a fully blank line ends the header block.
84+
2. Every diagram line's content starts with a **non-letter glyph** (`[`, ``, ``, `(`, indentation is NOT enough). A line like `# status: shows spinner` would silently **overwrite the action's `status` metadata**.
85+
86+
## Step 4 — Write the YAML
87+
88+
```yaml
89+
appId: com.example.shop
90+
---
91+
# id: add-product-to-cart
92+
# intent: From any screen, add product PRODUCT_ID to the cart and verify it landed.
93+
# tags: [cart, add, smoke]
94+
# mutates: true
95+
# status: experimental
96+
# params: [PRODUCT_ID]
97+
# appId: com.example.shop
98+
#
99+
# [diagram from Step 3 — every line #-prefixed, glyph-first]
100+
- launchApp:
101+
stopApp: false
102+
- tapOn:
103+
id: "tab-home"
104+
- assertVisible:
105+
id: "product-list"
106+
# ... steps mirror the diagram 1:1
107+
```
108+
109+
Contract rules (violations break replay, repair, or inventory):
110+
111+
- **id / filename**: lower-case kebab-case `^[a-z0-9][a-z0-9-]*$`; file is `.rn-agent/actions/<id>.yaml`.
112+
- **Header**: `id`, `intent`, `tags`, `mutates`, `status` are the 5 inventory keys — a missing `mutates` renders as `?` in `/list-learned-actions`. Always `status: experimental` at creation; promotion to `active` is earned by a clean replay, never hand-set. Full field glossary (incl. `produces`, `expectedRouteSequence`, `author`): `references/m7-header-reference.md`.
113+
- **Params**: keys match `[A-Z_][A-Z0-9_]*`; every `${VAR}` in the steps is listed in `# params`, and vice versa. The inventory scanner counts `${...}` occurrences **anywhere in the file, comments included** — so the diagram may mark real step params as `${PRODUCT_ID}`, but prose (e.g. the `intent` line) uses bare names, and no comment may mention a `${VAR}` the steps don't use.
114+
- **Body**: `launchApp: { stopApp: false }` self-bootstrap (works cold or warm, preserves login); conditional prologues via `runFlow: { when: { visible: ... } }`; `waitForAnimationToEnd` after transitions; the diagram's anchor `assertVisible` after each screen change; `scrollUntilVisible` for potentially off-screen targets.
115+
- **Never `clearState: true`** on an Expo Dev Client build — it wipes the Metro URL and strands the launcher (GH #8).
116+
- Do **not** hand-write the sidecar (`.rn-agent/state/<id>.state.json`) — it is created lazily on first load/replay.
117+
118+
Copy-adapt the complete worked example: `examples/add-product-to-cart.yaml`.
119+
120+
## Step 5 — Validate Before First Replay
121+
122+
1. **Header parses + inventory lists it**: re-run the Step-0 command with `--filter <id>` — confirm `intent`, `tags`, `mutates`, `status` come back exactly as written (not `?`). This also proves the embedded diagram didn't corrupt the header.
123+
2. **Placeholder coverage**: `grep -o '\${[A-Z_]*}' <file>` over the steps ↔ `# params` list, both directions.
124+
3. **Selector audit**: every `id:` in the YAML appears in the Step-2 evidence.
125+
4. **Syntax**: if the maestro MCP server is connected, `check_flow_syntax` on the body; otherwise the first replay doubles as the syntax check.
126+
127+
## Step 6 — Replay to Promote
128+
129+
Replay through the orchestrator — not raw `maestro_run` — so the run is recorded and auto-repair-aware:
130+
131+
```
132+
cdp_run_action({ actionId: "<id>", params: { PRODUCT_ID: "7" }, trigger: "agent" })
133+
```
134+
135+
- First clean pass auto-promotes `experimental → active` and materialises the sidecar.
136+
- Verify the outcome by **state, not pixels**: `cdp_store_state`, `expect_redux` / `expect_route` / `expect_visible_by_testid`.
137+
- `mutates: true` actions leave residue — clean up between runs or use timestamp-suffixed param values so repeated replays stay deterministic.
138+
- Exercise the variable branch (e.g. one on-screen and one off-screen `PRODUCT_ID`) before trusting the action.
139+
- **No device available?** Leave `status: experimental` and say so explicitly — never hand-promote.
140+
141+
After any later auto-repair or manual selector edit, **update the embedded diagram** to match — a stale diagram misleads the next repair review.
142+
143+
## Quick Reference
144+
145+
| What | Where / Rule |
146+
|---|---|
147+
| Action file | `<project>/.rn-agent/actions/<id>.yaml` |
148+
| Sidecar (auto-created) | `<project>/.rn-agent/state/<id>.state.json` |
149+
| id regex | `^[a-z0-9][a-z0-9-]*$` |
150+
| param key regex | `[A-Z_][A-Z0-9_]*` |
151+
| Inventory / dedup | `scripts/learned-actions.mjs` or `/rn-dev-agent:list-learned-actions` |
152+
| Replay | `cdp_run_action` / `/rn-dev-agent:run-action <id> -e KEY=VAL` |
153+
| Lifecycle | `experimental` → (clean replay) → `active`; repair demotes back to `experimental`; `deprecated` = never replay |
154+
| Repair budget | 3 auto-repairs per rolling 24h per action |
155+
156+
## Common Mistakes
157+
158+
| Mistake | Consequence |
159+
|---|---|
160+
| Skipping the Step-0 inventory scan | Duplicate actions that drift apart; repair history split |
161+
| Inventing a testID that "should" exist | `SELECTOR_NOT_FOUND` on first replay; wasted repair budget |
162+
| Hardcoding values instead of `${PARAMS}` | Action only replays one scenario; near-duplicates multiply |
163+
| `status: active` at creation | Unvalidated flow treated as production-quality by replay-first routing |
164+
| Diagram line starting with a bare `word:` | Silently overwrites M7 metadata (e.g. `status`) |
165+
| Blank (non-`#`) line inside the header | Parser stops early; later M7 keys ignored |
166+
| `${VAR}` in a comment that no step uses | Inventory synthesizes a phantom `-e VAR=...`; replay pre-flight demands a param the flow ignores |
167+
| `clearState: true` on Dev Client | App strands on the Dev Client launcher (GH #8) |
168+
| Raw `maestro_run` for a saved action | No RunRecord, no auto-repair, no promotion |
169+
| Hand-writing the sidecar | Stale `lastSeenMtimeMs` → false `EXTERNAL_EDIT` repair refusals |
170+
171+
## Related
172+
173+
- **rn-testing skill** — Maestro step patterns, timing rules, testID conventions, auth/permission pre-flights
174+
- **`references/m7-header-reference.md`** — every M7 field with semantics, parser behavior, lifecycle transitions
175+
- **`examples/add-product-to-cart.yaml`** — complete worked example with embedded diagram
176+
- **`/rn-dev-agent:run-action`** — replay-side pre-flight (mutates confirmation, appId match, param coverage)

0 commit comments

Comments
 (0)