Merge pull request #156 from Lykhoyda/fix/issue-116-run-action-params-plumbing

Lykhoyda · web-flow · commit 211100ff6812 · 2026-05-13T12:00:39.000+02:00
fix(gh-116): wire cdp_run_action into /run-action slash command via params plumbing
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -9,7 +9,7 @@
     {
       "name": "rn-dev-agent",
       "description": "AI agent that fully tests React Native features on simulator/emulator — navigates the app, verifies UI, walks user flows, and confirms internal state.",
-      "version": "0.44.42",
+      "version": "0.44.43",
       "source": "./",
       "category": "mobile-development",
       "homepage": "https://github.com/Lykhoyda/rn-dev-agent"
diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "rn-dev-agent",
-  "version": "0.44.42",
+  "version": "0.44.43",
   "description": "AI agent that fully tests React Native features on simulator/emulator — navigates the app, verifies UI, walks user flows, and confirms internal state.",
   "author": {
     "name": "Anton Lykhoyda",
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,43 @@ All notable changes to rn-dev-agent will be documented in this file.
 
 Format follows [Keep a Changelog](https://keepachangelog.com/).
 
+## [0.44.43] — 2026-05-13
+
+### Added (GH #116 — wire cdp_run_action into /run-action slash command)
+
+- **`maestro_run` now accepts `params: Record<string, string>`** that get
+  forwarded to maestro-runner as `-e KEY=VALUE` argv pairs. Keys must
+  match `[A-Z_][A-Z0-9_]*` (Maestro env-style convention) — anything
+  else is refused at the handler boundary so a hostile payload can't
+  become a shell-injectable flag. Values must be strings. Since the
+  invocation uses `execFile` (not `exec`), values are passed as
+  separate argv entries — shell metacharacters are inert by construction.
+- **`cdp_run_action` forwards `params`** through to both the first
+  `maestro_run` call AND the post-repair retry, so a parameterised flow
+  replays identically after auto-repair.
+- **`/rn-dev-agent:run-action` slash command** is rewritten to call
+  `cdp_run_action` via MCP rather than shelling out to maestro-runner
+  directly. User invocations of `run-action wizard-create-task -e
+  TITLE=...` now benefit from auto-repair, structured RunRecords, and
+  the GH #120 per-phase timing. The slash command still parses args
+  locally (positional + `-e` + `--platform` + `--dry-run` + new
+  `--no-auto-repair`) but delegates execution to the MCP tool.
+- `--dry-run` keeps the bash-only path since `cdp_run_action` always
+  executes.
+- 6 new handler tests cover: malformed-key refusal (5 shell-injection
+  shapes), non-string-value refusal, well-formed key acceptance,
+  cdp_run_action's params forwarding to the first maestro_run call,
+  end-to-end params threading via a real temp project fixture, and
+  params persistence into the post-repair retry path.
+  Suite: 1312 → 1318 passing.
+
+### Note
+
+Step #4 of issue #116 ("Live smoke: replay wizard-create-task with
+-e TITLE=foo end-to-end on a booted simulator") is left for a
+maintainer-driven verification — it requires a live simulator with the
+test app and is outside the scope of this code-only PR.
+
 ## [0.44.42] — 2026-05-13
 
 ### Hardened (GH #113 — saveAction precondition becomes a runtime soft-assertion)
diff --git a/commands/run-action.md b/commands/run-action.md
@@ -1,8 +1,8 @@
 ---
 command: run-action
-description: Execute a learned Maestro flow ("action") by name with optional -e KEY=VALUE parameters. Looks the flow up via scripts/learned-actions.mjs (same inventory as /rn-dev-agent:list-learned-actions), then replays it with maestro-runner. Counterpart to /list-learned-actions — list discovers, run executes.
-argument-hint: <action-name> [-e KEY=VALUE ...] [--platform ios|android] [--dry-run]
-allowed-tools: Bash, Read, Glob
+description: Execute a learned Maestro flow ("action") by name with optional -e KEY=VALUE parameters. Looks the flow up via scripts/learned-actions.mjs (same inventory as /rn-dev-agent:list-learned-actions), then replays it via cdp_run_action — auto-repair-aware orchestration with structured RunRecords (GH #116). Counterpart to /list-learned-actions — list discovers, run executes.
+argument-hint: <action-name> [-e KEY=VALUE ...] [--platform ios|android] [--no-auto-repair] [--dry-run]
+allowed-tools: Bash, Read, Glob, mcp__plugin_rn-dev-agent_cdp__cdp_run_action
 ---
 
 Execute the learned action: $ARGUMENTS
@@ -23,23 +23,28 @@ without `.yaml`). Substring + case-insensitive — `task-create` will match
 The first positional arg is the action name (required). Subsequent args are
 passed through to `maestro-runner` verbatim:
 
-- `-e KEY=VALUE` — environment variable for `${KEY}` placeholders in the flow
+- `-e KEY=VALUE` — environment variable for `${KEY}` placeholders in the flow. Keys must match `[A-Z_][A-Z0-9_]*` (Maestro convention) — anything else is rejected by `cdp_run_action` / `maestro_run` (GH #116).
 - `--platform <ios|android>` — target device (auto-detected from booted device if omitted)
-- `--dry-run` — print the resolved replay command without executing it
+- `--no-auto-repair` — opt out of `cdp_repair_action` retry on `SELECTOR_NOT_FOUND` (default: auto-repair on)
+- `--dry-run` — print the resolved replay command without executing it (bash-only path; bypasses `cdp_run_action`)
 
 Example calls:
 
 ```
 /rn-dev-agent:run-action wizard-create-task -e TITLE="Buy milk" -e PRIORITY=high -e TAG=feature -e DESC="Test"
 /rn-dev-agent:run-action mark-all-done --platform android
 /rn-dev-agent:run-action wizard-create-task --dry-run -e TITLE=foo -e PRIORITY=low -e TAG=bug -e DESC=test
+/rn-dev-agent:run-action mark-all-done --no-auto-repair    # surface the raw failure without patching
 ```
 
 ## Protocol
 
 1. **Parse arguments.** First word of `$ARGUMENTS` is the action name. Detect
-   `--platform`, `--dry-run`, and collect every `-e KEY=VALUE` pair. Treat
-   anything else as a passthrough flag.
+   `--platform`, `--dry-run`, `--no-auto-repair`, and collect every
+   `-e KEY=VALUE` pair into a `params` object (key must match
+   `[A-Z_][A-Z0-9_]*`; reject malformed early — `cdp_run_action` will
+   refuse them anyway, but catching at parse time gives a clearer
+   error). Treat anything else as a passthrough flag.
 
 2. **Resolve the action via the script** (single source of truth — never glob
    `.rn-agent/actions/` directly):
@@ -84,25 +89,78 @@ Example calls:
    - If both are booted, stop and ask the user to pass `--platform`.
    - If neither, stop and tell the user to boot a device.
 
-5. **Build the replay command**:
-   ```bash
-   FLOW_PATH=$(echo "$RESULT" | jq -r '.sections.flows.items[0].path')
-   CMD=(maestro-runner --platform "$PLATFORM" test)
-   for KV in "${E_FLAGS[@]}"; do CMD+=(-e "$KV"); done
-   CMD+=("$FLOW_PATH")
+5. **Build the call** to `cdp_run_action` from the parsed args. The action
+   id is the inventory match's `flow` field (filename without `.yaml`).
+   Convert the `-e KEY=VALUE` array into a `params` object:
+   ```js
+   {
+     actionId: "<flow-name>",
+     platform: "<ios|android>",          // omit to auto-detect
+     params: { TITLE: "Buy milk", PRIORITY: "high", ... },
+     autoRepair: !noAutoRepair,          // default true; --no-auto-repair flips to false
+     trigger: "agent"                    // or "human" / "ci" based on context
+   }
+   ```
+   If `--dry-run`, do NOT call `cdp_run_action`. Print the resolved call
+   shape (the JSON args object as above) plus the would-be Maestro CLI
+   `maestro-runner --platform <PLATFORM> test -e K=V ... <FLOW_PATH>` and
+   stop. The `cdp_run_action` tool always executes, so a separate
+   bash-print path is necessary for dry-run.
+
+6. **Execute via MCP**:
+   ```
+   cdp_run_action({ actionId, platform, params, autoRepair, trigger })
+   ```
+   Read the returned envelope's `data` field. Shape (matches
+   `scripts/cdp-bridge/src/tools/run-action.ts`):
+   ```
+   {
+     ok: true | false,
+     data: {
+       actionId,
+       passed: boolean,                 // happy path: true
+       autoRepair: {
+         attempted: boolean,
+         outcome: 'skipped' | 'passed' | 'failed' | 'refused',
+         refusedReason?: 'USER_DISABLED' | 'NOT_REPAIRABLE_KIND' | 'EDITED_SINCE_LOAD'
+                       | 'BUDGET_EXHAUSTED' | 'NO_CANDIDATE',
+         phases?: { firstAttemptMs, repairMs?, retryMs? },
+         diff?: string                  // patch summary when outcome === 'passed'
+       },
+       durationMs,
+       flowFile,
+       firstAttemptOutput?: string,     // first 500 chars of maestro stdout/stderr
+       retryOutput?: string,            // present iff retriedAfterRepair === true
+       retriedAfterRepair?: boolean
+     }
+   }
    ```
-   If `--dry-run`, print `${CMD[*]}` and stop. Otherwise execute it.
-
-6. **Execute and report**:
-   - Stream the output (don't swallow); capture exit code.
-   - On success: report `passed/total` commands and `duration_ms` from the
-     report JSON (`reports/<timestamp>/report.json`), plus the full path so
-     the user can inspect screenshots and the hierarchy XML.
-   - On failure: extract the failing command from the report JSON and the
-     associated `assets/flow-000/cmd-NNN-after.png` screenshot path. Diagnose
-     in three lines max — point at the most likely cause (stale testID,
-     iOS keyboard digraph drop per `feedback_maestro_patterns.md` item 9,
-     auth state lost, etc.) — DO NOT auto-edit the flow.
+   The persisted RunRecord lands in the sidecar at
+   `<project>/.rn-agent/state/<actionId>.state.json` — read it via
+   `cdp_run_action`'s side-effect, not from `data.runRecord` (which is
+   not present in the response).
+
+   Branch on `data.autoRepair.outcome`:
+   - **`outcome === 'skipped'`** with `attempted: false`: happy path —
+     report `✅ <flow-name> passed in <durationMs>ms` and stop.
+   - **`outcome === 'passed'`** with `attempted: true,
+     retriedAfterRepair: true`: repaired-and-passed — report `🩹
+     <flow-name> failed, repaired, then passed` and (if `data.autoRepair.diff`
+     is present) print the one-line patch summary. Suggest the user `git
+     diff .rn-agent/actions/<id>.yaml` to inspect.
+   - **`outcome === 'failed'`**: post-repair retry still failed —
+     `data.retryOutput` carries the trailing maestro output for
+     diagnosis.
+   - **`outcome === 'refused'`** with `refusedReason`: auto-repair declined
+     (user disabled, file edited since load, repair budget exhausted, or
+     no candidate). Surface the refused reason verbatim — DO NOT edit
+     the flow yourself; suggest re-running with `--no-auto-repair` to
+     see the raw failure or running `cdp_repair_action` manually.
+
+   In all cases, diagnose in three lines max — point at the most likely
+   cause (stale testID, iOS keyboard digraph drop per
+   `feedback_maestro_patterns.md` item 9, auth state lost, etc.) — DO
+   NOT auto-edit the flow.
 
 ## Output
 
diff --git a/scripts/cdp-bridge/dist/tools/maestro-run.js b/scripts/cdp-bridge/dist/tools/maestro-run.js
@@ -9,6 +9,10 @@ import { resolveBundleId, readExpoSlug } from '../project-config.js';
 import { chooseMaestroDispatch, shouldWarnFallback } from './maestro-dispatch.js';
 import { buildMaestroFlow, parseAndValidateFlow, isValidBundleId, MaestroValidationError, } from '../domain/maestro-validator.js';
 const execFile = promisify(execFileCb);
+/** GH #116: Maestro env-style key pattern. Refuses anything that could
+ *  syntactically be confused with a flag (`--`, `-e`) or break the
+ *  KEY=VALUE join (`=`, space, control chars). Strict; documented. */
+const PARAM_KEY_RE = /^[A-Z_][A-Z0-9_]*$/;
 function resolvePlatform(override) {
     if (override === 'ios' || override === 'android')
         return override;
@@ -24,6 +28,21 @@ function resolveAppId(override, platform) {
 }
 export function createMaestroRunHandler() {
     return async (args) => {
+        // GH #116: validate params shape FIRST so a malformed payload is rejected
+        // regardless of platform / dispatch-tier availability. CI envs without
+        // maestro-runner or Maestro CLI would otherwise short-circuit at
+        // chooseMaestroDispatch before reaching the validator.
+        if (args.params) {
+            for (const [key, value] of Object.entries(args.params)) {
+                if (!PARAM_KEY_RE.test(key)) {
+                    return failResult(`Refusing to run Maestro: invalid param key '${String(key).slice(0, 60)}' ` +
+                        `— must match ${PARAM_KEY_RE.source} (GH #116).`);
+                }
+                if (typeof value !== 'string') {
+                    return failResult(`Refusing to run Maestro: param '${key}' has non-string value (GH #116).`);
+                }
+            }
+        }
         const platform = resolvePlatform(args.platform);
         if (!platform) {
             return failResult('Cannot determine platform. Pass platform or open a device session first.');
@@ -83,8 +102,21 @@ export function createMaestroRunHandler() {
             throw err;
         }
         const timeout = args.timeoutMs ?? 120_000;
+        // GH #116: build the final argv. Start with the dispatch tier's
+        // base args, then append `-e KEY=VALUE` pairs for any supplied
+        // params. Validation already ran at the top of the handler so by
+        // this point every key matches PARAM_KEY_RE and every value is a
+        // string — no need to re-check.
+        const baseArgs = dispatch.buildArgs(platform, flowFile);
+        const paramArgs = [];
+        if (args.params) {
+            for (const [key, value] of Object.entries(args.params)) {
+                paramArgs.push('-e', `${key}=${value}`);
+            }
+        }
+        const finalArgs = [...baseArgs, ...paramArgs];
         try {
-            const { stdout, stderr } = await execFile(dispatch.binPath, dispatch.buildArgs(platform, flowFile), { timeout, encoding: 'utf8' });
+            const { stdout, stderr } = await execFile(dispatch.binPath, finalArgs, { timeout, encoding: 'utf8' });
             const output = (stdout + '\n' + stderr).trim();
             const passed = !output.includes('FAILED') && !output.includes('Error:');
             const meta = {
diff --git a/scripts/cdp-bridge/dist/tools/run-action.js b/scripts/cdp-bridge/dist/tools/run-action.js
@@ -145,6 +145,7 @@ export function createRunActionHandler(deps = {}) {
                 flowPath: action.filePath,
                 platform: args.platform,
                 timeoutMs,
+                params: args.params,
             });
             const firstAttemptMs = Date.now() - tBeforeFirst;
             const firstEnv = parseEnvelope(firstResult, 'maestro_run');
@@ -277,6 +278,7 @@ export function createRunActionHandler(deps = {}) {
                 flowPath: reloadedAction.filePath,
                 platform: args.platform,
                 timeoutMs,
+                params: args.params,
             });
             const retryMs = Date.now() - tBeforeRetry;
             const retryEnv = parseEnvelope(retryResult, 'maestro_run');
diff --git a/scripts/cdp-bridge/package.json b/scripts/cdp-bridge/package.json
@@ -1,6 +1,6 @@
 {
   "name": "rn-dev-agent-cdp",
-  "version": "0.38.37",
+  "version": "0.38.38",
   "type": "module",
   "main": "dist/index.js",
   "scripts": {
diff --git a/scripts/cdp-bridge/src/tools/maestro-run.ts b/scripts/cdp-bridge/src/tools/maestro-run.ts
@@ -23,8 +23,23 @@ interface MaestroRunArgs {
   platform?: 'ios' | 'android';
   appId?: string;
   timeoutMs?: number;
+  /**
+   * GH #116: per-flow parameter bindings forwarded as `-e KEY=VALUE`
+   * pairs to the maestro-runner subprocess. Keys must match
+   * /^[A-Z_][A-Z0-9_]*$/ (Maestro's documented env-style convention) —
+   * any other key shape is refused so a malformed/hostile payload can't
+   * become a shell-injectable flag. Values are NOT quoted; they're
+   * passed as separate argv entries so shell metacharacters are inert
+   * by construction (execFile, not exec).
+   */
+  params?: Record<string, string>;
 }
 
+/** GH #116: Maestro env-style key pattern. Refuses anything that could
+ *  syntactically be confused with a flag (`--`, `-e`) or break the
+ *  KEY=VALUE join (`=`, space, control chars). Strict; documented. */
+const PARAM_KEY_RE = /^[A-Z_][A-Z0-9_]*$/;
+
 function resolvePlatform(override?: string): 'ios' | 'android' | null {
   if (override === 'ios' || override === 'android') return override;
   const session = getActiveSession();
@@ -39,6 +54,26 @@ function resolveAppId(override?: string, platform?: string): string {
 
 export function createMaestroRunHandler(): (args: MaestroRunArgs) => Promise<ToolResult> {
   return async (args) => {
+    // GH #116: validate params shape FIRST so a malformed payload is rejected
+    // regardless of platform / dispatch-tier availability. CI envs without
+    // maestro-runner or Maestro CLI would otherwise short-circuit at
+    // chooseMaestroDispatch before reaching the validator.
+    if (args.params) {
+      for (const [key, value] of Object.entries(args.params)) {
+        if (!PARAM_KEY_RE.test(key)) {
+          return failResult(
+            `Refusing to run Maestro: invalid param key '${String(key).slice(0, 60)}' ` +
+            `— must match ${PARAM_KEY_RE.source} (GH #116).`,
+          );
+        }
+        if (typeof value !== 'string') {
+          return failResult(
+            `Refusing to run Maestro: param '${key}' has non-string value (GH #116).`,
+          );
+        }
+      }
+    }
+
     const platform = resolvePlatform(args.platform);
     if (!platform) {
       return failResult(
@@ -102,10 +137,24 @@ export function createMaestroRunHandler(): (args: MaestroRunArgs) => Promise<Too
 
     const timeout = args.timeoutMs ?? 120_000;
 
+    // GH #116: build the final argv. Start with the dispatch tier's
+    // base args, then append `-e KEY=VALUE` pairs for any supplied
+    // params. Validation already ran at the top of the handler so by
+    // this point every key matches PARAM_KEY_RE and every value is a
+    // string — no need to re-check.
+    const baseArgs = dispatch.buildArgs(platform, flowFile);
+    const paramArgs: string[] = [];
+    if (args.params) {
+      for (const [key, value] of Object.entries(args.params)) {
+        paramArgs.push('-e', `${key}=${value}`);
+      }
+    }
+    const finalArgs = [...baseArgs, ...paramArgs];
+
     try {
       const { stdout, stderr } = await execFile(
         dispatch.binPath,
-        dispatch.buildArgs(platform, flowFile),
+        finalArgs,
         { timeout, encoding: 'utf8' },
       );
 
diff --git a/scripts/cdp-bridge/src/tools/run-action.ts b/scripts/cdp-bridge/src/tools/run-action.ts
@@ -95,6 +95,14 @@ export interface RunActionArgs {
    * 'ci'; human-driven invocations 'human'.
    */
   trigger?: 'agent' | 'ci' | 'human';
+  /**
+   * GH #116: per-flow parameter bindings forwarded to maestro_run as
+   * `-e KEY=VALUE` pairs. Keys must match Maestro's env-style convention
+   * `/^[A-Z_][A-Z0-9_]*$/`; validation enforced in maestro_run itself.
+   * Pass through unchanged on both first attempt AND post-repair retry
+   * so a parameterised flow can be replayed identically after repair.
+   */
+  params?: Record<string, string>;
 }
 
 interface MaestroEnvelope {
@@ -213,6 +221,7 @@ export function createRunActionHandler(deps: RunActionDeps = {}) {
         flowPath: action.filePath,
         platform: args.platform,
         timeoutMs,
+        params: args.params,
       });
       const firstAttemptMs = Date.now() - tBeforeFirst;
       const firstEnv = parseEnvelope(firstResult, 'maestro_run');
@@ -367,6 +376,7 @@ export function createRunActionHandler(deps: RunActionDeps = {}) {
         flowPath: reloadedAction.filePath,
         platform: args.platform,
         timeoutMs,
+        params: args.params,
       });
       const retryMs = Date.now() - tBeforeRetry;
       const retryEnv = parseEnvelope(retryResult, 'maestro_run');
diff --git a/scripts/cdp-bridge/test/unit/gh-116-run-action-params.test.js b/scripts/cdp-bridge/test/unit/gh-116-run-action-params.test.js

Original file line number	Diff line number	Diff line change
`@@ -9,7 +9,7 @@`
`9`	`9`	`{`
`10`	`10`	`"name": "rn-dev-agent",`
`11`	`11`	`"description": "AI agent that fully tests React Native features on simulator/emulator — navigates the app, verifies UI, walks user flows, and confirms internal state.",`
`12`		`- "version": "0.44.42",`
	`12`	`+ "version": "0.44.43",`
`13`	`13`	`"source": "./",`
`14`	`14`	`"category": "mobile-development",`
`15`	`15`	`"homepage": "https://github.com/Lykhoyda/rn-dev-agent"`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "rn-dev-agent",`
`3`		`- "version": "0.44.42",`
	`3`	`+ "version": "0.44.43",`
`4`	`4`	`"description": "AI agent that fully tests React Native features on simulator/emulator — navigates the app, verifies UI, walks user flows, and confirms internal state.",`
`5`	`5`	`"author": {`
`6`	`6`	`"name": "Anton Lykhoyda",`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "rn-dev-agent-cdp",`
`3`		`- "version": "0.38.37",`
	`3`	`+ "version": "0.38.38",`
`4`	`4`	`"type": "module",`
`5`	`5`	`"main": "dist/index.js",`
`6`	`6`	`"scripts": {`