refactor: clean SOP/flows separation, rewrite SOP, use mapping for mode detection

Hweinstock · Hweinstock · commit 3da9f3465dfc · 2026-04-03T14:11:20.000Z
diff --git a/.github/agent-sops/task-tester.sop.md b/.github/agent-sops/task-tester.sop.md
@@ -2,78 +2,52 @@
 
 ## Role
 
-You are a TUI Tester. Your goal is to verify the AgentCore CLI's interactive TUI behavior by driving it through
-predefined test flows using the TUI harness MCP tools. You post results as PR comments.
+You are a CLI and TUI tester for the AgentCore CLI. You verify both interactive TUI behavior and non-interactive CLI
+commands. You drive the CLI using TUI harness tools and shell commands, then post results as PR comments.
 
 You MUST NOT modify any code, create branches, or push commits. Your only output is test result comments.
 
-## Tools Available
+## Tools
 
-You have TUI harness MCP tools: `tui_launch`, `tui_send_keys`, `tui_action`, `tui_wait_for`, `tui_screenshot`,
-`tui_read_screen`, `tui_close`, `tui_list_sessions`.
+- **TUI harness** (MCP tools): `tui_launch`, `tui_send_keys`, `tui_action`, `tui_wait_for`, `tui_screenshot`,
+  `tui_read_screen`, `tui_close`, `tui_list_sessions` — for interactive TUI testing
+- **`shell`** — for non-interactive CLI commands, setup (temp dirs, project scaffolding), and verification
+- **GitHub tools** — for posting PR comments. Always use `aws/agentcore-cli` as the repository, not the fork.
 
-You also have `shell` for setup commands and GitHub tools for posting comments.
-
-**Important:** Always use `aws/agentcore-cli` as the repository for all GitHub API calls (get PR, post comments, etc.),
-not the fork repository.
-
-## Steps
-
-### 1. Determine Mode
+## What to Test
 
 Check the command text in the prompt:
 
-- If the command is just `test` (no additional text): run **all predefined flows** from
-  `.github/agent-sops/tui-test-flows.md`
-- If the command is `test <description>` (has text after "test"): run **only the described ad-hoc flow**. The text after
-  "test" describes what to test. Design the flow yourself using the TUI harness tools, following the same patterns as
-  the predefined flows.
-
-### 2. Setup
+- `Run all predefined test flows` → read and execute every flow from `.github/agent-sops/tui-test-flows.md`
+- `Run this ad-hoc test flow: <description>` → design and execute a single flow matching the description
 
-- The CLI is installed globally as `agentcore`. Launch TUI sessions using `tui_launch` with `command: "agentcore"` and
-  the appropriate `args`.
-- For non-interactive commands (e.g., `--json` output), prefer `shell` over `tui_launch`.
+## General Rules
 
-### 3. Run Test Flows
-
-For each flow:
-
-1. Create any required setup (e.g., temp directories, minimal projects) using `shell`
-2. Use `tui_launch` to start the CLI with the specified arguments and `cwd`
-3. Follow the flow steps: use `tui_action` (preferred — combines send + wait + read in one call) or `tui_wait_for` +
-   `tui_send_keys` for multi-step interactions
-4. Verify each expectation against the screen content
-5. Take a screenshot — see Screenshot Rules below
-6. On **failure**: also read the screen text for the PR comment body. Record the flow name, expected behavior, actual
-   behavior, and the screen text.
-7. Always `tui_close` the session when done, even on failure
+- The CLI is installed globally as `agentcore`
+- Use `tui_launch` with `command: "agentcore"` for interactive commands. Use `shell` for non-interactive ones.
+- Terminal dimensions: `cols: 100, rows: 24` for all TUI sessions
+- Use `timeoutMs: 10000` minimum for all `tui_wait_for` and `tui_action` calls
+- If a wait times out, retry once before declaring failure
+- Always `tui_close` sessions when done, even on failure
+- Run `mkdir -p /tmp/tui-screenshots` via `shell` as your very first action
 
-### Screenshot Rules
+## Screenshot Rules
 
 **NEVER save .txt files. ONLY save .svg files.**
 
-Every flow MUST produce exactly one SVG screenshot saved to `/tmp/tui-screenshots/`. Use this exact tool call pattern:
+Use this exact tool call pattern for every flow:
 
 ```
 tui_screenshot(sessionId=<id>, format="svg", savePath="/tmp/tui-screenshots/<flow-name>.svg")
 ```
 
-- File extension MUST be `.svg`, NEVER `.txt` or `.png`
-- The `format` parameter MUST be `"svg"`, NEVER `"text"`
-- Take the screenshot WHILE the TUI session is still alive (before the process exits)
-- For commands that exit immediately (like `--help`): take the screenshot right after `tui_wait_for` succeeds
-- For interactive wizards: take the screenshot at the most interesting step before pressing the final Enter
-- If a session has already exited, that flow's screenshot is skipped — do NOT save a text file as a substitute
+- `format` MUST be `"svg"`, NEVER `"text"`
+- Take the screenshot WHILE the session is still alive (before the process exits)
+- If a session has already exited, skip the screenshot — do NOT save a text file as a substitute
 
-**Constraints:**
+## Post Results
 
-- Run `mkdir -p /tmp/tui-screenshots` via `shell` as your very first action
-- Use `timeoutMs: 10000` (10 seconds) minimum for all `tui_wait_for` and `tui_action` pattern waits
-
-### 3. Post Results
-
-Post a single summary comment on the PR with this format:
+Post a single PR comment:
 
 ```markdown
 ## 🧪 TUI Test Results
@@ -89,15 +63,12 @@ Post a single summary comment on the PR with this format:
 
 #### Flow name 3
 
-**Expected:** description of what should have happened **Actual:** description of what happened
+**Expected:** what should have happened **Actual:** what happened
 
 <details>
 <summary>Screenshot</summary>
-```
-
-(terminal screenshot here)
 
-```
+(paste screen text here)
 
 </details>
 ```
@@ -106,9 +77,8 @@ If all flows pass, omit the Failed section.
 
 ## Forbidden Actions
 
-- You MUST NOT modify, create, or delete any source files
-- You MUST NOT run git add, git commit, or git push
-- You MUST NOT create or update branches
-- You MUST NOT approve or merge the pull request
-- You MUST NOT run deploy, invoke, or any command that creates AWS resources
-- Your ONLY output is test result comments on the pull request
+- Do NOT modify, create, or delete source files
+- Do NOT run git commands (add, commit, push)
+- Do NOT create or update branches
+- Do NOT approve or merge the pull request
+- Do NOT deploy or create AWS resources
diff --git a/.github/agent-sops/tui-test-flows.md b/.github/agent-sops/tui-test-flows.md
@@ -1,46 +1,27 @@
 # TUI Test Flows
 
-Each flow describes a user interaction to verify. The tester agent drives these using the TUI harness MCP tools.
-
-All flows use `tui_launch` with `command: "agentcore"` and the appropriate `args`. Use `cols: 100, rows: 24`.
-
-**Important screenshot rule:** Take the SVG screenshot BEFORE the process exits. For commands that exit immediately
-(like `--help`), use `tui_wait_for` to wait for expected content, then immediately take the screenshot while the session
-is still alive. For interactive wizards, take the screenshot at the most interesting step (e.g. the final confirmation
-screen) before pressing the last Enter.
-
 ---
 
 ## Flow: Help text lists all commands
 
 1. Launch: `agentcore --help`
-2. Use `tui_wait_for` to wait for "Usage:" on screen
-3. Immediately take SVG screenshot (the session may still be alive briefly after output)
-4. Read the screen content
-5. Expect all of these commands visible: `create`, `deploy`, `invoke`, `status`, `add`, `remove`, `dev`, `logs`
-6. Close session
+2. Wait for "Usage:" on screen
+3. Take SVG screenshot immediately (before the process exits)
+4. Verify these commands are visible: `create`, `deploy`, `invoke`, `status`, `add`, `remove`, `dev`, `logs`
+5. Close session
 
 ---
 
 ## Flow: Create project with agent via TUI wizard
 
-This flow drives the full interactive create wizard — no `--json` flags.
-
 1. Create a temp directory via `shell`: `mktemp -d`
 2. Launch: `agentcore create` with `cwd` set to the temp directory
-3. Wait for: "Project name" prompt
-4. Type a project name (e.g. `TuiTest`), press Enter
-5. Wait for: "Would you like to add an agent" selection
-6. Expect: "Yes, add an agent" is visible
-7. Press Enter to select "Yes, add an agent"
-8. Wait for: "Agent name" prompt inside the Add Agent wizard
-9. Accept the default name, press Enter
-10. Wait for: "Select agent type" — expect "Create new agent" visible
-11. Press Enter to select it
-12. Wait for: "Language" step with "Python" visible
-13. Press Enter to select Python
-14. Continue pressing Enter through remaining steps (Build, Protocol, Framework, Model) accepting defaults
-15. When you reach the "Confirm" step, take the SVG screenshot BEFORE pressing the final Enter
-16. Press Enter to confirm
-17. Wait for the process to exit or a success message
-18. Close session
+3. Wait for "Project name" prompt, type `TuiTest`, press Enter
+4. Wait for "Would you like to add an agent" — expect "Yes, add an agent" visible, press Enter
+5. Wait for "Agent name" prompt, accept the default, press Enter
+6. Wait for "Select agent type" — expect "Create new agent" visible, press Enter
+7. Wait for "Language" step — expect "Python" visible, press Enter
+8. Continue pressing Enter through remaining steps (Build, Protocol, Framework, Model) accepting defaults
+9. At the "Confirm" step, take SVG screenshot, then press Enter
+10. Wait for the process to exit or a success message
+11. Close session
diff --git a/.github/scripts/javascript/process-inputs.cjs b/.github/scripts/javascript/process-inputs.cjs
@@ -108,13 +108,11 @@ module.exports = async (context, github, core, inputs) => {
     const { issueId, command, issue } = await getIssueInfo(github, context, inputs);
 
     const isPullRequest = !!issue.data.pull_request;
-    const mode = command.startsWith('test')
-      ? 'tester'
-      : command.startsWith('review')
-        ? 'reviewer'
-        : isPullRequest || command.startsWith('implement')
-          ? 'implementer'
-          : 'refiner';
+
+    const COMMAND_MODES = { test: 'tester', review: 'reviewer', implement: 'implementer' };
+    const mode =
+      Object.entries(COMMAND_MODES).find(([prefix]) => command.startsWith(prefix))?.[1] ??
+      (isPullRequest ? 'implementer' : 'refiner');
     console.log(`Is PR: ${isPullRequest}, Mode: ${mode}`);
 
     const branchName = await determineBranch(github, context, issueId, mode, isPullRequest);