Skip to content

Commit 63e366f

Browse files
feat: Appguide support (#22)
* intial support for appguide Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * feat: polish CLI output — fun spinner verbs, step counter, cleaner logs - Replace static "Reasoning…" spinner with randomly rotating fun verbs (Brewing, Cogitating, Pondering, etc.) that change every 2.5s - Add step counter to spinner detail: (1/30 · vision · thinking on · model) - Move verbose debug output behind MCP_DEBUG=1 flag: - Episodic memory status bullets - AppGuide injection/active bullets - "Pulling UI state" / "Consulting agent" bullets - LLM reasoning text (streaming and static) - Remove misleading static 0/30 progress bar from goal box Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * feat: AppGuide planner integration, long press support, vision and flow improvements - Thread AppGuide through planner and orchestrator for app-aware goal decomposition - Add find_and_long_press meta-tool with vision and DOM mode support - Migrate appium_click calls to appium_gesture for consistency - Improve vision coordinate scaling with async screen size fetch - Add natural language long_press step parsing in YAML flows - Enhance preprocessor with appId tracking for AppGuide - Update prompts with AppGuide context injection - Various fixes across MCP client, device session, and flow execution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: flows taps Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * feat: pre-download WDA in CI before iOS simulator runs Downloads prebuilt WebDriverAgentRunner via authenticated GitHub API (5000/hr limit) and sets APPIUM_MCP_WDA_APP_PATH so appium-mcp skips the in-process download entirely. Applied to both root action.yml (marketplace) and github-action/action.yml. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: boot ios simulator Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: iOS sim boot Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: build error Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: update to latest mcp server Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: device picker in CI when already flag is given Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: try adding appium-mcp as dependency Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: action yml opts Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix; actions yml for ios Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> * fix: skip the ios yaml Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com> --------- Co-authored-by: Srinivasan Sekar <srinivasan.sekar1990@gmail.com>
1 parent d0451e3 commit 63e366f

46 files changed

Lines changed: 1505 additions & 624 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/action-test.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ on:
88
push:
99
branches: [main]
1010
paths:
11-
- 'github-action/**'
11+
- 'action.yml'
1212
- '.github/workflows/action-test.yml'
1313
- 'flows/**'
1414
workflow_dispatch:
@@ -39,7 +39,7 @@ jobs:
3939
- uses: actions/checkout@v4
4040

4141
# Use the local action definition (same repo, same commit)
42-
- uses: ./github-action
42+
- uses: ./
4343
id: run
4444
with:
4545
flow: ${{ github.event.inputs.flow || 'flows/youtube.yaml' }}
@@ -61,7 +61,7 @@ jobs:
6161
steps:
6262
- uses: actions/checkout@v4
6363

64-
- uses: ./github-action
64+
- uses: ./
6565
id: run
6666
with:
6767
goal: 'Open YouTube app and verify the home feed is visible'
@@ -84,7 +84,7 @@ jobs:
8484
steps:
8585
- uses: actions/checkout@v4
8686

87-
- uses: ./github-action
87+
- uses: ./
8888
id: run
8989
with:
9090
flow: ${{ github.event.inputs.flow || 'flows/youtube.yaml' }}

.github/workflows/layer3-branch-test.yml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ jobs:
3030
steps:
3131
- uses: actions/checkout@v4
3232

33-
- uses: ./github-action
33+
- uses: ./
3434
id: run
3535
with:
3636
use-local-build: 'true'
@@ -55,7 +55,7 @@ jobs:
5555
steps:
5656
- uses: actions/checkout@v4
5757

58-
- uses: ./github-action
58+
- uses: ./
5959
id: run
6060
with:
6161
use-local-build: 'true'
@@ -75,17 +75,19 @@ jobs:
7575
ios-flow:
7676
name: iOS — YAML flow
7777
runs-on: macos-14
78-
if: github.event_name == 'pull_request' || (github.event_name == 'workflow_dispatch' && inputs.platform == 'ios')
78+
if: false
7979

8080
steps:
8181
- uses: actions/checkout@v4
8282

83-
- uses: ./github-action
83+
- uses: ./
8484
id: run
8585
with:
8686
use-local-build: 'true'
8787
flow: ${{ inputs.flow || 'flows/wdio.yaml' }}
8888
platform: ios
89+
ios-device-type: simulator
90+
mcp-debug: 'true'
8991
provider: gemini
9092
agent-mode: vision
9193
api-key: ${{ secrets.LLM_API_KEY }}

CHANGELOG.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,24 @@
22

33
### Features
44

5-
* add action.yml at repo root for GitHub Marketplace publishing ([#20](https://github.com/AppiumTestDistribution/AppClaw/issues/20)) ([c007399](https://github.com/AppiumTestDistribution/AppClaw/commit/c007399fa670273058cd51e65f0fd68323ccb3be))
5+
- add action.yml at repo root for GitHub Marketplace publishing ([#20](https://github.com/AppiumTestDistribution/AppClaw/issues/20)) ([c007399](https://github.com/AppiumTestDistribution/AppClaw/commit/c007399fa670273058cd51e65f0fd68323ccb3be))
66

77
## 1.0.0 (2026-04-16)
88

99
### Features
1010

11-
* integrate ai-sdk-ollama for LLM support and update configuration ([#9](https://github.com/AppiumTestDistribution/AppClaw/issues/9)) ([c6794d7](https://github.com/AppiumTestDistribution/AppClaw/commit/c6794d718a37ef690c09f5fb006c8994c78e361b))
12-
* parallel testing support and screen recording for SDK ([#16](https://github.com/AppiumTestDistribution/AppClaw/issues/16)) ([7d14e7b](https://github.com/AppiumTestDistribution/AppClaw/commit/7d14e7b760c41783c61f1227c037e1b28d184a5c))
13-
* strict playground tap matching, waitUntil pre-check, faster vision assert ([59b8c29](https://github.com/AppiumTestDistribution/AppClaw/commit/59b8c299bf20c9232d89bbbb4d93a9ef600cca2b))
14-
* vision improvements — drag support, screenshot optimization, an… ([#7](https://github.com/AppiumTestDistribution/AppClaw/issues/7)) ([8cfbcb4](https://github.com/AppiumTestDistribution/AppClaw/commit/8cfbcb483fce0dec531ad8c21c8cd93d5743d62f))
11+
- integrate ai-sdk-ollama for LLM support and update configuration ([#9](https://github.com/AppiumTestDistribution/AppClaw/issues/9)) ([c6794d7](https://github.com/AppiumTestDistribution/AppClaw/commit/c6794d718a37ef690c09f5fb006c8994c78e361b))
12+
- parallel testing support and screen recording for SDK ([#16](https://github.com/AppiumTestDistribution/AppClaw/issues/16)) ([7d14e7b](https://github.com/AppiumTestDistribution/AppClaw/commit/7d14e7b760c41783c61f1227c037e1b28d184a5c))
13+
- strict playground tap matching, waitUntil pre-check, faster vision assert ([59b8c29](https://github.com/AppiumTestDistribution/AppClaw/commit/59b8c299bf20c9232d89bbbb4d93a9ef600cca2b))
14+
- vision improvements — drag support, screenshot optimization, an… ([#7](https://github.com/AppiumTestDistribution/AppClaw/issues/7)) ([8cfbcb4](https://github.com/AppiumTestDistribution/AppClaw/commit/8cfbcb483fce0dec531ad8c21c8cd93d5743d62f))
1515

1616
### Bug Fixes
1717

18-
* add semantic-release for automated versioning and npm publishing ([#19](https://github.com/AppiumTestDistribution/AppClaw/issues/19)) ([66c73a6](https://github.com/AppiumTestDistribution/AppClaw/commit/66c73a677e763112c4fab80dd29301f3d2071532))
19-
* ci ([#10](https://github.com/AppiumTestDistribution/AppClaw/issues/10)) ([dfcd62f](https://github.com/AppiumTestDistribution/AppClaw/commit/dfcd62fa083d673c98fc0c381820c7dd58d36818))
20-
* DOM locator resolution, vision assert parsing, and appium-mcp coordinate scaling ([9272c36](https://github.com/AppiumTestDistribution/AppClaw/commit/9272c36b65e7bd996b730bb6d67d0fa6fee9518a))
21-
* read CLI version from package.json instead of hardcoded string ([#14](https://github.com/AppiumTestDistribution/AppClaw/issues/14)) ([fcb3a64](https://github.com/AppiumTestDistribution/AppClaw/commit/fcb3a6417ddc48d72d246bc9fd5dd1438020635d))
22-
* screenshot parsing ([e449a23](https://github.com/AppiumTestDistribution/AppClaw/commit/e449a2341fc67e193f1519bae16d4cace878bcfc))
23-
* scroll-aware stuck detection, press_enter tool, and post-done verification ([c03bbe4](https://github.com/AppiumTestDistribution/AppClaw/commit/c03bbe4222ce7fd7bba6867f7d1e59ac5ef3c8ee))
24-
* terminal UI ([294a780](https://github.com/AppiumTestDistribution/AppClaw/commit/294a780113d8afdb99b80cf57b47db5b3fe12dc2))
25-
* terminal view ([42c0e75](https://github.com/AppiumTestDistribution/AppClaw/commit/42c0e75e2d8a28c569b6511891628c1b98380cc3))
18+
- add semantic-release for automated versioning and npm publishing ([#19](https://github.com/AppiumTestDistribution/AppClaw/issues/19)) ([66c73a6](https://github.com/AppiumTestDistribution/AppClaw/commit/66c73a677e763112c4fab80dd29301f3d2071532))
19+
- ci ([#10](https://github.com/AppiumTestDistribution/AppClaw/issues/10)) ([dfcd62f](https://github.com/AppiumTestDistribution/AppClaw/commit/dfcd62fa083d673c98fc0c381820c7dd58d36818))
20+
- DOM locator resolution, vision assert parsing, and appium-mcp coordinate scaling ([9272c36](https://github.com/AppiumTestDistribution/AppClaw/commit/9272c36b65e7bd996b730bb6d67d0fa6fee9518a))
21+
- read CLI version from package.json instead of hardcoded string ([#14](https://github.com/AppiumTestDistribution/AppClaw/issues/14)) ([fcb3a64](https://github.com/AppiumTestDistribution/AppClaw/commit/fcb3a6417ddc48d72d246bc9fd5dd1438020635d))
22+
- screenshot parsing ([e449a23](https://github.com/AppiumTestDistribution/AppClaw/commit/e449a2341fc67e193f1519bae16d4cace878bcfc))
23+
- scroll-aware stuck detection, press_enter tool, and post-done verification ([c03bbe4](https://github.com/AppiumTestDistribution/AppClaw/commit/c03bbe4222ce7fd7bba6867f7d1e59ac5ef3c8ee))
24+
- terminal UI ([294a780](https://github.com/AppiumTestDistribution/AppClaw/commit/294a780113d8afdb99b80cf57b47db5b3fe12dc2))
25+
- terminal view ([42c0e75](https://github.com/AppiumTestDistribution/AppClaw/commit/42c0e75e2d8a28c569b6511891628c1b98380cc3))

action.yml

Lines changed: 130 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,40 @@ inputs:
5252
required: false
5353
default: '500'
5454

55+
# ── Debug ────────────────────────────────────────────────────────────────────
56+
mcp-debug:
57+
description: 'Enable MCP debug logging (MCP_DEBUG=1). Default: false'
58+
required: false
59+
default: 'false'
60+
mcp-timeout-ms:
61+
description: 'MCP request timeout in milliseconds. Default: 300000'
62+
required: false
63+
default: '300000'
64+
llm-thinking:
65+
description: 'Enable LLM extended thinking: on or off. Default: off'
66+
required: false
67+
default: 'off'
68+
69+
# ── iOS device ───────────────────────────────────────────────────────────────
70+
ios-device-type:
71+
description: 'iOS device type: simulator or real. Default: simulator'
72+
required: false
73+
default: 'simulator'
74+
75+
# ── iOS simulator ────────────────────────────────────────────────────────────
76+
device-udid:
77+
description: 'Explicit device/simulator UDID to target. Leave empty to let AppClaw auto-detect.'
78+
required: false
79+
default: ''
80+
ios-simulator-name:
81+
description: 'iOS simulator device model to boot (e.g. "iPhone 16", "iPhone 15 Pro"). Default: iPhone 16'
82+
required: false
83+
default: 'iPhone 16'
84+
ios-simulator-os:
85+
description: 'iOS version to use when multiple runtimes are available (e.g. "18.4", "17.5"). Default: latest available'
86+
required: false
87+
default: ''
88+
5589
# ── Android emulator ─────────────────────────────────────────────────────────
5690
android-api-level:
5791
description: 'Android emulator API level. Default: 33 (Android 13)'
@@ -191,7 +225,7 @@ runs:
191225
LLM_PROVIDER: ${{ inputs.provider }}
192226
LLM_API_KEY: ${{ inputs.api-key }}
193227
LLM_MODEL: ${{ inputs.model }}
194-
LLM_THINKING: 'off'
228+
LLM_THINKING: ${{ inputs.llm-thinking }}
195229
AGENT_MODE: ${{ inputs.agent-mode }}
196230
MAX_STEPS: ${{ inputs.max-steps }}
197231
STEP_DELAY: ${{ inputs.step-delay }}
@@ -212,7 +246,7 @@ runs:
212246
LLM_PROVIDER: ${{ inputs.provider }}
213247
LLM_API_KEY: ${{ inputs.api-key }}
214248
LLM_MODEL: ${{ inputs.model }}
215-
LLM_THINKING: 'off'
249+
LLM_THINKING: ${{ inputs.llm-thinking }}
216250
AGENT_MODE: ${{ inputs.agent-mode }}
217251
MAX_STEPS: ${{ inputs.max-steps }}
218252
STEP_DELAY: ${{ inputs.step-delay }}
@@ -243,7 +277,7 @@ runs:
243277
LLM_PROVIDER: ${{ inputs.provider }}
244278
LLM_API_KEY: ${{ inputs.api-key }}
245279
LLM_MODEL: ${{ inputs.model }}
246-
LLM_THINKING: 'off'
280+
LLM_THINKING: ${{ inputs.llm-thinking }}
247281
AGENT_MODE: ${{ inputs.agent-mode }}
248282
MAX_STEPS: ${{ inputs.max-steps }}
249283
STEP_DELAY: ${{ inputs.step-delay }}
@@ -265,7 +299,7 @@ runs:
265299
LLM_PROVIDER: ${{ inputs.provider }}
266300
LLM_API_KEY: ${{ inputs.api-key }}
267301
LLM_MODEL: ${{ inputs.model }}
268-
LLM_THINKING: 'off'
302+
LLM_THINKING: ${{ inputs.llm-thinking }}
269303
AGENT_MODE: ${{ inputs.agent-mode }}
270304
MAX_STEPS: ${{ inputs.max-steps }}
271305
STEP_DELAY: ${{ inputs.step-delay }}
@@ -279,6 +313,88 @@ runs:
279313
disable-animations: true
280314
script: appclaw "${{ inputs.goal }}" --platform android
281315

316+
# ── iOS — pre-download WebDriverAgent ────────────────────────────────────
317+
- name: Download prebuilt WebDriverAgent for iOS simulator
318+
if: inputs.platform == 'ios' && inputs.cloud-provider == ''
319+
shell: bash
320+
env:
321+
GH_TOKEN: ${{ github.token }}
322+
run: |
323+
# Resolve latest WDA version via GitHub API (authenticated = 5000/hr, no rate-limit risk)
324+
WDA_VERSION=$(curl -fsSL \
325+
-H "Authorization: Bearer ${GH_TOKEN}" \
326+
-H "Accept: application/vnd.github+json" \
327+
"https://api.github.com/repos/appium/WebDriverAgent/releases/latest" \
328+
| python3 -c "import sys,json; print(json.load(sys.stdin)['tag_name'].lstrip('v'))")
329+
330+
if [ -z "$WDA_VERSION" ]; then
331+
echo "::error::Could not resolve latest WDA version from GitHub"
332+
exit 1
333+
fi
334+
335+
ARCH=$(uname -m) # arm64 on macos-14 (Apple Silicon), x86_64 otherwise
336+
URL="https://github.com/appium/WebDriverAgent/releases/download/v${WDA_VERSION}/WebDriverAgentRunner-Build-Sim-${ARCH}.zip"
337+
338+
echo "Downloading prebuilt WDA v${WDA_VERSION} for ${ARCH}..."
339+
curl -fsSL "${URL}" -o /tmp/wda.zip
340+
unzip -q /tmp/wda.zip -d /tmp/wda
341+
342+
WDA_APP="/tmp/wda/WebDriverAgentRunner-Runner.app"
343+
if [ ! -d "$WDA_APP" ]; then
344+
echo "::error::WebDriverAgentRunner-Runner.app not found after extraction"
345+
ls -la /tmp/wda/
346+
exit 1
347+
fi
348+
349+
echo "APPIUM_MCP_WDA_APP_PATH=${WDA_APP}" >> $GITHUB_ENV
350+
echo "WDA pre-downloaded: ${WDA_APP}"
351+
352+
# ── iOS — boot simulator ─────────────────────────────────────────────────
353+
- name: Boot iOS simulator
354+
if: inputs.platform == 'ios' && inputs.cloud-provider == '' && inputs.ios-device-type == 'simulator'
355+
shell: bash
356+
env:
357+
SIM_NAME: ${{ inputs.ios-simulator-name }}
358+
SIM_OS: ${{ inputs.ios-simulator-os }}
359+
run: |
360+
xcrun simctl list devices available -j > /tmp/simctl_devices.json
361+
362+
UDID=$(python3 <<'EOF'
363+
import json, os, re, sys
364+
sim_name = os.environ.get('SIM_NAME', 'iPhone 16').lower()
365+
sim_os = os.environ.get('SIM_OS', '').strip()
366+
data = json.load(open('/tmp/simctl_devices.json'))
367+
candidates = []
368+
for runtime, devs in data['devices'].items():
369+
if 'iOS' not in runtime:
370+
continue
371+
# Extract version from runtime key, e.g. "com.apple.CoreSimulator.SimRuntime.iOS-18-4" → "18.4"
372+
m = re.search(r'iOS[- ]([\d][\d.-]+)', runtime, re.IGNORECASE)
373+
ver = m.group(1).replace('-', '.') if m else ''
374+
if sim_os and not ver.startswith(sim_os):
375+
continue
376+
for d in devs:
377+
if d.get('isAvailable') and sim_name in d.get('name', '').lower():
378+
candidates.append((ver, d['udid']))
379+
if not candidates:
380+
sys.exit(1)
381+
# Pick highest iOS version
382+
candidates.sort(key=lambda x: [int(p) for p in x[0].split('.') if p.isdigit()], reverse=True)
383+
print(candidates[0][1])
384+
EOF
385+
)
386+
387+
if [ -z "$UDID" ]; then
388+
echo "::error::No available iOS simulator matching name='${SIM_NAME}' os='${SIM_OS}'"
389+
xcrun simctl list devices available
390+
exit 1
391+
fi
392+
393+
echo "Booting simulator $UDID (${SIM_NAME})"
394+
xcrun simctl boot "$UDID" 2>/dev/null || true # already Booted is OK
395+
xcrun simctl bootstatus "$UDID" -b # block until fully booted
396+
echo "IOS_SIMULATOR_UDID=$UDID" >> "$GITHUB_ENV"
397+
282398
# ── iOS — YAML flow ───────────────────────────────────────────────────────
283399
- name: Run YAML flow on iOS simulator
284400
if: inputs.platform == 'ios' && inputs.cloud-provider == '' && inputs.flow != ''
@@ -287,12 +403,15 @@ runs:
287403
LLM_PROVIDER: ${{ inputs.provider }}
288404
LLM_API_KEY: ${{ inputs.api-key }}
289405
LLM_MODEL: ${{ inputs.model }}
290-
LLM_THINKING: 'off'
406+
LLM_THINKING: ${{ inputs.llm-thinking }}
291407
AGENT_MODE: ${{ inputs.agent-mode }}
292408
MAX_STEPS: ${{ inputs.max-steps }}
293409
STEP_DELAY: ${{ inputs.step-delay }}
294410
PLATFORM: ios
295-
DEVICE_TYPE: simulator
411+
DEVICE_TYPE: ${{ inputs.ios-device-type }}
412+
DEVICE_UDID: ${{ inputs.device-udid || env.IOS_SIMULATOR_UDID }}
413+
MCP_DEBUG: ${{ inputs.mcp-debug == 'true' && '1' || '0' }}
414+
MCP_TIMEOUT_MS: ${{ inputs.mcp-timeout-ms }}
296415
run: appclaw --flow "${{ inputs.flow }}" --platform ios
297416

298417
# ── iOS — natural language goal ───────────────────────────────────────────
@@ -303,12 +422,15 @@ runs:
303422
LLM_PROVIDER: ${{ inputs.provider }}
304423
LLM_API_KEY: ${{ inputs.api-key }}
305424
LLM_MODEL: ${{ inputs.model }}
306-
LLM_THINKING: 'off'
425+
LLM_THINKING: ${{ inputs.llm-thinking }}
307426
AGENT_MODE: ${{ inputs.agent-mode }}
308427
MAX_STEPS: ${{ inputs.max-steps }}
309428
STEP_DELAY: ${{ inputs.step-delay }}
310429
PLATFORM: ios
311-
DEVICE_TYPE: simulator
430+
DEVICE_TYPE: ${{ inputs.ios-device-type }}
431+
DEVICE_UDID: ${{ inputs.device-udid || env.IOS_SIMULATOR_UDID }}
432+
MCP_DEBUG: ${{ inputs.mcp-debug == 'true' && '1' || '0' }}
433+
MCP_TIMEOUT_MS: ${{ inputs.mcp-timeout-ms }}
312434
run: appclaw "${{ inputs.goal }}" --platform ios
313435

314436
# ── Report ────────────────────────────────────────────────────────────────

0 commit comments

Comments
 (0)