Skip to content

Commit 359de57

Browse files
rtibblesclaude
andcommitted
Add agentic tooling for emulator interaction
- AGENTS.md: Guide for Claude agents to autonomously manage the emulator, build/install the app, and interact with the UI - scripts/cdp_helper.py: Chrome DevTools Protocol helper to inspect and click WebView DOM elements (invisible to uiautomator) - .claude/commands/screenshot.md: /project:screenshot command for the visual inspect-act loop - CLAUDE.md: Wire in AGENTS.md via @import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 287956e commit 359de57

21 files changed

Lines changed: 1152 additions & 206 deletions

.claude/commands/screenshot.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
You are interacting with an Android emulator running Kolibri, a WebView-based app. Your goal is to inspect the current screen state and, if needed, interact with the UI to accomplish the user's task.
2+
3+
## Step 1: Capture the screen
4+
5+
```bash
6+
mkdir -p /tmp/claude
7+
adb exec-out screencap -p > /tmp/claude/screenshot.png
8+
```
9+
10+
Then read the screenshot image at `/tmp/claude/screenshot.png` to see what's on screen.
11+
12+
## Step 2: Inspect the UI
13+
14+
Kolibri is a WebView app. There are two separate tools for inspecting the UI, and you need to use the right one:
15+
16+
### For WebView content (Kolibri UI — buttons, text, forms, navigation):
17+
```bash
18+
python3 scripts/cdp_helper.py dump
19+
```
20+
This uses Chrome DevTools Protocol to list all visible DOM elements with their `text`, `id`, `classes`, and `role`. This is what Maestro sees when using `androidWebViewHierarchy: devtools`.
21+
22+
You can also click WebView elements directly:
23+
```bash
24+
python3 scripts/cdp_helper.py click "CONTINUE"
25+
python3 scripts/cdp_helper.py click "EXPLORE"
26+
```
27+
28+
Or run arbitrary JavaScript:
29+
```bash
30+
python3 scripts/cdp_helper.py js "document.title"
31+
```
32+
33+
### For native Android UI (system dialogs, permission prompts, toasts):
34+
```bash
35+
adb shell uiautomator dump /sdcard/window_dump.xml && adb shell cat /sdcard/window_dump.xml
36+
```
37+
Use this when you see a native Android dialog (e.g. "Allow notifications?", permission requests). These are **not** visible to CDP. Parse the XML to find elements by `text` and `bounds`, then tap using `adb shell input tap <x> <y>` with coordinates derived from bounds center.
38+
39+
**How to tell which tool to use:** If the screenshot shows a system dialog with rounded corners overlaying the app, use uiautomator. For everything else (Kolibri's own UI), use CDP.
40+
41+
## Step 3: Check recent logs (if needed)
42+
43+
```bash
44+
adb logcat -d -t 50
45+
```
46+
47+
For Kolibri-specific logs:
48+
```bash
49+
adb logcat -d -t 50 -s python.stdout:V python.stderr:V KolibriWebView:V KolibriServer:V
50+
```
51+
52+
## Step 4: Interact with the UI
53+
54+
### WebView elements (preferred)
55+
Use the CDP helper to click by text — this avoids coordinate math entirely:
56+
```bash
57+
python3 scripts/cdp_helper.py click "Button Text"
58+
```
59+
60+
### Native elements (system dialogs only)
61+
Derive tap coordinates from uiautomator `bounds="[left,top][right,bottom]"`:
62+
- x = (left + right) / 2
63+
- y = (top + bottom) / 2
64+
65+
```bash
66+
adb shell input tap <x> <y>
67+
```
68+
69+
### Other interactions
70+
```bash
71+
adb shell input text "<text>" # Type (encode spaces as %s)
72+
adb shell input swipe 540 1500 540 500 300 # Scroll down
73+
adb shell input keyevent 4 # BACK
74+
adb shell input keyevent 66 # ENTER
75+
```
76+
77+
## Step 5: Verify
78+
79+
After every interaction, take another screenshot and read it to confirm the action had the intended effect. Repeat the inspect-act loop until the task is complete.
80+
81+
## Workflow summary
82+
83+
1. Screenshot -> Read image -> Inspect (CDP for WebView, uiautomator for native) -> Understand state
84+
2. Decide action -> Click via CDP or tap via adb -> Screenshot again -> Verify
85+
3. Repeat until done
86+
87+
Always read the screenshot image visually — the CDP dump shows text content but not layout, and uiautomator cannot see inside the WebView.

.pre-commit-config.yaml

Lines changed: 6 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,19 @@
11
repos:
2-
- repo: https://github.com/python/black
3-
rev: 22.3.0
2+
- repo: https://github.com/astral-sh/ruff-pre-commit
3+
rev: v0.9.6
44
hooks:
5-
- id: black
6-
types_or: [ python, pyi ]
7-
- repo: https://github.com/pycqa/flake8
8-
rev: 4.0.1
9-
hooks:
10-
- id: flake8
5+
- id: ruff
6+
args: [--fix]
7+
- id: ruff-format
118
- repo: https://github.com/pre-commit/pre-commit-hooks
12-
rev: v4.1.0
9+
rev: v6.0.0
1310
hooks:
1411
- id: trailing-whitespace
1512
- id: check-yaml
1613
args: ['--allow-multiple-documents']
1714
- id: check-added-large-files
1815
- id: debug-statements
1916
- id: end-of-file-fixer
20-
21-
- repo: https://github.com/asottile/reorder_python_imports
22-
rev: v2.6.0
23-
hooks:
24-
- id: reorder-python-imports
25-
2617
- repo: local
2718
hooks:
2819
- id: spotless

0 commit comments

Comments
 (0)