Skip to content

Commit b3ad24f

Browse files
authored
feat: add Android keyboard status and dismiss command (#144)
1 parent 3c6be3c commit b3ad24f

13 files changed

Lines changed: 508 additions & 1 deletion

File tree

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ The project is in early development and considered experimental. Pull requests a
1717
- Core commands: `open`, `back`, `home`, `app-switcher`, `press`, `long-press`, `focus`, `type`, `fill`, `scroll`, `scrollintoview`, `wait`, `alert`, `screenshot`, `close`, `reinstall`, `push`, `trigger-app-event`.
1818
- Inspection commands: `snapshot` (accessibility tree), `diff snapshot` (structural baseline diff), `appstate`, `apps`, `devices`.
1919
- Clipboard commands: `clipboard read`, `clipboard write <text>`.
20+
- Keyboard commands: `keyboard status|get|dismiss` (Android).
2021
- Performance command: `perf` (alias: `metrics`) returns a metrics JSON blob for the active session; startup timing is currently sampled.
2122
- App logs and traffic inspection: `logs path` returns session log metadata; `logs start` / `logs stop` stream app output; `logs clear` truncates session app logs; `logs clear --restart` resets and restarts stream in one step; `logs doctor` checks readiness; `logs mark` writes timeline markers; `network dump` parses recent HTTP(s) entries from session logs.
2223
- Device tooling: `adb` (Android), `simctl`/`devicectl` (iOS via Xcode).
@@ -152,6 +153,7 @@ agent-device scrollintoview @e42
152153
- `trace start`, `trace stop`
153154
- `logs path`, `logs start`, `logs stop`, `logs clear`, `logs clear --restart`, `logs doctor`, `logs mark` (session app log file for grep; iOS simulator + iOS device + Android)
154155
- `clipboard read`, `clipboard write <text>` (iOS simulator + Android)
156+
- `keyboard [status|get|dismiss]` (Android emulator/device)
155157
- `network dump [limit] [summary|headers|body|all]`, `network log ...` (best-effort HTTP(s) parsing from session app log)
156158
- `settings wifi|airplane|location on|off`
157159
- `settings appearance light|dark|toggle`
@@ -418,6 +420,12 @@ Clipboard:
418420
- Supported on Android emulator/device and iOS simulator.
419421
- iOS physical devices currently return `UNSUPPORTED_OPERATION` for clipboard commands.
420422

423+
Keyboard:
424+
- `keyboard status` (or `keyboard get`) reports Android keyboard visibility and best-effort input type classification (`text`, `number`, `email`, `phone`, `password`, `datetime`).
425+
- `keyboard dismiss` issues Android back keyevent only when keyboard is visible, then verifies hidden state.
426+
- Works with an active session device or explicit selectors (`--platform`, `--device`, `--udid`, `--serial`).
427+
- Supported on Android emulator/device.
428+
421429
## Debug
422430

423431
- **App logs (token-efficient):** Logging is off by default in normal flows. Enable it on demand when debugging. With an active session, run `logs path` to get path + state metadata (e.g. `<state-dir>/sessions/<session>/app.log`). Run `logs start` to stream app output to that file; use `logs stop` to stop. Run `logs clear` to truncate `app.log` (and remove rotated `app.log.N` files) before a new repro window. Run `logs doctor` for tool/runtime checks and `logs mark "step"` to insert timeline markers. Grep the file when you need to inspect errors (e.g. `grep -n "Error\|Exception" <path>`) instead of pulling full logs into context. Supported on iOS simulator, iOS physical device, and Android.

skills/agent-device/SKILL.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,8 @@ agent-device is visible 'id="anchor"'
138138
agent-device appstate
139139
agent-device clipboard read
140140
agent-device clipboard write "token"
141+
agent-device keyboard status
142+
agent-device keyboard dismiss
141143
agent-device perf --json
142144
agent-device network dump [limit] [summary|headers|body|all]
143145
agent-device push <bundle|package> <payload.json|inline-json>
@@ -169,6 +171,7 @@ agent-device batch --steps-file /tmp/batch-steps.json --json
169171
- Use `fill` for clear-then-type semantics; use `type` for focused append typing.
170172
- iOS `appstate` is session-scoped; Android `appstate` is live foreground state.
171173
- Clipboard helpers: `clipboard read` / `clipboard write <text>` are supported on Android and iOS simulators; iOS physical devices are not supported yet.
174+
- Android keyboard helpers: `keyboard status|get|dismiss` report keyboard visibility/type and dismiss via keyevent when visible.
172175
- `network dump` is best-effort and parses HTTP(s) entries from the session app log file.
173176
- Biometric settings: iOS simulator supports `settings faceid|touchid <match|nonmatch|enroll|unenroll>`; Android supports `settings fingerprint <match|nonmatch>` where runtime tooling is available.
174177
- For AndroidTV/tvOS selection, always pair `--target` with `--platform` (`ios`, `android`, or `apple` alias); target-only selection is invalid.

src/core/__tests__/capabilities.test.ts

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,12 @@ test('simulator-only iOS commands with Android support reject iOS devices', () =
5656
}
5757
});
5858

59+
test('keyboard command is Android-only', () => {
60+
assert.equal(isCommandSupportedOnDevice('keyboard', iosSimulator), false, 'keyboard on iOS sim');
61+
assert.equal(isCommandSupportedOnDevice('keyboard', iosDevice), false, 'keyboard on iOS device');
62+
assert.equal(isCommandSupportedOnDevice('keyboard', androidDevice), true, 'keyboard on Android');
63+
});
64+
5965
test('swipe supports iOS simulator, iOS device, and Android', () => {
6066
assert.equal(isCommandSupportedOnDevice('swipe', iosSimulator), true, 'swipe on iOS sim');
6167
assert.equal(isCommandSupportedOnDevice('swipe', iosDevice), true, 'swipe on iOS device');
@@ -127,6 +133,7 @@ test('tvOS follows iOS capability matrix by device kind', () => {
127133
for (const cmd of ['pinch', 'push', 'settings', 'alert']) {
128134
assert.equal(isCommandSupportedOnDevice(cmd, tvOsSimulator), true, `${cmd} on tvOS simulator`);
129135
}
136+
assert.equal(isCommandSupportedOnDevice('keyboard', tvOsSimulator), false, 'keyboard on tvOS simulator');
130137
});
131138

132139
test('unknown commands default to supported', () => {

src/core/capabilities.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ const COMMAND_CAPABILITY_MATRIX: Record<string, CommandCapability> = {
2222
boot: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
2323
click: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
2424
clipboard: { ios: { simulator: true }, android: { emulator: true, device: true, unknown: true } },
25+
keyboard: { ios: {}, android: { emulator: true, device: true, unknown: true } },
2526
close: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
2627
fill: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
2728
diff: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },

src/core/dispatch.ts

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,9 @@ import { listAndroidDevices } from '../platforms/android/devices.ts';
66
import {
77
appSwitcherAndroid,
88
backAndroid,
9+
dismissAndroidKeyboard,
910
ensureAdb,
11+
getAndroidKeyboardState,
1012
homeAndroid,
1113
pushAndroidNotification,
1214
readAndroidClipboardText,
@@ -461,6 +463,39 @@ export async function dispatchCommand(
461463
else await writeAndroidClipboardText(device, text);
462464
return { action, textLength: Array.from(text).length };
463465
}
466+
case 'keyboard': {
467+
if (device.platform !== 'android') {
468+
throw new AppError('UNSUPPORTED_OPERATION', 'keyboard is currently supported only on Android');
469+
}
470+
const action = (positionals[0] ?? 'status').toLowerCase();
471+
if (action !== 'status' && action !== 'get' && action !== 'dismiss') {
472+
throw new AppError('INVALID_ARGS', 'keyboard requires a subcommand: status, get, or dismiss');
473+
}
474+
if (positionals.length > 1) {
475+
throw new AppError('INVALID_ARGS', 'keyboard accepts at most one subcommand argument');
476+
}
477+
if (action === 'dismiss') {
478+
const result = await dismissAndroidKeyboard(device);
479+
return {
480+
platform: 'android',
481+
action: 'dismiss',
482+
attempts: result.attempts,
483+
wasVisible: result.wasVisible,
484+
dismissed: result.dismissed,
485+
visible: result.visible,
486+
inputType: result.inputType,
487+
type: result.type,
488+
};
489+
}
490+
const state = await getAndroidKeyboardState(device);
491+
return {
492+
platform: 'android',
493+
action: 'status',
494+
visible: state.visible,
495+
inputType: state.inputType,
496+
type: state.type,
497+
};
498+
}
464499
case 'settings': {
465500
const [setting, state, target, mode, appBundleId] = positionals;
466501
const permissionOptions =

src/daemon/handlers/__tests__/session.test.ts

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -838,6 +838,112 @@ test('clipboard requires an active session or explicit device selector', async (
838838
}
839839
});
840840

841+
test('keyboard requires an active session or explicit device selector', async () => {
842+
const sessionStore = makeSessionStore();
843+
const response = await handleSessionCommands({
844+
req: {
845+
token: 't',
846+
session: 'default',
847+
command: 'keyboard',
848+
positionals: ['status'],
849+
flags: {},
850+
},
851+
sessionName: 'default',
852+
logPath: path.join(os.tmpdir(), 'daemon.log'),
853+
sessionStore,
854+
invoke: noopInvoke,
855+
});
856+
857+
assert.ok(response);
858+
assert.equal(response?.ok, false);
859+
if (response && !response.ok) {
860+
assert.equal(response.error.code, 'INVALID_ARGS');
861+
assert.match(response.error.message, /keyboard requires an active session or an explicit device selector/i);
862+
}
863+
});
864+
865+
test('keyboard dismiss supports explicit selector without active session', async () => {
866+
const sessionStore = makeSessionStore();
867+
const selectedDevice: SessionState['device'] = {
868+
platform: 'android',
869+
id: 'emulator-5554',
870+
name: 'Pixel Emulator',
871+
kind: 'emulator',
872+
booted: true,
873+
};
874+
875+
const response = await handleSessionCommands({
876+
req: {
877+
token: 't',
878+
session: 'default',
879+
command: 'keyboard',
880+
positionals: ['dismiss'],
881+
flags: { platform: 'android', serial: 'emulator-5554' },
882+
},
883+
sessionName: 'default',
884+
logPath: path.join(os.tmpdir(), 'daemon.log'),
885+
sessionStore,
886+
invoke: noopInvoke,
887+
ensureReady: async () => {},
888+
resolveTargetDevice: async () => selectedDevice,
889+
dispatch: async (device, command, positionals) => {
890+
assert.equal(device.id, 'emulator-5554');
891+
assert.equal(command, 'keyboard');
892+
assert.deepEqual(positionals, ['dismiss']);
893+
return { platform: 'android', action: 'dismiss', dismissed: true, visible: false };
894+
},
895+
});
896+
897+
assert.ok(response);
898+
assert.equal(response?.ok, true);
899+
if (response && response.ok) {
900+
assert.equal(response.data?.platform, 'android');
901+
assert.equal(response.data?.action, 'dismiss');
902+
assert.equal(response.data?.dismissed, true);
903+
assert.equal(response.data?.visible, false);
904+
}
905+
});
906+
907+
test('keyboard rejects unsupported iOS simulator devices', async () => {
908+
const sessionStore = makeSessionStore();
909+
const sessionName = 'ios-sim-session';
910+
sessionStore.set(
911+
sessionName,
912+
makeSession(sessionName, {
913+
platform: 'ios',
914+
id: 'sim-1',
915+
name: 'iPhone 17 Pro',
916+
kind: 'simulator',
917+
booted: true,
918+
}),
919+
);
920+
921+
const response = await handleSessionCommands({
922+
req: {
923+
token: 't',
924+
session: sessionName,
925+
command: 'keyboard',
926+
positionals: ['status'],
927+
flags: {},
928+
},
929+
sessionName,
930+
logPath: path.join(os.tmpdir(), 'daemon.log'),
931+
sessionStore,
932+
invoke: noopInvoke,
933+
ensureReady: async () => {},
934+
dispatch: async () => {
935+
throw new Error('dispatch should not run for unsupported targets');
936+
},
937+
});
938+
939+
assert.ok(response);
940+
assert.equal(response?.ok, false);
941+
if (response && !response.ok) {
942+
assert.equal(response.error.code, 'UNSUPPORTED_OPERATION');
943+
assert.match(response.error.message, /keyboard is not supported on this device/i);
944+
}
945+
});
946+
841947
test('clipboard read uses active session device', async () => {
842948
const sessionStore = makeSessionStore();
843949
const sessionName = 'ios-sim-session';

src/daemon/handlers/session.ts

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -806,6 +806,20 @@ export async function handleSessionCommands(params: {
806806
});
807807
}
808808

809+
if (command === 'keyboard') {
810+
return await runSessionOrSelectorDispatch({
811+
req,
812+
sessionName,
813+
logPath,
814+
sessionStore,
815+
ensureReady,
816+
resolveDevice,
817+
dispatch,
818+
command: 'keyboard',
819+
positionals: req.positionals ?? [],
820+
});
821+
}
822+
809823
if (command === 'perf') {
810824
const session = sessionStore.get(sessionName);
811825
if (!session) {

0 commit comments

Comments
 (0)