Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
5be447c
feat: expose semantic MCP tools
thymikee May 26, 2026
9ee4461
docs: remove semantic mcp prd
thymikee May 26, 2026
283adaa
refactor: deepen semantic command surface
thymikee May 26, 2026
75d2e0b
refactor: add mcp execution seam
thymikee May 26, 2026
1ad69e9
refactor: deepen command grammar
thymikee May 26, 2026
28efe81
refactor: remove legacy command definitions
thymikee May 26, 2026
690292e
refactor: collapse semantic cli wrappers
thymikee May 26, 2026
1696678
refactor: remove local mcp placeholders
thymikee May 26, 2026
0608720
refactor: derive semantic cli routing
thymikee May 26, 2026
e0a45dd
refactor: trim mcp status metadata
thymikee May 26, 2026
0cb8a3b
refactor: derive semantic input contracts
thymikee May 26, 2026
5c0ce25
refactor: split semantic grammar modules
thymikee May 26, 2026
8772818
refactor: derive batch input schema
thymikee May 26, 2026
3482b57
refactor: centralize cli command schema catalog
thymikee May 26, 2026
bcb4d57
refactor: share semantic cli output projections
thymikee May 26, 2026
3fad3a8
refactor: remove legacy cli output paths
thymikee May 26, 2026
9b63d0b
refactor: consolidate command interface surface
thymikee May 27, 2026
b44923f
docs: align command contract wording
thymikee May 27, 2026
e20e3de
refactor: split command projection from cli grammar
thymikee May 27, 2026
a0d8672
refactor: trim projection exports
thymikee May 27, 2026
57dded0
fix: satisfy fallow command contract audit
thymikee May 27, 2026
accce16
refactor: structure public batch steps
thymikee May 27, 2026
c9af4d8
chore: clean batch architecture references
thymikee May 27, 2026
bde63ef
fix: keep legacy cli batch steps working
thymikee May 27, 2026
dab7d8c
fix: serialize mcp batches
thymikee May 27, 2026
88492f2
chore: tighten command surface cleanup
thymikee May 27, 2026
40f589b
fix: serialize mcp stdin requests
thymikee May 27, 2026
0f9c1ef
chore: keep mcp config out of command contracts
thymikee May 27, 2026
634c513
fix: project structured batch targets
thymikee May 27, 2026
1f6136f
chore: harden command input typing
thymikee May 27, 2026
6a1d335
fix: project maestro backend for replay tests
thymikee May 28, 2026
d841ac8
fix: preserve session mcp request options
thymikee May 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 38 additions & 15 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,29 @@ Single-context repo. Read `CONTEXT.md` for domain language and testing/architect
- Keep modules small for agent context safety:
- target <= 300 LOC per implementation file when practical.
- if a file grows past 500 LOC, plan/extract focused submodules before adding new behavior.
- exception: generated files, schema/fixture snapshots, and integration test aggregations.
- if a file grows past 1,000 LOC, treat it as architecture debt unless it is generated data, a fixture snapshot, or an integration test aggregation.
- long guidance/data tables should live behind focused modules instead of sharing a file with parser/runtime logic.
- prefer deep modules over mechanical splits: extract when it improves locality for a concept callers already need, not just to reduce line count.

## Context Management
- Optimize for one-pass agent reads. A module that requires reading many siblings to understand one change is usually too shallow; a module that hides one concept behind a small interface is usually worth keeping.
- Start with the owning module, then one shared helper, then one downstream caller or adapter. Broaden only when the contract crosses that edge.
- Use targeted symbol searches before opening large files. For files over 500 LOC, search for the relevant type/function/section first, then read a bounded range.
- Do not add unrelated exports just to make tests easier. Test through the public interface when possible; if that is awkward, consider whether the module's interface is too shallow.
- When adding new guidance, examples, schemas, or command metadata, decide whether it belongs in the command surface, CLI grammar, CLI help, MCP projection, or daemon runtime before editing.
- Prefer updating existing domain vocabulary in `CONTEXT.md` when naming a new durable module concept. Do not coin parallel names in docs, tests, and code.

## Routing
- Keep `src/daemon.ts` as a thin router.
- Keep command names and daemon routing groups centralized in `src/command-catalog.ts`; do not re-create command string sets in handlers or request policy modules.
- Keep CLI/client positional grammar in `src/command-codecs.ts` and its `src/command-codecs/*` command-family modules. CLI commands, typed client methods, and daemon interaction adapters should reuse these codecs instead of duplicating selector/ref/positionals parsing.
- Keep command input/output contracts in the command modules:
- command surface and shared schemas: `src/commands/command-surface.ts`, `src/commands/command-contract.ts`, `src/commands/command-input.ts`
- typed client command execution: `src/commands/client-command-contracts.ts`
- command families: `src/commands/interaction-command-contracts.ts`, `src/commands/batch-command.ts`, with other typed client contracts in `src/commands/client-command-contracts.ts`
- CLI positional/flag grammar: `src/commands/cli-grammar.ts` and `src/commands/cli-grammar/*`
- typed input to daemon request projection: `src/commands/command-projection.ts`
- CLI/client/runtime output projection: `src/commands/cli-output.ts`, `src/commands/client-output.ts`, `src/commands/runtime-output.ts`
- Do not reintroduce CLI-shaped command adapters or schemas as a second source of truth. CLI, Node.js, and MCP should project from command contracts.
- Keep `src/daemon/request-router.ts` as request orchestration: auth, diagnostics scope, request admission, locking, handler chain, and fallback dispatch.
- Put request policies in focused request modules:
- tenant/lease/selector/lock admission: `src/daemon/request-admission.ts`
Expand Down Expand Up @@ -111,17 +128,18 @@ Single-context repo. Read `CONTEXT.md` for domain language and testing/architect

## Adding a New CLI Flag

A new snapshot/command flag touches up to 7 files in a fixed order. Follow this checklist:
A new snapshot/command flag touches only the layers that need to understand it. Follow this checklist in order:

1. `src/utils/command-schema.ts`: add to `CliFlags` type, `FLAG_DEFINITIONS` array, and the relevant `*_FLAGS` constant (e.g. `SNAPSHOT_FLAGS`). Update the command's `usageOverride` string.
2. `src/utils/snapshot.ts` (or the relevant options type): add to `SnapshotOptions` or equivalent.
3. `src/client-types.ts`: add to `CaptureSnapshotOptions` (or equivalent public options type) **and** `InternalRequestOptions`.
4. `src/client-normalizers.ts`: map the public option name to the internal flag name in `buildFlags`.
5. `src/daemon/context.ts`: add to `DaemonCommandContext` type and `contextFromFlags` function.
6. `src/core/dispatch-context.ts`: add to `DispatchContext` when the flag flows into platform dispatch, then thread it through the relevant dispatcher module.
7. `src/cli/commands/<command>.ts`: pass the flag from `flags.*` to the client call.
1. `src/utils/cli-flags.ts`: add to `CliFlags`, `FLAG_DEFINITIONS`, and the relevant exported flag group (e.g. `SNAPSHOT_FLAGS`). Add the flag to `CLI_COMMAND_OVERRIDES` in `src/utils/cli-command-overrides.ts` for each command that supports it; command names/descriptions come from command contracts unless CLI help needs a specific override.
2. `src/commands/cli-grammar/*`: read the CLI flag into command input when the CLI accepts it.
3. `src/commands/command-projection.ts` and command-family projection helpers: write the input into the daemon request only if the flag affects daemon execution.
4. `src/commands/*-command-contracts.ts`: add or update the command input schema only if the option should be available through Node.js or MCP as structured input.
5. `src/client-types.ts`: update the public typed client option only when the Node.js interface exposes the option.
6. `src/client-normalizers.ts`: update daemon flag normalization only when the request still needs a public-to-internal option translation.
7. `src/daemon/context.ts` and `src/core/dispatch-context.ts`: add the field only when it flows into platform dispatch.
8. Handler/platform modules: thread the option only after the command surface, grammar, and projection prove it belongs there.

Command-only flags (like `find --first`) that don't flow to the platform layer only need steps 1 and the handler file.
Command-only flags (like `find --first`) that do not flow to the platform layer usually stop at steps 1-3.

## Hard Rules
- Use process helpers from `src/utils/exec.ts` for TypeScript process execution: `runCmd`, `runCmdStreaming`, `runCmdSync`, `runCmdBackground`, and `runCmdDetached`. Do not import raw `spawn`/`spawnSync` outside `src/utils/exec.ts`; add or extend an exec helper instead. Plain `.mjs` packaging fixtures that cannot import TypeScript helpers should keep child-process usage local and prefer `execFile`/`execFileSync` over spawn.
Expand Down Expand Up @@ -190,7 +208,7 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o

## Testing Matrix
- Docs/skills only: no tests required unless a more specific rule below applies.
- CLI help/guidance changes in `src/utils/command-schema.ts`: run `pnpm exec vitest run src/utils/__tests__/args.test.ts`.
- CLI help/guidance changes in `src/utils/cli-help.ts`, `src/utils/cli-command-overrides.ts`, or `src/utils/command-schema.ts`: run `pnpm exec vitest run src/utils/__tests__/args.test.ts`.
- SkillGym prompt/assertion changes: run `pnpm test:skillgym:case <case-id>`; the script builds local CLI help first. For broad validation, use `pnpm test:skillgym`; append `-- --tag fixture-smoke` or `-- --tag skill-guidance` when validating one suite group.
- Non-TS, no behavior impact: no tests unless requested.
- Keep tests behavioral; do not assert shapes or cases TypeScript already proves.
Expand All @@ -208,6 +226,7 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o
- Do not run integration tests by default.
- Do not inspect both iOS and Android codepaths unless task requires both.
- Prefer targeted `git diff -- <paths>` over broad file reads during review.
- Keep long help prose in `src/utils/cli-help.ts`; keep flag definitions in `src/utils/cli-flags.ts`; keep CLI-specific command usage/flag metadata in `src/utils/cli-command-overrides.ts`.
- Prefer `snapshot -i`, `find`, and scoped selectors over repeated full snapshot dumps when exploring Apple desktop UIs.
- Keep PR summaries short and scoped.

Expand All @@ -222,9 +241,10 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o
- Changing `tsconfig.lib.json`/build tooling without running `pnpm check:tooling`; declaration generation is stricter than `tsc --noEmit`.

## Docs & Skills
- Versioned CLI help is the agent-facing source of truth. Put workflow guidance in `src/utils/command-schema.ts` help topics and assert important copy in `src/utils/__tests__/args.test.ts`.
- Versioned CLI help is the agent-facing source of truth. Put workflow guidance and help-topic prose in `src/utils/cli-help.ts`, keep flag definitions in `src/utils/cli-flags.ts`, keep CLI command overrides in `src/utils/cli-command-overrides.ts`, and assert important copy in `src/utils/__tests__/args.test.ts`.
- Keep parser schema and help rendering separate: `src/utils/command-schema.ts` composes contract-derived command schemas with CLI overrides; `src/utils/cli-help.ts` owns help topics and usage rendering.
- Skills are thin routers. Keep `skills/**/SKILL.md` focused on when to use the skill, version gating, which `agent-device help <topic>` page to read, and a short default loop. Do not duplicate full CLI manuals in skills.
- For behavior/CLI surface changes, update the versioned help instructions in `src/utils/command-schema.ts` and assert important help copy in `src/utils/__tests__/args.test.ts`. Also update `README.md` and relevant `website/docs/**` when user-facing docs need it.
- For behavior/CLI surface changes, update the versioned help instructions in `src/utils/cli-help.ts` or the CLI command metadata in `src/utils/cli-command-overrides.ts`, then assert important help copy in `src/utils/__tests__/args.test.ts`. Also update `README.md` and relevant `website/docs/**` when user-facing docs need it.
- For behavior/CLI surface changes and command-planning guidance changes, write or update a SkillGym case in `test/skillgym/suites/agent-device-smoke-suite.ts` that captures the expected agent command plan.
- Do not update `skills/**/SKILL.md` for command behavior or workflow guidance unless the user explicitly asks; skills must route to versioned CLI help instead of carrying behavior details.
- Keep SkillGym cases behavioral and command-planning oriented. Prefer prompts that assert the user-visible contract and expected command family over brittle exact output, but forbid known bad patterns.
Expand All @@ -245,6 +265,7 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o

## Key Files
- CLI parse + formatting: `src/bin.ts`, `src/cli.ts`, `src/utils/args.ts`
- CLI help + option metadata: `src/utils/cli-help.ts`, `src/utils/cli-flags.ts`, `src/utils/cli-command-overrides.ts`, `src/utils/command-schema.ts`, `src/utils/cli-option-schema.ts`
- Daemon client transport: `src/daemon-client.ts`
- Daemon state/store: `src/daemon/session-store.ts`
- Selector DSL and matching: `src/daemon/selectors.ts`
Expand All @@ -254,7 +275,9 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o
- Handler context helpers: `src/daemon/context.ts`, `src/daemon/device-ready.ts`
- Request routing/policy: `src/daemon/request-router.ts`, `src/daemon/request-admission.ts`, `src/daemon/request-generic-dispatch.ts`
- Dispatcher + capability map: `src/core/dispatch.ts`, `src/core/dispatch-context.ts`, `src/core/dispatch-interactions.ts`, `src/core/capabilities.ts`
- Command catalog + positional codecs: `src/command-catalog.ts`, `src/command-codecs.ts`, `src/command-codecs/*`
- Command catalog + command surface: `src/command-catalog.ts`, `src/commands/command-surface.ts`, `src/commands/command-contract.ts`, `src/commands/client-command-contracts.ts`
- CLI grammar: `src/commands/cli-grammar.ts`, `src/commands/cli-grammar/*`
- Daemon request projection: `src/commands/command-projection.ts`
- Platform backends: `src/platforms/ios/*`, `ios-runner/*`, `src/platforms/android/*`

## Pull Requests
Expand Down
1 change: 1 addition & 0 deletions CONTEXT.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
- Target: selected automation destination, such as mobile, tv, or desktop.
- Modality: broad supported device family, such as mobile, tv, or desktop.
- Session: daemon-owned state for a selected target and opened app or surface.
- Command surface: catalog of public command identity, interface exposure, adapter policy, and shared command metadata across CLI, Node.js, MCP, and batch entrypoints.

## Testing Principles

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Snapshots assign refs like `@e1`, `@e2`, and `@e3` to elements on the current sc

## Next Steps

- **Set up your agent**: run the CLI from Cursor, Codex, Claude Code, Windsurf, or another agent terminal. For skills, rules, MCP discovery, and client-specific setup, see [AI Agent Setup](https://incubator.callstack.com/agent-device/docs/agent-setup).
- **Set up your agent**: run the CLI from Cursor, Codex, Claude Code, Windsurf, or another agent terminal. For skills, rules, direct MCP tools, and client-specific setup, see [AI Agent Setup](https://incubator.callstack.com/agent-device/docs/agent-setup).
- **Try the sample app**: clone the repo and run the bundled Expo fixture when you want a guided first dogfood run with screenshots, replay, and performance evidence. See [Quick Start](https://incubator.callstack.com/agent-device/docs/quick-start).
- **Go deeper**: use [Commands](https://incubator.callstack.com/agent-device/docs/commands), [Replay & E2E](https://incubator.callstack.com/agent-device/docs/replay-e2e), and [Debugging & Profiling](https://incubator.callstack.com/agent-device/docs/debugging-profiling) for production workflows.

Expand Down
Loading
Loading