Skip to content

Commit c51b1e2

Browse files
aidandaly24claudejesseturner21
authored
feat: add TUI agent harness with MCP server (#548)
* feat: add TUI harness core library Headless terminal harness using node-pty + @xterm/headless that spawns real CLI processes in a PTY and reads screen state programmatically. Core components: - TuiSession: spawn, sendKeys, sendSpecialKey, readScreen, waitFor, close - SettlingMonitor: text-content comparison to filter cursor blink - Screen reader: viewport/scrollback reading, numbered output - Key map: named keys to escape sequence mapping - Session manager: global registry with process-exit cleanup - Availability check: graceful skip when node-pty is missing waitFor() throws WaitForTimeoutError on timeout (not silent return). launch() races settle vs process exit, throws LaunchError on crash. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add MCP server for TUI harness 7 MCP tools exposed via stdio transport for AI agents to drive the TUI: tui_launch, tui_send_keys, tui_read_screen, tui_wait_for, tui_screenshot, tui_close, tui_list_sessions. Session map with max 10 concurrent sessions. tui_wait_for catches WaitForTimeoutError and returns {found: false} (not an MCP error). tui_launch defaults to AgentCore CLI when no command specified. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: add dependencies and build config for TUI harness - devDependencies: @xterm/headless, @modelcontextprotocol/sdk - optionalDependencies: node-pty (native addon, graceful skip) - esbuild: second entry point for mcp-harness bundle - vitest: new 'tui' project with fileParallelism: false - .mcp.json: MCP server discovery for Claude Code - package.json: bin entry + test:tui script Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add TUI harness documentation AGENTS.md: TUI harness section with MCP tool reference, complete 27-step create wizard example (verified against real TUI), screen identification markers table, screenshot format, error recovery patterns, navigation patterns, and known limitations. TESTING.md: TUI integration test guide with TuiSession API reference, ScreenState type, special keys list, waitFor vs settling guidance, WaitForTimeoutError output example, and LaunchError handling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: consolidate TUI harness into src/tui-harness/ Move harness code from two separate directories (src/test-utils/tui-harness/ and src/mcp-harness/) into a single src/tui-harness/ directory with lib/ and mcp/ subdirectories. Also cleans up dead code in tools.ts, derives SpecialKey type from SPECIAL_KEY_VALUES array (single source of truth), and fixes cross-boundary import of createMinimalProjectDir. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: make TUI harness build opt-in for dev only The harness is dev-only tooling for AI agents and integration tests. It should not ship to end users who install the CLI. - Gate MCP harness esbuild behind BUILD_HARNESS=1 env var - Remove agent-tui-harness bin entry from package.json - Add !dist/mcp-harness to files array (npm publish exclusion) - Remove node-pty from optionalDependencies (stays in devDependencies) - Add build:harness script, update test:tui to use it - Update AGENTS.md to reference npm run build:harness Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: move TUI harness guide from AGENTS.md to docs/tui-harness.md Reduces AGENTS.md context overhead for agents that don't need TUI harness details. Leaves a one-line pointer to the full guide in docs/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix prettier formatting in AGENTS.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: rename TuiSession.ts to kebab-case tui-session.ts Aligns with the kebab-case convention used by other files in src/tui-harness/lib/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Jesse Turner <ajesstur@amazon.com> Co-authored-by: Jesse Turner <57651174+jesseturner21@users.noreply.github.com>
1 parent c2c646c commit c51b1e2

22 files changed

Lines changed: 3785 additions & 3 deletions

.mcp.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"mcpServers": {
3+
"tui-harness": {
4+
"command": "node",
5+
"args": ["dist/mcp-harness/index.mjs"]
6+
}
7+
}
8+
}

AGENTS.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Note: CDK L3 constructs are in a separate package `@aws/agentcore-cdk`.
4949

5050
## Primitives Architecture
5151

52-
All resource types (agent, memory, identity, gateway, mcp-tool) are modeled as **primitives** self-contained classes
52+
All resource types (agent, memory, identity, gateway, mcp-tool) are modeled as **primitives** -- self-contained classes
5353
in `src/cli/primitives/` that own the full add/remove lifecycle for their resource type.
5454

5555
Each primitive extends `BasePrimitive` and implements: `add()`, `remove()`, `previewRemove()`, `getRemovable()`,
@@ -117,3 +117,8 @@ See `docs/TESTING.md` for details.
117117

118118
- Always look for existing types before creating a new type inline.
119119
- Re-usable constants must be defined in a constants file in the closest sensible subdirectory.
120+
121+
## TUI Harness
122+
123+
See `docs/tui-harness.md` for the full TUI harness usage guide (MCP tools, screen markers, examples, and error
124+
recovery).

docs/TESTING.md

Lines changed: 223 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
npm test # Run unit tests
77
npm run test:watch # Run tests in watch mode
88
npm run test:integ # Run integration tests
9+
npm run test:tui # Run TUI integration tests (builds first)
910
npm run test:all # Run all tests (unit + integ)
1011
```
1112

@@ -125,12 +126,234 @@ Review the changes in `src/assets/__tests__/__snapshots__/` before committing.
125126
- Contents of all template files (CDK, Python frameworks, MCP, static assets)
126127
- Any file addition or removal
127128

129+
## TUI Integration Tests
130+
131+
TUI integration tests run the full CLI binary inside a pseudo-terminal (PTY) and verify screen output, keyboard
132+
navigation, and end-to-end wizard flows.
133+
134+
> **Note:** TUI tests require `node-pty` (native addon). If node-pty is not installed, TUI tests are automatically
135+
> skipped.
136+
137+
### Running TUI Tests
138+
139+
```bash
140+
npm run test:tui # Builds first, then runs TUI tests
141+
npx vitest run --project tui # Skip build (use when build is fresh)
142+
```
143+
144+
### Test Organization
145+
146+
```
147+
integ-tests/tui/
148+
├── setup.ts # Global setup: availability check, afterAll cleanup
149+
├── helpers.ts # createMinimalProjectDir, common test setup
150+
├── harness.test.ts # TuiSession self-tests (spawn, send, read)
151+
├── navigation.test.ts # Screen navigation flows
152+
├── create-flow.test.ts # Create wizard end-to-end
153+
├── add-flow.test.ts # Add resource flows
154+
└── deploy-screen.test.ts # Deploy screen rendering
155+
```
156+
157+
### Writing a TUI Flow Test
158+
159+
Below is a complete example showing the typical pattern for a TUI flow test:
160+
161+
```typescript
162+
import { isAvailable } from '../../src/test-utils/tui-harness/index.js';
163+
import { TuiSession } from '../../src/test-utils/tui-harness/index.js';
164+
import { createMinimalProjectDir } from './helpers.js';
165+
import { afterEach, describe, expect, it } from 'vitest';
166+
167+
describe.skipIf(!isAvailable)('my TUI flow', () => {
168+
let session: TuiSession;
169+
170+
afterEach(async () => {
171+
await session?.close();
172+
});
173+
174+
it('navigates to the add screen', async () => {
175+
// createMinimalProjectDir makes a temp dir with agentcore config (~10ms)
176+
const { dir, cleanup } = await createMinimalProjectDir({ hasAgents: true });
177+
178+
try {
179+
// Launch the CLI TUI in the project directory
180+
session = await TuiSession.launch({
181+
command: 'node',
182+
args: ['../../dist/cli/index.mjs'],
183+
cwd: dir,
184+
});
185+
186+
// Wait for the HelpScreen to render
187+
await session.waitFor('Commands');
188+
189+
// Navigate: type 'add' to filter, then Enter
190+
await session.sendKeys('add');
191+
await session.sendSpecialKey('enter');
192+
193+
// Verify we reached the AddScreen
194+
await session.waitFor('agent');
195+
const screen = session.readScreen();
196+
expect(screen.lines.join('\n')).toContain('agent');
197+
} finally {
198+
await cleanup();
199+
}
200+
});
201+
});
202+
```
203+
204+
Key points:
205+
206+
- **`describe.skipIf(!isAvailable)`** -- gracefully skips when `node-pty` is missing.
207+
- **`afterEach` with `session?.close()`** -- always clean up PTY processes.
208+
- **`createMinimalProjectDir`** -- fast temp directory setup (no `npm install`).
209+
- **`try/finally` with `cleanup()`** -- always remove temp directories.
210+
211+
### TuiSession API Quick Reference
212+
213+
| Method | Returns | Description |
214+
| -------------------------------------- | ---------------------- | -------------------------------------------------------------------------------------------- |
215+
| `TuiSession.launch(options)` | `Promise<TuiSession>` | Spawn CLI in PTY. Throws `LaunchError` if process exits during startup. |
216+
| `session.sendKeys(text, waitMs?)` | `Promise<ScreenState>` | Type text, wait for screen to settle, return screen. |
217+
| `session.sendSpecialKey(key, waitMs?)` | `Promise<ScreenState>` | Send special key (enter, tab, escape, etc.), wait, return screen. |
218+
| `session.readScreen(options?)` | `ScreenState` | Read current screen (synchronous). Options: `{ includeScrollback?, numbered? }`. |
219+
| `session.waitFor(pattern, timeoutMs?)` | `Promise<ScreenState>` | Wait for text/regex on screen. **Throws `WaitForTimeoutError` on timeout** (default 5000ms). |
220+
| `session.close(signal?)` | `Promise<CloseResult>` | Close session. Returns exit code, signal, final screen. |
221+
| `session.info` | `SessionInfo` | Session metadata: sessionId, pid, dimensions, alive status. |
222+
| `session.alive` | `boolean` | Whether the PTY process is still running. |
223+
224+
### ScreenState Shape
225+
226+
```typescript
227+
interface ScreenState {
228+
lines: string[]; // Each line of terminal text
229+
cursor: { x: number; y: number }; // Cursor position
230+
dimensions: { cols: number; rows: number }; // Terminal size
231+
bufferType: 'normal' | 'alternate'; // Active buffer
232+
}
233+
```
234+
235+
### Special Keys
236+
237+
The following special keys can be passed to `session.sendSpecialKey()`:
238+
239+
`enter`, `tab`, `escape`, `backspace`, `delete`, `space`, `up`, `down`, `left`, `right`, `home`, `end`, `pageup`,
240+
`pagedown`, `ctrl+c`, `ctrl+d`, `ctrl+q`, `ctrl+g`, `ctrl+a`, `ctrl+e`, `ctrl+w`, `ctrl+u`, `ctrl+k`, `f1` through
241+
`f12`.
242+
243+
### Key Concepts
244+
245+
#### waitFor vs Settling
246+
247+
- **Settling** (automatic after `sendKeys`/`sendSpecialKey`): Waits for screen text to stop changing. Good for most
248+
screens. Fails on spinner/animation screens because text changes continuously.
249+
- **waitFor**: Polls for a specific text pattern. Use for: (a) async operations with spinners, (b) confirming you
250+
reached the right screen, (c) any case where you need a specific pattern before proceeding.
251+
- **Rule of thumb**: Use `waitFor` when waiting for an async result (project creation, deployment). Use
252+
`sendKeys`/`sendSpecialKey` (which auto-settle) for navigating between static screens.
253+
254+
#### waitFor Throws on Timeout
255+
256+
`waitFor()` throws `WaitForTimeoutError` when the pattern is not found within the timeout. The error includes:
257+
258+
- The pattern that was not found
259+
- How long it waited
260+
- The full screen content at timeout
261+
262+
This means tests fail fast with useful diagnostics. You do not need to check a `found` boolean.
263+
264+
#### WaitForTimeoutError Output
265+
266+
When `waitFor()` times out, the thrown `WaitForTimeoutError` produces a message like this:
267+
268+
```
269+
WaitForTimeoutError: waitFor("created successfully") timed out after 5000ms.
270+
Screen content:
271+
AgentCore Create
272+
273+
Creating project...
274+
⠋ Installing dependencies
275+
```
276+
277+
The error message includes the full non-blank screen content at the time of the timeout. This makes it straightforward
278+
to diagnose why the expected pattern was not found -- was the screen still loading? Did the test land on the wrong
279+
screen? Was there a typo in the pattern?
280+
281+
If you need to inspect the error properties programmatically (for example, to log additional context or make assertions
282+
on the screen state), you can catch the error directly:
283+
284+
```typescript
285+
import { WaitForTimeoutError } from '../../src/test-utils/tui-harness/index.js';
286+
287+
try {
288+
await session.waitFor('expected text', 3000);
289+
} catch (err) {
290+
if (err instanceof WaitForTimeoutError) {
291+
console.log(err.pattern); // 'expected text'
292+
console.log(err.elapsed); // ~3000
293+
console.log(err.screen); // ScreenState with full content
294+
}
295+
throw err;
296+
}
297+
```
298+
299+
#### createMinimalProjectDir
300+
301+
Creates a temp directory that AgentCore recognizes as a project in ~10ms (no npm install). Use it when your test needs a
302+
project context:
303+
304+
```typescript
305+
const { dir, cleanup } = await createMinimalProjectDir({
306+
projectName: 'mytest', // optional, defaults to 'testproject'
307+
hasAgents: true, // optional, adds a sample agent
308+
});
309+
```
310+
311+
Always call `cleanup()` when done (in `finally` or `afterEach`).
312+
313+
#### LaunchError
314+
315+
`TuiSession.launch()` throws `LaunchError` when the spawned process exits before the screen settles. Common causes
316+
include a missing binary, a crash on startup, or an invalid working directory.
317+
318+
The error includes the following diagnostic properties:
319+
320+
- `command` -- the executable that was launched
321+
- `args` -- the arguments passed to the command
322+
- `cwd` -- the working directory used for the spawned process
323+
- `exitCode` -- the process exit code (or `null` if terminated by signal)
324+
- `screen` -- the `ScreenState` captured at the time of exit
325+
326+
You can assert that a launch fails with `LaunchError`:
327+
328+
```typescript
329+
import { LaunchError, TuiSession } from '../../src/test-utils/tui-harness/index.js';
330+
331+
it('throws LaunchError for missing binary', async () => {
332+
await expect(TuiSession.launch({ command: 'nonexistent-binary' })).rejects.toThrow(LaunchError);
333+
});
334+
335+
// Or if you need to inspect the error:
336+
it('provides diagnostics in LaunchError', async () => {
337+
try {
338+
await TuiSession.launch({ command: 'node', args: ['missing-file.js'] });
339+
} catch (err) {
340+
if (err instanceof LaunchError) {
341+
console.log(err.command); // 'node'
342+
console.log(err.exitCode); // 1
343+
console.log(err.screen); // ScreenState at time of crash
344+
}
345+
throw err;
346+
}
347+
});
348+
```
349+
128350
## Configuration
129351

130352
Test configuration is in `vitest.config.ts` using Vitest projects:
131353

132354
- **unit** project: `src/**/*.test.ts` (includes snapshot tests)
133355
- **integ** project: `integ-tests/**/*.test.ts`
356+
- **tui** project: `integ-tests/tui/**/*.test.ts` (TUI integration tests)
134357
- Test timeout: 120 seconds
135358
- Hook timeout: 120 seconds
136359

0 commit comments

Comments
 (0)