fix: improve maestro test output by thymikee · Pull Request #647 · callstackincubator/agent-device

thymikee · 2026-06-01T15:05:54Z

Summary

Improve Maestro replay test output so CI logs are easier to scan and failures carry enough context to investigate.

Before, replay suite rows printed noisy absolute paths, millisecond durations, and a FLAKY status inline with the main results. Failed and flaky cases also lacked enough step-level context for a human to quickly see what failed.

After this change:

result rows use Maestro YAML name titles when available, falling back to file basenames
failures include both the readable title and source file, e.g. FAIL "Bottom Tabs - Dynamic" in bottom-tabs-dynamic.yml
durations render in seconds, e.g. 17.5s instead of 17492ms
retried passes are reported as PASS in the main list and summarized separately as flaky
flaky timing output separates passed-attempt time from total retry time
failed tests print replay step telemetry from replay-timing.ndjson, including command names like tapOn / assertVisible, line numbers, durations, timing payloads, and highlighted failed steps
test --verbose prints the same per-test step timings for all runnable tests, not only failures
test --verbose no longer enables the noisy debug/daemon log stream; --debug and -v still do

Details

Touched 12 files. Scope stayed within Maestro/replay test discovery, suite result shaping, CLI test rendering, CLI debug/verbose handling, and focused tests/help copy.

Validation

Verified with formatting, TypeScript/lint checks, fallow audit, and focused Vitest coverage for CLI output/help, diagnostics behavior, replay test discovery, and replay suite retry behavior.

Commands run:

pnpm format
pnpm check:quick
pnpm check:fallow --base origin/main
pnpm exec vitest run src/__tests__/cli-network.test.ts src/__tests__/cli-diagnostics.test.ts src/utils/__tests__/args.test.ts src/daemon/handlers/__tests__/session-test-discovery.test.ts src/daemon/handlers/__tests__/session-test-suite.test.ts

github-actions · 2026-06-01T15:07:50Z

Size Report

Metric	Base	Current	Diff
JS raw	1.1 MB	1.1 MB	-39 B
JS gzip	358.6 kB	359.0 kB	+354 B
npm tarball	459.4 kB	458.5 kB	-945 B
npm unpacked	1.5 MB	1.5 MB	-3.3 kB

Startup median (7 runs, lower is better):

Scenario	Base	Current	Diff
CLI --version	29.4 ms	28.0 ms	-1.5 ms
CLI --help	43.2 ms	43.2 ms	-0.1 ms

Top changed chunks:

Chunk	Raw diff	Gzip diff
`dist/src/2415.js`	-2.9 kB	-745 B
`dist/src/session.js`	+289 B	+209 B
`dist/src/cli.js`	+167 B	+74 B
`dist/src/interaction.js`	-165 B	-56 B
`dist/src/1352.js`	+34 B	+18 B

github-actions · 2026-06-01T16:24:04Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-01 16:24 UTC

thymikee force-pushed the codex/maestro-test-output branch 8 times, most recently from 8645167 to 388e28c Compare June 1, 2026 16:08

fix: improve maestro test output

87aa82e

thymikee force-pushed the codex/maestro-test-output branch from 388e28c to 87aa82e Compare June 1, 2026 16:17

thymikee merged commit 4f8e0af into main Jun 1, 2026
18 checks passed

thymikee deleted the codex/maestro-test-output branch June 1, 2026 16:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve maestro test output#647

fix: improve maestro test output#647
thymikee merged 1 commit into
mainfrom
codex/maestro-test-output

thymikee commented Jun 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thymikee commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

github-actions Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Size Report

Uh oh!

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thymikee commented Jun 1, 2026 •

edited

Loading

github-actions Bot commented Jun 1, 2026 •

edited

Loading