fix: improve maestro test output#647
Merged
Merged
Conversation
Size Report
Startup median (7 runs, lower is better):
Top changed chunks:
|
8645167 to
388e28c
Compare
388e28c to
87aa82e
Compare
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Improve Maestro replay test output so CI logs are easier to scan and failures carry enough context to investigate.
Before, replay suite rows printed noisy absolute paths, millisecond durations, and a
FLAKYstatus inline with the main results. Failed and flaky cases also lacked enough step-level context for a human to quickly see what failed.After this change:
nametitles when available, falling back to file basenamesFAIL "Bottom Tabs - Dynamic" in bottom-tabs-dynamic.yml17.5sinstead of17492msPASSin the main list and summarized separately as flakyreplay-timing.ndjson, including command names liketapOn/assertVisible, line numbers, durations, timing payloads, and highlighted failed stepstest --verboseprints the same per-test step timings for all runnable tests, not only failurestest --verboseno longer enables the noisy debug/daemon log stream;--debugand-vstill doDetails
Touched 12 files. Scope stayed within Maestro/replay test discovery, suite result shaping, CLI test rendering, CLI debug/verbose handling, and focused tests/help copy.
Validation
Verified with formatting, TypeScript/lint checks, fallow audit, and focused Vitest coverage for CLI output/help, diagnostics behavior, replay test discovery, and replay suite retry behavior.
Commands run:
pnpm formatpnpm check:quickpnpm check:fallow --base origin/mainpnpm exec vitest run src/__tests__/cli-network.test.ts src/__tests__/cli-diagnostics.test.ts src/utils/__tests__/args.test.ts src/daemon/handlers/__tests__/session-test-discovery.test.ts src/daemon/handlers/__tests__/session-test-suite.test.ts