Skip to content

Commit a48ed54

Browse files
kevinccbsgclaude
andauthored
feat(cli): add Tests: summary line, Failed tests block, and MOCK prefix on contract lines (#7)
* docs: add spec for twd-cli test summary output Spec proposes a final, grep-friendly `Tests: N passed, M failed, K skipped` line, adds a `MOCK ` prefix to contract-validation lines to disambiguate them from test-result glyphs, and replaces console.time with a manual duration delta folded into the summary line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: add implementation plan for test summary output Six TDD tasks: formatDuration helper, formatTestSummary + Failed tests block, MOCK prefix on contract lines, integration into runTests, and a manual smoke test against test-example-app. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): add formatDuration helper for m:ss.SSS output * feat(cli): add formatTestSummary for grep-friendly Tests: line Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cli): add formatFailedTestsBlock for end-of-log failure list * feat(cli): prefix contract-validation lines with MOCK Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cli): print Tests: summary line and Failed tests block Wire formatTestSummary and formatFailedTestsBlock into the runTests orchestrator, replacing console.time/timeEnd with a startedAt timestamp and durationMs calculation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cli): print Tests: summary after browser close, remove stale time mocks --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 839a747 commit a48ed54

10 files changed

Lines changed: 1112 additions & 8 deletions

docs/superpowers/plans/2026-05-20-test-summary-output.md

Lines changed: 690 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# twd-cli Test Summary Output — Design Spec
2+
3+
**Date:** 2026-05-20
4+
**Status:** Proposed
5+
6+
## Purpose
7+
8+
Make the final output of `twd-cli run` self-describing: at a glance, a developer (or an AI agent piping the output through `grep`) should be able to tell **how many tests passed, how many failed, how many were skipped** — without parsing per-test lines or running the suite again.
9+
10+
Today the run ends with a mock-validation summary like:
11+
12+
```
13+
Mocks validated: 128 | Errors: 7 | Warnings: 0 | Skipped: 80
14+
```
15+
16+
That line is about *mocks*, not *tests*. There is no equivalent line for test results. Users reading the tail of the log have to scroll back and visually count `✓ should ...` lines, and they may confuse the yellow `✗ … mock "fetchCart"` contract-warning lines with failing tests (same glyph, similar position).
17+
18+
## Problem (real session)
19+
20+
While running a long suite headless via `npm run test:ci`, the consuming agent re-ran the suite ~5 times trying to confirm "did all tests pass?" because:
21+
22+
1. No final `Tests: N passed, M failed, K skipped` line exists.
23+
2. The yellow `` glyph used for *mock contract validation failures* looks identical to a failed test marker.
24+
3. ANSI color codes broke naive `grep "✓ should"` patterns, so attempts to count from the log returned 0.
25+
26+
Each re-run was ~1:23, so the cost of "I can't tell if it passed" was ~7 minutes of wall time.
27+
28+
## Scope
29+
30+
**In scope:**
31+
- A final, single-line test summary printed after all tests complete.
32+
- Visual disambiguation between *test result* lines and *mock contract validation* lines.
33+
- A machine-friendly summary line (stable format, easy to grep without ANSI gymnastics).
34+
35+
**Out of scope:**
36+
- Changing the per-test output format itself.
37+
- Reworking the mock-validation summary line (the line that exists today is fine — it just needs to not be the *only* summary).
38+
- A `--summary` / quiet reporter mode — deferred to a follow-up.
39+
- JUnit XML / JSON reporter output — deferred to a follow-up.
40+
41+
## Proposed Solution
42+
43+
### 1. Add a final test summary line
44+
45+
After all tests finish (and after the mock-validation summary), print:
46+
47+
```
48+
Tests: 74 passed, 0 failed, 0 skipped (74 total) in 1:23.193
49+
```
50+
51+
Format requirements:
52+
- One line.
53+
- Stable label `Tests:` at the start so it's grep-friendly.
54+
- Colors only on the count digits (green for passed, red for failed if > 0, yellow for skipped if > 0). The label `Tests:` and the words `passed` / `failed` / `skipped` stay uncolored so `grep "^Tests:"` works regardless of ANSI handling.
55+
- Duration in the same `m:ss.SSS` format the runner shows today.
56+
57+
**Duration source.** Today `src/index.js` uses `console.time('Total Test Time')` / `console.timeEnd(...)` to print `Total Test Time: 1:23.193` as its own line. That call's output is not capturable as a value. Replace it with a manual `Date.now()` delta captured around the same span (start before `page.goto`, end after `runner.runAll()` returns), formatted to the same `m:ss.SSS` string. The standalone `Total Test Time:` line is removed; the duration appears only on the `Tests:` line. This keeps the log to one canonical timing line.
58+
59+
When there are failures, also print a `Failed tests:` block with just the test names (no stack traces — those already appear inline above), so the developer can see the names at the end of the log without scrolling.
60+
61+
### 2. Disambiguate mock-validation lines from test result lines
62+
63+
The current mock contract output (`src/contractReport.js`) uses `` for passing mocks, `` for failing ones, and `` for warnings. The `` glyph collides visually with the `` used for failed tests in the suite tree printed by `reportResults` (`twd-js/runner-ci`). Color helps in warn-mode contract failures (yellow) but not in error-mode (red — same as test failures), and color is fragile under `grep`/CI log viewers.
64+
65+
**Decision:** add a `MOCK ` prefix to every line that comes out of `contractReport.js`. The existing glyph assignments stay (`` pass, `` fail, `` warning) — they are correct *within* the contract report; the prefix is what distinguishes contract lines from test-result lines.
66+
67+
Example before:
68+
```
69+
✗ GET /v1/carts/{cart_id} (200) — mock "fetchCart" — in "Checkout New — Redis ID Flow > ..."
70+
```
71+
72+
Example after:
73+
```
74+
MOCK ✗ GET /v1/carts/{cart_id} (200) — mock "fetchCart" — in "Checkout New — Redis ID Flow > ..."
75+
```
76+
77+
Apply the prefix uniformly to all four line kinds the report can emit: pass (``), fail (``), warning (``), and skipped (``). Indentation already exists; the prefix sits between the indentation and the glyph.
78+
79+
## Exit Code Behavior
80+
81+
No change. Exit code already reflects test failures plus `mode: "error"` contract failures (`src/index.js:101,119`).
82+
83+
**Interplay with the `Tests:` line.** The new `Tests:` summary counts test outcomes *only* (pass/fail/skip from `testStatus`). A run can legitimately exit non-zero while `Tests:` reads `0 failed` — that means every test passed but at least one mock failed contract validation in `error` mode. The mock summary line (`Mocks validated: … | Errors: N | …`) and the contract report block above it are the canonical place to see contract failures; the `Tests:` line is not retroactively edited to fold them in.
84+
85+
## Testing Strategy
86+
87+
- Unit test the summary formatter directly: given a `testStatus` array with a known mix (e.g. 3 pass, 1 fail, 1 skip) and a duration value, assert the `Tests:` line matches the expected format. Keep this layer pure (no Puppeteer) so the format is easy to lock down.
88+
- Unit test the failed-tests block: given a `testStatus` array with two failures and a `handlers` array, assert both names appear under `Failed tests:` in the order the suite produced them.
89+
- Extend the existing `contractReport.test.js` to assert every emitted line starts with `MOCK ` (after any leading whitespace). Cover all four line kinds: pass, fail, warning, skipped.
90+
- Verify `grep "^Tests:"` against a raw run (ANSI included) returns exactly one line — i.e. the label is not wrapped in escape sequences. (The count digits themselves may carry color codes; the label must not.)
91+
92+
## Benefits
93+
94+
- **Faster developer feedback:** one line at the end answers "did it pass?" — no scrolling, no counting.
95+
- **AI-agent friendly:** stable, grep-able summary line. Avoids re-running long suites just to confirm a result.
96+
- **Less confusion between mocks and tests:** the `MOCK ` prefix removes the "is that a test failure or a mock warning?" question.
97+
98+
## Notes / Open Questions
99+
100+
- Should the failed-test block at the end include the file path + line number for each failure, or just the test name? (Stack traces already appear inline above.) Default for the implementation plan: **just the test name**, mirroring what the per-test line shows. Revisit if it proves too thin.
101+
102+
## Follow-up Work (Out of Scope Here)
103+
104+
- **`--summary` / quiet reporter.** A mode that suppresses per-request mock log lines (which dominate output for large suites) and prints only RUN/PASS/FAIL per test, the `Tests:` line, the mock-validation summary line, and the contract report path. Likely shaped as a `twd.config.json` field (`reporter: "summary"`) for consistency with how other twd-cli behavior is configured, not a CLI flag.
105+
- **`--json` reporter** for CI dashboards. The summary-line work in this spec makes this trivial later.

src/contractReport.js

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ export function printContractReport(output) {
4747

4848
if (!result.validation.valid) {
4949
errorCount += result.validation.errors.length;
50-
console.log(failColor(` ✗ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
50+
console.log(failColor(` MOCK ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
5151
for (const err of result.validation.errors) {
5252
console.log(detailColor(` → ${err.path}: ${err.message}`));
5353
}
@@ -56,12 +56,12 @@ export function printContractReport(output) {
5656
hasContractErrors = true;
5757
}
5858
} else if (result.validation.warnings.length === 0) {
59-
console.log(green(` ✓ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
59+
console.log(green(` MOCK ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
6060
}
6161

6262
for (const warning of result.validation.warnings) {
6363
warningCount++;
64-
console.log(yellow(` ⚠ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
64+
console.log(yellow(` MOCK ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
6565
console.log(yellow(` ${warning.message}`));
6666
console.log('');
6767
}
@@ -71,7 +71,7 @@ export function printContractReport(output) {
7171
if (skipped.length > 0) {
7272
console.log(dim('Skipped:'));
7373
for (const skip of skipped) {
74-
console.log(dim(` ℹ "${skip.alias}" — ${skip.url}`));
74+
console.log(dim(` MOCK ℹ "${skip.alias}" — ${skip.url}`));
7575
console.log(dim(` ${skip.reason === 'urlRegex mock' ? 'Regex URL pattern' : 'No matching path in any spec'}`));
7676
}
7777
console.log('');

src/formatDuration.js

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
export function formatDuration(ms) {
2+
const totalSeconds = Math.floor(ms / 1000);
3+
const minutes = Math.floor(totalSeconds / 60);
4+
const seconds = totalSeconds % 60;
5+
const millis = ms % 1000;
6+
return `${minutes}:${String(seconds).padStart(2, '0')}.${String(millis).padStart(3, '0')}`;
7+
}

src/index.js

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ import { loadContracts, validateMocks } from './contracts.js';
77
import { printContractReport } from './contractReport.js';
88
import { generateContractMarkdown } from './contractMarkdown.js';
99
import { buildTestPath } from './buildTestPath.js';
10+
import { formatTestSummary, formatFailedTestsBlock } from './testSummary.js';
1011

1112
export async function runTests() {
1213
let browser;
@@ -29,7 +30,6 @@ export async function runTests() {
2930
});
3031

3132
const page = await browser.newPage();
32-
console.time('Total Test Time');
3333

3434
// Register mock collector for contract validation
3535
const collectedMocks = new Map();
@@ -46,6 +46,7 @@ export async function runTests() {
4646
}
4747

4848
// Navigate to your development server
49+
const startedAt = Date.now();
4950
console.log(`Navigating to ${config.url} ...`);
5051
await page.goto(config.url);
5152

@@ -80,6 +81,8 @@ export async function runTests() {
8081
return { handlers: Array.from(handlers.values()), testStatus };
8182
}, config.retryCount);
8283

84+
const durationMs = Date.now() - startedAt;
85+
8386
console.log(`Tests to report: ${testStatus.length}`);
8487

8588
// Display results in console
@@ -99,7 +102,6 @@ export async function runTests() {
99102

100103
// Exit with appropriate code
101104
let hasFailures = testStatus.some(test => test.status === 'fail');
102-
console.timeEnd('Total Test Time');
103105

104106
// Enrich collected mocks with full test path names
105107
for (const [, mock] of collectedMocks) {
@@ -158,6 +160,15 @@ export async function runTests() {
158160
await browser.close();
159161
console.log('Browser closed.');
160162

163+
console.log('');
164+
console.log(formatTestSummary({ testStatus, durationMs }));
165+
const failedBlock = formatFailedTestsBlock({ testStatus, handlers });
166+
if (failedBlock) {
167+
for (const line of failedBlock.split('\n')) {
168+
console.log(line);
169+
}
170+
}
171+
161172
return hasFailures;
162173

163174
} catch (error) {

src/testSummary.js

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
import { formatDuration } from './formatDuration.js';
2+
3+
const green = (s) => `\x1b[32m${s}\x1b[0m`;
4+
const red = (s) => `\x1b[31m${s}\x1b[0m`;
5+
const yellow = (s) => `\x1b[33m${s}\x1b[0m`;
6+
7+
export function formatTestSummary({ testStatus, durationMs }) {
8+
const passed = testStatus.filter((t) => t.status === 'pass').length;
9+
const failed = testStatus.filter((t) => t.status === 'fail').length;
10+
const skipped = testStatus.filter((t) => t.status === 'skip').length;
11+
const total = testStatus.length;
12+
13+
const passedStr = `${green(passed)} passed`;
14+
const failedStr = `${failed > 0 ? red(failed) : '0'} failed`;
15+
const skippedStr = `${skipped > 0 ? yellow(skipped) : '0'} skipped`;
16+
17+
return `Tests: ${passedStr}, ${failedStr}, ${skippedStr} (${total} total) in ${formatDuration(durationMs)}`;
18+
}
19+
20+
export function formatFailedTestsBlock({ testStatus, handlers }) {
21+
const failures = testStatus.filter((t) => t.status === 'fail');
22+
if (failures.length === 0) return null;
23+
24+
const handlersById = new Map(handlers.map((h) => [h.id, h]));
25+
const lines = ['Failed tests:'];
26+
for (const failure of failures) {
27+
const handler = handlersById.get(failure.id);
28+
const name = handler ? handler.name : failure.id;
29+
lines.push(` ${red('✗')} ${name}`);
30+
}
31+
return lines.join('\n');
32+
}

tests/contractReport.test.js

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,66 @@ describe('printContractReport', () => {
210210
expect(logs).toContain('mock "getPets" — in "Cart > should load items"');
211211
});
212212

213+
it('prefixes every glyph-led line with MOCK ', () => {
214+
const output = {
215+
results: [
216+
// pass
217+
{
218+
alias: 'getPets',
219+
url: '/api/v1/pets',
220+
method: 'GET',
221+
status: 200,
222+
specSource: './openapi.json',
223+
matchedPath: '/v1/pets',
224+
mode: 'warn',
225+
validation: { valid: true, errors: [], warnings: [] },
226+
},
227+
// fail
228+
{
229+
alias: 'createPet',
230+
url: '/api/v1/pets',
231+
method: 'POST',
232+
status: 201,
233+
specSource: './openapi.json',
234+
matchedPath: '/v1/pets',
235+
mode: 'warn',
236+
validation: {
237+
valid: false,
238+
errors: [{ path: 'response.id', message: 'expected integer, got string', keyword: 'type' }],
239+
warnings: [],
240+
},
241+
},
242+
// warning
243+
{
244+
alias: 'serverError',
245+
url: '/api/v1/pets',
246+
method: 'GET',
247+
status: 500,
248+
specSource: './openapi.json',
249+
matchedPath: '/v1/pets',
250+
mode: 'warn',
251+
validation: {
252+
valid: true,
253+
errors: [],
254+
warnings: [{ type: 'UNMATCHED_STATUS', message: 'Status 500 not documented' }],
255+
},
256+
},
257+
],
258+
skipped: [
259+
{ alias: 'untracked', url: '/whatever', reason: 'No matching path in any spec' },
260+
],
261+
};
262+
263+
printContractReport(output);
264+
265+
const lines = consoleSpy.mock.calls.map((c) => stripAnsi(c[0]));
266+
const glyphLines = lines.filter((l) => /^\s*(MOCK\s+)?[]/.test(l));
267+
expect(glyphLines.length).toBeGreaterThanOrEqual(4);
268+
for (const line of glyphLines) {
269+
expect(line).toMatch(/^\s*MOCK []/);
270+
}
271+
});
272+
213273
it('prints occurrence suffix when occurrence > 1', () => {
214274
const output = {
215275
results: [

tests/formatDuration.test.js

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
import { describe, it, expect } from 'vitest';
2+
import { formatDuration } from '../src/formatDuration.js';
3+
4+
describe('formatDuration', () => {
5+
it('formats zero as 0:00.000', () => {
6+
expect(formatDuration(0)).toBe('0:00.000');
7+
});
8+
9+
it('formats sub-second durations with leading zero minutes/seconds', () => {
10+
expect(formatDuration(123)).toBe('0:00.123');
11+
});
12+
13+
it('formats single-digit seconds with a leading zero', () => {
14+
expect(formatDuration(5_678)).toBe('0:05.678');
15+
});
16+
17+
it('formats the spec example (83.193s) as 1:23.193', () => {
18+
expect(formatDuration(83_193)).toBe('1:23.193');
19+
});
20+
21+
it('formats a long duration past 10 minutes', () => {
22+
expect(formatDuration(754_567)).toBe('12:34.567');
23+
});
24+
25+
it('pads milliseconds to three digits', () => {
26+
expect(formatDuration(60_007)).toBe('1:00.007');
27+
});
28+
});

tests/runTests.test.js

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,6 @@ describe("runTests", () => {
5858
vi.clearAllMocks();
5959
vi.mocked(loadConfig).mockReturnValue({ ...defaultMockConfig });
6060
consoleSpy = vi.spyOn(console, 'log').mockImplementation(() => {});
61-
vi.spyOn(console, 'time').mockImplementation(() => {});
62-
vi.spyOn(console, 'timeEnd').mockImplementation(() => {});
6361
});
6462

6563
afterEach(() => {
@@ -244,4 +242,33 @@ describe("runTests", () => {
244242
expect(entries[0].alias).toBe('getPhoto');
245243
expect(entries[0].occurrence).toBe(1);
246244
});
245+
246+
it("should print the Tests: summary line and Failed tests block", async () => {
247+
const testStatus = [
248+
{ id: '1', status: 'pass' },
249+
{ id: '2', status: 'fail', error: 'boom' },
250+
{ id: '3', status: 'skip' },
251+
];
252+
const handlers = [
253+
{ id: '1', name: 'should render', type: 'test' },
254+
{ id: '2', name: 'should submit form', type: 'test' },
255+
{ id: '3', name: 'should show error', type: 'test' },
256+
];
257+
const page = createMockPage({ handlers, testStatus });
258+
const browser = createMockBrowser(page);
259+
vi.mocked(puppeteer.launch).mockResolvedValue(browser);
260+
261+
await runTests();
262+
263+
const stripAnsi = (s) => s.replace(/\x1b\[[0-9;]*m/g, '');
264+
const logs = consoleSpy.mock.calls.map((c) => stripAnsi(String(c[0])));
265+
266+
const summaryLine = logs.find((l) => l.startsWith('Tests:'));
267+
expect(summaryLine).toBeDefined();
268+
expect(summaryLine).toMatch(/^Tests: 1 passed, 1 failed, 1 skipped \(3 total\) in \d+:\d{2}\.\d{3}$/);
269+
270+
const failedHeader = logs.find((l) => l === 'Failed tests:');
271+
expect(failedHeader).toBeDefined();
272+
expect(logs.some((l) => l.includes('should submit form'))).toBe(true);
273+
});
247274
});

0 commit comments

Comments
 (0)