feat(cli): add Tests: summary line, Failed tests block, and MOCK prefix on contract lines (#7)

kevinccbsg · claude · web-flow · commit a48ed545f99a · 2026-05-20T19:33:58.000+02:00
* docs: add spec for twd-cli test summary output

Spec proposes a final, grep-friendly `Tests: N passed, M failed, K
skipped` line, adds a `MOCK ` prefix to contract-validation lines to
disambiguate them from test-result glyphs, and replaces console.time
with a manual duration delta folded into the summary line.

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;

* docs: add implementation plan for test summary output

Six TDD tasks: formatDuration helper, formatTestSummary + Failed tests
block, MOCK prefix on contract lines, integration into runTests, and a
manual smoke test against test-example-app.

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;

* feat(cli): add formatDuration helper for m:ss.SSS output

* feat(cli): add formatTestSummary for grep-friendly Tests: line

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;

* feat(cli): add formatFailedTestsBlock for end-of-log failure list

* feat(cli): prefix contract-validation lines with MOCK

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;

* feat(cli): print Tests: summary line and Failed tests block

Wire formatTestSummary and formatFailedTestsBlock into the runTests
orchestrator, replacing console.time/timeEnd with a startedAt timestamp
and durationMs calculation.

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;

* fix(cli): print Tests: summary after browser close, remove stale time mocks

---------

Co-authored-by: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/docs/superpowers/plans/2026-05-20-test-summary-output.md b/docs/superpowers/plans/2026-05-20-test-summary-output.md
diff --git a/docs/superpowers/specs/2026-05-20-test-summary-output.md b/docs/superpowers/specs/2026-05-20-test-summary-output.md
@@ -0,0 +1,105 @@
+# twd-cli Test Summary Output — Design Spec
+
+**Date:** 2026-05-20
+**Status:** Proposed
+
+## Purpose
+
+Make the final output of `twd-cli run` self-describing: at a glance, a developer (or an AI agent piping the output through `grep`) should be able to tell **how many tests passed, how many failed, how many were skipped** — without parsing per-test lines or running the suite again.
+
+Today the run ends with a mock-validation summary like:
+
+```
+Mocks validated: 128 | Errors: 7 | Warnings: 0 | Skipped: 80
+```
+
+That line is about *mocks*, not *tests*. There is no equivalent line for test results. Users reading the tail of the log have to scroll back and visually count `✓ should ...` lines, and they may confuse the yellow `✗ … mock "fetchCart"` contract-warning lines with failing tests (same glyph, similar position).
+
+## Problem (real session)
+
+While running a long suite headless via `npm run test:ci`, the consuming agent re-ran the suite ~5 times trying to confirm "did all tests pass?" because:
+
+1. No final `Tests: N passed, M failed, K skipped` line exists.
+2. The yellow `✗` glyph used for *mock contract validation failures* looks identical to a failed test marker.
+3. ANSI color codes broke naive `grep "✓ should"` patterns, so attempts to count from the log returned 0.
+
+Each re-run was ~1:23, so the cost of "I can't tell if it passed" was ~7 minutes of wall time.
+
+## Scope
+
+**In scope:**
+- A final, single-line test summary printed after all tests complete.
+- Visual disambiguation between *test result* lines and *mock contract validation* lines.
+- A machine-friendly summary line (stable format, easy to grep without ANSI gymnastics).
+
+**Out of scope:**
+- Changing the per-test output format itself.
+- Reworking the mock-validation summary line (the line that exists today is fine — it just needs to not be the *only* summary).
+- A `--summary` / quiet reporter mode — deferred to a follow-up.
+- JUnit XML / JSON reporter output — deferred to a follow-up.
+
+## Proposed Solution
+
+### 1. Add a final test summary line
+
+After all tests finish (and after the mock-validation summary), print:
+
+```
+Tests:   74 passed, 0 failed, 0 skipped (74 total) in 1:23.193
+```
+
+Format requirements:
+- One line.
+- Stable label `Tests:` at the start so it's grep-friendly.
+- Colors only on the count digits (green for passed, red for failed if > 0, yellow for skipped if > 0). The label `Tests:` and the words `passed` / `failed` / `skipped` stay uncolored so `grep "^Tests:"` works regardless of ANSI handling.
+- Duration in the same `m:ss.SSS` format the runner shows today.
+
+**Duration source.** Today `src/index.js` uses `console.time('Total Test Time')` / `console.timeEnd(...)` to print `Total Test Time: 1:23.193` as its own line. That call's output is not capturable as a value. Replace it with a manual `Date.now()` delta captured around the same span (start before `page.goto`, end after `runner.runAll()` returns), formatted to the same `m:ss.SSS` string. The standalone `Total Test Time:` line is removed; the duration appears only on the `Tests:` line. This keeps the log to one canonical timing line.
+
+When there are failures, also print a `Failed tests:` block with just the test names (no stack traces — those already appear inline above), so the developer can see the names at the end of the log without scrolling.
+
+### 2. Disambiguate mock-validation lines from test result lines
+
+The current mock contract output (`src/contractReport.js`) uses `✓` for passing mocks, `✗` for failing ones, and `⚠` for warnings. The `✗` glyph collides visually with the `✗` used for failed tests in the suite tree printed by `reportResults` (`twd-js/runner-ci`). Color helps in warn-mode contract failures (yellow) but not in error-mode (red — same as test failures), and color is fragile under `grep`/CI log viewers.
+
+**Decision:** add a `MOCK ` prefix to every line that comes out of `contractReport.js`. The existing glyph assignments stay (`✓` pass, `✗` fail, `⚠` warning) — they are correct *within* the contract report; the prefix is what distinguishes contract lines from test-result lines.
+
+Example before:
+```
+  ✗ GET /v1/carts/{cart_id} (200) — mock "fetchCart" — in "Checkout New — Redis ID Flow > ..."
+```
+
+Example after:
+```
+  MOCK ✗ GET /v1/carts/{cart_id} (200) — mock "fetchCart" — in "Checkout New — Redis ID Flow > ..."
+```
+
+Apply the prefix uniformly to all four line kinds the report can emit: pass (`✓`), fail (`✗`), warning (`⚠`), and skipped (`ℹ`). Indentation already exists; the prefix sits between the indentation and the glyph.
+
+## Exit Code Behavior
+
+No change. Exit code already reflects test failures plus `mode: "error"` contract failures (`src/index.js:101,119`).
+
+**Interplay with the `Tests:` line.** The new `Tests:` summary counts test outcomes *only* (pass/fail/skip from `testStatus`). A run can legitimately exit non-zero while `Tests:` reads `0 failed` — that means every test passed but at least one mock failed contract validation in `error` mode. The mock summary line (`Mocks validated: … | Errors: N | …`) and the contract report block above it are the canonical place to see contract failures; the `Tests:` line is not retroactively edited to fold them in.
+
+## Testing Strategy
+
+- Unit test the summary formatter directly: given a `testStatus` array with a known mix (e.g. 3 pass, 1 fail, 1 skip) and a duration value, assert the `Tests:` line matches the expected format. Keep this layer pure (no Puppeteer) so the format is easy to lock down.
+- Unit test the failed-tests block: given a `testStatus` array with two failures and a `handlers` array, assert both names appear under `Failed tests:` in the order the suite produced them.
+- Extend the existing `contractReport.test.js` to assert every emitted line starts with `MOCK ` (after any leading whitespace). Cover all four line kinds: pass, fail, warning, skipped.
+- Verify `grep "^Tests:"` against a raw run (ANSI included) returns exactly one line — i.e. the label is not wrapped in escape sequences. (The count digits themselves may carry color codes; the label must not.)
+
+## Benefits
+
+- **Faster developer feedback:** one line at the end answers "did it pass?" — no scrolling, no counting.
+- **AI-agent friendly:** stable, grep-able summary line. Avoids re-running long suites just to confirm a result.
+- **Less confusion between mocks and tests:** the `MOCK ` prefix removes the "is that a test failure or a mock warning?" question.
+
+## Notes / Open Questions
+
+- Should the failed-test block at the end include the file path + line number for each failure, or just the test name? (Stack traces already appear inline above.) Default for the implementation plan: **just the test name**, mirroring what the per-test line shows. Revisit if it proves too thin.
+
+## Follow-up Work (Out of Scope Here)
+
+- **`--summary` / quiet reporter.** A mode that suppresses per-request mock log lines (which dominate output for large suites) and prints only RUN/PASS/FAIL per test, the `Tests:` line, the mock-validation summary line, and the contract report path. Likely shaped as a `twd.config.json` field (`reporter: "summary"`) for consistency with how other twd-cli behavior is configured, not a CLI flag.
+- **`--json` reporter** for CI dashboards. The summary-line work in this spec makes this trivial later.
diff --git a/src/contractReport.js b/src/contractReport.js
@@ -47,7 +47,7 @@ export function printContractReport(output) {
 
       if (!result.validation.valid) {
         errorCount += result.validation.errors.length;
-        console.log(failColor(`  ✗ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
+        console.log(failColor(`  MOCK ✗ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
         for (const err of result.validation.errors) {
           console.log(detailColor(`    → ${err.path}: ${err.message}`));
         }
@@ -56,12 +56,12 @@ export function printContractReport(output) {
           hasContractErrors = true;
         }
       } else if (result.validation.warnings.length === 0) {
-        console.log(green(`  ✓ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
+        console.log(green(`  MOCK ✓ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
       }
 
       for (const warning of result.validation.warnings) {
         warningCount++;
-        console.log(yellow(`  ⚠ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
+        console.log(yellow(`  MOCK ⚠ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
         console.log(yellow(`    ${warning.message}`));
         console.log('');
       }
@@ -71,7 +71,7 @@ export function printContractReport(output) {
   if (skipped.length > 0) {
     console.log(dim('Skipped:'));
     for (const skip of skipped) {
-      console.log(dim(`  ℹ "${skip.alias}" — ${skip.url}`));
+      console.log(dim(`  MOCK ℹ "${skip.alias}" — ${skip.url}`));
       console.log(dim(`    ${skip.reason === 'urlRegex mock' ? 'Regex URL pattern' : 'No matching path in any spec'}`));
     }
     console.log('');
diff --git a/src/formatDuration.js b/src/formatDuration.js
@@ -0,0 +1,7 @@
+export function formatDuration(ms) {
+  const totalSeconds = Math.floor(ms / 1000);
+  const minutes = Math.floor(totalSeconds / 60);
+  const seconds = totalSeconds % 60;
+  const millis = ms % 1000;
+  return `${minutes}:${String(seconds).padStart(2, '0')}.${String(millis).padStart(3, '0')}`;
+}
diff --git a/src/index.js b/src/index.js
@@ -7,6 +7,7 @@ import { loadContracts, validateMocks } from './contracts.js';
 import { printContractReport } from './contractReport.js';
 import { generateContractMarkdown } from './contractMarkdown.js';
 import { buildTestPath } from './buildTestPath.js';
+import { formatTestSummary, formatFailedTestsBlock } from './testSummary.js';
 
 export async function runTests() {
   let browser;
@@ -29,7 +30,6 @@ export async function runTests() {
     });
 
     const page = await browser.newPage();
-    console.time('Total Test Time');
 
     // Register mock collector for contract validation
     const collectedMocks = new Map();
@@ -46,6 +46,7 @@ export async function runTests() {
     }
 
     // Navigate to your development server
+    const startedAt = Date.now();
     console.log(`Navigating to ${config.url} ...`);
     await page.goto(config.url);
 
@@ -80,6 +81,8 @@ export async function runTests() {
       return { handlers: Array.from(handlers.values()), testStatus };
     }, config.retryCount);
 
+    const durationMs = Date.now() - startedAt;
+
     console.log(`Tests to report: ${testStatus.length}`);
 
     // Display results in console
@@ -99,7 +102,6 @@ export async function runTests() {
 
     // Exit with appropriate code
     let hasFailures = testStatus.some(test => test.status === 'fail');
-    console.timeEnd('Total Test Time');
 
     // Enrich collected mocks with full test path names
     for (const [, mock] of collectedMocks) {
@@ -158,6 +160,15 @@ export async function runTests() {
     await browser.close();
     console.log('Browser closed.');
 
+    console.log('');
+    console.log(formatTestSummary({ testStatus, durationMs }));
+    const failedBlock = formatFailedTestsBlock({ testStatus, handlers });
+    if (failedBlock) {
+      for (const line of failedBlock.split('\n')) {
+        console.log(line);
+      }
+    }
+
     return hasFailures;
 
   } catch (error) {
diff --git a/src/testSummary.js b/src/testSummary.js
@@ -0,0 +1,32 @@
+import { formatDuration } from './formatDuration.js';
+
+const green = (s) => `\x1b[32m${s}\x1b[0m`;
+const red = (s) => `\x1b[31m${s}\x1b[0m`;
+const yellow = (s) => `\x1b[33m${s}\x1b[0m`;
+
+export function formatTestSummary({ testStatus, durationMs }) {
+  const passed = testStatus.filter((t) => t.status === 'pass').length;
+  const failed = testStatus.filter((t) => t.status === 'fail').length;
+  const skipped = testStatus.filter((t) => t.status === 'skip').length;
+  const total = testStatus.length;
+
+  const passedStr = `${green(passed)} passed`;
+  const failedStr = `${failed > 0 ? red(failed) : '0'} failed`;
+  const skippedStr = `${skipped > 0 ? yellow(skipped) : '0'} skipped`;
+
+  return `Tests: ${passedStr}, ${failedStr}, ${skippedStr} (${total} total) in ${formatDuration(durationMs)}`;
+}
+
+export function formatFailedTestsBlock({ testStatus, handlers }) {
+  const failures = testStatus.filter((t) => t.status === 'fail');
+  if (failures.length === 0) return null;
+
+  const handlersById = new Map(handlers.map((h) => [h.id, h]));
+  const lines = ['Failed tests:'];
+  for (const failure of failures) {
+    const handler = handlersById.get(failure.id);
+    const name = handler ? handler.name : failure.id;
+    lines.push(`  ${red('✗')} ${name}`);
+  }
+  return lines.join('\n');
+}
diff --git a/tests/contractReport.test.js b/tests/contractReport.test.js
@@ -210,6 +210,66 @@ describe('printContractReport', () => {
     expect(logs).toContain('mock "getPets" — in "Cart > should load items"');
   });
 
+  it('prefixes every glyph-led line with MOCK ', () => {
+    const output = {
+      results: [
+        // pass
+        {
+          alias: 'getPets',
+          url: '/api/v1/pets',
+          method: 'GET',
+          status: 200,
+          specSource: './openapi.json',
+          matchedPath: '/v1/pets',
+          mode: 'warn',
+          validation: { valid: true, errors: [], warnings: [] },
+        },
+        // fail
+        {
+          alias: 'createPet',
+          url: '/api/v1/pets',
+          method: 'POST',
+          status: 201,
+          specSource: './openapi.json',
+          matchedPath: '/v1/pets',
+          mode: 'warn',
+          validation: {
+            valid: false,
+            errors: [{ path: 'response.id', message: 'expected integer, got string', keyword: 'type' }],
+            warnings: [],
+          },
+        },
+        // warning
+        {
+          alias: 'serverError',
+          url: '/api/v1/pets',
+          method: 'GET',
+          status: 500,
+          specSource: './openapi.json',
+          matchedPath: '/v1/pets',
+          mode: 'warn',
+          validation: {
+            valid: true,
+            errors: [],
+            warnings: [{ type: 'UNMATCHED_STATUS', message: 'Status 500 not documented' }],
+          },
+        },
+      ],
+      skipped: [
+        { alias: 'untracked', url: '/whatever', reason: 'No matching path in any spec' },
+      ],
+    };
+
+    printContractReport(output);
+
+    const lines = consoleSpy.mock.calls.map((c) => stripAnsi(c[0]));
+    const glyphLines = lines.filter((l) => /^\s*(MOCK\s+)?[✓✗⚠ℹ]/.test(l));
+    expect(glyphLines.length).toBeGreaterThanOrEqual(4);
+    for (const line of glyphLines) {
+      expect(line).toMatch(/^\s*MOCK [✓✗⚠ℹ]/);
+    }
+  });
+
   it('prints occurrence suffix when occurrence > 1', () => {
     const output = {
       results: [
diff --git a/tests/formatDuration.test.js b/tests/formatDuration.test.js
@@ -0,0 +1,28 @@
+import { describe, it, expect } from 'vitest';
+import { formatDuration } from '../src/formatDuration.js';
+
+describe('formatDuration', () => {
+  it('formats zero as 0:00.000', () => {
+    expect(formatDuration(0)).toBe('0:00.000');
+  });
+
+  it('formats sub-second durations with leading zero minutes/seconds', () => {
+    expect(formatDuration(123)).toBe('0:00.123');
+  });
+
+  it('formats single-digit seconds with a leading zero', () => {
+    expect(formatDuration(5_678)).toBe('0:05.678');
+  });
+
+  it('formats the spec example (83.193s) as 1:23.193', () => {
+    expect(formatDuration(83_193)).toBe('1:23.193');
+  });
+
+  it('formats a long duration past 10 minutes', () => {
+    expect(formatDuration(754_567)).toBe('12:34.567');
+  });
+
+  it('pads milliseconds to three digits', () => {
+    expect(formatDuration(60_007)).toBe('1:00.007');
+  });
+});
diff --git a/tests/runTests.test.js b/tests/runTests.test.js
@@ -58,8 +58,6 @@ describe("runTests", () => {
     vi.clearAllMocks();
     vi.mocked(loadConfig).mockReturnValue({ ...defaultMockConfig });
     consoleSpy = vi.spyOn(console, 'log').mockImplementation(() => {});
-    vi.spyOn(console, 'time').mockImplementation(() => {});
-    vi.spyOn(console, 'timeEnd').mockImplementation(() => {});
   });
 
   afterEach(() => {
@@ -244,4 +242,33 @@ describe("runTests", () => {
     expect(entries[0].alias).toBe('getPhoto');
     expect(entries[0].occurrence).toBe(1);
   });
+
+  it("should print the Tests: summary line and Failed tests block", async () => {
+    const testStatus = [
+      { id: '1', status: 'pass' },
+      { id: '2', status: 'fail', error: 'boom' },
+      { id: '3', status: 'skip' },
+    ];
+    const handlers = [
+      { id: '1', name: 'should render', type: 'test' },
+      { id: '2', name: 'should submit form', type: 'test' },
+      { id: '3', name: 'should show error', type: 'test' },
+    ];
+    const page = createMockPage({ handlers, testStatus });
+    const browser = createMockBrowser(page);
+    vi.mocked(puppeteer.launch).mockResolvedValue(browser);
+
+    await runTests();
+
+    const stripAnsi = (s) => s.replace(/\x1b\[[0-9;]*m/g, '');
+    const logs = consoleSpy.mock.calls.map((c) => stripAnsi(String(c[0])));
+
+    const summaryLine = logs.find((l) => l.startsWith('Tests:'));
+    expect(summaryLine).toBeDefined();
+    expect(summaryLine).toMatch(/^Tests: 1 passed, 1 failed, 1 skipped \(3 total\) in \d+:\d{2}\.\d{3}$/);
+
+    const failedHeader = logs.find((l) => l === 'Failed tests:');
+    expect(failedHeader).toBeDefined();
+    expect(logs.some((l) => l.includes('should submit form'))).toBe(true);
+  });
 });
diff --git a/tests/testSummary.test.js b/tests/testSummary.test.js

Original file line number	Diff line number	Diff line change
`@@ -47,7 +47,7 @@ export function printContractReport(output) {`
`47`	`47`
`48`	`48`	`if (!result.validation.valid) {`
`49`	`49`	`errorCount += result.validation.errors.length;`
`50`		- console.log(failColor(` ✗ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
	`50`	+ console.log(failColor(` MOCK ✗ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
`51`	`51`	`for (const err of result.validation.errors) {`
`52`	`52`	console.log(detailColor(` → ${err.path}: ${err.message}`));
`53`	`53`	`}`
`@@ -56,12 +56,12 @@ export function printContractReport(output) {`
`56`	`56`	`hasContractErrors = true;`
`57`	`57`	`}`
`58`	`58`	`} else if (result.validation.warnings.length === 0) {`
`59`		- console.log(green(` ✓ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
	`59`	+ console.log(green(` MOCK ✓ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
`60`	`60`	`}`
`61`	`61`
`62`	`62`	`for (const warning of result.validation.warnings) {`
`63`	`63`	`warningCount++;`
`64`		- console.log(yellow(` ⚠ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
	`64`	+ console.log(yellow(` MOCK ⚠ ${result.method} ${result.matchedPath} (${result.status}) — ${formatMockLabel(result)}`));
`65`	`65`	console.log(yellow(` ${warning.message}`));
`66`	`66`	`console.log('');`
`67`	`67`	`}`
`@@ -71,7 +71,7 @@ export function printContractReport(output) {`
`71`	`71`	`if (skipped.length > 0) {`
`72`	`72`	`console.log(dim('Skipped:'));`
`73`	`73`	`for (const skip of skipped) {`
`74`		- console.log(dim(` ℹ "${skip.alias}" — ${skip.url}`));
	`74`	+ console.log(dim(` MOCK ℹ "${skip.alias}" — ${skip.url}`));
`75`	`75`	console.log(dim(` ${skip.reason === 'urlRegex mock' ? 'Regex URL pattern' : 'No matching path in any spec'}`));
`76`	`76`	`}`
`77`	`77`	`console.log('');`