You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor(12/12): add snapshot test fixtures and benchmarks (#330)
This is **PR 12 of 12**, the final PR in the stacked series that decouples the rendering pipeline from MCP transport. Depends on PR 11.
Adds the snapshot test suite and performance benchmarks that validate the entire rendering pipeline end-to-end. These are large in line count but are almost entirely test fixtures (expected output files) and benchmark scripts.
The snapshot test infrastructure captures the rendered output of tool invocations and compares against expected fixtures. This provides regression protection for the rendering pipeline -- any change to event formatting, diagnostic grouping, or output ordering will be caught by a fixture mismatch.
**Test harness** (`src/snapshot-tests/`):
- `harness.ts`: Core test runner that invokes tools with mock executors and captures rendered output
- `fixture-io.ts`: Reads/writes fixture files, handles normalization (timestamps, paths, UUIDs)
- `flowdeck-fixture-io.ts`: Flowdeck-specific fixture handling
- `normalize.ts`: Output normalization for stable comparisons across environments
- `resource-harness.ts`: Resource-specific snapshot testing
**Fixtures**: Expected output files for each tool covering success, error, and edge case scenarios. These serve as living documentation of what each tool's output looks like.
Performance benchmarks for the rendering pipeline and xcodebuild parsing:
- Parser throughput: lines/second for xcodebuild output parsing
- Render session performance: events/second for text and JSON strategies
- End-to-end tool invocation timing
These benchmarks establish baselines and can be run in CI to catch performance regressions.
This PR is large by line count but low in conceptual complexity. The fixture files are auto-generated expected outputs. The benchmark scripts are straightforward timing loops. The meaningful code is the ~500 lines of test harness infrastructure.
- PR 1-11/12: All code and configuration changes
- **PR 12/12** (this PR): Snapshot tests and benchmarks
- [ ] `npx vitest run` passes -- snapshot tests match expected fixtures
- [ ] `npx vitest run --config vitest.snapshot.config.ts` runs snapshot suite specifically
- [ ] Benchmarks execute without errors (performance numbers are informational)
0 commit comments