Skip to content

Commit b051c2b

Browse files
authored
refactor(tests): modular mock-free test suite with cross-platform CI (#13)
* Extract three module globals into RuntimeSingletons container for atomic test swap * Create centralized test harness with singleton isolation and capture * Fragment monolithic test into per-module unit suites with snapshots * Add test runner script, module loader bootstrap, and E2E suite * Update package metadata, scripts, and contributing docs for test suite * Enable cross-platform E2E tests (Windows, Linux, macOS) - Remove the Windows exclusion guard from E2E test step in CI workflow - The process-isolated child-process harness uses only cross-platform Node.js APIs (spawn, readline, stdio pipes) and is verified to work on Windows - Add .gitattributes to enforce LF line endings on snapshot golden files - Add normalizeEOL() helper in snapshot tests for Windows CRLF handling - Add E2E_TIMEOUT_MS env var support for configurable test timeouts - Add engines.node >=22 to package.json - Add tsconfig.json for type checking - Update CI documentation in CONTRIBUTING.md Closes #12 * Commit package-lock.json for reproducible CI installs * Move type+audit checks into matrix rows, remove gate job * chore: upgrade SDK peerDependencies to v0.78.1 * refactor(notebook): use TypeBox schemas for typed tool parameters * refactor(notebook/rehydration): use CustomEntry generic type * fix(spawn): widen message types for SDK v0.78.1 content union * fix(spawn/renderer): add null-safety and type casts for SDK v0.78.1 * test: fix mocks and helpers for SDK v0.78.1 API surface * test: fix spawn tests for SDK v0.78.1 type changes * test: fix notebook tests for widened content access * Rename PytestHarness to ProcessHarness * Use real AbortSignal in mock test context * Guard infinite loop in findPackageRoot with maxDepth * Fix Node version comment for module.register availability * Add missing-entry error test for register-loader * Document single-active-harness lifecycle constraint * Move render snapshots from __snapshots__ to tests/snapshots/ * Add try/catch recovery to SpawnFrameScheduler flush * Tag noop scheduler with sentinel marker * Update test harness lifecycle constraints comment * Wrap property tests with harness isolation * Split oversized spawn test file into focused modules * Wrap runtime-singletons test with harness isolation * Tighten assertions to test invariants instead of disjunctions * Fix E2E mock: add model field for spawn ctx.model check * Fix misleading JSDoc on __setSingletons * Add isNoopScheduler with import-order guard in test harness * Remove fragile spread pattern from snapshot test harness * Preserve writeContext alongside writeLock during in-flight singleton swaps * Add regression test for non-reentrant saveNotebookPage across singleton swaps * Align peer dependency ranges with tested SDK baseline * Convert pty-harness.ts and basic.test.ts to tab indentation * Resolve TypeBox from project dependency tree via exports map * Add regression tests for TypeBox resolution in register-loader
1 parent 31f0b94 commit b051c2b

49 files changed

Lines changed: 9444 additions & 3945 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitattributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
tests/snapshots/**/*.txt text eol=lf

.github/workflows/test.yml

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Cross-platform CI for pi-agenticoding
2+
#
3+
# Runs the full unit suite on Linux, macOS, and Windows
4+
# on the minimum Node.js version required by pi coding agent. Snapshot
5+
# tests verify TUI render output against golden files.
6+
7+
name: test
8+
9+
permissions:
10+
contents: read
11+
12+
concurrency:
13+
group: ${{ github.workflow }}-${{ github.ref }}
14+
cancel-in-progress: true
15+
16+
on:
17+
push:
18+
branches: [main]
19+
paths-ignore: ['*.md', '**/docs/**']
20+
pull_request:
21+
branches: [main]
22+
paths-ignore: ['*.md', '**/docs/**']
23+
24+
jobs:
25+
# ── Cross-platform test matrix ──────────────────────────────────────
26+
# Node 22 (minimum) is tested only on Linux — the primary platform and the only one
27+
# guaranteed to have the oldest toolchain. macOS and Windows test Node 24 (latest)
28+
# to catch regressions in the newest runtime. This asymmetry is intentional: it
29+
# balances CI cost with meaningful coverage while ensuring the minimum version works
30+
# correctly on the platform most likely to encounter toolchain edge cases.
31+
test:
32+
runs-on: ${{ matrix.os }}
33+
strategy:
34+
fail-fast: false # report every combination, don't cancel
35+
matrix:
36+
include:
37+
- os: ubuntu-latest
38+
node-version: "22" # minimum version on primary platform
39+
- os: ubuntu-latest
40+
node-version: "24" # latest on primary platform
41+
- os: macos-latest
42+
node-version: "24" # latest on macOS
43+
- os: windows-latest
44+
node-version: "24" # latest on Windows
45+
steps:
46+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
47+
48+
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
49+
with:
50+
node-version: ${{ matrix.node-version }}
51+
cache: "npm"
52+
53+
- run: npm ci
54+
55+
# Uniform pre-flight checks — type errors and security issues on every platform
56+
- name: Type check
57+
run: npx tsc --noEmit
58+
59+
- name: Security audit
60+
run: npm audit --audit-level=moderate
61+
62+
# Unit suite (unit tests + snapshot tests + property-based tests)
63+
- name: Unit tests
64+
run: npm test
65+
66+
# E2E tests — process-isolated child-process harness (stdin/stdout, no PTY).
67+
# Verified cross-platform: runs on Linux, macOS, and Windows.
68+
# See https://github.com/agenticoding/pi-agenticoding/issues/12
69+
- name: E2E tests
70+
run: npm run test:e2e
71+
72+
# Upload test results for debugging — artifacts available for 30 days.
73+
- name: Upload test results
74+
if: always()
75+
uses: actions/upload-artifact@v4
76+
with:
77+
name: test-results-${{ matrix.os }}-node-${{ matrix.node-version }}
78+
path: |
79+
tests/snapshots/
80+
retention-days: 30

.gitignore

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -142,8 +142,7 @@ vite.config.js.timestamp-*
142142
vite.config.ts.timestamp-*
143143
.vite/
144144

145-
# Lockfiles (library package — consumers manage their own)
146-
package-lock.json
145+
# package-lock.json committed for reproducible CI installs (excluded from publish)
147146

148147
# Agenticoding local config (credentials, API keys)
149148
.chunkhound.json

CONTRIBUTING.md

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,15 @@ Welcome! This project welcomes focused, well-validated contributions. Use coding
77
- **Use code research first** — understand the surrounding module responsibilities before editing.
88
- **Make minimal changes** — prefer targeted edits that reuse existing mechanisms.
99
- **Match existing patterns** — keep naming, lifecycle hooks, tool contracts, and TUI behavior consistent with the current code.
10-
- **Preserve context-management semantics** — changes to `spawn`, `ledger`, or `handoff` should keep the agent workflow predictable across session resets and compaction.
10+
- **Preserve context-management semantics** — changes to `spawn`, `notebook`, or `handoff` should keep the agent workflow predictable across session resets and compaction.
11+
- **Use static imports only for `spawn/renderer.ts`** — it registers the frame scheduler into the singleton container at module evaluation time. Switching to `await import()` will silently break test isolation because the test harness cannot overwrite the singleton before registration.
1112
- **AI-agent generated contributions are welcome** — include enough human intent and validation context in the PR for reviewers to trust the result.
1213

1314
## Suggested Workflow
1415

1516
1. **Research the area**
16-
- Identify the relevant primitive: spawn, ledger, handoff, watchdog, or extension wiring.
17-
- Read nearby tests in `agenticoding.test.ts` before changing behavior.
17+
- Identify the relevant primitive: spawn, notebook, handoff, watchdog, or extension wiring.
18+
- Read the relevant suite in `tests/unit/` before changing behavior.
1819

1920
2. **Plan the smallest safe change**
2021
- Reuse existing state and lifecycle hooks when possible.
@@ -38,6 +39,29 @@ Before submitting, check that your change:
3839
- Handles reset, cancellation, and stale-session cases where relevant.
3940
- Keeps docs aligned with the package version and installed behavior.
4041

42+
## Tests
43+
44+
- `npm test` — runs the unit suite under `tests/unit/` via the in-repo Node test runner.
45+
- `npm run test:snapshots:check` — runs only the render-snapshot tests; fails on any drift in `tests/snapshots/`.
46+
- `npm run test:snapshots:update` — rewrites the golden files in `tests/snapshots/` after an intentional render change. Review the diff carefully: snapshot updates are the only signal that catches unintended UI regressions.
47+
- `npm run test:e2e` — runs the process-isolated end-to-end suite under `tests/e2e/`.
48+
49+
## CI
50+
51+
Pull requests are automatically tested via GitHub Actions. A cross-platform matrix runs on every push and PR:
52+
53+
| OS | Node | Runs |
54+
|---|---|---|
55+
| Ubuntu | 22 (minimum) | Type check, security audit, unit tests, E2E tests |
56+
| Ubuntu | 24 | Type check, security audit, unit tests, E2E tests |
57+
| macOS | 24 | Unit tests, E2E tests |
58+
| Windows | 24 | Unit tests, E2E tests |
59+
60+
Node 22 (minimum) is tested only on Linux — the primary platform and the only one guaranteed to have the oldest toolchain. macOS and Windows test Node 24 (latest) to catch regressions in the newest runtime while balancing CI cost.
61+
62+
Snapshot golden files in `tests/snapshots/` are stored with LF line endings (enforced by `.gitattributes`). The `normalizeEOL` helper in the snapshot test file normalizes `\r\n` to `\n` on read, so Windows developers get correct comparisons even if their working tree has CRLF. If you update snapshots, the CI matrix validates them on all platforms.
63+
The E2E suite runs on all platforms including Windows (verified in issue #12).
64+
4165
## Community
4266

4367
Use GitHub Issues for bug reports and feature requests. Keep discussions concrete: describe the agent workflow you expected, what happened instead, and any reproduction steps.

0 commit comments

Comments
 (0)