Run npm test to execute the project's automated tests. Dedicated scripts run
common subsets of the suite:
npm test # runs all tests
npm run test-core # core game logic
npm run test-bench # bench-related unit/integration tests (Mocha only)
npm run test-bench-smoke # browser benchmark smoke gate (Playwright + E2E harness)
npm run bench-smoke # fast benchmark smoke gate (short dev-loop default)
npm run bench-performance # standalone perf bench (smoke profile by default)
npm run bench-performance-smoke # explicit perf smoke profile
npm run bench-history # history stress bench (smoke profile by default)
npm run bench-history-smoke # explicit history smoke profile
npm run bench-long-session # long-session benchmark gate (smoke profile by default)
npm run bench-long-session-smoke # explicit long-session smoke profile
npm run bench-performance-soak # long perf soak run (explicit opt-in)
npm run bench-history-soak # long history soak run (explicit opt-in)
npm run bench-long-session-soak # long replay/memory/event-queue soak run
npm run test-workflow # GitHub workflow helpers
npm run test-tools # command line tools
npm run test-offline-tools # offline asset tooling
npm run test-editor # editor-related tests
npm run test:changed # infer the smallest safe Mocha subset from git changes
npm run coverage-editor # 100% coverage for editor modules
npm run test-mcp-smoke # MCP stdio smoke test (requires start-https)
npm run typecheck:critical # targeted checkJs guard for runtime-critical modules
npm run release-readiness # release checklist gate (strict by default)Categories map to the glob patterns defined in scripts/runTests.js.
npm run test:changed resolves its comparison base in this order: explicit
--base=<ref>, current branch upstream, origin/HEAD, then known default
branch names. Add --print-selection (or --dry-run) to print the resolved
base ref, changed files, inferred categories, and Mocha args without running
guards or tests.
The maintained subset scripts (test-core, test-bench-unit,
test-workflow, test-tools, test-offline-tools, and test-editor) all go
through scripts/runTests.js, so they share the same runtime-global guard,
critical typecheck guard, and runtime budget reporting as npm test.
Tests that require significant manual setup or large downloads belong in
excluded-tests.md. No tests are currently excluded.
The tests require no special environment variables. A minimal lemmings object
is created and temporary files are written under your operating system's temp
directory.
Benchmark scripts default to short smoke settings so local perf checks stay within a quick dev-loop budget. Use explicit soak mode for long runs:
npm run bench-performance -- --soak
npm run bench-history -- --soak
npm run bench-smoke -- --soaktest-bench and bench-* intentionally have different semantics:
test-bench: runs deterministic Mocha tests undertest/*bench*.test.js.bench-*: runs live browser benchmarks (requires local HTTPS server and browser automation support).
bench-history is the replay-invariant guardrail for compression/rewind work:
- It runs random seek/replay probes and fails on replay-hash divergence
(
HISTORY_REQUIRE_REPLAY_PARITY, defaults totrue). - It fails when bounded history retention is not enabled
(
HISTORY_REQUIRE_BOUNDED_RETENTION, defaults totrue). - Non-smoke profiles also require cold compaction activity
(
HISTORY_REQUIRE_COLD_COMPACTION, defaults totruefordefault/soak).
bench-hotpaths now reports percentile and allocation diagnostics per section:
avgMs,p50Ms,p95Ms,p99Ms,worstMsallocBytesAvg,allocBytesP95,allocBytesWorst
For render experiments, use canonical query flags in non-default runs and keep rollback ready:
offscreenPresent=true: requests the Canvas2D staging plusdrawImagepresent-path experiment when supported.workerOffscreen=true: requests worker/offscreen path; runtime falls back automatically when unsupported.
Runtime diagnostics now expose capability matrix and rollout-flag snapshots
through window.__E2E__.getDiagnostics() / window.__E2E__.getState():
capabilities.webMidi,capabilities.offscreenCanvas,capabilities.imageBitmap,capabilities.worker.capabilities.renderPathsfor deterministic fallback selection. Diagnostics usepresentPathSupported/drawimage_present.rolloutFlagsfor staged rollout / emergency rollback state.
Rollout and rollback query toggles:
rollbackAll=1: disables all high-risk rollout flags.rollbackRenderPresent=1: disables offscreen/worker present-path experiments.rollbackHistoryCodec=1: disables cold history compression/dedupe.
bench-long-session enforces thresholds for:
- replay-hash integrity
- heap growth and heap churn proxies
- sound-event queue ratio and queue-growth bounds
- history span growth and trigger-count drift
Release gates are defined in release-readiness.md and
validated by npm run release-readiness. Override strictness via
LEMMINGS_RELEASE_READINESS_STRICT=false when validating checklist structure
without requiring all items checked.
Runtime boot/query presets use these profile IDs:
classicmidieditore2eperf
Legacy profile=gameplay links are normalized to classic.
Privacy-first analytics is opt-in and local-only by default. See
analytics.md for consent defaults, event schema constraints,
local buffer export/import, optional managed beacon settings, and hard/runtime
kill switches.
Run npm run check-undefined before npm test to catch low-cost JS hygiene
regressions early. GitHub Actions also runs git diff --check against the
changed lines after npm run format to catch trailing-whitespace and EOF
blank-line issues without a custom baseline.
npm test now reports total runtime and supports optional guardrails for local
suite budgets:
LEMMINGS_TEST_ENFORCE_BUDGET=true: fail when runtime budget is exceeded.LEMMINGS_TEST_BUDGET_MS=<ms>: override the default 180000ms budget.npm run test:budget: convenience wrapper with enforcement enabled.
To cover the main CI static/test gates locally:
npm run lint
git diff --check origin/master...HEAD
npm run check-undefined
npm testPlaywright defaults to https://localhost:8080. Override the origin with
LEMMINGS_E2E_BASE_URL when validating another same-machine or LAN URL:
npm run test-e2e -- e2e/service-worker.spec.js
$env:LEMMINGS_E2E_BASE_URL = "https://127.0.0.1:8080"
npm run test-e2e -- e2e/service-worker.spec.js
Remove-Item Env:\LEMMINGS_E2E_BASE_URLThe service-worker smoke asserts same-origin scope instead of literal localhost, so the same spec should pass for localhost and non-localhost origins served by the configured HTTPS server.
Disposable local screenshots are documented in
playwright-tests.md. Start npm run start-https first
when invoking the capture CLI directly:
npm run capture:e2e:midi
npm run capture:e2e:editor
npm run capture:e2e:procgen
npm run capture:e2e:game-hudOutput stays under ignored temp/e2e-captures/.
Milestone work should leave a short, reproducible evidence note before issue
closeout. Keep working artifacts under ignored temp/ and commit only the code,
tests, and durable docs needed by the feature.
Use this compact format for each lane or issue group:
Issues:
Commands:
Temp artifacts:
GitHub closeout:
Skipped checks:
Unrelated failures:
Follow-up risks:
The Capture matrix below is the standard milestone map for the current editor, procgen, solver, MIDI, and validation work. Run the narrow lane checks while a lane is in progress, then run the standard gate before final closeout.
| Area | Issues | Capture matrix | Required checkpoint commands | Disposable evidence |
|---|---|---|---|---|
| MIDI polish | #957 | midi-transport, midi-source-browser, midi-track-workspace, midi-clip-library, midi-inspector, midi-learn, midi-record, midi-output-status; desktop/tablet/mobile when layout changes |
npm run test-e2e -- e2e/midi-ui.spec.js; npm run capture:e2e:midi -- --viewport=desktop --json; npm run capture:e2e:midi -- --viewport=tablet --json; npm run capture:e2e:midi -- --viewport=mobile --json |
temp/e2e-captures/, temp/midi-lane-*.md |
| Editor productization | #954 | states shell, canvas-palette-inspector, validation, save-import-export, playtest; desktop required, tablet/mobile when controls change |
npm run test-editor; npm run test-e2e:harness; npm run capture:e2e:editor -- --viewport=desktop --json |
temp/e2e-captures/, temp/editor-lane-*.md |
| Procgen productization | #955 | states overview, frontier, newest-pieces; desktop required for seed review, targeted mobile only for shell/layout changes |
npm run test-e2e -- e2e/procgen.spec.js; npm run capture:e2e:procgen -- --viewport=desktop --json; npm run bench-procgen-soak; npm run test-bench-unit |
temp/e2e-captures/, temp/procgen-lane-*.md |
| Solver platform | #956 | no standalone screenshot gate until a UI/editor advisory surface changes; capture editor advisory or solver failure surfaces only when they are user-visible | npm test; focused test/solver*.test.js; npm run test-e2e:harness when runtime adapters or editor advisory flows change; MCP checks after solver tools change |
temp/solver-lane-*.md, optional temp/solver-failures/ |
| Validation and closeout | #953 | capture docs and runner drift are checked by tests; do not create committed galleries or manifests | npm run format; npm run check-undefined; npm run lint; npm run typecheck:critical; npm test; npm run test-bench-unit |
temp/milestone-integration-*.md |
GitHub issue closeout comments should cite the commands that actually ran, any
relevant ignored artifact paths, and any skipped checks with the concrete
reason. Do not close an issue on visual captures alone when behavior changed;
pair captures with a deterministic unit, harness, or E2E check where available.
Keep capture output disposable: do not create committed galleries or manifests.
Use npm run release-readiness only when the release checklist or release gate
scripts change; it is not the issue closeout checklist.