Unit tests are required for all features. This is not optional. See PRINCIPLES_AND_GOALS.md for why.
-
Every feature must have unit tests. Before a feature is considered complete, it must have tests that verify its behavior.
-
Refactor over complex tests. If writing a test requires:
- Extensive mocking or setup
- Deep knowledge of unrelated systems
- More than a few lines of boilerplate
...then refactor the code, don't write a more complex test. Hard-to-test code is hard-to-maintain code.
-
Tests document behavior. Tests show how code is meant to be used. If a test is hard to read, the API is probably hard to use.
-
Tests enable safe refactoring. The test suite should give confidence that changes don't break existing behavior. If you can't refactor safely, add more tests first.
| Layer | What to test | Example |
|---|---|---|
| Schemas | Validation, factory functions, edge cases | PersonaSchema.safeParse(), createPersona() |
| Utilities | Pure functions, transformations | resolvePath(), findPersona() |
| CLI commands | Command execution, structured results | executeCommandLine('ls') returns expected result |
| Components | Rendering, user interactions | ContactList renders items, handles selection |
Unit tests are feedback on your design. If code is hard to test, that's a signal the design is getting too complex or convoluted. Use this feedback:
- Easy to test → probably easy to understand, change, and reuse
- Hard to test → probably too coupled, doing too much, or poorly factored
Write the test first (or at least think about how you'd test it) before the implementation gets complicated. If you can't imagine a simple test, simplify the design.
If you find yourself struggling to test something, ask:
- Is this unit doing too much? Split it into smaller pieces.
- Are there too many dependencies? Inject them or extract pure logic.
- Is the state management tangled? Separate state from behavior.
- Am I testing implementation details? Test behavior, not internals.
The answer is almost always "refactor the code" rather than "write a more elaborate test."
Code optimized for performance may have a structure that's harder to test (e.g., inlined logic, avoiding allocations). This is acceptable only if:
- Explicitly marked — Comment explains why this code is performance-critical
- Well bounded — Small and focused on one thing
- Isolated — Minimal dependencies, clear interface
Such code still needs tests, but they may test at the boundary (inputs → outputs) rather than internals. The isolation requirement ensures that hard-to-test performance code doesn't spread throughout the codebase.
- Phase 1: Unit Tests with Vitest
- Phase 2: BDD Tests with Cucumber/Gherkin + Playwright
- Phase 3: Storybook for UI Components (partial - all stories created)
- Phase 4: Documentation Updates
- vitest
- @vitest/coverage-v8
- @vitest/ui
- @testing-library/react
- @testing-library/jest-dom
- @testing-library/user-event
- jsdom
vitest.config.ts- Vitest configuration with jsdom environment, coverage thresholds (80%)tests/setup.ts- Global test setup with jest-dom matchers, localStorage mock, crypto.randomUUID mock
tests/helpers/store-factory.ts- TinyBase store factory with sample datatests/helpers/render-with-providers.tsx- React Testing Library wrapper with TinyBase Provider
| File | Tests | Status |
|---|---|---|
| tests/unit/schemas/base.test.ts | ~30 | Passing |
| tests/unit/schemas/persona.test.ts | ~40 | Passing |
| tests/unit/schemas/contact.test.ts | ~45 | Passing |
| tests/unit/schemas/group.test.ts | ~40 | Passing |
| tests/unit/schemas/file.test.ts | ~35 | Passing |
| tests/unit/schemas/typeIndex.test.ts | ~30 | Passing |
| tests/unit/schemas/preferences.test.ts | ~32 | Passing |
| File | Tests | Status |
|---|---|---|
| tests/unit/utils/validation.test.ts | ~30 | Passing |
| tests/unit/utils/storeExport.test.ts | ~35 | Passing |
| tests/unit/utils/typeIndex.test.ts | ~25 | Passing |
| tests/unit/utils/settings.test.ts | ~24 | Passing |
| File | Tests | Status |
|---|---|---|
| tests/unit/components/ContactList.test.tsx | 19 | Passing |
"test": "vitest",
"test:ui": "vitest --ui",
"test:coverage": "vitest --coverage",
"test:run": "vitest run"Total: 392 tests passing (includes CLI registry test for exit command)
- @storybook/react-vite (via
npx storybook@latest init) - @storybook/addon-a11y
- @storybook/addon-interactions
- @storybook/test
.storybook/main.ts- Storybook main config.storybook/preview.tsx- Global decorators with TinyBase Provider and sample data
| Component | Stories | Variants |
|---|---|---|
| ContactList.stories.tsx | 6 | Default, WithSelection, Empty, MixedTypes, OnlyAgents, ManyContacts |
| ContactForm.stories.tsx | 4 | CreateNew, EditExisting, CreateAgent, EditAgent |
| PersonaList.stories.tsx | 6 | Default, WithSelection, Empty, SinglePersona, ManyPersonas, NoDefault |
| PersonaForm.stories.tsx | 2 | CreateNew, EditExisting |
| GroupList.stories.tsx | 6 | Default, WithSelection, Empty, MixedTypes, OnlyOrganizations, ManyGroups |
| GroupForm.stories.tsx | 3 | CreateNew, EditOrganization, EditTeam |
| MembershipManager.stories.tsx | 5 | WithMembers, NoMembers, AllMembersAdded, NoContacts, ManyMembers |
| FileMetadataPanel.stories.tsx | 6 | TextFile, ImageFile, NoMetadata, LargeFile, WithMultipleAuthors, JsonFile |
"storybook": "storybook dev -p 6006",
"storybook:build": "storybook build"Total: 8 components with 38 story variants
- @playwright/test, playwright-bdd;
npx playwright install(chromium)
playwright.config.ts— Playwright config with defineBddConfig, baseURL; no webServer (start dev server manually)
tests/features/*.feature— app, cli-contacts, cli-personas, cli-navigation, contacts, personastests/features/steps/common.steps.ts,cli.steps.ts
playwright-report/— HTML report; open withnpx playwright show-reportorplaywright-report/index.htmltest-results/—.last-run.json, per-test folders (traces, screenshots).features-gen/— generated Playwright specs fromnpx bddgen(gitignored)
See docs/testing/bdd-tests.md for full details.
Testing boundary: The same doc defines what is automated vs manual: app shell, tab navigation, Contacts/Personas UI, CLI help/clear/contact/persona/navigation are covered by BDD in both browser and terminal (Scenario Outline); CLI group/file/files/config/data/typeindex, Groups UI, forms, file browser, settings, export/import, and future features (pod connect, sync, ACL) are left for manual verification until scenarios are added.
"test:e2e": "npx bddgen && playwright test",
"test:e2e:headed": "npx bddgen && playwright test --headed",
"test:bdd": "npx bddgen && playwright test"- docs/testing/README.md — Testing overview, quick start, where results are stored (unit, BDD, Storybook)
- docs/testing/unit-tests.md — Unit test commands and layout
- docs/testing/bdd-tests.md — BDD commands, where BDD test results are stored (playwright-report, test-results, .features-gen), layout, manual server, manual steps
- docs/testing/storybook.md — Storybook commands and layout
- README.md — Testing section with commands and manual BDD steps; link to docs/testing/
- docs/TEST_PLAN.md — Status and phases marked complete
-
npm test- All unit tests pass (392 tests) -
npm run test:coverage- Coverage meets 80% threshold -
npm run storybook- All stories render without errors -
npm run test:e2e- All BDD scenarios pass (run with server on 5173; see docs/testing/bdd-tests.md) - Documentation is clear and complete (docs/testing/)