Skip to content

Latest commit

 

History

History
251 lines (191 loc) · 17.4 KB

File metadata and controls

251 lines (191 loc) · 17.4 KB

Test Plan & QA — Architecture Advisor

Phase 5 of 7 · Status: 🔬 In progress. 62 Vitest unit/component/integration tests + an axe-core a11y suite, three model-integrity guards, and a Playwright real-browser E2E suite (smoke, share deep-link, structural a11y, keyboard) — all in CI, which now also gates a bundle-size budget and a production-dependency audit. Real-browser a11y gates full WCAG A/AA including color-contrast in both themes. Open: the UAT script is written but not yet run with participants. This document is the test strategy, the current inventory, the acceptance-criteria traceability matrix, and the honest gap list.

Primary references: Build Spec Section 14 (acceptance criteria), the SRS (FR/NFR), and the charter Section 11 (Definition of Done & quality gates).


1. Objectives & scope

Goal: prove the tool is correct, transparent, reproducible, and accessible — not merely that it renders. The differentiator of Architecture Advisor is an auditable scoring model, so the model is tested to the number, and the documentation is machine-checked against the implementation.

In scope: the scoring engine, anti-pattern detection, exporters (ADR/report/CSV/JSON/share), i18n completeness, the model↔docs↔config consistency, the four-step UI flow, accessibility, and client-side security/performance.

Out of scope: backend/API/database testing (there is none — the app is pure client-side), load testing of servers (static hosting), and cross-browser matrices beyond evergreen browsers.


2. Test strategy — the layers

A deliberately bottom-heavy pyramid: the model logic is pure and deterministic, so most assurance lives in fast unit tests and the cross-document guards; UI and human-judgement checks sit on top.

Layer Tooling What it protects State
L0 · Model guards Node scripts (no deps), CI The docs, the reference model, and src/config cannot drift apart ✅ Done
L1 · Unit Vitest Scoring math, anti-patterns, exporters, i18n ✅ Done
L2 · Component/Integration Vitest + Testing Library The 4-step flow, reactivity, override panel + redistribution, radar, command palette, manual & A/B compare overlays ✅ Mostly done
L3 · System / E2E Playwright (chromium) Full journeys in a real browser: smoke, share-URL deep-link, structural a11y, keyboard ✅ Done
L4 · Accessibility vitest-axe + Playwright + @axe-core/playwright Names/roles/ARIA (jsdom + real browser), keyboard, and full color-contrast (real browser, both themes) — all automated ✅ Done
L5 · UAT Scripted scenarios — uat-script.md Real architects/newcomers confirm usefulness & clarity ⏳ Script ready (not run)
L6 · Security npm audit --omit=dev in CI (Section 8) Client-side injection, storage, dependencies ✅ Gated in CI
L7 · Performance Bundle-size budget guard in CI (Section 9) Bundle budget, first paint, instant recompute ✅ Gated in CI

3. Current inventory (what runs today)

3.1 Automated unit suite — npm run test (Vitest)

62 tests across 11 files, all green:

File Cases Covers
src/lib/scoring.test.ts 22 Fixtures A–C, equal-weight fallback, 500 seeded random invariants, requirement scenarios (AC-6/AC-7), contribution reconciliation (FR-REC-4), expert override & lock, all 25 preset targets (SRS Section 5.3), qaFit defaulting
src/lib/antiPatternEngine.test.ts 8 Distributed monolith, premature microservices, and the other rules (Model Data Sheet Section 5)
src/lib/exports.test.ts 8 generateAdr (MADR), generateReport, buildC4, scenario JSON round-trip, share-URL round-trip (AC-14)
src/App.test.tsx 4 Integration: preset & single-factor reactivity (AC-2), language toggle (AC-13), and weight-override redistribution end-to-end
src/components/RadarPanel.test.tsx 3 Component: D1 ranking + single top pick, option toggle, dimension switch (AC-12)
src/components/SensitivityCard.test.tsx 3 Component: flip sentence + robust fallback, max-3 flips (AC-11)
src/components/QaOverridePanel.test.tsx 4 Component: edit → lock, clamp 0–100, unlock, clear-all
src/components/CommandPalette.test.tsx 3 Component: closed renders nothing; filter by query; run on click / Enter
src/components/overlays.test.tsx 4 Component: ManualBook + ScenarioCompare (A/B) — hidden when closed, labelled dialog + close when open
src/a11y.test.tsx 2 Accessibility (AC-15): axe-core WCAG A/AA on the composed app + Expert/override panel — caught & fixed an unlabeled file input
src/i18n/dict.test.ts 1 Dictionary completeness — every key has EN and ID

Component/integration tests render via a small src/test/render.tsx helper that wraps the unit under test in the i18n provider with the language pinned. (Vitest runs with css: false, so assertions target accessible names, roles, and unique strings — not CSS-driven guided/expert visibility.)

3.2 Model-integrity guards — L0 (run in CI on every push/PR)

Guard Asserts
scripts/verify-model.mjs The reference model reproduces the math, the fixtures, and all 25 preset targets
scripts/cross-check-docs.mjs The docs agree with each other and with the prototype (qaFit vectors, influence matrix, presets, option names, EN/ID parity) — 12 checks
scripts/check-app-config.mjs src/config/* mirrors the documented model (no app↔doc drift)

3.3 End-to-end — npm run test:e2e (Playwright, real chromium)

Real-browser journeys against the dev server at the /architecture-advisor/ sub-path. 6 pass (all gating):

Spec Covers
e2e/smoke.spec.ts The 4-step flow loads; a preset recomputes the recommendation (AC-2); the primary export downloads a .md (MADR)
e2e/share.spec.ts AC-14 end to end: Share copies a #s=… deep link to the clipboard; opening it restores the exact recommendation
e2e/a11y.spec.ts Full WCAG A/AA incl. color-contrast (axe, real engine) in Guided/dark + Expert/light + override panel; keyboard operability

3.4 CI pipelines (.github/workflows/)

  • ci.yml — on push/PR: check-app-configlinttestbuildsize (bundle budget, L7) → audit:prod (production-dependency audit, L6).
  • e2e.yml — installs the chromium browser and runs test:e2e (L3).
  • docs-integrity.yml — runs verify-model + cross-check-docs on doc/model changes.
  • deploy.yml — build + publish to GitHub Pages on main.

The model guards are intentionally dependency-free Node scripts so they run identically on a laptop and in CI, and never rot behind a test framework upgrade.


4. Acceptance-criteria traceability (Build Spec Section 14)

Each criterion maps to its verification method. Automated = covered by a test/guard that fails the build on regression; Manual = on the release checklist (Section 6) until L2–L4 land.

# Acceptance criterion (abridged) Verified by Status
1 install/dev/test/build clean; CI present ci.yml ✅ Automated
2 Any factor change instantly updates weights/rankings/charts/analyses App.test (preset + single factor → verdict recomputes) ✅ Automated
3 Defaults → D1 top Monolith; timeToMarket highest scoring.test Fixture A ✅ Automated
4 team2/dist2/scale2/devops2/ttm0 → D1 top Microservices scoring.test Fixture B ✅ Automated
5 domain2/team0/ttm0 → Modular Monolith; D4 Hexagonal/Clean = 5.0 scoring.test Fixture C ✅ Automated
6 async2/realtime2 → D2 Event-driven/Streaming; scalability+perf lead scoring.test (AC-6) ✅ Automated
7 consistency2 → dataConsistency dominates; D3 Single shared DB scoring.test (AC-7) ✅ Automated
8 Microservices + Single shared DB → distributed monolith warning antiPatternEngine.test ✅ Automated
9 team0/devops0 + Microservices → premature microservices warning antiPatternEngine.test ✅ Automated
10 Contribution table reconciles exactly to the composite scoring.test (FR-REC-4) ✅ Automated
11 Sensitivity names a flipping factor or correctly says "robust" SensitivityCard.test ✅ Automated
12 Radar overlays top options; compare 2–3 options RadarPanel.test (toggle + dimension switch) ✅ Automated
13 Language toggle updates all strings; dark mode fully styled dict.test (keys) + App.test (toggle); dark mode manual 🟡 Partial
14 Share link round-trips; Export ADR = valid MADR exports.test + e2e/share.spec (deep-link) + e2e/smoke.spec (ADR download) ✅ Automated
15 Keyboard-operable; accessible names; AA contrast both themes a11y.test + e2e/a11y.spec (axe incl. color-contrast + keyboard, both themes, real browser) ✅ Automated
16 Every QA/factor/option/rule/template in config + documented check-app-config + cross-check-docs ✅ Automated

Summary: 15/16 fully automated, 1 partial (AC-13 — dark-mode styling completeness is still eyeballed; the dark theme is otherwise axe-clean). Nothing fully manual.


5. How to run

npm run test                       # Vitest unit/component/a11y suite (watch: npm run test:watch)
node scripts/verify-model.mjs      # model math + fixtures + 25 preset targets
node scripts/cross-check-docs.mjs  # docs agree with each other + the prototype
node scripts/check-app-config.mjs  # src/config mirrors the documented model
npm run lint && npm run build      # types + lint + production build
npm run size                       # bundle-size budget (after build) — L7
npm run audit:prod                 # production-dependency audit (high/critical) — L6

# E2E (real browser) — one-time browser download, then run:
npx playwright install chromium
npm run test:e2e                   # smoke, share deep-link, structural a11y, keyboard — L3

CI runs the equivalent on every push/PR (ci.yml + e2e.yml + docs-integrity.yml); a green checkmark is the merge gate, and deploy.yml re-runs the unit tests + build before publishing.


6. Release checklist (manual, until L2–L4 automate it)

Run before tagging a release, in both light and dark themes and at 360 px width:

  • Change a factor → weights, rankings, radar, and analyses update with no reload (AC-2).
  • Open the sensitivity card → it names a flipping factor or says "robust," and is correct (AC-11).
  • Toggle radar options and the dimension selector → overlays and ranking update (AC-12).
  • Switch EN↔ID → no untranslated string anywhere; switch theme → everything styled (AC-13).
  • Tab through the whole flow → every control reachable, visible focus, sensible order (AC-15).
  • Export ADR / report / CSV / JSON, Print/PDF, Share link, Import setup → all succeed (AC-14).
  • npm run build size is within budget (Section 9); no console errors on load.

7. Accessibility (L4) & UAT (L5)

Accessibility — WCAG 2.1 AA (NFR + AC-15), automated: names/roles/ARIA via vitest-axe (axe-core) in src/a11y.test.tsx (jsdom), plus full color-contrast + keyboard in a real browser via Playwright + @axe-core/playwright in e2e/a11y.spec.ts, across both themes. The axe run caught and fixed an unlabeled file input; the contrast pass drove the tertiary-token, light-success-green, and dimmed-opacity (off chips / hidden rows) adjustments to clear AA.

  • Keyboard: every interactive control operable, logical tab order, visible focus (already styled via :focus-visible in index.css), no traps; overlays (palette, manual) are escapable.
  • Names/roles: segmented controls use role="radiogroup"/radio; icon-only buttons have aria-label; the radar <svg> has role="img" + label.
  • Contrast: AA for text in both themes (design tokens chosen for this; spot-check after token edits).

UAT — scripted scenarios for two personas, ≥3 participants each, success = task completed unaided + self-reported clarity ≥4/5:

  1. Newcomer (Guided): "You're building a small internal tool — what should you use and why?" Expectations: reaches a recommendation, can explain the top driver in their own words.
  2. Architect (Expert): "Justify a Modular Monolith over Microservices for a regulated, high-scale product." Expectations: uses the contribution bars + sensitivity + close-call, exports an ADR.

8. Security verification (L6)

Pure client-side, no backend/accounts/secrets — the surface is the browser and the dependencies.

  • Injection: all user/derived text rendered via React (escaped). Audit any dangerouslySet… / innerHTML (the prototype mockup uses innerHTML with non-user template strings only; the React app must not interpolate user input into HTML).
  • Persisted/URL state: localStorage + URL-hash state is validated on read (corrupt/stale snapshots are treated as empty — see the isScenario() guard) so a hostile hash can't crash or mislead the app.
  • Dependencies: CI-gatednpm run audit:prod (npm audit --omit=dev --audit-level=high) runs after the build; production deps (React + fontsource + tabler icons) report 0 vulnerabilities. The known dev-only Vite/esbuild dev-server advisory does not ship and is excluded; track it and bump Vite/Vitest when a non-breaking fix lands.
  • Supply chain: package-lock.json committed; CI uses npm ci.

9. Performance verification (L7)

  • Bundle budget: CI-gatednpm run size (scripts/check-bundle-size.mjs) asserts gzip JS ≤120kB / CSS ≤25kB (currently ~110 / ~19 with React 19). No chart/diagram library ships — all visuals are hand-built SVG (see DECISIONS.md).
  • Recompute: changing a factor recomputes the full model synchronously (pure functions, no async) — perceptibly instant; verified by the 500-iteration invariant test running in ms.
  • First paint: dark theme applied pre-paint (inline script); fonts font-display: swap.
  • No layout thrash: SVGs are static markup, not re-laid-out per frame.

10. Gaps & roadmap (honest)

Gap Impact Plan
UAT not yet executed (L5) Real-user clarity unproven Run uat-script.md with ≥3 per persona before v1.1
Component/integration (L2) is broad but not exhaustive A few minor affordances still ride the release checklist Add cases opportunistically as components change

11. Definition of Done (testing gate)

A change is "done" when: the unit suite and all three guards pass; lint and build are clean; new model/config/doc changes keep cross-check-docs and check-app-config green; any new UI affordance is covered by the release checklist (Section 6); and no acceptance criterion regresses.


Version Date Notes
0.1 2026-06-18 Initial test plan: strategy, current inventory (39 tests + 3 guards + CI), AC traceability, manual checklist, security/perf/UAT/a11y approach, and the L2–L7 gap roadmap.
0.2 2026-06-20 First L2 component/integration tests landed (9 cases via src/test/render.tsx): App reactivity (AC-2) + language (AC-13), RadarPanel (AC-12), SensitivityCard (AC-11). Inventory 39→48; automated AC 11→14 of 16.
0.3 2026-06-20 L4 accessibility automated with vitest-axe (axe-core), WCAG A/AA, on the composed app + Expert/override panel (src/a11y.test.tsx, 2 cases) — caught & fixed an unlabeled file input (Toolbar). Inventory 48→50; AC-15 manual→partial (contrast/keyboard still manual).
0.4 2026-06-20 Extended L2 to the override panel + redistribution, command palette, and the manual / A/B-compare overlays. Inventory 50→62.
0.5 2026-06-20 L3 E2E (Playwright: smoke, share deep-link, structural a11y, keyboard) + real-browser keyboard for AC-15; L6/L7 gated in CI (audit:prod, bundle-size budget); L5 UAT script added (uat-script.md). Full color-contrast AA tracked as a test.fixme.
0.6 2026-06-20 Full color-contrast AA remediated and gated in the real browser (tertiary tokens, light success green, off-chip/hidden-row opacity) — e2e/a11y.spec now asserts color-contrast in both themes (no fixme); L4 ✅; AC-15 ✅ automated (15/16 AC automated).