Skip to content

✨ Align local TDD dynamic content filtering#274

Merged
Robdel12 merged 3 commits into
mainfrom
rd/tdd-dynamic-content-cli
May 23, 2026
Merged

✨ Align local TDD dynamic content filtering#274
Robdel12 merged 3 commits into
mainfrom
rd/tdd-dynamic-content-cli

Conversation

@Robdel12
Copy link
Copy Markdown
Contributor

@Robdel12 Robdel12 commented May 23, 2026

Why

Local TDD was using a much thinner view of dynamic content than cloud review.

In the cloud, Vizzly has access to honeydiff metadata, historical hot spot analysis, and user-confirmed dynamic regions. That lets cloud review treat things like timestamps, generated copy bands, avatar slots, and other known noisy areas differently from real layout changes.

The local TDD workflow did not have that full context. Even when the cloud knew a region was dynamic, the local server could still surface the change as a normal diff. The result was a broken-feeling loop: you would run TDD locally, get a wall of diffs, and then see cloud review behave differently because it had better metadata.

This PR makes the CLI side consume and apply that dynamic-content context so local TDD and cloud review are making the same kind of decision.

What Changed

Baseline downloads now persist dynamic metadata

When the app returns bundled TDD baseline context, the CLI now saves:

  • historical hot spot analysis to .vizzly/hotspots.json
  • confirmed dynamic regions to .vizzly/regions.json
  • the same metadata when downloading by explicit baseline build
  • fallback dynamic metadata when downloading by explicit comparison id

That comparison-id fallback matters because older comparison payloads may not include the new dynamic_content field directly. In that case, the CLI asks the comparison context endpoint for the history payload and persists the dynamic regions/hotspots from there.

Local honeydiff runs now request the needed metadata

The CLI comparison path now asks honeydiff for:

  • diff clusters
  • merged clusters
  • SSIM/perceptual score
  • GMSD

The important one for this PR is SSIM. Confirmed dynamic regions should not become a blanket escape hatch. They are safe only when the changed pixels are in expected regions and the overall image remains structurally similar.

Confirmed-region filtering now mirrors cloud semantics

Before this PR, the initial local implementation checked whether a diff cluster intersected a confirmed region. During review, that turned out to be too permissive.

Cloud confirmed-region matching uses center proximity, then weights coverage by changed pixels. This PR updates the CLI to follow that same shape:

  • normalize honeydiff bounding boxes
  • match clusters to confirmed regions by center distance
  • weight region coverage by pixelCount
  • fall back to estimated pixels from the bounding box when needed
  • require at least 90% confirmed-region coverage
  • require SSIM of at least 0.95
  • fail closed when SSIM is unavailable

This means a huge unmatched change will not be hidden just because a small piece of it touches a confirmed region. It also means a structural layout shift inside or near a dynamic area still fails locally, matching the cloud behavior we actually want.

Honeydiff is upgraded to the latest release

This branch also upgrades @vizzly-testing/honeydiff from the locked 0.10.1 install to 0.10.3, the current latest dist-tag on npm.

Keeping the CLI on the newest honeydiff release matters here because this local TDD path now depends directly on honeydiff's richer metadata: merged clusters, SSIM/perceptual score, and GMSD. The app PR upgrades honeydiff to the same version so local and cloud are comparing with the same package release.

User Impact

Local TDD should stop producing wildly different results from cloud review for known dynamic content.

The intended workflow becomes:

  1. Download baselines through local TDD.
  2. Receive the same dynamic-content context cloud review already knows about.
  3. Run local comparisons with the same high-level filtering rules.
  4. See diffs that are much closer to what cloud review will consider meaningful.

That makes local TDD useful again for pages with dynamic text, generated areas, or recurring noisy regions.

Safety Notes

This is intentionally conservative:

  • dynamic regions only auto-filter when coverage is high enough
  • SSIM must also pass
  • missing SSIM means no region auto-approval
  • hot spot filtering still runs after confirmed-region filtering
  • real pixel diffs still fail when they do not meet the dynamic-content rules

The tests cover the important failure mode from review: a cluster can intersect a confirmed region but still fail if its center does not match the confirmed region the way cloud expects.

Verification

  • npm view @vizzly-testing/honeydiff version dist-tags --json confirmed latest is 0.10.3
  • npm ls @vizzly-testing/honeydiff --depth=0
  • npx biome check src/tdd/tdd-service.js src/tdd/services/comparison-service.js src/tdd/core/region-coverage.js tests/tdd/tdd-service.test.js tests/tdd/services/comparison-service.test.js tests/tdd/core/region-coverage.test.js
  • node --test --test-concurrency=1 --test-reporter=spec tests/tdd/core/region-coverage.test.js tests/tdd/services/comparison-service.test.js tests/tdd/tdd-service.test.js
  • npm run build

Match local confirmed-region filtering to cloud behavior by using center-based region matching, pixel-weighted coverage, and SSIM gates. Persist bundled and fallback dynamic metadata during baseline downloads so local TDD can use the same context as cloud review.
@vizzly-testing

This comment has been minimized.

Use the latest released honeydiff package for local TDD comparisons so the CLI runs against the newest diff metadata behavior.
@vizzly-testing

This comment has been minimized.

@Robdel12 Robdel12 marked this pull request as ready for review May 23, 2026 19:17
Keep React and React DOM in the published dependency set because the packed CLI loads the SSR report renderer at runtime.
@vizzly-testing
Copy link
Copy Markdown

vizzly-testing Bot commented May 23, 2026

Vizzly - Visual Test Results

CLI Reporter - 2 changes need review
Status Count
Passed 17
Meaningful diffs 2
Auto-approved 17
Meaningful diffs needing review (2)

dashboard-mixed-state · Firefox · 375×892 · 28.7% diff

dashboard-mixed-state

fullscreen-viewer · Firefox · 375×667 · 78.5% diff

fullscreen-viewer

Review changes

CLI TUI - Approved

5 comparisons, no changes detected.

View build


rd/tdd-dynamic-content-cli · 0e9914f8

@Robdel12 Robdel12 merged commit e12aa5c into main May 23, 2026
29 of 30 checks passed
@Robdel12 Robdel12 deleted the rd/tdd-dynamic-content-cli branch May 23, 2026 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant