|
| 1 | +# Visual Regression Testing |
| 2 | + |
| 3 | +Pixel-level visual regression is the outermost geometry-fidelity layer in GraphCompose. It renders a PDF to one PNG per page and diffs each page against a committed baseline. |
| 4 | + |
| 5 | +It complements — does not replace — [layout snapshot testing](./layout-snapshot-testing.md): |
| 6 | + |
| 7 | +1. unit tests validate isolated layout math |
| 8 | +2. **layout snapshots** validate the resolved document tree (coordinates, page spans, layer/order) — structural geometry |
| 9 | +3. **visual regression** validates the rendered pixels — font shape, colour, anti-aliasing, glyph fallback |
| 10 | +4. human inspection of the PDF remains the final eye |
| 11 | + |
| 12 | +Reach for visual regression when the failure you care about is *pixel-level* rather than *geometry-level*: the layout snapshot still matches but the PDF looks wrong (wrong font, wrong colour, missing glyph, anti-aliasing drift). |
| 13 | + |
| 14 | +## Pixel vs semantic — which layer? |
| 15 | + |
| 16 | +| You want to catch… | Use | |
| 17 | +|---|---| |
| 18 | +| A node moved / page break shifted / sibling order changed | layout snapshot (semantic) | |
| 19 | +| The PDF looks identical pixel-for-pixel — fonts, colours, glyphs | **visual regression (pixel)** | |
| 20 | +| A specific layout-math rule | a focused unit test | |
| 21 | + |
| 22 | +The semantic layer is cheap, deterministic, and cross-platform stable. The pixel layer is precise but sensitive to platform font rendering (see [Cross-platform tolerance](#cross-platform-tolerance) below). A flagship template or a preset you publish to others deserves both. |
| 23 | + |
| 24 | +## Public API |
| 25 | + |
| 26 | +The harness is `com.demcha.compose.testing.visual.PdfVisualRegression` (`@since 1.6.9`), a sibling to the semantic `com.demcha.compose.testing.layout.*` helpers. It ships in the main artifact, so library consumers use the exact helpers GraphCompose uses in its own tests. |
| 27 | + |
| 28 | +### Assert a committed baseline |
| 29 | + |
| 30 | +```java |
| 31 | +import com.demcha.compose.GraphCompose; |
| 32 | +import com.demcha.compose.document.api.DocumentPageSize; |
| 33 | +import com.demcha.compose.document.api.DocumentSession; |
| 34 | +import com.demcha.compose.testing.visual.PdfVisualRegression; |
| 35 | +import org.junit.jupiter.api.Test; |
| 36 | + |
| 37 | +class InvoiceVisualParityTest { |
| 38 | + |
| 39 | + @Test |
| 40 | + void invoiceRendersPixelIdentical() throws Exception { |
| 41 | + byte[] pdfBytes; |
| 42 | + try (DocumentSession document = GraphCompose.document() |
| 43 | + .pageSize(DocumentPageSize.A4) |
| 44 | + .margin(22, 22, 22, 22) |
| 45 | + .create()) { |
| 46 | + template.compose(document, spec); |
| 47 | + pdfBytes = document.toPdfBytes(); |
| 48 | + } |
| 49 | + |
| 50 | + PdfVisualRegression.standard() |
| 51 | + .assertMatchesBaseline("invoice_standard", pdfBytes); |
| 52 | + } |
| 53 | +} |
| 54 | +``` |
| 55 | + |
| 56 | +`assertMatchesBaseline(name, pdfBytes)` renders every page, compares against `<name>-page-N.png` under the baseline root, and throws `AssertionError` if any page exceeds the configured budget. On failure it writes `<name>-page-N.actual.png` and `<name>-page-N.diff.png` next to the baseline for inspection. |
| 57 | + |
| 58 | +### Configure the harness |
| 59 | + |
| 60 | +`PdfVisualRegression` is immutable; every setter returns a copy. |
| 61 | + |
| 62 | +| Setter | Default | Meaning | |
| 63 | +|---|---|---| |
| 64 | +| `baselineRoot(Path)` | `src/test/resources/visual-baselines` | where baselines and diff sidecars live | |
| 65 | +| `renderScale(float)` | `1.0` | render scale multiplier (`2.0` = retina); must be `> 0` | |
| 66 | +| `perPixelTolerance(int)` | `6` | allowed per-channel delta (`0..255`) before a pixel counts as mismatched | |
| 67 | +| `mismatchedPixelBudget(long)` | `0` | mismatched pixels tolerated per page before the assertion fails | |
| 68 | + |
| 69 | +### Diff images directly |
| 70 | + |
| 71 | +For ad-hoc comparison, render pages and call `ImageDiff` yourself: |
| 72 | + |
| 73 | +```java |
| 74 | +List<BufferedImage> pages = PdfVisualRegression.standard().renderPages(pdfBytes); |
| 75 | +ImageDiff.Result diff = ImageDiff.compare(expectedPng, pages.get(0), 6); |
| 76 | +assertThat(diff.withinBudget(0)).isTrue(); |
| 77 | +``` |
| 78 | + |
| 79 | +## Approve mode (blessing baselines) |
| 80 | + |
| 81 | +There is no baseline the first time. Run with the approve flag to write the current renders as the baseline: |
| 82 | + |
| 83 | +```bash |
| 84 | +./mvnw test -Dtest=InvoiceVisualParityTest -Dgraphcompose.visual.approve=true |
| 85 | +``` |
| 86 | + |
| 87 | +The system-property name is exposed as `PdfVisualRegression.APPROVE_PROPERTY`; the environment variable `GRAPHCOMPOSE_VISUAL_APPROVE=true` works as a fallback. In approve mode the harness writes baselines and skips the diff assertion — so **never enable it in CI verification**, only when you have reviewed the new render and intend to re-bless. |
| 88 | + |
| 89 | +## Where files live |
| 90 | + |
| 91 | +- committed baselines: `<baselineRoot>/<name>-page-N.png` |
| 92 | +- mismatch artifacts (normal runs): `<baselineRoot>/<name>-page-N.actual.png` and `<name>-page-N.diff.png` (mismatched pixels red, matching pixels greyscale) |
| 93 | + |
| 94 | +Use a flat `name`, or pre-create nested baseline directories — the harness creates the baseline root but not intermediate folders. |
| 95 | + |
| 96 | +## Cross-platform tolerance |
| 97 | + |
| 98 | +PDFBox font rasterization drifts slightly across platforms (different system fonts, different rasterizer). A baseline recorded on Windows will not match Linux CI pixel-for-pixel. |
| 99 | + |
| 100 | +The `standard()` defaults are strict (tolerance `6`, budget `0`) — good for same-platform, deterministic renders. For baselines that must survive a Windows-author → Linux-CI round trip, loosen both. GraphCompose's own CV / cover-letter parity tests calibrate to: |
| 101 | + |
| 102 | +```java |
| 103 | +PdfVisualRegression.standard() |
| 104 | + .perPixelTolerance(8) // absorb sub-pixel anti-aliasing drift |
| 105 | + .mismatchedPixelBudget(50_000) // ~glyph edges across a full A4 page |
| 106 | + .assertMatchesBaseline(slug, pdfBytes); |
| 107 | +``` |
| 108 | + |
| 109 | +Tune these to your fonts and page density: too tight and CI flakes on anti-aliasing noise; too loose and a real regression slips through. Start from the values above and tighten until CI is stable. |
| 110 | + |
| 111 | +## Using visual regression in downstream projects |
| 112 | + |
| 113 | +Library consumers use the same published helpers: |
| 114 | + |
| 115 | +```java |
| 116 | +import com.demcha.compose.testing.visual.PdfVisualRegression; |
| 117 | + |
| 118 | +PdfVisualRegression.standard() |
| 119 | + .baselineRoot(Path.of("src", "test", "resources", "pdf-baselines")) |
| 120 | + .perPixelTolerance(8) |
| 121 | + .mismatchedPixelBudget(50_000) |
| 122 | + .assertMatchesBaseline("reports/monthly_invoice", pdfBytes); |
| 123 | +``` |
| 124 | + |
| 125 | +`PublicVisualApiDogfoodTest` in this repository drives exactly this consumer workflow end-to-end and proves the published surface is sufficient without any package-private access. |
| 126 | + |
| 127 | +## When not to use pixel regression |
| 128 | + |
| 129 | +- when structural geometry is what you care about → use a [layout snapshot](./layout-snapshot-testing.md) (cheaper, cross-platform stable) |
| 130 | +- when a small unit test proves the same rule more directly |
| 131 | +- as the *only* gate on a CI that runs on a different OS than where baselines were recorded — pair it with semantic snapshots and a sensible tolerance budget |
| 132 | + |
| 133 | +## Examples in this repository |
| 134 | + |
| 135 | +- `CvV2VisualParityTest`, `CoverLetterV2VisualParityTest` — preset parity with Windows-baseline / Linux-CI calibration |
| 136 | +- `ShapeContainerVisualRegressionTest` — engine primitive fidelity |
| 137 | +- `TableRowSpanDemoTest` — table rendering |
| 138 | +- `PdfVisualRegressionTest` — the harness's own unit tests |
| 139 | +- `PublicVisualApiDogfoodTest` — consumer-surface dogfood |
0 commit comments