|
| 1 | +# High-Precision Mandelbrot/Julia via Viewport-Aware Compilation |
| 2 | + |
| 3 | +**Date:** 2026-03-06 |
| 4 | +**Status:** Approved |
| 5 | + |
| 6 | +## Problem |
| 7 | + |
| 8 | +The current Mandelbrot/Julia GPU implementation uses single-precision floats |
| 9 | +(GLSL `float`, 23-bit mantissa, ~7 decimal digits). Interactive deep-zoom |
| 10 | +becomes unusable past ~10^6x magnification as coordinates become |
| 11 | +indistinguishable and the image pixelates. |
| 12 | + |
| 13 | +## Goal |
| 14 | + |
| 15 | +Enable interactive deep-zoom rendering of Mandelbrot/Julia sets beyond 32-bit |
| 16 | +float limitations, with no changes to the user-facing expression API. |
| 17 | + |
| 18 | +## Design |
| 19 | + |
| 20 | +### User-Facing API (unchanged) |
| 21 | + |
| 22 | +``` |
| 23 | +Mandelbrot(x + yi, 256) |
| 24 | +Julia(x + yi, -0.7 + 0.27i, 256) |
| 25 | +``` |
| 26 | + |
| 27 | +No new functions. The user writes the same expressions as today. |
| 28 | + |
| 29 | +### Compile API Extension |
| 30 | + |
| 31 | +The `compile()` method accepts an optional `hints` object with a nested |
| 32 | +`viewport` field: |
| 33 | + |
| 34 | +```typescript |
| 35 | +compile(expr, { |
| 36 | + realOnly: true, |
| 37 | + hints: { |
| 38 | + viewport: { center: [0.3, 0.1], radius: 0.001 } |
| 39 | + } |
| 40 | +}) |
| 41 | +``` |
| 42 | + |
| 43 | +When `hints` is absent (or `hints.viewport` is absent), the compiler uses the |
| 44 | +current single-precision strategy (backward compatible). The `hints` object is |
| 45 | +open-ended — `viewport` is the initial field, leaving room for non-viewport |
| 46 | +hints later without ambiguity about what `center` and `radius` refer to. |
| 47 | + |
| 48 | +### CompilationResult Extension |
| 49 | + |
| 50 | +```typescript |
| 51 | +interface CompilationResult { |
| 52 | + success: boolean; |
| 53 | + run: Function; |
| 54 | + code?: string; |
| 55 | + |
| 56 | + // Cheap staleness check -- plain numbers, no closures |
| 57 | + staleWhen?: { |
| 58 | + radiusBelow?: number; // recompile when zoomed deeper than this |
| 59 | + radiusAbove?: number; // recompile when zoomed out past this |
| 60 | + centerDistance?: number; // recompile when center moves more than this |
| 61 | + }; |
| 62 | + |
| 63 | + // Scalar uniforms the shader needs |
| 64 | + uniforms?: Record<string, number>; |
| 65 | + |
| 66 | + // Texture data the shader needs (e.g., reference orbit) |
| 67 | + // Separated from uniforms because the upload path is fundamentally |
| 68 | + // different (createTexture + sampler vs uniform1f) |
| 69 | + textures?: Record<string, { |
| 70 | + data: Float32Array; |
| 71 | + width: number; |
| 72 | + height: number; |
| 73 | + format: 'r32f' | 'rg32f' | 'rgba32f'; |
| 74 | + }>; |
| 75 | +} |
| 76 | +``` |
| 77 | + |
| 78 | +`staleWhen` is plain serializable data. The plot engine checks it with a few |
| 79 | +number comparisons per viewport change -- trivially cheap. |
| 80 | + |
| 81 | +`uniforms` and `textures` are separated because their GPU upload paths are |
| 82 | +fundamentally different: scalar uniforms use `uniform1f`/`uniform1i`, while |
| 83 | +textures need `createTexture` + sampler setup + dimension metadata. The |
| 84 | +`format` field tells the plot engine the internal format to use when creating |
| 85 | +the texture (e.g., `gl.RG32F` for WebGL2). The reference orbit uses `rg32f`: |
| 86 | +each texel holds one orbit point as (re, im), which is the natural mapping. |
| 87 | +The plot engine uploads both without knowing what the data represents. |
| 88 | + |
| 89 | +### Precision Strategy Selection |
| 90 | + |
| 91 | +The compiler picks the strategy based on `hints.viewport.radius`: |
| 92 | + |
| 93 | +| Radius | Strategy | GPU cost | staleWhen | |
| 94 | +| ------------ | ---------------- | -------- | ----------------------------------------------------- | |
| 95 | +| > 10^-6 | Single float | 1x | `{ radiusBelow: 1e-6 }` | |
| 96 | +| 10^-6..10^-14| Emulated double | ~5x | `{ radiusBelow: 1e-14, radiusAbove: 1e-5 }` | |
| 97 | +| < 10^-14 | Perturbation | ~1.2x | `{ radiusAbove: 1e-5, radiusBelow: radius * 0.01, centerDistance: radius * 2.0 }` | |
| 98 | + |
| 99 | +When zooming out from perturbation, `radiusAbove: 1e-5` jumps directly to |
| 100 | +single-float, skipping the emulated-double tier. The brief passage through the |
| 101 | +10^-6 to 10^-5 range with single-float precision is acceptable for a transient |
| 102 | +zoom level and avoids a pointless intermediate recompilation. |
| 103 | + |
| 104 | +When zooming deeper within the perturbation tier, `radiusBelow: radius * 0.01` |
| 105 | +triggers recompilation after 100x further magnification. This is needed because |
| 106 | +the reference orbit's BigDecimal precision is calibrated to the zoom level at |
| 107 | +which it was computed — zooming significantly deeper requires recomputing the |
| 108 | +orbit at higher precision (more BigDecimal digits) and potentially more |
| 109 | +iterations. |
| 110 | + |
| 111 | +The perturbation `centerDistance` is set to `radius * 2.0` (two viewport |
| 112 | +widths). Perturbation theory tolerates reasonable center drift since deltas |
| 113 | +simply grow and glitch detection handles the worst cases. A conservative |
| 114 | +threshold avoids excessive recompilation during interactive panning. This can |
| 115 | +be tightened later if glitch frequency becomes a problem. |
| 116 | + |
| 117 | +When no hints are provided, single-float is used with no `staleWhen`. |
| 118 | + |
| 119 | +### Three GLSL Preambles |
| 120 | + |
| 121 | +**1. Single float** -- already exists (`GPU_FRACTAL_PREAMBLE_GLSL`). |
| 122 | + |
| 123 | +**2. Emulated double** -- new (~200 lines). Uses "double-single" (float-float) |
| 124 | +arithmetic where a number is stored as `vec2(hi, lo)` with `value = hi + lo`: |
| 125 | + |
| 126 | +```glsl |
| 127 | +vec2 ds_add(vec2 a, vec2 b); // Knuth TwoSum |
| 128 | +vec2 ds_mul(vec2 a, vec2 b); // Dekker split + TwoProduct |
| 129 | +vec2 ds_sqr(vec2 a); // optimized self-multiply |
| 130 | +
|
| 131 | +float _fractal_mandelbrot_dp(vec4 c, int maxIter) { |
| 132 | + // c.xy = hi parts of (re, im), c.zw = lo parts |
| 133 | + // Uses ds_add/ds_mul for z -> z^2 + c iteration |
| 134 | +} |
| 135 | +``` |
| 136 | + |
| 137 | +This gives ~48 bits of mantissa (~14 decimal digits) from two 32-bit floats. |
| 138 | +The Dekker/Knuth algorithms are well-established and widely used in GPU |
| 139 | +double-emulation. |
| 140 | + |
| 141 | +**3. Perturbation** -- new. Standard single-float arithmetic on small deltas: |
| 142 | + |
| 143 | +```glsl |
| 144 | +uniform sampler2D _refOrbit; // reference orbit Z_n as texture |
| 145 | +uniform int _refOrbitLen; |
| 146 | +
|
| 147 | +float _fractal_mandelbrot_pt(vec2 delta_c, int maxIter) { |
| 148 | + // delta_{n+1} = 2 * Z_n * delta_n + delta_n^2 + delta_c |
| 149 | + // Z_n fetched from _refOrbit texture row by row |
| 150 | + // Glitch detection: if |delta| > |Z|, rebase using emulated double |
| 151 | +} |
| 152 | +``` |
| 153 | + |
| 154 | +Perturbation theory: instead of iterating z -> z^2 + c per pixel, pick a |
| 155 | +reference point C0 at viewport center, compute its full orbit Z_0..Z_N once |
| 156 | +on the CPU at arbitrary precision, then for each pixel compute only the delta |
| 157 | +from that orbit. Since delta is the difference between nearby points, it stays |
| 158 | +small and single-float is sufficient. |
| 159 | + |
| 160 | +### Reference Orbit Computation |
| 161 | + |
| 162 | +When the compiler selects perturbation, it computes the reference orbit on the |
| 163 | +CPU using `BigDecimal` (the engine's existing arbitrary-precision library): |
| 164 | + |
| 165 | +```typescript |
| 166 | +// Inside the compiler, when perturbation is selected: |
| 167 | +const orbit = computeReferenceOrbit(center, maxIter); |
| 168 | +// Pack orbit into a texture (2 floats per texel: re, im) |
| 169 | +const orbitLen = orbit.length / 2; |
| 170 | +const texWidth = Math.min(orbitLen, 4096); |
| 171 | +const texHeight = Math.ceil(orbitLen / texWidth); |
| 172 | +result.textures = { |
| 173 | + _refOrbit: { |
| 174 | + data: new Float32Array(orbit), // [re0, im0, re1, im1, ...] |
| 175 | + width: texWidth, |
| 176 | + height: texHeight, |
| 177 | + format: 'rg32f', // 2 floats per texel: (re, im) per orbit point |
| 178 | + }, |
| 179 | +}; |
| 180 | +result.uniforms = { _refOrbitLen: orbitLen }; |
| 181 | +``` |
| 182 | + |
| 183 | +The plot engine uploads `textures` and `uniforms` to the GPU without knowing |
| 184 | +what they represent. |
| 185 | + |
| 186 | +### Async Boundary for Orbit Computation |
| 187 | + |
| 188 | +At deep zoom levels, iterating a BigDecimal Mandelbrot orbit for tens of |
| 189 | +thousands of iterations can take 100ms+. The `compile()` method itself remains |
| 190 | +**synchronous** — it returns `CompilationResult` directly. The async boundary |
| 191 | +lives in the **plot engine**, which is responsible for calling `compile()` off |
| 192 | +the main thread: |
| 193 | + |
| 194 | +- **Recommended**: the plot engine calls `compile()` inside a |
| 195 | + `requestIdleCallback` or `setTimeout`, then swaps the shader when ready |
| 196 | +- **Alternative**: the plot engine uses a Web Worker for the compile call |
| 197 | + (BigDecimal is pure JS, no DOM dependencies, worker-friendly) |
| 198 | + |
| 199 | +The compute engine does not impose an async API because: |
| 200 | +1. Most compilations (single-float, emulated double) are instant |
| 201 | +2. The plot engine already manages the stale-while-revalidate pattern and |
| 202 | + knows when to schedule heavy work |
| 203 | +3. Forcing a Promise return for the common fast path adds unnecessary |
| 204 | + complexity |
| 205 | + |
| 206 | +### Plot Engine Protocol |
| 207 | + |
| 208 | +```typescript |
| 209 | +let compiled = ce.compile(expr, { hints: { viewport } }); |
| 210 | + |
| 211 | +function onViewportChange(newViewport) { |
| 212 | + // Cheap staleness check (a few number comparisons) |
| 213 | + if (compiled.staleWhen) { |
| 214 | + const s = compiled.staleWhen; |
| 215 | + const stale = |
| 216 | + (s.radiusBelow && newViewport.radius < s.radiusBelow) || |
| 217 | + (s.radiusAbove && newViewport.radius > s.radiusAbove) || |
| 218 | + (s.centerDistance && |
| 219 | + dist(newViewport.center, oldCenter) > s.centerDistance); |
| 220 | + |
| 221 | + if (stale) { |
| 222 | + // Async recompile -- keep rendering old shader meanwhile |
| 223 | + recompileAsync(expr, { hints: { viewport: newViewport } }).then(c => { |
| 224 | + compiled = c; |
| 225 | + uploadUniforms(c.uniforms); |
| 226 | + uploadTextures(c.textures); |
| 227 | + }); |
| 228 | + } |
| 229 | + } |
| 230 | + render(compiled); |
| 231 | +} |
| 232 | +``` |
| 233 | + |
| 234 | +The protocol is generic: any compiled function can declare staleness and |
| 235 | +request uniforms/textures. Non-fractal functions ignore hints and return no |
| 236 | +`staleWhen`. |
| 237 | + |
| 238 | +**Debouncing note:** during active gestures (pinch-zoom, scroll), the viewport |
| 239 | +may cross staleness thresholds transiently. The plot engine should debounce |
| 240 | +recompilation — e.g., only fire after the viewport has been stable for ~100ms |
| 241 | +or after the gesture ends. This avoids wasted recompilations when the user |
| 242 | +zooms through a threshold and immediately reverses. This is a plot engine |
| 243 | +implementation detail, not a compute engine concern. |
| 244 | + |
| 245 | +### Glitch Detection (Perturbation) |
| 246 | + |
| 247 | +When the perturbation delta grows too large relative to the reference orbit |
| 248 | +value (|delta| > |Z|), the approximation breaks down ("glitch"). The shader |
| 249 | +detects this and rebases to absolute coordinates using emulated double |
| 250 | +arithmetic from the double-single preamble. |
| 251 | + |
| 252 | +This means the **perturbation preamble literally includes the ds_\* functions** |
| 253 | +from the emulated-double preamble. Phase 6 (perturbation shader) depends on |
| 254 | +Phase 2-3 (emulated double) not just conceptually but because the perturbation |
| 255 | +GLSL code calls `ds_add`, `ds_mul`, etc. directly for rebase operations. |
| 256 | + |
| 257 | +### What Stays the Same |
| 258 | + |
| 259 | +- `Mandelbrot(c, maxIter)` and `Julia(z, c, maxIter)` signatures unchanged |
| 260 | +- CPU evaluation unchanged (already uses JS `number` doubles; BigDecimal only |
| 261 | + used for reference orbit computation in perturbation mode) |
| 262 | +- Single-float GPU path unchanged |
| 263 | +- All existing tests pass without modification |
| 264 | +- Non-fractal expression compilation unaffected |
| 265 | + |
| 266 | +## Implementation Phases |
| 267 | + |
| 268 | +### Dependency graph |
| 269 | + |
| 270 | +``` |
| 271 | +Phase 1 (types) ──> Phase 2-3 (emulated double) ──> Phase 4 (strategy selection) |
| 272 | + ╲ |
| 273 | +Phase 1 (types) ──> Phase 5 (reference orbit) ───> Phase 6 (perturbation) |
| 274 | +``` |
| 275 | + |
| 276 | +Phases 2-3 and Phase 5 are **independent** and can be developed in parallel. |
| 277 | +Phase 6 depends on both (perturbation shader uses ds_* functions for glitch |
| 278 | +rebase and reference orbit data for the iteration). |
| 279 | + |
| 280 | +### Phase 1: Extend compile API |
| 281 | + |
| 282 | +Add `hints` (with nested `viewport`) to compile options. Add `staleWhen`, |
| 283 | +`uniforms`, and `textures` to the compilation result type. No behavioral |
| 284 | +change yet; all functions ignore hints. |
| 285 | + |
| 286 | +**Files:** `src/compute-engine/compilation/types.ts` |
| 287 | + |
| 288 | +### Phase 2: Emulated double arithmetic |
| 289 | + |
| 290 | +GLSL double-single helper functions: `ds_add`, `ds_sub`, `ds_mul`, `ds_sqr`, |
| 291 | +`ds_div`, `ds_sqrt`, `ds_cmp`. Implement as a new preamble constant |
| 292 | +`GPU_DS_ARITHMETIC_PREAMBLE_GLSL` (and WGSL variant). |
| 293 | + |
| 294 | +**Files:** `src/compute-engine/compilation/gpu-target.ts` |
| 295 | + |
| 296 | +### Phase 3: Emulated double Mandelbrot/Julia |
| 297 | + |
| 298 | +`_fractal_mandelbrot_dp` and `_fractal_julia_dp` preamble functions using the |
| 299 | +ds_* helpers. New preamble constant `GPU_FRACTAL_DP_PREAMBLE_GLSL`. |
| 300 | + |
| 301 | +**Files:** `src/compute-engine/compilation/gpu-target.ts` |
| 302 | + |
| 303 | +### Phase 4: Strategy selection |
| 304 | + |
| 305 | +When compiling `Mandelbrot`/`Julia` with viewport hints, select strategy based |
| 306 | +on `hints.viewport.radius`. Set `staleWhen` on the result. Emit the correct |
| 307 | +preamble and compile to the corresponding helper function call. |
| 308 | + |
| 309 | +**Coordinate passing:** the emulated-double path changes how coordinates reach |
| 310 | +the fractal function. The current single-float path passes `vec2(x, y)`. The |
| 311 | +emulated-double path must split each coordinate into hi/lo pairs and pass |
| 312 | +`vec4(re_hi, im_hi, re_lo, im_lo)` to `_fractal_mandelbrot_dp`. This means |
| 313 | +Phase 4 must also emit a coordinate-conversion wrapper or modify the shader |
| 314 | +entry point to perform the split (using `ds_split` from the ds_* preamble). |
| 315 | + |
| 316 | +**Files:** `src/compute-engine/compilation/gpu-target.ts`, |
| 317 | +`src/compute-engine/compilation/compile-expression.ts` |
| 318 | + |
| 319 | +### Phase 5: Reference orbit computation (parallelizable with Phases 2-3) |
| 320 | + |
| 321 | +CPU-side `BigDecimal` Mandelbrot iteration. Pack orbit into `Float32Array` and |
| 322 | +return via `textures`. Used when perturbation strategy is selected in Phase 6. |
| 323 | + |
| 324 | +**Files:** `src/compute-engine/library/fractals.ts` (or new |
| 325 | +`src/compute-engine/compilation/fractal-orbit.ts`) |
| 326 | + |
| 327 | +### Phase 6: Perturbation shader (depends on Phases 2-3 and 5) |
| 328 | + |
| 329 | +`_fractal_mandelbrot_pt` and `_fractal_julia_pt` preambles. Reference orbit |
| 330 | +read from texture. Glitch detection with rebase to emulated double (calls |
| 331 | +ds_* functions from Phase 2 directly). |
| 332 | + |
| 333 | +**Files:** `src/compute-engine/compilation/gpu-target.ts` |
| 334 | + |
| 335 | +### Phase 7 (future): Series approximation |
| 336 | + |
| 337 | +Skip early iterations via Taylor expansion of the perturbation formula. Major |
| 338 | +speedup for ultra-deep zooms but adds significant mathematical complexity. |
| 339 | +Deferred -- the system works without it, just slower at extreme depths. |
| 340 | + |
| 341 | +## Testing Strategy |
| 342 | + |
| 343 | +- Unit tests for ds_* arithmetic accuracy (compare against BigDecimal) |
| 344 | +- Snapshot tests for generated GLSL at each precision level |
| 345 | +- Integration test: compile Mandelbrot with various hint radii, verify correct |
| 346 | + strategy selection and staleWhen values |
| 347 | +- Regression: all existing fractal compilation tests unchanged |
0 commit comments