|
| 1 | +# Demosaic Algorithm Benchmark |
| 2 | + |
| 3 | +Comparison of darktable's existing demosaic methods (PPG, AMaZE, RCD, LMMSE, AMaZE+dual, RCD+dual) against the **Menon (2007)** and **ARI (Monno 2015)** implementations added in PR #20800, across multiple datasets and ISO conditions. |
| 4 | + |
| 5 | +Reproducible scripts, the Python reference implementation, and edge-image output PNGs are included. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## 1. Kodak-24 CPSNR (dB, average, low/mid/high ISO) |
| 10 | + |
| 11 | +Measured on the Kodak True Color set (kodim01–24) with additive Gaussian noise on the CFA input: |
| 12 | + |
| 13 | +- low: σ=0 (clean) |
| 14 | +- mid: σ=0.02 (~ISO 1600 equivalent) |
| 15 | +- high: σ=0.05 (~ISO 6400 equivalent) |
| 16 | + |
| 17 | +| method | low | mid | high | |
| 18 | +|---|---|---|---| |
| 19 | +| **ari** | **39.939** | 33.256 | 26.412 | |
| 20 | +| amaze | 39.134 | 32.954 | 26.282 | |
| 21 | +| amaze+dual | 38.759 | **33.373** | 26.464 | |
| 22 | +| rcd | 36.907 | 32.404 | 26.417 | |
| 23 | +| rcd+dual | 36.584 | 32.707 | **26.586** | |
| 24 | +| menon_r1 | 39.078 | 32.687 | 25.902 | |
| 25 | +| menon_r0 | 38.447 | 32.453 | 25.804 | |
| 26 | + |
| 27 | +**Observations**: |
| 28 | +- At low ISO, ari is top by +0.8 dB over amaze. |
| 29 | +- At mid/high ISO, dual-base methods win; ari degrades to parity. |
| 30 | +- Menon sits within 0.1 dB of amaze — no clear advantage at any ISO. |
| 31 | + |
| 32 | +--- |
| 33 | + |
| 34 | +## 2. Chroma zipper (5 edge-heavy Kodak images, mean) |
| 35 | + |
| 36 | +Measured on kodim01, 08, 13, 19, 20 — the images with the most high-frequency edge content. Metric: Δ = |Laplace(R−G) deviation| + |Laplace(B−G) deviation|. Reported inside an edge mask (top 15 % gradient), both local mean and local peak. |
| 37 | + |
| 38 | +| method | cz_mean ↓ | cz_peak ↓ | CPSNR (ref) | |
| 39 | +|---|---|---|---| |
| 40 | +| **ari** | **0.0620** | **0.0803** | 38.405 | |
| 41 | +| amaze | 0.0725 | 0.0901 | 36.892 | |
| 42 | +| amaze+dual | 0.0730 | 0.0907 | — | |
| 43 | +| menon_r1 | 0.0696 | 0.1375 | 36.942 | |
| 44 | +| menon_r0 | 0.0921 | 0.1533 | 35.942 | |
| 45 | +| rcd | 0.1151 | 0.1453 | 34.093 | |
| 46 | +| rcd+dual | 0.1146 | 0.1436 | — | |
| 47 | + |
| 48 | +**Observations**: |
| 49 | +- ari has the smallest chroma zipper on both mean and peak. Consistent with the paper's design goal (per-pixel adaptive RI+MLRI selection). |
| 50 | +- Menon mean is OK but peak is worst-tier — it spikes locally. |
| 51 | +- RCD is the weakest of the tested methods on zipper metrics (visible to the eye as well). |
| 52 | + |
| 53 | +Visual comparison PNGs: [`viewer_data_paper_vs_simple/iso_low/`](viewer_data_paper_vs_simple/iso_low/) — directories `__gt__`, `c_port_q2`, `py_ref`, `amaze`, `rcd`, `menon_r0`, `menon_r1` × kodim01/08/13/19/20. |
| 54 | + |
| 55 | +--- |
| 56 | + |
| 57 | +## 3. SIDD medium real raw (20 scenes × 5 phones, per-scene 3×3 CCM fit) |
| 58 | + |
| 59 | +Uses SIDD Medium's clean reference (gt_raw) and matching noisy raw (noisy_raw) for 20 scenes (G4, GP, IP, N6, S6, 4 frames each), demosaicked and compared against gt_srgb. A per-scene 3×3 CCM + bias fit is applied before CPSNR to absorb sensor colour-space differences so the number reflects demosaic behaviour rather than pipeline mismatch. |
| 60 | + |
| 61 | +| method | iso_clean CPSNR | iso_noise CPSNR | |
| 62 | +|---|---|---| |
| 63 | +| **ari** | **40.354** | 28.197 | |
| 64 | +| amaze+dual | 39.976 | 28.118 | |
| 65 | +| rcd+dual | 39.927 | **28.202** | |
| 66 | +| amaze | 38.299 | 27.651 | |
| 67 | +| rcd | 38.424 | 27.730 | |
| 68 | +| menon_r1 | 37.675 | 27.422 | |
| 69 | +| menon_r0 | 37.755 | 27.268 | |
| 70 | + |
| 71 | +**Observations**: |
| 72 | +- iso_clean (low-noise real raw): ari leads by +0.38 dB over amaze+dual. |
| 73 | +- iso_noise (real high-ISO noise): rcd+dual and ari at 28.20 / 28.197 — statistically tied. |
| 74 | +- Menon is bottom-tier even on clean real raw. |
| 75 | + |
| 76 | +### Chroma zipper (SIDD edges) |
| 77 | + |
| 78 | +| method | iso_clean cz_mean | iso_clean cz_peak | |
| 79 | +|---|---|---| |
| 80 | +| **ari** | **0.0407** | 0.2127 | |
| 81 | +| amaze+dual | 0.0442 | **0.1179** | |
| 82 | +| rcd+dual | 0.0444 | 0.1450 | |
| 83 | +| rcd | 0.0519 | 0.1621 | |
| 84 | +| amaze | 0.0520 | 0.1610 | |
| 85 | +| menon_r0 | 0.0484 | 0.1701 | |
| 86 | +| menon_r1 | 0.0497 | 0.1622 | |
| 87 | + |
| 88 | +--- |
| 89 | + |
| 90 | +## 4. Processing time (GCC 15.2 + OpenMP) |
| 91 | + |
| 92 | +### Kodak 0.4 MP (kodim19, 4 threads, internal timing) |
| 93 | + |
| 94 | +| method | time | Mpix/s | |
| 95 | +|---|---|---| |
| 96 | +| menon_r0 | 0.029s | 13.5 | |
| 97 | +| menon_r1 | 0.028s | 14.3 | |
| 98 | +| rcd | 0.039s | 10.1 | |
| 99 | +| amaze | 0.062s | 6.4 | |
| 100 | +| rcd+dual | 0.090s | 4.4 | |
| 101 | +| amaze+dual | 0.112s | 3.5 | |
| 102 | +| **ari** | **1.6s** | **0.25** | |
| 103 | + |
| 104 | +### SIDD 15.86 MP (16 threads) |
| 105 | + |
| 106 | +| method | time | |
| 107 | +|---|---| |
| 108 | +| rcd / amaze / menon | 0.5 – 1.5s | |
| 109 | +| rcd+dual / amaze+dual | 2 – 3s | |
| 110 | +| **ari (standalone)** | **53s** | |
| 111 | +| **ari (darktable pipeline, mire1.cr2 18.8 MP)** | **113s** | |
| 112 | + |
| 113 | +### ari optimization progression (SIDD 15.86 MP, 16 threads) |
| 114 | + |
| 115 | +| state | time | |
| 116 | +|---|---| |
| 117 | +| Initial C port (MSVC /O2, no OpenMP) | 180s | |
| 118 | +| + box-call reduction + integral reuse (phase 1) | 94s | |
| 119 | +| + case A paired box-sum sharing | 69s | |
| 120 | +| + case D separable 5×5 Gaussian | 60s | |
| 121 | +| + rolling sum (no integral image) | 53s (float 2-buffer; the double col_acc variant is 2.4× slower) | |
| 122 | +| + row-major vertical pass | **53s (current)** | |
| 123 | + |
| 124 | +Attempted but rejected (documented as reverts in commit history): |
| 125 | +- **Rolling sum with double col_acc + copy-back**: 2.4× slower at 15 MP due to double-precision intermediate doubling bandwidth. The float-only variant (adopted) avoids this. |
| 126 | +- **Fused `box(A*B*C)`**: eliminates the prep pass but serializes the integral-image scan loop (the accumulator has a dependency chain), defeating auto-vectorization; net −12 %. |
| 127 | + |
| 128 | +### Iteration-count sensitivity (ari with q = iter override) |
| 129 | + |
| 130 | +| iter | Kodak-24 CPSNR | cz_mean | Kodak time | |
| 131 | +|---|---|---|---| |
| 132 | +| 3 | 37.687 | 0.0827 | 0.72s | |
| 133 | +| 5 | 39.088 | 0.0686 | 1.08s | |
| 134 | +| 7 | 39.647 | 0.0644 | 1.50s | |
| 135 | +| 9 | 39.862 | 0.0628 | 1.89s | |
| 136 | +| **11** | **39.939** | **0.0620** | 2.37s | |
| 137 | + |
| 138 | +Paper default is iter=11. iter=7 captures ~87 % of the quality for ~37 % of the time — but this does not close the gap to the other methods. |
| 139 | + |
| 140 | +--- |
| 141 | + |
| 142 | +## 5. When to use each method |
| 143 | + |
| 144 | +| use case | recommended | |
| 145 | +|---|---| |
| 146 | +| interactive darkroom preview | rcd (default) / amaze | |
| 147 | +| high-ISO raw | rcd+dual / amaze+dual | |
| 148 | +| low-ISO stills, archival work, pixel-peeping | ari (if 1–2 minutes per 24 MP is acceptable) | |
| 149 | +| fastest at acceptable quality | menon_r1 (though no margin over amaze in practice) | |
| 150 | + |
| 151 | +--- |
| 152 | + |
| 153 | +## Reproduction |
| 154 | + |
| 155 | +### Building the standalone harness (MSYS2 UCRT64 + GCC) |
| 156 | + |
| 157 | +```bash |
| 158 | +cd demosaic_test |
| 159 | +bash build_gcc_omp.sh |
| 160 | +``` |
| 161 | + |
| 162 | +### Running benchmarks |
| 163 | + |
| 164 | +```bash |
| 165 | +# Full Kodak-24, low/mid/high ISO |
| 166 | +python bench_final_all.py --iso low |
| 167 | +python bench_final_all.py --iso mid |
| 168 | +python bench_final_all.py --iso high |
| 169 | + |
| 170 | +# Chroma zipper detail |
| 171 | +python bench_zipper_all_omp.py |
| 172 | + |
| 173 | +# SIDD real raw |
| 174 | +python bench_sidd_final.py |
| 175 | + |
| 176 | +# ari iteration-count sweep |
| 177 | +python bench_ari_iter_sweep.py |
| 178 | + |
| 179 | +# viewer PNGs (5 edge images × 7 methods) |
| 180 | +python _paper_vs_simple_save.py |
| 181 | +python _paper_vs_simple_add_amaze_rcd.py |
| 182 | +python _paper_vs_simple_add_menon.py |
| 183 | +``` |
| 184 | + |
| 185 | +### Building darktable on Windows (MSYS2 UCRT64) |
| 186 | + |
| 187 | +```bash |
| 188 | +# Required: UCRT64 env, pointing to gcc.exe/g++.exe directly, excluding mingw64 |
| 189 | +cmake -G Ninja \ |
| 190 | + -DCMAKE_C_COMPILER=/c/msys64/ucrt64/bin/gcc.exe \ |
| 191 | + -DCMAKE_CXX_COMPILER=/c/msys64/ucrt64/bin/g++.exe \ |
| 192 | + -DCMAKE_IGNORE_PATH=/c/msys64/mingw64 \ |
| 193 | + -DCMAKE_INSTALL_PREFIX=/c/msys64/opt/darktable \ |
| 194 | + -DCMAKE_BUILD_TYPE=Release \ |
| 195 | + .. |
| 196 | +MSYSTEM=UCRT64 ninja install |
| 197 | +``` |
| 198 | + |
| 199 | +--- |
| 200 | + |
| 201 | +## Included materials |
| 202 | + |
| 203 | +- [_paper_ari_ref.py](_paper_ari_ref.py) — Python paper-exact reference of ARI (structurally matches Monno's MATLAB code) |
| 204 | +- [_paper_ri_ref.py](_paper_ri_ref.py) — RI-only reference |
| 205 | +- [refs/matlab/](refs/matlab/) — Authors' original MATLAB code (RI / MLRI / ARI generations) |
| 206 | +- [viewer_data_paper_vs_simple/iso_low/](viewer_data_paper_vs_simple/iso_low/) — PNG outputs, 5 edge images × 7 methods |
| 207 | +- [viewer.html](viewer.html) — HTML viewer for split-comparison of the above PNGs |
| 208 | +- bench_*.py — reproducible scripts |
0 commit comments