Skip to content

Commit ae9ec40

Browse files
dregsistclaude
andcommitted
Add demosaic benchmarks (BENCHMARK.md / BENCHMARK_JA.md)
Accompanies darktable PR darktable-org#20800 (Menon 2007 + ARI Monno 2015 demosaic). Contents: - BENCHMARK.md / BENCHMARK_JA.md: full writeup, English + Japanese - bench_*.py: reproducible scripts for Kodak-24 (low/mid/high ISO), SIDD medium (iso_clean/iso_noise), chroma zipper, iteration sweep - _paper_ari_ref.py / _paper_ri_ref.py: Python paper-exact references - test_*.c / test_amaze.cc: standalone C test harnesses - viewer_data_paper_vs_simple/: 5 Kodak edge images x 7 methods PNGs - refs/matlab/: authors' original MATLAB code (RI/MLRI/ARI) - dt_tests_dtbuild/mire1_*.jpg: end-to-end darktable integration Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0 parents  commit ae9ec40

151 files changed

Lines changed: 9231 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

BENCHMARK.md

Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
# Demosaic Algorithm Benchmark
2+
3+
Comparison of darktable's existing demosaic methods (PPG, AMaZE, RCD, LMMSE, AMaZE+dual, RCD+dual) against the **Menon (2007)** and **ARI (Monno 2015)** implementations added in PR #20800, across multiple datasets and ISO conditions.
4+
5+
Reproducible scripts, the Python reference implementation, and edge-image output PNGs are included.
6+
7+
---
8+
9+
## 1. Kodak-24 CPSNR (dB, average, low/mid/high ISO)
10+
11+
Measured on the Kodak True Color set (kodim01–24) with additive Gaussian noise on the CFA input:
12+
13+
- low: σ=0 (clean)
14+
- mid: σ=0.02 (~ISO 1600 equivalent)
15+
- high: σ=0.05 (~ISO 6400 equivalent)
16+
17+
| method | low | mid | high |
18+
|---|---|---|---|
19+
| **ari** | **39.939** | 33.256 | 26.412 |
20+
| amaze | 39.134 | 32.954 | 26.282 |
21+
| amaze+dual | 38.759 | **33.373** | 26.464 |
22+
| rcd | 36.907 | 32.404 | 26.417 |
23+
| rcd+dual | 36.584 | 32.707 | **26.586** |
24+
| menon_r1 | 39.078 | 32.687 | 25.902 |
25+
| menon_r0 | 38.447 | 32.453 | 25.804 |
26+
27+
**Observations**:
28+
- At low ISO, ari is top by +0.8 dB over amaze.
29+
- At mid/high ISO, dual-base methods win; ari degrades to parity.
30+
- Menon sits within 0.1 dB of amaze — no clear advantage at any ISO.
31+
32+
---
33+
34+
## 2. Chroma zipper (5 edge-heavy Kodak images, mean)
35+
36+
Measured on kodim01, 08, 13, 19, 20 — the images with the most high-frequency edge content. Metric: Δ = |Laplace(R−G) deviation| + |Laplace(B−G) deviation|. Reported inside an edge mask (top 15 % gradient), both local mean and local peak.
37+
38+
| method | cz_mean ↓ | cz_peak ↓ | CPSNR (ref) |
39+
|---|---|---|---|
40+
| **ari** | **0.0620** | **0.0803** | 38.405 |
41+
| amaze | 0.0725 | 0.0901 | 36.892 |
42+
| amaze+dual | 0.0730 | 0.0907 ||
43+
| menon_r1 | 0.0696 | 0.1375 | 36.942 |
44+
| menon_r0 | 0.0921 | 0.1533 | 35.942 |
45+
| rcd | 0.1151 | 0.1453 | 34.093 |
46+
| rcd+dual | 0.1146 | 0.1436 ||
47+
48+
**Observations**:
49+
- ari has the smallest chroma zipper on both mean and peak. Consistent with the paper's design goal (per-pixel adaptive RI+MLRI selection).
50+
- Menon mean is OK but peak is worst-tier — it spikes locally.
51+
- RCD is the weakest of the tested methods on zipper metrics (visible to the eye as well).
52+
53+
Visual comparison PNGs: [`viewer_data_paper_vs_simple/iso_low/`](viewer_data_paper_vs_simple/iso_low/) — directories `__gt__`, `c_port_q2`, `py_ref`, `amaze`, `rcd`, `menon_r0`, `menon_r1` × kodim01/08/13/19/20.
54+
55+
---
56+
57+
## 3. SIDD medium real raw (20 scenes × 5 phones, per-scene 3×3 CCM fit)
58+
59+
Uses SIDD Medium's clean reference (gt_raw) and matching noisy raw (noisy_raw) for 20 scenes (G4, GP, IP, N6, S6, 4 frames each), demosaicked and compared against gt_srgb. A per-scene 3×3 CCM + bias fit is applied before CPSNR to absorb sensor colour-space differences so the number reflects demosaic behaviour rather than pipeline mismatch.
60+
61+
| method | iso_clean CPSNR | iso_noise CPSNR |
62+
|---|---|---|
63+
| **ari** | **40.354** | 28.197 |
64+
| amaze+dual | 39.976 | 28.118 |
65+
| rcd+dual | 39.927 | **28.202** |
66+
| amaze | 38.299 | 27.651 |
67+
| rcd | 38.424 | 27.730 |
68+
| menon_r1 | 37.675 | 27.422 |
69+
| menon_r0 | 37.755 | 27.268 |
70+
71+
**Observations**:
72+
- iso_clean (low-noise real raw): ari leads by +0.38 dB over amaze+dual.
73+
- iso_noise (real high-ISO noise): rcd+dual and ari at 28.20 / 28.197 — statistically tied.
74+
- Menon is bottom-tier even on clean real raw.
75+
76+
### Chroma zipper (SIDD edges)
77+
78+
| method | iso_clean cz_mean | iso_clean cz_peak |
79+
|---|---|---|
80+
| **ari** | **0.0407** | 0.2127 |
81+
| amaze+dual | 0.0442 | **0.1179** |
82+
| rcd+dual | 0.0444 | 0.1450 |
83+
| rcd | 0.0519 | 0.1621 |
84+
| amaze | 0.0520 | 0.1610 |
85+
| menon_r0 | 0.0484 | 0.1701 |
86+
| menon_r1 | 0.0497 | 0.1622 |
87+
88+
---
89+
90+
## 4. Processing time (GCC 15.2 + OpenMP)
91+
92+
### Kodak 0.4 MP (kodim19, 4 threads, internal timing)
93+
94+
| method | time | Mpix/s |
95+
|---|---|---|
96+
| menon_r0 | 0.029s | 13.5 |
97+
| menon_r1 | 0.028s | 14.3 |
98+
| rcd | 0.039s | 10.1 |
99+
| amaze | 0.062s | 6.4 |
100+
| rcd+dual | 0.090s | 4.4 |
101+
| amaze+dual | 0.112s | 3.5 |
102+
| **ari** | **1.6s** | **0.25** |
103+
104+
### SIDD 15.86 MP (16 threads)
105+
106+
| method | time |
107+
|---|---|
108+
| rcd / amaze / menon | 0.5 – 1.5s |
109+
| rcd+dual / amaze+dual | 2 – 3s |
110+
| **ari (standalone)** | **53s** |
111+
| **ari (darktable pipeline, mire1.cr2 18.8 MP)** | **113s** |
112+
113+
### ari optimization progression (SIDD 15.86 MP, 16 threads)
114+
115+
| state | time |
116+
|---|---|
117+
| Initial C port (MSVC /O2, no OpenMP) | 180s |
118+
| + box-call reduction + integral reuse (phase 1) | 94s |
119+
| + case A paired box-sum sharing | 69s |
120+
| + case D separable 5×5 Gaussian | 60s |
121+
| + rolling sum (no integral image) | 53s (float 2-buffer; the double col_acc variant is 2.4× slower) |
122+
| + row-major vertical pass | **53s (current)** |
123+
124+
Attempted but rejected (documented as reverts in commit history):
125+
- **Rolling sum with double col_acc + copy-back**: 2.4× slower at 15 MP due to double-precision intermediate doubling bandwidth. The float-only variant (adopted) avoids this.
126+
- **Fused `box(A*B*C)`**: eliminates the prep pass but serializes the integral-image scan loop (the accumulator has a dependency chain), defeating auto-vectorization; net −12 %.
127+
128+
### Iteration-count sensitivity (ari with q = iter override)
129+
130+
| iter | Kodak-24 CPSNR | cz_mean | Kodak time |
131+
|---|---|---|---|
132+
| 3 | 37.687 | 0.0827 | 0.72s |
133+
| 5 | 39.088 | 0.0686 | 1.08s |
134+
| 7 | 39.647 | 0.0644 | 1.50s |
135+
| 9 | 39.862 | 0.0628 | 1.89s |
136+
| **11** | **39.939** | **0.0620** | 2.37s |
137+
138+
Paper default is iter=11. iter=7 captures ~87 % of the quality for ~37 % of the time — but this does not close the gap to the other methods.
139+
140+
---
141+
142+
## 5. When to use each method
143+
144+
| use case | recommended |
145+
|---|---|
146+
| interactive darkroom preview | rcd (default) / amaze |
147+
| high-ISO raw | rcd+dual / amaze+dual |
148+
| low-ISO stills, archival work, pixel-peeping | ari (if 1–2 minutes per 24 MP is acceptable) |
149+
| fastest at acceptable quality | menon_r1 (though no margin over amaze in practice) |
150+
151+
---
152+
153+
## Reproduction
154+
155+
### Building the standalone harness (MSYS2 UCRT64 + GCC)
156+
157+
```bash
158+
cd demosaic_test
159+
bash build_gcc_omp.sh
160+
```
161+
162+
### Running benchmarks
163+
164+
```bash
165+
# Full Kodak-24, low/mid/high ISO
166+
python bench_final_all.py --iso low
167+
python bench_final_all.py --iso mid
168+
python bench_final_all.py --iso high
169+
170+
# Chroma zipper detail
171+
python bench_zipper_all_omp.py
172+
173+
# SIDD real raw
174+
python bench_sidd_final.py
175+
176+
# ari iteration-count sweep
177+
python bench_ari_iter_sweep.py
178+
179+
# viewer PNGs (5 edge images × 7 methods)
180+
python _paper_vs_simple_save.py
181+
python _paper_vs_simple_add_amaze_rcd.py
182+
python _paper_vs_simple_add_menon.py
183+
```
184+
185+
### Building darktable on Windows (MSYS2 UCRT64)
186+
187+
```bash
188+
# Required: UCRT64 env, pointing to gcc.exe/g++.exe directly, excluding mingw64
189+
cmake -G Ninja \
190+
-DCMAKE_C_COMPILER=/c/msys64/ucrt64/bin/gcc.exe \
191+
-DCMAKE_CXX_COMPILER=/c/msys64/ucrt64/bin/g++.exe \
192+
-DCMAKE_IGNORE_PATH=/c/msys64/mingw64 \
193+
-DCMAKE_INSTALL_PREFIX=/c/msys64/opt/darktable \
194+
-DCMAKE_BUILD_TYPE=Release \
195+
..
196+
MSYSTEM=UCRT64 ninja install
197+
```
198+
199+
---
200+
201+
## Included materials
202+
203+
- [_paper_ari_ref.py](_paper_ari_ref.py) — Python paper-exact reference of ARI (structurally matches Monno's MATLAB code)
204+
- [_paper_ri_ref.py](_paper_ri_ref.py) — RI-only reference
205+
- [refs/matlab/](refs/matlab/) — Authors' original MATLAB code (RI / MLRI / ARI generations)
206+
- [viewer_data_paper_vs_simple/iso_low/](viewer_data_paper_vs_simple/iso_low/) — PNG outputs, 5 edge images × 7 methods
207+
- [viewer.html](viewer.html) — HTML viewer for split-comparison of the above PNGs
208+
- bench_*.py — reproducible scripts

0 commit comments

Comments
 (0)