LJpeg: support predictor modes 2–7#963
LJpeg: support predictor modes 2–7#963da-phil wants to merge 8 commits intodarktable-org:developfrom
Conversation
Summary: rawspeed's LJpeg decoder previously supported only predictor mode 1 (left neighbor). ITU-T T.81 defines seven predictor modes, and several camera manufacturers — notably DJI (Mavic 3S, Mavic 3 Pro, etc.) and Blackmagic — use mode 6 in their DNG files. This change adds support for all seven modes and handles the associated tile geometry those cameras use. Predictor modes 2–7 * Added a computePrediction(mode, Ra, Rb, Rc) helper that computes the seven ITU-T T.81 predictions. Arithmetic is done in int32_t to avoid overflow in modes 4–6, where Ra + Rb - Rc can transiently exceed the 16-bit range; the final result wraps via uint16_t cast per the standard. * decodeRowN() is extended to accept predMode and prevStripe. For mode 1, the existing fast path (no memory access to the previous row) is preserved. For modes 2–7, the three 2D neighbors (Ra = left, Rb = above, Rc = above-left) are looked up from prevStripe; at the start of a row the above-left neighbor is not available, so Rc = Rb per the JPEG specification. * decodeN() now tracks isFirstRow per restart interval. The first row of each interval always uses predictor mode 1 (per T.81: the standard mandates horizontal prediction for the first row). For subsequent rows, prevStripe is a CroppedArray2DRef into the already-decoded portion of the output image. No extra allocation is needed. Inverted tile reshape (LJpegDecoder) Inverted tile reshape DJI DNGs present tiles in a non-standard geometry: the JPEG SOF3 frame is wider than the declared tile (e.g. 8000×1500 for a 4000×3000 tile). Each JPEG row encodes two tile rows concatenated side-by-side — a widthPack = jpegFrameDim.x / tileW packing. This is the inverse of the Adobe-style reshape rawspeed already handles (MCU > 1×1). Design trade-offs: Separate decode-then-deinterleave vs. in-place: Reusing the existing LJpegDecompressor in-place would require threading the tile geometry inversion through all MCU addressing logic. Instead, the inverted-reshape path decodes into a temporary RawImage at the JPEG frame dimensions, then copies/deinterleaves rows into the real output image. The extra allocation is bounded by a single tile's worth of data and avoids touching the hot-path decompressor. widthPack validation: The factor is computed from the ratio jpegFrameDim.x / maxRes.x and validated against widthPack * jpegFrameDim.y == maxRes.y before decoding begins. An upper bound of 4 is enforced as a sanity check. The inverted reshape is restricted to single-component LJpeg (N_COMP == 1), which is the only case seen in practice for this layout. Fuzz harness The LJpegDecompressor fuzz entry point is updated to source a predictorMode byte from the fuzzer input, covering the new code paths.
Eliminate per-pixel predMode branch (performance) decodeRowN gains a bool Use2DPred template parameter. The mode-1 (left-neighbor) path and the 2D-predictor path are now separate compile-time instantiations selected with if constexpr. The runtime if (predMode == 1) that previously executed for every pixel in every tile is gone entirely. Mode-1 files — the overwhelming majority of existing DNG content — are completely unaffected at the generated-code level; the compiler produces the same tight loop as before. For modes 2–7, the computePrediction switch is also folded away since predictorMode is now only evaluated once at the per-row dispatch in decodeN(), not per-pixel. The first-column Rc check is also simplified: stripeCol >= MCUSize.x (computed per-pixel) is replaced with mcuIdx > 0 (a loop-index comparison available for free), and the redundant stripeRow alias is removed. Replace scalar deinterleave with memcpy (performance) In the inverted-reshape path (LJpegDecoder), the per-pixel operator() copy loop is replaced by a single std::memcpy per row-segment. The previous loop also carried a data-dependent bounds check (srcCol + col < jpegFrameDim.x) that blocked auto-vectorisation; the check is provably redundant given the dimension validation already performed, so it is removed. memcpy allows the compiler to emit an optimal vector store.
|
The proposed diff is not cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch |
|
@LebedevRI @kmilos do I need to update any other meta data in the repo for "full camera integration"? |
Some DNG files encode 2-component tiles (e.g., 592×158) as JPEG frames with the component dimension packed horizontally into a narrower, taller JPEG (592×79, 2 components). The JPEG SOF reports half the tile height, with each row encoding two consecutive output rows' worth of data — components are always interleaved horizontally, so the effective decoded width per JPEG row is jpegFrameDim.x × N_COMP = 1184.
LJpegDecompressor.cpp — allow MCU{1,2} construction and dispatch
The constructor's MCU allowlist and decode()'s dispatch table gain {1,2}. This is needed so the decompressor can be instantiated when the caller passes MCU{1,2} (it will never be used for actual pixel output after the decoder fix below, but the validation and dispatch must not reject it at an early stage).
LJpegDecoder.cpp — detect vertical MCU and route to the inverted reshape path
Two complementary changes:
* In the standard (non-inverted) path, after computing MCUSize = maxRes / jpegFrameDim, gate the direct-decode branch on MCUSize.x >= MCUSize.y. When MCUSize is purely vertical (e.g., {1,2}), fall through instead of decoding in-place.
* Generalize the inverted reshape path to handle N_COMP > 1:
* effectiveJpegWidth = jpegFrameDim.x × N_COMP replaces jpegFrameDim.x everywhere in width calculations.
* The temporary decode buffer is sized effectiveJpegWidth × jpegFrameDim.y.
* MCU is set to {N_COMP, 1} (horizontal interleaving), not hardcoded {1,1}.
* The N_COMP != 1 guard is removed.
* widthPack is derived from effectiveJpegWidth / maxRes.x, so for the 2-comp example: 1184 / 592 = 2, matching the tile's 158 / 79 = 2× height ratio.
This correctly reconstructs the full-width, full-height tile from the narrower, shorter JPEG frame, yielding proper CFA Bayer output instead of the full-image purple/orange vertical stripe pattern that occurred when the two components were written to separate rows.
8ad4f39 to
d8577b1
Compare
|
Have you tested your implementation against all predictor modes? I've added a sample set in #334 (comment) |
|
@da-phil Apart from the ljpeg decoding, I don't think anything else is needed for "base" support of these. After that, there is a known issue (and current workaround in darktable, not rawspeed) for the GainMap in OpCodeList1 (to be applied straight after deciding) which is for vignetting in DJI raws. |
|
@da-phil Note also #475 (comment) |
(stupid github app, first it showed my comment two times and when deleting the 2nd one, both disappeared) Yes, I did test it on your test pictures, see PR description, all images can be decoded successfully now. Can you recommend me to look at other test images, e.g. from other camera manufacturers like Blackmagic? |
What does this mean in practice for this PR? I understand that you're going to do a hard cut once the other RAW decoder lib achieved parity and stability to be used by darktable, but how far is the progress, is it tracked somewhere? |
|
You can find a couple of Blackmagic samples on raw.pixls.us As this change is not trivial, only @LebedevRI can comment. The rewrite can be found at https://github.com/darktable-org/rawspeed.rs with an issue containing the roadmap and progress. |
I like the fast progress of "software corrosion" nowadays, and the humor of rust folks ;) Will try some Blackmagic RAW samples in the meantime, thanks! Regarding roadmap & progress: looks like feature parity is far away in the future so far... I wonder if @LebedevRI is looking for contributors to accelerate feature parity. |
|
The proposed diff is not cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch |
ca973e9 to
aeb1719
Compare
|
Can I assume that it is okay to ignore the output of the ClangCTUStaticAnalysis & fuzzer build jobs, as they seem to discover a lot of things which are independent of my changes? Depending on how much longer we want to maintain this repo, I'd be willing to contribute to the investigation of the issues in a separate PR. |
Blackmagic CinemaDNG files don't include TIFF Make/Model tags — they only have UniqueCameraModel (tag 50708), which is valid per the DNG spec. In decodeMetaDataInternal(), the code called getID() unconditionally. Instead of calling getID() in a try/catch (which logged a spurious error), check for MAKE+MODEL existence first. If present, call getID() as before. If absent, fall back to UNIQUECAMERAMODEL for both make and model fields — mirroring the approach already used in checkSupportInternal().
4ceecda to
a167a44
Compare
|
The proposed diff is not cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch |
607fc8a to
cb0c24e
Compare
|
Current state: Also Blackmagic Camera DNG RAWs are decoded, with one caveat:
On libjpeg-turbo versions < 3.0, the error is now "Unsupported JPEG data precision 12" (much clearer). EDIT: due to the current unavailability of libjpeg-turbo 3.0+ for the project and the slight scope creep of this PR, I decided to remove the still experimental 12bit lossless JPEG path and only provide the comprehensive error msg instead. |
cb0c24e to
a37c196
Compare
|
|
||
| // Scan markers after SOI. Stop after a reasonable number of bytes. | ||
| const auto limit = | ||
| std::min(remaining, static_cast<ByteStream::size_type>(1024)); |
a37c196 to
5efdb0e
Compare
5efdb0e to
2e27a99
Compare
Some Blackmagic CinemaDNG files (Pocket Cinema Camera 4K files (3)/(4), Micro Cinema Camera (1)) use SOF1 (Extended Sequential DCT) at 12-bit precision but label tiles as TIFF compression=7 (lossless JPEG). The lossless JPEG decoder hit a DQT marker and threw "Not a valid RAW file." On libjpeg-turbo 2.1.5, the error is now "Unsupported JPEG data precision 12" (much clearer). Future work: on systems with libjpeg-turbo 3.0+, we can enable full 12-bit lossy JPEG decoding.
2e27a99 to
8e10cee
Compare
|
Update: due to the current unavailability of libjpeg-turbo 3.0+ for the project, and the slight scope creep of this PR, I decided to remove the still experimental 12bit lossless JPEG path and only provide the comprehensive error msg instead. |
This PR is supposed to address #189 and #258 and was created with the help of Claude.
Summary
rawspeed's LJpeg decoder previously supported only predictor mode 1 (left neighbor). ITU-T T.81 defines seven predictor modes, and several camera manufacturers — notably DJI (Mavic 3S, Mavic 3 Pro, etc.) and Blackmagic — use mode 6 in their DNG files. This change adds support for all seven modes and handles the associated tile geometry those cameras use.
Predictor modes 2–7
Inverted tile reshape
DJI DNGs present tiles in a non-standard geometry: the JPEG SOF3 frame is wider than the declared tile (e.g. 8000×1500 for a 4000×3000 tile). Each JPEG row encodes two tile rows concatenated side-by-side — a widthPack = jpegFrameDim.x / tileW packing. This is the inverse of the Adobe-style reshape rawspeed already handles (MCU > 1×1).
Design trade-offs
widthPackvalidation: The factor is computed from the ratio jpegFrameDim.x / maxRes.x and validated against widthPack * jpegFrameDim.y == maxRes.y before decoding begins. An upper bound of 4 is enforced as a sanity check.Fuzz harness
The LJpegDecompressor fuzz entry point is updated to source a predictorMode byte from the fuzzer input, covering the new code paths.
Testing
Additionally to sucessfully testing my DJI Mavic Air 3S DNG files I also downloaded this tarball (shared by @cytrinox here) containing DNG images with various encodings and tested all with the following result:
Fixes #189
Fixes #258
Fixes #191
In order to also fix #190, we need to add a decoding path for 12bit lossy JPEG encoding, this is going to be future work.