Skip to content

LJpeg: support predictor modes 2–7#963

Open
da-phil wants to merge 8 commits intodarktable-org:developfrom
da-phil:add_ljpeg_predictor_modes_2_7
Open

LJpeg: support predictor modes 2–7#963
da-phil wants to merge 8 commits intodarktable-org:developfrom
da-phil:add_ljpeg_predictor_modes_2_7

Conversation

@da-phil
Copy link
Copy Markdown
Contributor

@da-phil da-phil commented Apr 18, 2026

This PR is supposed to address #189 and #258 and was created with the help of Claude.

Summary

rawspeed's LJpeg decoder previously supported only predictor mode 1 (left neighbor). ITU-T T.81 defines seven predictor modes, and several camera manufacturers — notably DJI (Mavic 3S, Mavic 3 Pro, etc.) and Blackmagic — use mode 6 in their DNG files. This change adds support for all seven modes and handles the associated tile geometry those cameras use.

Predictor modes 2–7

  • Added a computePrediction(mode, Ra, Rb, Rc) helper that computes the seven ITU-T T.81 predictions. Arithmetic is done in int32_t to avoid overflow in modes 4–6, where Ra + Rb - Rc can transiently exceed the 16-bit range; the final result wraps via uint16_t cast per the standard.
  • decodeRowN() is extended to accept predMode and prevStripe. For mode 1, the existing fast path (no memory access to the previous row) is preserved. For modes 2–7, the three 2D neighbors (Ra = left, Rb = above, Rc = above-left) are looked up from prevStripe; at the start of a row the above-left neighbor is not available, so Rc = Rb per the JPEG specification.
  • decodeN() now tracks isFirstRow per restart interval. The first row of each interval always uses predictor mode 1 (per T.81: the standard mandates horizontal prediction for the first row). For subsequent rows, prevStripe is a CroppedArray2DRef into the already-decoded portion of the output image. No extra allocation is needed. Inverted tile reshape (LJpegDecoder)

Inverted tile reshape

DJI DNGs present tiles in a non-standard geometry: the JPEG SOF3 frame is wider than the declared tile (e.g. 8000×1500 for a 4000×3000 tile). Each JPEG row encodes two tile rows concatenated side-by-side — a widthPack = jpegFrameDim.x / tileW packing. This is the inverse of the Adobe-style reshape rawspeed already handles (MCU > 1×1).

Design trade-offs

  • Separate decode-then-deinterleave vs. in-place: Reusing the existing LJpegDecompressor in-place would require threading the tile geometry inversion through all MCU addressing logic. Instead, the inverted-reshape path decodes into a temporary RawImage at the JPEG frame dimensions, then copies/deinterleaves rows into the real output image. The extra allocation is bounded by a single tile's worth of data and avoids touching the hot-path decompressor.
  • widthPack validation: The factor is computed from the ratio jpegFrameDim.x / maxRes.x and validated against widthPack * jpegFrameDim.y == maxRes.y before decoding begins. An upper bound of 4 is enforced as a sanity check.
  • The inverted reshape is restricted to single-component LJpeg (N_COMP == 1), which is the only case seen in practice for this layout.

Fuzz harness

The LJpegDecompressor fuzz entry point is updated to source a predictorMode byte from the fuzzer input, covering the new code paths.

Testing

Additionally to sucessfully testing my DJI Mavic Air 3S DNG files I also downloaded this tarball (shared by @cytrinox here) containing DNG images with various encodings and tested all with the following result:

  • cfaraw_16bit_tiled_pred1_2comp_doublewidth.dng ✔️
  • cfaraw_16bit_tiled_pred3_2comp_normalwidth.dng ✔️
  • cfaraw_16bit_tiled_pred4_2comp_doublewidth.dng ✔️
  • cfaraw_16bit_tiled_pred4_2comp_normalwidth.dng ✔️
  • cfaraw_16bit_tiled_pred5_2comp_doublewidth.dng ✔️
  • cfaraw_16bit_tiled_pred5_2comp_normalwidth.dng ✔️
  • cfaraw_16bit_tiled_pred6_1comp_doublewidth.dng ✔️
  • cfaraw_16bit_tiled_pred7_2comp_doublewidth.dng ✔️
  • cfaraw_16bit_tiled_pred7_2comp_normalwidth.dng ✔️

Fixes #189
Fixes #258
Fixes #191

In order to also fix #190, we need to add a decoding path for 12bit lossy JPEG encoding, this is going to be future work.

da-phil added 2 commits April 18, 2026 11:43
Summary:

rawspeed's LJpeg decoder previously supported only predictor mode 1 (left neighbor). ITU-T T.81 defines seven predictor modes, and several camera manufacturers — notably DJI (Mavic 3S, Mavic 3 Pro, etc.) and Blackmagic — use mode 6 in their DNG files. This change adds support for all seven modes and handles the associated tile geometry those cameras use.

Predictor modes 2–7

* Added a computePrediction(mode, Ra, Rb, Rc) helper that computes the seven ITU-T T.81 predictions. Arithmetic is done in int32_t to avoid overflow in modes 4–6, where Ra + Rb - Rc can transiently exceed the 16-bit range; the final result wraps via uint16_t cast per the standard.
* decodeRowN() is extended to accept predMode and prevStripe. For mode 1, the existing fast path (no memory access to the previous row) is preserved. For modes 2–7, the three 2D neighbors (Ra = left, Rb = above, Rc = above-left) are looked up from prevStripe; at the start of a row the above-left neighbor is not available, so Rc = Rb per the JPEG specification.
* decodeN() now tracks isFirstRow per restart interval. The first row of each interval always uses predictor mode 1 (per T.81: the standard mandates horizontal prediction for the first row). For subsequent rows, prevStripe is a CroppedArray2DRef into the already-decoded portion of the output image. No extra allocation is needed.
Inverted tile reshape (LJpegDecoder)

Inverted tile reshape

DJI DNGs present tiles in a non-standard geometry: the JPEG SOF3 frame is wider than the declared tile (e.g. 8000×1500 for a 4000×3000 tile). Each JPEG row encodes two tile rows concatenated side-by-side — a widthPack = jpegFrameDim.x / tileW packing. This is the inverse of the Adobe-style reshape rawspeed already handles (MCU > 1×1).

Design trade-offs:

Separate decode-then-deinterleave vs. in-place: Reusing the existing LJpegDecompressor in-place would require threading the tile geometry inversion through all MCU addressing logic. Instead, the inverted-reshape path decodes into a temporary RawImage at the JPEG frame dimensions, then copies/deinterleaves rows into the real output image. The extra allocation is bounded by a single tile's worth of data and avoids touching the hot-path decompressor.
widthPack validation: The factor is computed from the ratio jpegFrameDim.x / maxRes.x and validated against widthPack * jpegFrameDim.y == maxRes.y before decoding begins. An upper bound of 4 is enforced as a sanity check.
The inverted reshape is restricted to single-component LJpeg (N_COMP == 1), which is the only case seen in practice for this layout.
Fuzz harness

The LJpegDecompressor fuzz entry point is updated to source a predictorMode byte from the fuzzer input, covering the new code paths.
Eliminate per-pixel predMode branch (performance)

decodeRowN gains a bool Use2DPred template parameter. The mode-1 (left-neighbor) path and the 2D-predictor path are now separate compile-time instantiations selected with if constexpr. The runtime if (predMode == 1) that previously executed for every pixel in every tile is gone entirely. Mode-1 files — the overwhelming majority of existing DNG content — are completely unaffected at the generated-code level; the compiler produces the same tight loop as before. For modes 2–7, the computePrediction switch is also folded away since predictorMode is now only evaluated once at the per-row dispatch in decodeN(), not per-pixel.

The first-column Rc check is also simplified: stripeCol >= MCUSize.x (computed per-pixel) is replaced with mcuIdx > 0 (a loop-index comparison available for free), and the redundant stripeRow alias is removed.

Replace scalar deinterleave with memcpy (performance)

In the inverted-reshape path (LJpegDecoder), the per-pixel operator() copy loop is replaced by a single std::memcpy per row-segment. The previous loop also carried a data-dependent bounds check (srcCol + col < jpegFrameDim.x) that blocked auto-vectorisation; the check is provably redundant given the dimension validation already performed, so it is removed. memcpy allows the compiler to emit an optimal vector store.
@da-phil da-phil requested a review from LebedevRI as a code owner April 18, 2026 10:00
@github-actions
Copy link
Copy Markdown

The proposed diff is not clang-formatted.
To make this check pass, download the following patch
(via browser, you must be logged-in in order for this URL to work),
(NOTE: save it into the repo checkout dir for the snippet to work)
https://github.com/darktable-org/rawspeed/actions/runs/24602278749/artifacts/6510287982
... and run:

cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch

@da-phil
Copy link
Copy Markdown
Contributor Author

da-phil commented Apr 18, 2026

@LebedevRI @kmilos do I need to update any other meta data in the repo for "full camera integration"?
I own a DJI Mavic air 3S and could gather data myself if necessary.
For the other cameras this PR enables within rawspeed, we would need to ask other contributors.

Comment thread src/librawspeed/decompressors/LJpegDecoder.cpp Fixed
Comment thread src/librawspeed/decompressors/LJpegDecoder.cpp Fixed
Some DNG files encode 2-component tiles (e.g., 592×158) as JPEG frames with the component dimension packed horizontally into a narrower, taller JPEG (592×79, 2 components). The JPEG SOF reports half the tile height, with each row encoding two consecutive output rows' worth of data — components are always interleaved horizontally, so the effective decoded width per JPEG row is jpegFrameDim.x × N_COMP = 1184.

LJpegDecompressor.cpp — allow MCU{1,2} construction and dispatch

The constructor's MCU allowlist and decode()'s dispatch table gain {1,2}. This is needed so the decompressor can be instantiated when the caller passes MCU{1,2} (it will never be used for actual pixel output after the decoder fix below, but the validation and dispatch must not reject it at an early stage).

LJpegDecoder.cpp — detect vertical MCU and route to the inverted reshape path

Two complementary changes:

* In the standard (non-inverted) path, after computing MCUSize = maxRes / jpegFrameDim, gate the direct-decode branch on MCUSize.x >= MCUSize.y. When MCUSize is purely vertical (e.g., {1,2}), fall through instead of decoding in-place.

* Generalize the inverted reshape path to handle N_COMP > 1:
  * effectiveJpegWidth = jpegFrameDim.x × N_COMP replaces jpegFrameDim.x everywhere in width calculations.
  * The temporary decode buffer is sized effectiveJpegWidth × jpegFrameDim.y.
  * MCU is set to {N_COMP, 1} (horizontal interleaving), not hardcoded {1,1}.
  * The N_COMP != 1 guard is removed.
  * widthPack is derived from effectiveJpegWidth / maxRes.x, so for the 2-comp example: 1184 / 592 = 2, matching the tile's 158 / 79 = 2× height ratio.

This correctly reconstructs the full-width, full-height tile from the narrower, shorter JPEG frame, yielding proper CFA Bayer output instead of the full-image purple/orange vertical stripe pattern that occurred when the two components were written to separate rows.
@da-phil da-phil force-pushed the add_ljpeg_predictor_modes_2_7 branch from 8ad4f39 to d8577b1 Compare April 18, 2026 11:28
@cytrinox
Copy link
Copy Markdown
Contributor

Have you tested your implementation against all predictor modes? I've added a sample set in #334 (comment)

@kmilos
Copy link
Copy Markdown
Collaborator

kmilos commented Apr 18, 2026

@da-phil Apart from the ljpeg decoding, I don't think anything else is needed for "base" support of these. After that, there is a known issue (and current workaround in darktable, not rawspeed) for the GainMap in OpCodeList1 (to be applied straight after deciding) which is for vignetting in DJI raws.

@kmilos
Copy link
Copy Markdown
Collaborator

kmilos commented Apr 19, 2026

@da-phil Note also #475 (comment)

@da-phil
Copy link
Copy Markdown
Contributor Author

da-phil commented Apr 19, 2026

Have you tested your implementation against all predictor modes? I've added a sample set in #334 (comment)

(stupid github app, first it showed my comment two times and when deleting the 2nd one, both disappeared)

Yes, I did test it on your test pictures, see PR description, all images can be decoded successfully now.

Can you recommend me to look at other test images, e.g. from other camera manufacturers like Blackmagic?

@da-phil
Copy link
Copy Markdown
Contributor Author

da-phil commented Apr 19, 2026

@da-phil Note also #475 (comment)

What does this mean in practice for this PR? I understand that you're going to do a hard cut once the other RAW decoder lib achieved parity and stability to be used by darktable, but how far is the progress, is it tracked somewhere?

@kmilos
Copy link
Copy Markdown
Collaborator

kmilos commented Apr 19, 2026

You can find a couple of Blackmagic samples on raw.pixls.us

As this change is not trivial, only @LebedevRI can comment. The rewrite can be found at https://github.com/darktable-org/rawspeed.rs with an issue containing the roadmap and progress.

@da-phil
Copy link
Copy Markdown
Contributor Author

da-phil commented Apr 19, 2026

You can find a couple of Blackmagic samples on raw.pixls.us

As this change is not trivial, only @LebedevRI can comment. The rewrite can be found at https://github.com/darktable-org/rawspeed.rs with an issue containing the roadmap and progress.

I like the fast progress of "software corrosion" nowadays, and the humor of rust folks ;)

Will try some Blackmagic RAW samples in the meantime, thanks!

Regarding roadmap & progress: looks like feature parity is far away in the future so far... I wonder if @LebedevRI is looking for contributors to accelerate feature parity.

@github-actions
Copy link
Copy Markdown

The proposed diff is not clang-formatted.
To make this check pass, download the following patch
(via browser, you must be logged-in in order for this URL to work),
(NOTE: save it into the repo checkout dir for the snippet to work)
https://github.com/darktable-org/rawspeed/actions/runs/24627775015/artifacts/6517749030
... and run:

cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch

@da-phil da-phil force-pushed the add_ljpeg_predictor_modes_2_7 branch from ca973e9 to aeb1719 Compare April 19, 2026 11:30
@da-phil
Copy link
Copy Markdown
Contributor Author

da-phil commented Apr 19, 2026

Can I assume that it is okay to ignore the output of the ClangCTUStaticAnalysis & fuzzer build jobs, as they seem to discover a lot of things which are independent of my changes?

Depending on how much longer we want to maintain this repo, I'd be willing to contribute to the investigation of the issues in a separate PR.

Comment thread src/librawspeed/decompressors/LJpegDecoder.cpp Fixed
Comment thread src/librawspeed/decompressors/LJpegDecoder.cpp Fixed
da-phil added 2 commits April 19, 2026 14:31
Blackmagic CinemaDNG files don't include TIFF Make/Model tags — they only have UniqueCameraModel (tag 50708), which is valid per the DNG spec. In decodeMetaDataInternal(), the code called getID() unconditionally.

Instead of calling getID() in a try/catch (which logged a spurious error), check for MAKE+MODEL existence first. If present, call getID() as before. If absent, fall back to UNIQUECAMERAMODEL for both make and model fields — mirroring the approach already used in checkSupportInternal().
@da-phil da-phil force-pushed the add_ljpeg_predictor_modes_2_7 branch from 4ceecda to a167a44 Compare April 19, 2026 12:34
@github-actions
Copy link
Copy Markdown

The proposed diff is not clang-formatted.
To make this check pass, download the following patch
(via browser, you must be logged-in in order for this URL to work),
(NOTE: save it into the repo checkout dir for the snippet to work)
https://github.com/darktable-org/rawspeed/actions/runs/24631353348/artifacts/6518826737
... and run:

cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch

@da-phil da-phil force-pushed the add_ljpeg_predictor_modes_2_7 branch from 607fc8a to cb0c24e Compare April 19, 2026 14:33
@da-phil
Copy link
Copy Markdown
Contributor Author

da-phil commented Apr 19, 2026

Current state:

Also Blackmagic Camera DNG RAWs are decoded, with one caveat:

  • SOF3 (lossless JPEG) files decode without any change
  • SOF1 (Extended Sequential DCT + DQT) require 12bit lossy JPEG decoding, which is not supported on libjepg-turbo < 3.0 versions.

On libjpeg-turbo versions < 3.0, the error is now "Unsupported JPEG data precision 12" (much clearer).
On systems with libjpeg-turbo 3.0+, the HAVE_JPEG_12BIT path will automatically enable full 12-bit lossy JPEG decoding.
I still need to test the HAVE_JPEG_12BIT path, as my ubuntu 24.04 also still uses ibjpeg-turbo 2.1.5, such as the current ci setup for rawspeed.

EDIT: due to the current unavailability of libjpeg-turbo 3.0+ for the project and the slight scope creep of this PR, I decided to remove the still experimental 12bit lossless JPEG path and only provide the comprehensive error msg instead.
Supporting 12bit lossless JPEG encoded DNGs from Blackmagic cameras will be work for a future PR.

@da-phil da-phil force-pushed the add_ljpeg_predictor_modes_2_7 branch from cb0c24e to a37c196 Compare April 19, 2026 14:46

// Scan markers after SOI. Stop after a reasonable number of bytes.
const auto limit =
std::min(remaining, static_cast<ByteStream::size_type>(1024));
@da-phil da-phil force-pushed the add_ljpeg_predictor_modes_2_7 branch from a37c196 to 5efdb0e Compare April 19, 2026 20:50
@da-phil da-phil force-pushed the add_ljpeg_predictor_modes_2_7 branch from 5efdb0e to 2e27a99 Compare April 19, 2026 21:05
Some Blackmagic CinemaDNG files (Pocket Cinema Camera 4K files (3)/(4), Micro Cinema Camera (1)) use SOF1 (Extended Sequential DCT) at 12-bit precision but label tiles as TIFF compression=7 (lossless JPEG). The lossless JPEG decoder hit a DQT marker and threw "Not a valid RAW file."

On libjpeg-turbo 2.1.5, the error is now "Unsupported JPEG data precision 12" (much clearer).

Future work: on systems with libjpeg-turbo 3.0+, we can enable full 12-bit lossy JPEG decoding.
@da-phil da-phil force-pushed the add_ljpeg_predictor_modes_2_7 branch from 2e27a99 to 8e10cee Compare April 20, 2026 21:59
@da-phil
Copy link
Copy Markdown
Contributor Author

da-phil commented Apr 20, 2026

Update: due to the current unavailability of libjpeg-turbo 3.0+ for the project, and the slight scope creep of this PR, I decided to remove the still experimental 12bit lossless JPEG path and only provide the comprehensive error msg instead.
Supporting 12bit lossless JPEG encoded DNGs from Blackmagic cameras will be work for a future PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants