Skip to content

jxl: Support fine-grained metadata filtering instead of all-or-nothing by filtering Exif data in memory#19056

Closed
dholth wants to merge 9 commits into
darktable-org:masterfrom
dholth:patch-1
Closed

jxl: Support fine-grained metadata filtering instead of all-or-nothing by filtering Exif data in memory#19056
dholth wants to merge 9 commits into
darktable-org:masterfrom
dholth:patch-1

Conversation

@dholth

@dholth dholth commented Jul 8, 2025

Copy link
Copy Markdown

Add dt_exif_write_blob_to_buffer() so we can send jxl.c exactly the metadata it should write, instead of handling metadata after the fact. Update jxl.c to support this mode of operation, guarded by Exiv2 version checks.

For .jpeg, dt_exif_write_blob() adds metadata after the compressed image has been written:

jpeg_write_image(...); 
dt_exif_write_blob(..., filename); // Add metadata to file

Exiv2 is able to add metadata to .jpeg after the fact, but jpeg-xl expects it to be added by the encoder, and the Exif data comes before the image codestream. By filtering the included metadata in memory, we can pass that to the encoder; solving the all-or-nothing implementation.

Working through some noise in the PR.

AI-assisted.

@kmilos

kmilos commented Jul 8, 2025

Copy link
Copy Markdown
Contributor

I'm not convinced this is still a good idea. We have no clue how much other software out there that could possibly consume JXLs produced by dt actually supports brotli compressed metadata...

Moreover, not every exiv2 >= 0.28 build supports brotli necessarily as it is optional.

@dholth

dholth commented Jul 8, 2025

Copy link
Copy Markdown
Author

Probably most jxl software supports compressed boxes? Since cjxl from the reference library produces them by default.

I looked into this after noticing no exif metadata in jxl exports, but it turns out that checking every single box in the "which metadata to export" preferences does include exif, and the xmp metadata.

@kmilos

kmilos commented Jul 9, 2025

Copy link
Copy Markdown
Contributor

And one more point: the dt packages for Windows must use exiv2 0.27.x for the time being.

@parafin

parafin commented Jul 9, 2025

Copy link
Copy Markdown
Member

Out of the interest - how much space does this save?.. It's a minor note, but by compressing the metadata, you're making it impossible to use grep to find some strings in the metadata (like camera model).

@dholth

dholth commented Jul 9, 2025

Copy link
Copy Markdown
Author

jxlinfo shows the size of each box in the container. I hope few people are using grep to find the word "Sony" or "Panasonic" in a (megabytes) file. Agree that it's more profitable to look into whether exiv2 can edit a jxl BMFF file or otherwise fix the "metadata isn't written unless all boxes checked" problem, by adding metadata in the same way as jpg export.

@parafin

parafin commented Jul 9, 2025

Copy link
Copy Markdown
Member

jxlinfo shows the size of each box in the container.

So, could you show some comparison before and after your change? As you say, it's a "(megabytes)" file, how many percent of the total file size this obfuscation will save?

@dholth

dholth commented Jul 9, 2025

Copy link
Copy Markdown
Author

I am more interested in allowing jpegxl to include exif even if all boxes are not checked in export->preferences.

There is a write_image(..., exif, exif_len, ...) function. Where is the code that clears exif if all boxes are not checked; but apparently allows exif when exporting jpeg in jpeg.c's similar write_image?

@dholth

dholth commented Jul 9, 2025

Copy link
Copy Markdown
Author

The code to ignore exif unless all boxes are checked is https://github.com/darktable-org/darktable/blob/master/src/imageio/imageio.c#L1427-L1434

One way to re-use the existing exiv code for jpegxl, and allow "some boxes to be un-checked", could be to create a temporary sidecar file std::unique_ptr<Exiv2::Image> image(Exiv2::ImageFactory::create(Exiv2::ImageType::exv, WIDEN(path))); in the exv format. The exv format looks like Exiv2's file format for exif data without image data.

Metadata edit functions happen on the exv format.

After all metadata activity is done, image is encoded with pre-filtered metadata blobs from the exv file.

@dholth dholth changed the title jxl.c: Brotli-compress metadata jxl.c: improve jpeg-xl metadata handling Jul 9, 2025
@victoryforce

Copy link
Copy Markdown
Collaborator

@dholth First of all, thank you for the suggestion to improve darktable and for your contribution.

So far, the discussion here has been about the practical value of this change, its pros versus cons. This is actually the main thing, because if we come to the conclusion that this change is impractical, then there is no point in discussing the implementation.

But, I also have a question about the specific implementation...

Yes, I already saw the disclaimer that you didn't even compile the code. That's okay, of course it will compile, because the change is really trivial. But the question is what will be the behavior of this code if darktable is linked with libjxl which does not have built-in brotli support.

Will the Exif box be written uncompressed? Will the Exif box not be written at all? Will darktable crash?

@dholth

dholth commented Jul 10, 2025

Copy link
Copy Markdown
Author

I was able to compile the code, you have a great developer experience.

libjxl does not compile without brotli.

I've changed this commit to include a separate ".exv" file next to the jpeg-xl. Suppose the EXV "metadata sidecar" code was moved to the top of imageio.c dt_imageio_export_with_flags. Instead of writing the image and then editing it several times to include metadata; only write the metadata at this point.

Continue to dt_exif_xmp_attach_export against the exv file, this appears to do the metadata filtering.

After that, encode the image data with final metadata pulled from the exv file.

@ralfbrown ralfbrown added the scope: DAM managing files, collections, archiving, metadata, etc. label Jul 11, 2025
@ralfbrown

Copy link
Copy Markdown
Collaborator

Since the Exif-handling code is in C++, have you considered using an in-memory iostream (e.g. an ostrstream) instead of a disk file for the temporary storage? That would be faster, especially if the image lives on a network drive.

@dholth

dholth commented Jul 12, 2025

Copy link
Copy Markdown
Author

The code now moves the metadata handling to a tempory exv file and then copies it into the image. This works to get the exif data into the jpeg-xl image; the xmp data doesn't appear to make it into the exv file for whatever reason.

@TurboGit TurboGit added the controversial this raises concerns, don't move on the technical work before reaching a consensus label Jul 17, 2025
@github-actions

Copy link
Copy Markdown

This pull request has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please verify it has no conflicts with the master branch and rebase if needed. Mention it now if you need help or give permission to other people to finish your work.

@dholth dholth changed the title jxl.c: improve jpeg-xl metadata handling jxl: Support fine-grained metadata filtering instead of all-or-nothing by Jun 18, 2026
@dholth dholth changed the title jxl: Support fine-grained metadata filtering instead of all-or-nothing by jxl: Support fine-grained metadata filtering instead of all-or-nothing by filtering Exif data in memory Jun 18, 2026
@dholth

dholth commented Jun 18, 2026

Copy link
Copy Markdown
Author

Superseded by #21344 which cleans things up, writes filtered exif to memory, and only advertises XMP support if Exiv2 is new enough.

@dholth dholth closed this Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

controversial this raises concerns, don't move on the technical work before reaching a consensus no-pr-activity scope: DAM managing files, collections, archiving, metadata, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants