Skip to content

Commit c6f218d

Browse files
committed
Strip embedded XMP in TIFF/BMFF and update docs
Add support to detect and strip existing embedded XMP from TIFF and BMFF inputs and tighten BMFF behavior, plus related docs and tests. Code: introduce helpers to remove IFD entries (remove_tiff_ifd_entry_tag, remove_ifd_entry_tag), convert parsed payloads to TIFF offsets (payload_to_tiff_offset), and detect BMFF meta entries that declare XMP items (bmff_meta_declares_xmp_item). Update bmff metadata collection to optionally require the OpenMeta transfer marker and add logic to inspect and strip existing XMP from EXIF/GPS/SubIFD/page IFDs when requested; reject BMFF strip mode if a foreign metadata meta box (without OpenMeta marker) is found. Adjust many parsing and merging code paths to preserve/strip entries correctly. Tests: add little-endian write helpers and numerous test-build helpers to construct TIFF/BigTIFF/DNG/BMFF fixtures (preview, subIFD, EXIF/GPS carriers, etc.) to validate the new strip/merge behaviors. CMake/docs: print install prefix/libdir and computed CMake package dir in OpenMeta config summary, and document the exported CMake package path in sphinx/docs. Also add a large "Competitor Parity Checklist" section to metadata_transfer_plan.md describing remaining parity work and priorities.
1 parent 7fcb284 commit c6f218d

6 files changed

Lines changed: 4428 additions & 1439 deletions

File tree

cmake/OpenMetaReport.cmake

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,16 @@ function(openmeta_report_dependency name enabled found provider)
1818
endfunction()
1919

2020
function(openmeta_print_config_summary)
21+
set(_openmeta_pkg_install_dir "")
22+
if(DEFINED _openmeta_pkg_dir AND NOT "${_openmeta_pkg_dir}" STREQUAL "")
23+
if(IS_ABSOLUTE "${_openmeta_pkg_dir}")
24+
set(_openmeta_pkg_install_dir "${_openmeta_pkg_dir}")
25+
else()
26+
set(_openmeta_pkg_install_dir
27+
"${CMAKE_INSTALL_PREFIX}/${_openmeta_pkg_dir}")
28+
endif()
29+
endif()
30+
2131
message(STATUS "OpenMeta configuration summary:")
2232
message(STATUS " Generator: ${CMAKE_GENERATOR}")
2333
if(CMAKE_CONFIGURATION_TYPES)
@@ -36,6 +46,11 @@ function(openmeta_print_config_summary)
3646
else()
3747
message(STATUS " CMAKE_PREFIX_PATH: <empty>")
3848
endif()
49+
message(STATUS " Install prefix: ${CMAKE_INSTALL_PREFIX}")
50+
message(STATUS " Install libdir: ${CMAKE_INSTALL_LIBDIR}")
51+
if(_openmeta_pkg_install_dir)
52+
message(STATUS " CMake package dir: ${_openmeta_pkg_install_dir}")
53+
endif()
3954
if(OPENMETA_DEPS_REPOS_ROOT)
4055
message(STATUS " OPENMETA_DEPS_REPOS_ROOT: ${OPENMETA_DEPS_REPOS_ROOT}")
4156
endif()

docs/development.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1384,5 +1384,10 @@ cmake --install build --prefix /tmp/openmeta-install
13841384
ls /tmp/openmeta-install/share/doc/OpenMeta/html/index.html
13851385
```
13861386

1387+
The exported CMake package is installed under
1388+
`${CMAKE_INSTALL_LIBDIR}/cmake/OpenMeta`. On Unix this may resolve to a
1389+
multiarch path such as `lib/x86_64-linux-gnu/cmake/OpenMeta` when the install
1390+
prefix is `/usr`.
1391+
13871392
When both `OPENMETA_BUILD_SPHINX_DOCS=ON` and `OPENMETA_BUILD_DOCS=ON`, the
13881393
Doxygen HTML output is installed under `share/doc/OpenMeta/doxygen/html`.

docs/metadata_transfer_plan.md

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -607,6 +607,163 @@ Exit criteria:
607607
3. Spend follow-up effort on deeper parity lanes only after the writer
608608
baseline is trustworthy.
609609

610+
## Competitor Parity Checklist
611+
612+
This section is a practical tracking view for the remaining gap against
613+
general-purpose metadata competitors.
614+
615+
Estimated remaining work packages:
616+
- to reach normal still-image workflow parity close to `Exiv2`: about `8`
617+
major work packages
618+
- to reach broader overall parity closer to `ExifTool`: about `12-15`
619+
major work packages
620+
621+
These are not release-percentage numbers. They are rough planning counts for
622+
distinct parity workstreams that still matter after the current public writer
623+
contract work.
624+
625+
### Package Status
626+
627+
Status legend:
628+
- `Done`: public contract and regression coverage are good enough that this is
629+
no longer a primary parity blocker
630+
- `Partial`: real support exists, but competitor-visible limits still remain
631+
- `Missing`: still a clear parity gap
632+
633+
| Work package | Why it matters for parity | Status | Remaining package count | Main target families |
634+
| --- | --- | --- | --- | --- |
635+
| Public writer contract for primary targets | Competitors feel predictable on preserve/replace behavior; OpenMeta still needs that same trust level across all first-class targets | `Partial` | `1` | `TIFF`, `DNG`, `PNG`, `WebP`, `JP2`, `JXL`, `HEIF/AVIF/CR3` |
636+
| General EXIF / IPTC / XMP sync engine | One of the biggest remaining gaps for general editing adoption | `Partial` | `1-2` | Cross-cutting |
637+
| Compare-backed release validation | Needed to defend parity claims with repeatable read-back and compare gates | `Partial` | `1` | Cross-cutting |
638+
| MakerNote rewrite trust | Read parity is strong, but rewrite guarantees still trail mature tools | `Partial` | `1` | `JPEG`, `TIFF`, `DNG`, RAW-derived lanes |
639+
| TIFF / DNG deeper rewrite guarantees | Important for serious export/edit trust on camera-originated files | `Partial` | `1` | `TIFF`, `DNG` |
640+
| BMFF writer depth beyond current bounded contract | Needed for stronger `HEIF/AVIF/CR3` parity beyond the current metadata-only `meta` model | `Partial` | `1` | `HEIF`, `AVIF`, `CR3` |
641+
| Modern container read-depth follow-through | Remaining visible read gaps are mostly here | `Partial` | `1` | `HEIF/AVIF`, `JXL` |
642+
| Long-tail native format semantics | Matters more against `ExifTool` than against `Exiv2` | `Partial` | `2-3` | `RAF`, `X3F`, `CRW/CIFF`, `Photoshop IRB` |
643+
| EXR target decision | Current EXR target is real but still architecturally bounded | `Partial` | `1` | `EXR` |
644+
| JUMBF / C2PA deeper semantics | Current support is bounded and useful, but not full trust-policy parity | `Partial` | `1-2` | `JPEG`, `PNG`, `WebP`, `JXL`, `BMFF` |
645+
| Full arbitrary metadata editing parity | Mature competitors expose a broader open-ended editor surface | `Missing` | Strategic / out of scope | Cross-cutting |
646+
647+
### Format-Family Gap Map
648+
649+
This map is intentionally coarse. It answers where the main remaining work
650+
still sits after the current public regression and writer-contract work.
651+
652+
| Format family | Read parity | Transfer/write parity | Main remaining competitor gap |
653+
| --- | --- | --- | --- |
654+
| `JPEG` | `Strong` | `Strong` | needs continued compare-backed hardening and deeper mixed-bundle parity, not a new baseline writer |
655+
| `TIFF` | `Strong` | `Partial` | deeper rewrite guarantees for more existing-graph and tail-preservation cases |
656+
| `DNG` | `Strong` | `Partial` | more predictable preserve/merge behavior across target modes and raw `SubIFD` chains |
657+
| `PNG` | `Strong` | `Partial` | stable unmanaged-chunk preservation contract and broader compare-backed validation |
658+
| `WebP` | `Strong` | `Partial` | stable unrelated-chunk preservation contract and broader compare-backed validation |
659+
| `JP2` | `Strong` | `Partial` | stronger managed-box preservation guarantees and more roundtrip validation |
660+
| `JXL` | `Strong` on current lanes | `Partial` | more explicit box-preservation guarantees and deeper `brob` realtype follow-through |
661+
| `HEIF / AVIF / CR3` | `Strong` on tracked lanes | `Partial` | BMFF writer depth and deeper scene/relation semantics beyond the bounded current model |
662+
| `EXR` | `Bounded but real` | `Bounded but real` | still needs an explicit long-term decision between stable bounded target vs rewrite/edit path |
663+
| `RAF / X3F` | `Partial` | not a main writer lane | deeper native semantics beyond embedded-TIFF follow paths |
664+
| `CRW / CIFF` | `Partial` | bounded | legacy native depth still trails mature tools |
665+
| `Photoshop IRB` | `Partial` | bounded preservation | interpreted subset still smaller than mature tools |
666+
| `JUMBF / C2PA` | `Partial` | bounded | deeper semantics, trust-policy behavior, and signed rewrite parity remain out of scope |
667+
668+
### Practical Readout
669+
670+
If OpenMeta stops after the current writer-contract work, it can already argue
671+
that it is close to competitor parity on the main tracked still-image targets.
672+
673+
To get materially closer to `Exiv2`, the remaining work is mostly:
674+
- finish stable writer guarantees for the first-class target family
675+
- finish the broader sync policy
676+
- harden compare-backed release validation
677+
- improve TIFF/DNG/BMFF rewrite trust
678+
679+
To get materially closer to `ExifTool`, OpenMeta also needs:
680+
- more long-tail native format depth
681+
- broader general editing behavior
682+
- deeper `JUMBF/C2PA` semantics
683+
- a clearer answer for `EXR`
684+
685+
### Execution Order
686+
687+
Use this as the practical delivery order for the remaining parity work.
688+
689+
Priority legend:
690+
- `Now`: should be in the next active delivery slice
691+
- `Next`: should start after the `Now` slice is stable
692+
- `Later`: important for broader parity, but not the next blocker
693+
694+
| Work package | Priority | Why this order |
695+
| --- | --- | --- |
696+
| Public writer contract for primary targets | `Now` | This is the core trust gap that still keeps OpenMeta below mature writer parity |
697+
| General EXIF / IPTC / XMP sync engine | `Now` | This is still one of the biggest adoption blockers for general edit workflows |
698+
| Compare-backed release validation | `Now` | Parity claims remain weaker until compare-backed gates are release-facing instead of mostly API-facing |
699+
| TIFF / DNG deeper rewrite guarantees | `Now` | This is the highest-risk writer lane for serious still-image export confidence |
700+
| BMFF writer depth beyond current bounded contract | `Next` | `HEIF/AVIF/CR3` are already real targets, but the bounded writer model still needs more depth for stronger parity |
701+
| MakerNote rewrite trust | `Next` | Important for trust, but it benefits from the writer-contract and validation work landing first |
702+
| Modern container read-depth follow-through | `Next` | Visible gap, but less urgent than finishing the current writer baseline |
703+
| EXR target decision | `Next` | Needs an explicit product choice, but should follow the main writer-contract stabilization work |
704+
| Long-tail native format semantics | `Later` | Matters more for broad `ExifTool` parity than for the first still-image writer milestone |
705+
| JUMBF / C2PA deeper semantics | `Later` | Current bounded behavior is already useful; deeper trust semantics should wait until the core writer contract is stable |
706+
| Full arbitrary metadata editing parity | `Later` | Strategic follow-up, not part of the next parity-closing milestone |
707+
708+
Suggested delivery sequence:
709+
1. Finish the stable writer contract for the first-class target family.
710+
2. Finish the broader sync-policy layer and compare-backed release validation.
711+
3. Harden the two highest-risk writer lanes: `TIFF/DNG` and bounded `BMFF`.
712+
4. Revisit MakerNote rewrite trust after the first-class writer contract is stable.
713+
5. Spend follow-up time on modern-container depth, `EXR`, and long-tail native semantics only after the main writer baseline is defendable.
714+
715+
### Now Slice Implementation Board
716+
717+
This board turns the current `Now` slice into concrete delivery checklists.
718+
719+
The sync item here means the bounded next-slice policy completion needed for
720+
practical writer parity. It does not mean full arbitrary EXIF/IPTC/XMP sync
721+
parity across every workflow.
722+
723+
#### 1. Public Writer Contract For Primary Targets
724+
725+
- [ ] document final preserve-vs-replace behavior for existing embedded XMP on `TIFF`, `DNG`, `PNG`, `WebP`, `JP2`, `JXL`, and bounded `BMFF`
726+
- [ ] document final preserve-vs-replace behavior for destination sidecars across embedded-only, sidecar-only, and dual-write flows
727+
- [ ] lock explicit unmanaged-metadata preservation rules for unrelated chunks, boxes, items, and tails per target family
728+
- [ ] add compare-backed read-back gates for each first-class target instead of relying mainly on API-shape regression coverage
729+
- [ ] make CLI and Python surfaces describe the same writeback behavior and path-derivation rules as the C++ helper
730+
- [ ] reduce remaining target differences to documented limits instead of accidental implementation details
731+
732+
#### 2. Bounded EXIF / IPTC / XMP Sync Layer
733+
734+
- [ ] publish one final precedence table for source embedded XMP, destination embedded XMP, and destination sidecar XMP
735+
- [ ] lock conflict behavior for generated EXIF-to-XMP and IPTC-to-XMP projections when existing XMP is also present
736+
- [ ] lock canonical generated-XMP writeback behavior for embedded-only, sidecar-only, and dual-write flows
737+
- [ ] lock namespace preservation and canonicalization rules for managed vs unmanaged XMP content
738+
- [ ] add regression cases for mixed embedded-plus-sidecar destination states across the primary target family
739+
- [ ] document the explicit non-goals of this bounded sync layer so it is not confused with full arbitrary sync parity
740+
741+
#### 3. Compare-Backed Release Validation
742+
743+
- [ ] promote the current primary-target roundtrip checks into explicit release-facing compare gates
744+
- [ ] add compare-backed validation for `TIFF`, `DNG`, `PNG`, `WebP`, `JP2`, `JXL`, and bounded `BMFF` target outputs
745+
- [ ] cover embedded-only, sidecar-only, and dual-write XMP flows in release-facing compare validation
746+
- [ ] add compare-backed validation for explicit sidecar-base overrides and destination-sidecar cleanup behavior
747+
- [ ] gate the primary writer family on deterministic read-back of managed metadata after edit/apply
748+
- [ ] keep public parity claims tied to compare-backed evidence instead of only unit or smoke coverage
749+
750+
#### 4. TIFF / DNG Deeper Rewrite Guarantees
751+
752+
- [ ] lock rewrite guarantees for classic TIFF and BigTIFF root IFD, `ExifIFD`, preview chains, and bounded `SubIFD` replacement
753+
- [ ] lock explicit DNG behavior for `ExistingTarget`, `TemplateTarget`, and `MinimalFreshScaffold`
754+
- [ ] add compare-backed roundtrip gates for preview chains, raw `SubIFD` merge behavior, and `DNGVersion` persistence
755+
- [ ] document preserve-vs-replace guarantees for existing auxiliary IFD chains and downstream tails
756+
- [ ] harden read-back and rewrite tests around mixed existing metadata carriers on camera-originated TIFF/DNG files
757+
- [ ] define the bounded edge of TIFF/DNG rewrite depth clearly enough that hosts know what is guaranteed and what is not
758+
759+
#### Done-When Readout
760+
761+
- [ ] the first-class target family has one explicit public writer contract
762+
- [ ] the bounded sync-policy layer is documented and regression-gated
763+
- [ ] release-facing compare validation covers the main still-image writer set
764+
- [ ] `TIFF/DNG` rewrite guarantees are strong enough to stop being a primary parity blocker
765+
- [ ] the next work slice can move to bounded `BMFF` depth instead of still backfilling the writer baseline
766+
610767
## Postponed Work
611768

612769
Still out of scope for the current milestone:

docs/sphinx/build.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,11 @@ Install
2121
2222
cmake --install build --prefix /opt/openmeta
2323
24+
The exported CMake package is installed under
25+
``${CMAKE_INSTALL_LIBDIR}/cmake/OpenMeta``. On Unix this can be a multiarch
26+
path such as ``lib/x86_64-linux-gnu/cmake/OpenMeta`` when the install prefix is
27+
``/usr``.
28+
2429
Options
2530
-------
2631

0 commit comments

Comments
 (0)