Skip to content

Commit 091027e

Browse files
committed
Add interpretation, concepts, and EXIF value names
Introduce structured metadata interpretation and cross-family concept resolution plus human-readable EXIF/TIFF/DNG numeric value names. New public APIs: openmeta/metadata_interpretation.h, openmeta/metadata_concepts.h, and openmeta/exif_value_names.h (with implementations and tests). Update metadata_query to add SourceProcessing match and margin support, extend vendor_raw_processing groups, and add LibRaw orientation conflict reporting in libraw_adapter. Wire new sources/tests into CMake and update docs/quick start, host integration, and stability pages to document the new helper contracts. These changes provide iterable interpretation records, bounded concept resolution for orientation/date/time/color/GPS, and stable labels for common enum-like numeric tag values to aid host UIs and transfer/inspection workflows.
1 parent 7e09289 commit 091027e

32 files changed

Lines changed: 4375 additions & 170 deletions

CHANGES.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,24 @@ Changes compared with `0.4.8`.
9797
interpretation helpers for user-facing labels, clockwise rotation degrees,
9898
mirrored-state checks, width/height-swap checks, and rotation-only fallbacks,
9999
plus matching thin Python wrappers.
100+
- Added `openmeta/exif_value_names.h` with stable labels for common
101+
EXIF/TIFF/DNG enum-like numeric values, plus matching thin Python wrappers.
102+
- Added `openmeta/metadata_interpretation.h` with query-backed structured
103+
interpretation records for host/UI code that wants normalized semantic
104+
records instead of raw query candidates.
105+
- Added `openmeta/metadata_concepts.h` with first bounded cross-family concept
106+
resolution for orientation, date/time, color/profile, and GPS candidates
107+
across EXIF, XMP, IPTC, ICC, and PNG text where applicable, including source
108+
families, candidate source entries, preferred entries, normalized compare
109+
keys, parsed date/time fields, combined GPS date/time candidates, and
110+
same-role conflict flags.
111+
- Semantic crop queries now expose canonical border-margin candidates for
112+
parseable border/padding XMP text, DNG masked-area candidates, and Phase
113+
One/Leaf geometry margins.
114+
- Vendor RAW/source-processing classification now distinguishes source-private
115+
preview, face-geometry, computational, thermal, and stitch/panorama buckets
116+
in addition to the existing color, white-balance, geometry, storage,
117+
lens-correction, raw-data, sensor, and private-table groups.
100118
- Added focused regression coverage for compatible-file versus rendered-image
101119
transfer safety: compatible mode keeps serializable source RAW/camera-specific
102120
metadata, while rendered mode drops source-specific metadata and uses

CMakeLists.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -339,6 +339,7 @@ set(OPENMETA_SOURCES
339339
src/openmeta/exif_makernote_samsung.cc
340340
src/openmeta/exif_makernote_sony.cc
341341
src/openmeta/exif_makernote_tag_names.cc
342+
src/openmeta/exif_value_names.cc
342343
src/openmeta/exr_adapter.cc
343344
src/openmeta/exr_decode.cc
344345
src/openmeta/exif_tag_names.cc
@@ -355,6 +356,8 @@ set(OPENMETA_SOURCES
355356
src/openmeta/libraw_adapter.cc
356357
src/openmeta/meta_key.cc
357358
src/openmeta/metadata_capabilities.cc
359+
src/openmeta/metadata_concepts.cc
360+
src/openmeta/metadata_interpretation.cc
358361
src/openmeta/metadata_query.cc
359362
src/openmeta/metadata_transfer.cc
360363
src/openmeta/meta_store.cc
@@ -762,12 +765,15 @@ if(OPENMETA_BUILD_TESTS)
762765
tests/exr_adapter_test.cc
763766
tests/exr_decode_test.cc
764767
tests/exif_tiff_decode_test.cc
768+
tests/exif_value_names_test.cc
765769
tests/geotiff_decode_test.cc
766770
tests/iptc_iim_decode_test.cc
767771
tests/interop_export_test.cc
768772
tests/dji_app4_decode_test.cc
769773
tests/makernote_decode_test.cc
770774
tests/metadata_capabilities_test.cc
775+
tests/metadata_concepts_test.cc
776+
tests/metadata_interpretation_test.cc
771777
tests/metadata_query_test.cc
772778
tests/metadata_transfer_api_test.cc
773779
tests/flir_fff_decode_test.cc

docs/api_stability.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,10 @@ different status.
2323
| `ExportNameStyle::Canonical` and `ExportNameStyle::XmpPortable` | `openmeta/interop_export.h` | Stable | Stable naming modes for key-space-aware and portable exports. |
2424
| `ExportNameStyle::FlatHost` | `openmeta/interop_export.h` | Stable | Stable v1 flat host naming contract. See [flat_host_mapping.md](flat_host_mapping.md). |
2525
| EXIF/TIFF orientation helpers: `interpret_exif_orientation(...)`, `exif_orientation_name(...)`, `exif_orientation_rotation_degrees_cw(...)`, `exif_orientation_rotation_only(...)` | `openmeta/orientation.h` | Stable | Small utility contract for user-facing orientation labels, clockwise rotation degrees, mirrored-state detection, dimension-swap detection, and rotation-only fallbacks. Python exposes the same helpers through thin scalar/dictionary wrappers. |
26-
| Semantic metadata query: `query_metadata(...)`, `query_crop_metadata(...)`, focused query helpers, and `metadata_query_fuzzy_search_available()` | `openmeta/metadata_query.h` | Experimental | Query contract for inspection matches plus normalized candidates. Current coverage includes crop/active-area, exposure/gain, white balance, color, lens correction, orientation, and RAW-processing metadata across standard tags, selected DNG tags, fuzzy XMP paths, and vendor RAW-processing classification. Matches report `exact_match`, `fuzzy_match`, and `fuzzy_score` so tools can label exact results separately from RapidFuzz near-miss hits. `OPENMETA_ENABLE_RAPIDFUZZ=ON` adds optional near-miss XMP/property-path scoring. Grouped candidates include `matrix_set`, `vector_set`, and `table` shapes for related non-crop metadata, including RAW black/white levels, linearization, CFA/sensor layout, source geometry, and raw-storage identifiers. Python `Document` and `TransferSourceSnapshot` mirror this as thin dictionary-returning wrappers. |
26+
| EXIF/TIFF/DNG numeric value names: `exif_tag_numeric_value_name(...)` and focused helpers | `openmeta/exif_value_names.h` | Stable | Small helper contract for common enum-like TIFF/EXIF/DNG numeric values such as compression, photometric interpretation, planar configuration, exposure program, metering mode, light source, flash, color space, white balance, scene capture type, gain control, CFA layout, and DNG calibration illuminants. Unknown values return an empty string and remain lossless numeric metadata. |
27+
| Semantic metadata query: `query_metadata(...)`, `query_crop_metadata(...)`, focused query helpers, and `metadata_query_fuzzy_search_available()` | `openmeta/metadata_query.h` | Experimental | Query contract for inspection matches plus normalized candidates. Current coverage includes crop/active-area/border margins, exposure/gain, white balance, color, lens correction, orientation, and RAW/source-processing metadata across standard tags, selected DNG tags, fuzzy XMP paths, and vendor RAW-processing classification. Matches report `exact_match`, `fuzzy_match`, and `fuzzy_score` so tools can label exact results separately from RapidFuzz near-miss hits. `OPENMETA_ENABLE_RAPIDFUZZ=ON` adds optional near-miss XMP/property-path scoring. Grouped candidates include `matrix_set`, `vector_set`, and `table` shapes for related non-crop metadata, including RAW black/white levels, linearization, CFA/sensor layout, source geometry, raw-storage identifiers, and source-processing buckets. Python `Document` and `TransferSourceSnapshot` mirror this as thin dictionary-returning wrappers. |
28+
| Structured metadata interpretation records: `interpret_metadata(...)`, `interpret_metadata_query(...)` | `openmeta/metadata_interpretation.h` | Experimental | Thin structured projection over semantic query candidates. Records carry query class, semantic kind, normalized shape, confidence, source entry ids, and normalized origin/size/rect/margins/value arrays where available. Current scope covers orientation, geometry/crop/border, exposure/gain, color/white-balance, lens-correction, and RAW/source-processing records. Python `Document` and `TransferSourceSnapshot` expose matching dictionary wrappers. |
29+
| Cross-family concept resolution: `resolve_metadata_concepts(...)`, `resolve_metadata_concept(...)` | `openmeta/metadata_concepts.h` | Experimental | First bounded resolver for duplicated host-facing concepts. Current scope reports candidates, candidate source entries, source families, preferred entries, normalized numeric/text keys, normalized date/time fields, and same-role conflicts for orientation, date/time, color/profile, and GPS evidence across EXIF, XMP, IPTC, ICC, and PNG text where applicable. GPS date/time is combined from `GPSDateStamp` plus `GPSTimeStamp` when both entries exist. It is intended for inspection UI and host policy decisions; it does not rewrite metadata or hide ambiguity. Python `Document` and `TransferSourceSnapshot` expose matching dictionary wrappers. |
2730
| Vendor RAW-processing summaries: `vendor_raw_processing_from_store(...)`, `classify_vendor_raw_processing_field(...)` | `openmeta/vendor_raw_processing.h` | Experimental | Conservative grouped source-RAW/source-processing field summaries for decoded Sony, Canon, Nikon, Fujifilm, Pentax, Panasonic, Olympus, Kodak, Minolta, Sigma, Samsung, Ricoh, Apple, DJI, Google, FLIR, Casio, Sanyo, KyoceraRaw, Reconyx, HP, JVC, GE, Motorola, Nintendo, and Microsoft MakerNotes, including vendor-private, computational, thermal, preview, face-geometry, stitch/panorama, Apple computational capture/HDR/motion, DJI pose/thermal, Google HDR+/shot-log, and FLIR radiometric/raw-value buckets. Intended for audit/UI and rendered-transfer safety decisions, not for writing vendor RAW/source-processing values into rendered targets. |
2831
| Transfer safety audit: `transfer_safety_audit_from_store(...)` | `openmeta/metadata_transfer.h` | Experimental | Preflight summary of source entries and entries filtered or invalidated by `TransferSafetyMode`, including Sony/Canon/Nikon/Fujifilm/Pentax/Panasonic/Olympus/Kodak/Minolta/Sigma/Samsung/Ricoh/Apple/DJI/Google/FLIR/Casio/Sanyo/KyoceraRaw/Reconyx/HP/JVC/GE/Motorola/Nintendo/Microsoft RAW/source-processing buckets. Intended for diagnostics and host UI before preparing rendered-image transfers. |
2932
| Raw-carrier passthrough audit: `raw_carrier_passthrough_audit_from_snapshot(...)` | `openmeta/metadata_transfer.h` | Experimental | Diagnostic preflight for opt-in raw carriers. Reports candidate carriers and primary block reasons such as missing payload, target incompatibility, safety filtering, content-bound C2PA, explicit profile policy, missing decoded-entry links, or unsupported carrier kind. Hosts can call it directly before enabling snapshot passthrough. |

docs/development.md

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ model should stay compact:
1818
| --- | --- | --- |
1919
| Decoding | Find metadata carriers and decode EXIF, XMP, IPTC, ICC, Photoshop IRB, JUMBF/C2PA, EXR, and related blocks into `MetaStore` entries. | High, about 98-100% for the current target scope. |
2020
| Interpretation | Normalize names and values, group entries by meaning, and classify source-bound data such as RAW crop, color, lens-correction, sensor, and vendor-private fields. | Medium-high, about 80%. |
21-
| Query | Find entries by name, fuzzy term, or semantic group, for example crop/border/active-area, exposure/gain, color/WB, orientation, lens-correction, and RAW-processing fields across standard and vendor metadata. | Low, about 25-30%. |
21+
| Query | Find entries by name, fuzzy term, or semantic group, then expose normalized query candidates, structured interpretation records, and bounded cross-family concept resolutions for crop/border/active-area, exposure/gain, color/WB, orientation, date/time, GPS, lens-correction, and RAW/source-processing fields across standard and vendor metadata. | Medium, about 50-60%. |
2222
| Creation | Build fresh metadata entries from host-provided values. | Medium, about 55-65%. |
2323
| Editing | Modify existing logical metadata entries while preserving valid surrounding structure. | Medium, about 60-70%. |
2424
| Transfer | Move metadata between files using explicit compatible-file or rendered-image safety policies. | Medium-high, about 80-85%. |
@@ -41,13 +41,15 @@ RAW-processing queries.
4141
Crop queries include DNG crop tags, `ActiveArea`, Phase One/Leaf raw geometry,
4242
and fuzzy crop/border-style XMP property paths. The non-crop queries expose
4343
per-entry value candidates and reuse standard tag names, selected DNG tags,
44-
fuzzy XMP paths, and vendor RAW-processing classification where applicable.
44+
fuzzy XMP paths, canonical border-margin parsing, and vendor RAW-processing
45+
classification where applicable.
4546
They also append grouped candidates for related DNG color matrix/calibration/
4647
reduction/forward matrix tags, DNG white-balance vector tags, and
4748
lens-correction table groups. RAW-processing queries add conservative groups
4849
for black/white levels, linearization tables, CFA/sensor layout, source
49-
geometry, and raw-storage identifiers. Grouped candidates use `matrix_set`,
50-
`vector_set`, and `table` value shapes. When
50+
geometry, raw-storage identifiers, and source-private processing buckets.
51+
Grouped candidates use `matrix_set`, `vector_set`, and `table` value shapes.
52+
When
5153
`OPENMETA_ENABLE_RAPIDFUZZ=ON`, the same query helpers also use RapidFuzz to
5254
score near-miss XMP/property paths; default builds keep the deterministic
5355
substring/tag matcher only. Each raw match reports `exact_match`,
@@ -56,6 +58,19 @@ matches from near-miss search hits.
5658
Python `Document` and `TransferSourceSnapshot` mirror this as thin wrappers
5759
returning the same match/candidate dictionary shape.
5860

61+
For code that wants an iterable semantic record stream instead of raw query
62+
matches, use `openmeta/metadata_interpretation.h`. It projects query candidates
63+
into records with query class, semantic kind, normalized shape, confidence,
64+
source entries, and normalized geometry/value arrays where available.
65+
66+
For cross-family duplicated concepts, use `openmeta/metadata_concepts.h`.
67+
It currently resolves orientation, date/time, color/profile, and GPS into
68+
candidate lists with candidate source entries, source families, preferred
69+
entries, normalized compare keys, parsed date/time fields, and same-role
70+
conflict flags. This is deliberately an inspection/policy surface; host code
71+
still decides whether a conflict should be shown, ignored, or corrected during
72+
editing/transfer.
73+
5974
## Build Prerequisites
6075

6176
- CMake `>= 3.20`

docs/host_integration.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,10 @@ Use the narrowest public API that matches your host:
3131
| EXR writer | `build_exr_attribute_batch_from_file(...)` |
3232
| Host-owned metadata object model | `visit_metadata(...)` |
3333
| Host metadata inspection/search UI | `openmeta/metadata_query.h` focused query helpers |
34+
| Structured interpreted metadata records | `openmeta/metadata_interpretation.h` |
35+
| Cross-family concept conflicts | `openmeta/metadata_concepts.h` |
3436
| User-facing orientation display | `openmeta/orientation.h` |
37+
| Common EXIF/TIFF/DNG value labels | `openmeta/exif_value_names.h` |
3538
| JPEG/JXL/WebP/PNG/JP2/BMFF encoder path | `prepare_metadata_for_target_file(...)` + adapter view or backend emitter |
3639
| Adobe DNG SDK objects/files | `dng_sdk_adapter.h` |
3740

@@ -40,6 +43,19 @@ building a separate fuzzy layer. They report source entries, confidence, value
4043
shape, exact/fuzzy match provenance, and normalized candidates while preserving
4144
ambiguity.
4245

46+
For host code that wants a simpler iterable result, use
47+
`metadata_interpretation.h`. It keeps the same semantic vocabulary as query but
48+
returns structured records with query class, normalized shape, source entries,
49+
confidence, and normalized geometry/value arrays.
50+
51+
For host code that needs to reconcile duplicated concepts across metadata
52+
families, use `metadata_concepts.h`. It reports orientation, date/time,
53+
color/profile, and GPS candidates with source families, preferred entries, and
54+
same-role conflict flags. Date/time candidates include parsed date/time fields
55+
when the source value is recognizable, and GPS timestamps combine
56+
`GPSDateStamp` with `GPSTimeStamp` when both are present. Treat this as an
57+
inspection and policy input rather than an automatic metadata rewrite decision.
58+
4359
## Adapter Classes
4460

4561
OpenMeta splits host integration surfaces deliberately:
@@ -57,6 +73,13 @@ OpenMeta splits host integration surfaces deliberately:
5773
- orientation utility:
5874
`orientation.h` for EXIF/TIFF labels, rotation degrees, mirrored-state
5975
checks, and width/height-swap checks
76+
- value-name utility:
77+
`exif_value_names.h` for common EXIF/TIFF/DNG enum-style numeric labels
78+
- structured interpretation utility:
79+
`metadata_interpretation.h` for query-backed semantic records
80+
- concept-resolution utility:
81+
`metadata_concepts.h` for cross-family orientation, date/time, color/profile,
82+
and GPS conflict inspection
6083

6184
## 1. Read Into `MetaStore`
6285

0 commit comments

Comments
 (0)