Skip to content

Commit 13584ce

Browse files
committed
Add transfer hints to metadata concepts
Introduce transfer hint support for metadata concept candidates to help hosts distinguish safe, source-bound, rendered-unsafe, and target-image-spec-dependent values. Adds MetadataConceptTransferHint enum and new fields (transfer_hint, compatible_file_safe, rendered_image_safe, requires_target_image_spec, source_bound) to MetadataConceptCandidate, plus metadata_concept_transfer_hint_name(). Implements assign_transfer_hint logic and sets hints during concept resolution. Exposes hints and booleans in the Python bindings and adds the corresponding enum. Expands metadata query and vendor RAW-processing term matching to cover more synonyms and source-processing cases. Adds focused tests, updates docs to describe transfer hints, and bumps version to 0.4.10 with CHANGES.md entry.
1 parent 73782e3 commit 13584ce

21 files changed

Lines changed: 603 additions & 70 deletions

CHANGES.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,29 @@
11
# OpenMeta Changes
22

3+
## 0.4.10 - 2026-05-15
4+
5+
Changes compared with `0.4.9`.
6+
7+
### Added
8+
9+
- Added host-facing transfer hints to metadata concept candidates so callers
10+
can distinguish generally safe concepts from source-bound, rendered-unsafe,
11+
and target-image-spec-dependent metadata before transfer.
12+
- Python concept dictionaries now expose the same transfer hint, compatible-file
13+
safety, rendered-image safety, source-bound, and target-image-spec fields.
14+
- Added focused tests for concept transfer hints, color/geometry conflict
15+
reporting, compatible-file preservation, rendered-image filtering, and
16+
transfer-critical MakerNote classification.
17+
18+
### Changed
19+
20+
- Expanded transfer-critical MakerNote/source-processing classification for
21+
RAW crop/active-area/border names, source color transforms and WB terms, lens
22+
correction/shading/distortion terms, raw black/white/linearization terms, and
23+
multi-frame/computational capture state.
24+
- Updated the public interpretation and host-integration docs to describe the
25+
concept transfer hints and current interpretation readiness.
26+
327
## 0.4.9 - 2026-04-28
428

529
Changes compared with `0.4.8`.

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.4.9
1+
0.4.10

docs/api_stability.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ different status.
2626
| EXIF/TIFF/DNG numeric value names: `exif_tag_numeric_value_name(...)` and focused helpers | `openmeta/exif_value_names.h` | Stable | Small helper contract for common enum-like TIFF/EXIF/DNG numeric values such as compression, photometric interpretation, planar configuration, exposure program, metering mode, light source, flash, color space, white balance, scene capture type, gain control, CFA layout, and DNG calibration illuminants. Unknown values return an empty string and remain lossless numeric metadata. |
2727
| Semantic metadata query: `query_metadata(...)`, `query_crop_metadata(...)`, focused query helpers, and `metadata_query_fuzzy_search_available()` | `openmeta/metadata_query.h` | Experimental | Query contract for inspection matches plus normalized candidates. Current coverage includes crop/active-area/border margins, exposure/gain, white balance, color, lens correction, orientation, and RAW/source-processing metadata across standard tags, selected DNG tags, fuzzy XMP paths, and vendor RAW-processing classification. Matches report `exact_match`, `fuzzy_match`, and `fuzzy_score` so tools can label exact results separately from RapidFuzz near-miss hits. `OPENMETA_ENABLE_RAPIDFUZZ=ON` adds optional near-miss XMP/property-path scoring. Grouped candidates include `matrix_set`, `vector_set`, and `table` shapes for related non-crop metadata, including RAW black/white levels, linearization, CFA/sensor layout, source geometry, raw-storage identifiers, and source-processing buckets. Python `Document` and `TransferSourceSnapshot` mirror this as thin dictionary-returning wrappers. |
2828
| Structured metadata interpretation records: `interpret_metadata(...)`, `interpret_metadata_query(...)` | `openmeta/metadata_interpretation.h` | Experimental | Thin structured projection over semantic query candidates. Records carry query class, semantic kind, normalized shape, confidence, source entry ids, and normalized origin/size/rect/margins/value arrays where available. Current scope covers orientation, geometry/crop/border, exposure/gain, color/white-balance, lens-correction, and RAW/source-processing records. Python `Document` and `TransferSourceSnapshot` expose matching dictionary wrappers. |
29-
| Cross-family concept resolution: `resolve_metadata_concepts(...)`, `resolve_metadata_concept(...)` | `openmeta/metadata_concepts.h` | Experimental | First bounded resolver for duplicated host-facing concepts. Current scope reports candidates, candidate source entries, source families, preferred entries, normalized numeric/text keys, full normalized value vectors, normalized date/time fields, date/time precision, timezone kind, normalized geometry fields, and same-role conflicts for orientation, date/time, color/profile, GPS, geometry, lens-correction, and RAW-processing evidence across EXIF, XMP, IPTC, ICC, PNG text, and query-backed interpretation records where applicable. Geometry candidates cover crop, active area, border, and sensor geometry with canonical origin, size, rect, and margin fields when available. Color/white-balance, lens-correction, and RAW-processing concepts preserve grouped matrix/vector/table values for host inspection; they do not make source-bound values safe to serialize into rendered targets. GPS date/time is combined from `GPSDateStamp` plus `GPSTimeStamp` when both entries exist, and GPS altitude candidates expose altitude-reference code plus below-sea-level state when reference metadata is present. It is intended for inspection UI and host policy decisions; it does not rewrite metadata or hide ambiguity. Python `Document` and `TransferSourceSnapshot` expose matching dictionary wrappers. |
29+
| Cross-family concept resolution: `resolve_metadata_concepts(...)`, `resolve_metadata_concept(...)` | `openmeta/metadata_concepts.h` | Experimental | First bounded resolver for duplicated host-facing concepts. Current scope reports candidates, candidate source entries, source families, preferred entries, normalized numeric/text keys, full normalized value vectors, transfer hints, normalized date/time fields, date/time precision, timezone kind, normalized geometry fields, and same-role conflicts for orientation, date/time, color/profile, GPS, geometry, lens-correction, and RAW-processing evidence across EXIF, XMP, IPTC, ICC, PNG text, and query-backed interpretation records where applicable. Geometry candidates cover crop, active area, border, and sensor geometry with canonical origin, size, rect, and margin fields when available. Candidate transfer hints distinguish `safe`, `source_bound`, `rendered_unsafe`, and `requires_target_image_spec` evidence, with compatible-file and rendered-image safety booleans. Color/white-balance, lens-correction, and RAW-processing concepts preserve grouped matrix/vector/table values for host inspection; they do not make source-bound values safe to serialize into rendered targets. GPS date/time is combined from `GPSDateStamp` plus `GPSTimeStamp` when both entries exist, and GPS altitude candidates expose altitude-reference code plus below-sea-level state when reference metadata is present. It is intended for inspection UI and host policy decisions; it does not rewrite metadata or hide ambiguity. Python `Document` and `TransferSourceSnapshot` expose matching dictionary wrappers. |
3030
| Vendor RAW-processing summaries: `vendor_raw_processing_from_store(...)`, `classify_vendor_raw_processing_field(...)` | `openmeta/vendor_raw_processing.h` | Experimental | Conservative grouped source-RAW/source-processing field summaries for decoded Sony, Canon, Nikon, Fujifilm, Pentax, Panasonic, Olympus, Kodak, Minolta, Sigma, Samsung, Ricoh, Apple, DJI, Google, FLIR, Casio, Sanyo, KyoceraRaw, Reconyx, HP, JVC, GE, Motorola, Nintendo, and Microsoft MakerNotes, including vendor-private, computational, thermal, preview, face-geometry, stitch/panorama, Apple computational capture/HDR/motion, DJI pose/thermal, Google HDR+/shot-log, pixel-shift/multi-shot/composite/auto-lighting source processing, and FLIR radiometric/raw-value buckets. Intended for audit/UI and rendered-transfer safety decisions, not for writing vendor RAW/source-processing values into rendered targets. |
3131
| Transfer safety audit: `transfer_safety_audit_from_store(...)` | `openmeta/metadata_transfer.h` | Experimental | Preflight summary of source entries and entries filtered or invalidated by `TransferSafetyMode`, including Sony/Canon/Nikon/Fujifilm/Pentax/Panasonic/Olympus/Kodak/Minolta/Sigma/Samsung/Ricoh/Apple/DJI/Google/FLIR/Casio/Sanyo/KyoceraRaw/Reconyx/HP/JVC/GE/Motorola/Nintendo/Microsoft RAW/source-processing buckets. Intended for diagnostics and host UI before preparing rendered-image transfers. |
3232
| Raw-carrier passthrough audit: `raw_carrier_passthrough_audit_from_snapshot(...)` | `openmeta/metadata_transfer.h` | Experimental | Diagnostic preflight for opt-in raw carriers. Reports candidate carriers and primary block reasons such as missing payload, target incompatibility, safety filtering, content-bound C2PA, explicit profile policy, missing decoded-entry links, or unsupported carrier kind. Hosts can call it directly before enabling snapshot passthrough. |

docs/development.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ model should stay compact:
1717
| Area | Purpose | Readiness |
1818
| --- | --- | --- |
1919
| Decoding | Find metadata carriers and decode EXIF, XMP, IPTC, ICC, Photoshop IRB, JUMBF/C2PA, EXR, and related blocks into `MetaStore` entries. | High, about 98-100% for the current target scope. |
20-
| Interpretation | Normalize names and values, group entries by meaning, and classify source-bound data such as RAW crop, color, lens-correction, sensor, and vendor-private fields. | Medium-high, about 82%. |
21-
| Query | Find entries by name, fuzzy term, or semantic group, then expose normalized query candidates, structured interpretation records, and bounded cross-family concept resolutions for crop/border/active-area, exposure/gain, color/WB, orientation, date/time, GPS, lens-correction, and RAW/source-processing fields across standard and vendor metadata. | Medium, about 63-70%. |
20+
| Interpretation | Normalize names and values, group entries by meaning, and classify source-bound data such as RAW crop, color, lens-correction, sensor, computational capture state, and vendor-private fields. | Medium-high, about 83%. |
21+
| Query | Find entries by name, fuzzy term, or semantic group, then expose normalized query candidates, structured interpretation records, bounded cross-family concept resolutions, transfer hints, and conflict flags for crop/border/active-area, exposure/gain, color/WB, orientation, date/time, GPS, lens-correction, and RAW/source-processing fields across standard and vendor metadata. | Medium, about 66-72%. |
2222
| Creation | Build fresh metadata entries from host-provided values. | Medium, about 55-65%. |
2323
| Editing | Modify existing logical metadata entries while preserving valid surrounding structure. | Medium, about 60-70%. |
2424
| Transfer | Move metadata between files using explicit compatible-file or rendered-image safety policies. | Medium-high, about 80-85%. |
@@ -69,10 +69,10 @@ lens-correction, and RAW-processing into candidate lists with candidate source
6969
entries, source families, preferred entries, normalized compare keys, parsed
7070
date/time fields, date/time precision, timezone kind, GPS altitude-reference
7171
state, canonical geometry origin/size/rect/margins, full normalized value
72-
vectors for grouped matrix/vector/table records, and same-role conflict flags.
73-
This is deliberately an inspection/policy surface; host code still decides
74-
whether a conflict should be shown, ignored, or corrected during
75-
editing/transfer.
72+
vectors for grouped matrix/vector/table records, transfer hints, compatible and
73+
rendered safety booleans, and same-role conflict flags. This is deliberately
74+
an inspection/policy surface; host code still decides whether a conflict should
75+
be shown, ignored, or corrected during editing/transfer.
7676

7777
## Build Prerequisites
7878

docs/host_integration.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,10 @@ precision and timezone-kind fields. GPS timestamps combine `GPSDateStamp` with
6262
whether `GPSAltitudeRef` marked the height as below sea level. Treat this as an
6363
inspection and policy input rather than an automatic metadata rewrite decision;
6464
source-bound color, lens, and RAW-processing values still need rendered-transfer
65-
safety filtering.
65+
safety filtering. Each candidate also carries a transfer hint:
66+
`safe`, `source_bound`, `rendered_unsafe`, or
67+
`requires_target_image_spec`, plus `compatible_file_safe` and
68+
`rendered_image_safe` booleans for host UI and preflight policy.
6669

6770
## Adapter Classes
6871

docs/interpretation_status.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ meaningful interpretation. Interpretation means that decoded entries have
55
stable names, typed values, semantic groups, query shapes, and transfer-safety
66
classification that host applications can use directly.
77

8-
Current overall status: **medium-high, about 82%** for the public target scope.
8+
Current overall status: **medium-high, about 83%** for the public target scope.
99
This is intentionally lower than decode coverage. Decode parity only proves
1010
that metadata carriers and entries are visible; interpretation also requires
1111
human-readable meaning and safe cross-format behavior.
@@ -35,12 +35,12 @@ explicit outcome:
3535
| IPTC-IIM and portable XMP | IPTC datasets and XMP properties decode into typed entries, and bounded EXIF/IPTC-to-XMP projection exists for transfer/writeback. | Medium-high, about 75-85%. | Full MWG-style reconciliation of duplicated EXIF/XMP/IPTC concepts is still bounded. |
3636
| Orientation | EXIF/TIFF orientation query, LibRaw flip mapping, generic orientation helpers for index, rotation degrees, mirrored state, dimension swap, rotation-only fallback, human-readable labels, and EXIF-vs-XMP conflict reporting in the LibRaw bridge. | High, about 90-95%. | Higher-level policy for resolving container and host pixel-orientation state remains host-specific. |
3737
| Geometry, crop, active area, and borders | DNG crop/active-area/masked-area tags, Phase One/Leaf geometry, canonical border margins, vendor RAW-processing geometry buckets, and fuzzy crop/border-style paths are queryable. | Medium-high, about 85-90%. | More vendor-specific normalized rectangles and stronger output contracts for ambiguous multi-tag geometry. |
38-
| Color, white balance, and matrices | DNG color/calibration/reduction/forward matrix groups, white-balance vector groups, ICC metadata, RAW color/source-processing safety buckets, and cross-family concept candidates with full grouped value vectors are identified. | Medium-high, about 78-86%. | Deeper camera/vendor color science interpretation is intentionally conservative, especially for rendered-image transfer. |
39-
| Lens correction and RAW processing | Lens-correction groups, black/white levels, linearization, CFA/sensor layout, raw-storage identifiers, vendor RAW/source-processing buckets, and concept candidates with grouped table/vector values are classified for query and transfer safety. | Medium-high, about 74-82%. | Long-tail per-model correction tables and richer numeric normalization. |
40-
| Vendor MakerNotes | Broad MakerNote naming and source-processing classification exists for common vendors and several live computational/thermal vendors. Unknown entries remain lossless and source-private subgroups distinguish preview, face geometry, computational, thermal, stitch/panorama, pixel-shift, multi-shot, composite, and auto-lighting processing data. | Medium-high, about 77-86%. | ExifTool-style long-tail print conversions, encrypted/custom settings, and per-model private tables. |
38+
| Color, white balance, and matrices | DNG color/calibration/reduction/forward matrix groups, white-balance vector groups, ICC metadata, RAW color/source-processing safety buckets, transfer hints, and cross-family concept candidates with full grouped value vectors are identified. | Medium-high, about 79-87%. | Deeper camera/vendor color science interpretation is intentionally conservative, especially for rendered-image transfer. |
39+
| Lens correction and RAW processing | Lens-correction groups, black/white levels, linearization, CFA/sensor layout, raw-storage identifiers, vendor RAW/source-processing buckets, transfer hints, and concept candidates with grouped table/vector values are classified for query and transfer safety. | Medium-high, about 76-84%. | Long-tail per-model correction tables and richer numeric normalization. |
40+
| Vendor MakerNotes | Broad MakerNote naming and source-processing classification exists for common vendors and several live computational/thermal vendors. Unknown entries remain lossless and source-private subgroups distinguish preview, face geometry, computational, thermal, stitch/panorama, pixel-shift, multi-shot, composite, auto-lighting, RAW crop/active-area, source color-transform, lens-correction, and raw-level processing data. | Medium-high, about 79-87%. | ExifTool-style long-tail print conversions, encrypted/custom settings, and per-model private tables. |
4141
| BMFF item graph, HEIF/AVIF/CR3, JUMBF, and C2PA | BMFF derived fields, item-info rows, bounded relations, primary-linked roles, aux semantics, and draft C2PA/JUMBF structural fields are exposed. | Medium, about 60-70%. | Full BMFF scene modeling and full C2PA manifest/policy semantics. |
4242
| Photoshop IRB | Raw resources are preserved and a bounded interpreted subset is decoded for fixed-layout resources. | Medium, about 60-70%. | Broader resource-specific interpretation. |
43-
| Semantic query/search and records | Query helpers expose raw matches, confidence, provenance, value shapes, normalized candidates, canonical crop/active-area rectangles, border margins, source-processing buckets, optional RapidFuzz near-miss matching, structured interpretation records, and bounded cross-family concept resolution for orientation, date/time, color/profile, GPS, geometry, lens-correction, and RAW-processing with parsed date/time fields, timezone/precision classification, combined GPS timestamps, GPS altitude-reference state, canonical geometry origin/size/rect/margins, full grouped value vectors, and tolerance-aware GPS conflicts. | Medium, about 63-70%. | Richer host policy hints and more long-tail per-model concept aliases. |
43+
| Semantic query/search and records | Query helpers expose raw matches, confidence, provenance, value shapes, normalized candidates, canonical crop/active-area rectangles, border margins, source-processing buckets, optional RapidFuzz near-miss matching, structured interpretation records, and bounded cross-family concept resolution for orientation, date/time, color/profile, GPS, geometry, lens-correction, and RAW-processing with parsed date/time fields, timezone/precision classification, combined GPS timestamps, GPS altitude-reference state, canonical geometry origin/size/rect/margins, full grouped value vectors, transfer hints, rendered/compatible safety booleans, and tolerance-aware GPS/color/geometry conflicts. | Medium, about 66-72%. | More long-tail per-model concept aliases and clearer user-facing policy messages. |
4444
| Transfer-safety classification | Compatible-file versus rendered-image safety policies classify source-specific image geometry, color/profile, RAW-processing, MakerNote, JUMBF/C2PA, and vendor-private data. | High, about 85-90%. | More user-facing diagnostics and additional per-family policy tests. |
4545

4646
## Competitor Position
@@ -57,9 +57,8 @@ outputs.
5757

5858
## Next Interpretation Priorities
5959

60-
1. Add richer host policy hints to concept candidates so inspection UIs can
61-
distinguish portable facts, source-bound facts, and target-owned facts
62-
without reimplementing transfer-safety logic.
60+
1. Turn concept transfer hints into higher-level user-facing diagnostics for
61+
transfer previews and GUI workflows.
6362
2. Expand GPS policy beyond current coordinate tolerance and altitude-reference
6463
state, including unit/reference presentation and cross-family timestamp
6564
reconciliation.

docs/quick_start.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -168,8 +168,11 @@ RAW-processing. It returns candidate source entries, source families, preferred
168168
entries, normalized date/time fields where available, date/time precision,
169169
timezone kind, GPS altitude-reference state, canonical geometry
170170
origin/size/rect/margins, full normalized value vectors for grouped
171-
matrix/vector/table concepts, and same-role conflict flags so host UI can show
172-
ambiguity instead of guessing silently.
171+
matrix/vector/table concepts, transfer hints, compatible/rendered safety
172+
booleans, and same-role conflict flags so host UI can show ambiguity instead
173+
of guessing silently. Transfer hints use `safe`, `source_bound`,
174+
`rendered_unsafe`, or `requires_target_image_spec` to separate portable facts
175+
from source RAW/correction data and target-owned image facts.
173176

174177
For user-facing orientation display, use `openmeta/orientation.h` instead of
175178
showing only the numeric EXIF/TIFF index. `interpret_exif_orientation(...)`

0 commit comments

Comments
 (0)