Skip to content

Commit a7c8575

Browse files
committed
Group vendor MakerNote/RAW query candidates
Add per-family grouped semantic query candidates for vendor MakerNote/RAW fields (white-balance, color, raw-storage, sensor, source-processing). Implement vendor-family key mapping and grouping helpers in metadata_query.cc, append grouped candidates for relevant query kinds, and preserve grouped interpretation records. Add unit tests covering grouping behavior, update documentation and Sphinx sources to describe the feature, update CHANGES.md, and bump VERSION to 0.4.13.
1 parent b72854e commit a7c8575

15 files changed

Lines changed: 349 additions & 33 deletions

CHANGES.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,16 @@
11
# OpenMeta Changes
22

3+
## 0.4.13 - 2026-05-18
4+
5+
Changes compared with `0.4.12`.
6+
7+
### Added
8+
9+
- Added per-family grouped semantic query candidates for vendor MakerNote/RAW
10+
white-balance, color, raw-storage, sensor, and source-processing fields so
11+
structured interpretation and concept resolution can expose table/vector/set
12+
records instead of only per-entry bucket matches.
13+
314
## 0.4.12 - 2026-05-18
415

516
Changes compared with `0.4.11`.

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.4.12
1+
0.4.13

docs/api_stability.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,8 @@ different status.
2424
| `ExportNameStyle::FlatHost` | `openmeta/interop_export.h` | Stable | Stable v1 flat host naming contract. See [flat_host_mapping.md](flat_host_mapping.md). |
2525
| EXIF/TIFF orientation helpers: `interpret_exif_orientation(...)`, `exif_orientation_name(...)`, `exif_orientation_rotation_degrees_cw(...)`, `exif_orientation_rotation_only(...)` | `openmeta/orientation.h` | Stable | Small utility contract for user-facing orientation labels, clockwise rotation degrees, mirrored-state detection, dimension-swap detection, and rotation-only fallbacks. Python exposes the same helpers through thin scalar/dictionary wrappers. |
2626
| EXIF/TIFF/DNG numeric value names: `exif_tag_numeric_value_name(...)` and focused helpers | `openmeta/exif_value_names.h` | Stable | Small helper contract for common enum-like TIFF/EXIF/DNG numeric values such as compression, photometric interpretation, planar configuration, exposure program, metering mode, light source, flash, color space, white balance, scene capture type, gain control, CFA layout, and DNG calibration illuminants. Unknown values return an empty string and remain lossless numeric metadata. |
27-
| Semantic metadata query: `query_metadata(...)`, `query_crop_metadata(...)`, focused query helpers, and `metadata_query_fuzzy_search_available()` | `openmeta/metadata_query.h` | Experimental | Query contract for inspection matches plus normalized candidates. Current coverage includes crop/active-area/border margins, exposure/gain, white balance, color, lens correction, orientation, and RAW/source-processing metadata across standard tags, selected DNG tags, fuzzy XMP paths, and vendor RAW-processing classification. Matches report `exact_match`, `fuzzy_match`, and `fuzzy_score` so tools can label exact results separately from RapidFuzz near-miss hits. `OPENMETA_ENABLE_RAPIDFUZZ=ON` adds optional near-miss XMP/property-path scoring. Grouped candidates include `matrix_set`, `vector_set`, and `table` shapes for related non-crop metadata, including RAW black/white levels, linearization, CFA/sensor layout, source geometry, raw-storage identifiers, and source-processing buckets. Python `Document` and `TransferSourceSnapshot` mirror this as thin dictionary-returning wrappers. |
28-
| Structured metadata interpretation records: `interpret_metadata(...)`, `interpret_metadata_query(...)` | `openmeta/metadata_interpretation.h` | Experimental | Thin structured projection over semantic query candidates. Records carry query class, semantic kind, normalized shape, confidence, source entry ids, and normalized origin/size/rect/margins/value arrays where available. Current scope covers orientation, geometry/crop/border, exposure/gain, color/white-balance, lens-correction, and RAW/source-processing records. Python `Document` and `TransferSourceSnapshot` expose matching dictionary wrappers. |
27+
| Semantic metadata query: `query_metadata(...)`, `query_crop_metadata(...)`, focused query helpers, and `metadata_query_fuzzy_search_available()` | `openmeta/metadata_query.h` | Experimental | Query contract for inspection matches plus normalized candidates. Current coverage includes crop/active-area/border margins, exposure/gain, white balance, color, lens correction, orientation, and RAW/source-processing metadata across standard tags, selected DNG tags, fuzzy XMP paths, and vendor RAW-processing classification. Matches report `exact_match`, `fuzzy_match`, and `fuzzy_score` so tools can label exact results separately from RapidFuzz near-miss hits. `OPENMETA_ENABLE_RAPIDFUZZ=ON` adds optional near-miss XMP/property-path scoring. Grouped candidates include `matrix_set`, `vector_set`, and `table` shapes for related non-crop metadata, including RAW black/white levels, linearization, CFA/sensor layout, source geometry, raw-storage identifiers, source-processing buckets, and per-family vendor MakerNote/RAW white-balance, color, raw-storage, sensor, and source-processing groups. Python `Document` and `TransferSourceSnapshot` mirror this as thin dictionary-returning wrappers. |
28+
| Structured metadata interpretation records: `interpret_metadata(...)`, `interpret_metadata_query(...)` | `openmeta/metadata_interpretation.h` | Experimental | Thin structured projection over semantic query candidates. Records carry query class, semantic kind, normalized shape, confidence, source entry ids, and normalized origin/size/rect/margins/value arrays where available. Current scope covers orientation, geometry/crop/border, exposure/gain, color/white-balance, lens-correction, RAW/source-processing records, and grouped vendor-family table/vector records where classification supports them. Python `Document` and `TransferSourceSnapshot` expose matching dictionary wrappers. |
2929
| Cross-family concept resolution: `resolve_metadata_concepts(...)`, `resolve_metadata_concept(...)` | `openmeta/metadata_concepts.h` | Experimental | First bounded resolver for duplicated host-facing concepts. Current scope reports candidates, candidate source entries, source families, preferred entries, normalized numeric/text keys, full normalized value vectors, transfer hints, normalized date/time fields, date/time precision, timezone kind, normalized geometry fields, and same-role conflicts for orientation, date/time, color/profile, GPS, geometry, lens-correction, and RAW-processing evidence across EXIF, XMP, IPTC, ICC, PNG text, and query-backed interpretation records where applicable. Geometry candidates cover crop, active area, border, and sensor geometry with canonical origin, size, rect, and margin fields when available. Candidate transfer hints distinguish `safe`, `source_bound`, `rendered_unsafe`, and `requires_target_image_spec` evidence, with compatible-file and rendered-image safety booleans. Color/white-balance, lens-correction, and RAW-processing concepts preserve grouped matrix/vector/table values for host inspection; they do not make source-bound values safe to serialize into rendered targets. GPS date/time is combined from `GPSDateStamp` plus `GPSTimeStamp` when both entries exist, and GPS altitude candidates expose altitude-reference code plus below-sea-level state when reference metadata is present; `metadata_concept_gps_altitude_reference_name(...)` provides a stable display token for the reference code. It is intended for inspection UI and host policy decisions; it does not rewrite metadata or hide ambiguity. Python `Document` and `TransferSourceSnapshot` expose matching dictionary wrappers. |
3030
| Transfer concept diagnostics: `transfer_concept_diagnostics_from_store(...)`, `transfer_concept_diagnostic_message(...)` | `openmeta/metadata_transfer.h` | Experimental | Preflight view over concept candidates for `TransferSafetyMode`. Each diagnostic reports concept kind/role, transfer hint, keep/drop/requires-target-image-spec action, reason token, severity token, default message text, conflict flag, source entries, compatible/rendered safety booleans, and GPS altitude-reference presentation fields. Intended for UI previews and host policy messages before calling `prepare_metadata_for_target(...)`; it does not replace the actual transfer filter. Python `Document` and `TransferSourceSnapshot` expose `transfer_concept_diagnostics(...)` dictionaries with `severity_name` and `message` fields. |
3131
| Vendor RAW-processing summaries: `vendor_raw_processing_from_store(...)`, `classify_vendor_raw_processing_field(...)` | `openmeta/vendor_raw_processing.h` | Experimental | Conservative grouped source-RAW/source-processing field summaries for decoded Sony, Canon, Nikon, Fujifilm, Pentax, Panasonic, Olympus, Kodak, Minolta, Sigma, Samsung, Ricoh, Apple, DJI, Google, FLIR, Casio, Sanyo, KyoceraRaw, Reconyx, HP, JVC, GE, Motorola, Nintendo, and Microsoft MakerNotes, including vendor-private, computational, thermal, preview, face-geometry, stitch/panorama, Apple computational capture/HDR/motion, DJI pose/thermal, Google HDR+/shot-log, pixel-shift/multi-shot/composite/auto-lighting source processing, and FLIR radiometric/raw-value buckets. Direct field classification also recognizes decoded Phase One/Leaf RAW-processing tags; use the dedicated Phase One/Leaf helpers for normalized geometry and processing summaries. Intended for audit/UI and rendered-transfer safety decisions, not for writing vendor RAW/source-processing values into rendered targets. |

docs/development.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ model should stay compact:
1717
| Area | Purpose | Readiness |
1818
| --- | --- | --- |
1919
| Decoding | Find metadata carriers and decode EXIF, XMP, IPTC, ICC, Photoshop IRB, JUMBF/C2PA, EXR, and related blocks into `MetaStore` entries. | High, about 98-100% for the current target scope. |
20-
| Interpretation | Normalize names and values, group entries by meaning, and classify source-bound data such as RAW crop, color, lens-correction, sensor, computational capture state, and vendor-private fields. | Medium-high, about 83%. |
21-
| Query | Find entries by name, fuzzy term, or semantic group, then expose normalized query candidates, structured interpretation records, bounded cross-family concept resolutions, transfer hints, and conflict flags for crop/border/active-area, exposure/gain, color/WB, orientation, date/time, GPS, lens-correction, and RAW/source-processing fields across standard and vendor metadata. | Medium, about 66-72%. |
20+
| Interpretation | Normalize names and values, group entries by meaning, and classify source-bound data such as RAW crop, color, lens-correction, sensor, computational capture state, and vendor-private fields. | Medium-high, about 84%. |
21+
| Query | Find entries by name, fuzzy term, or semantic group, then expose normalized query candidates, structured interpretation records, bounded cross-family concept resolutions, transfer hints, and conflict flags for crop/border/active-area, exposure/gain, color/WB, orientation, date/time, GPS, lens-correction, and RAW/source-processing fields across standard and vendor metadata. | Medium, about 68-74%. |
2222
| Creation | Build fresh metadata entries from host-provided values. | Medium, about 55-65%. |
2323
| Editing | Modify existing logical metadata entries while preserving valid surrounding structure. | Medium, about 60-70%. |
2424
| Transfer | Move metadata between files using explicit compatible-file or rendered-image safety policies. | Medium-high, about 80-85%. |
@@ -45,8 +45,10 @@ fuzzy XMP paths, canonical border-margin parsing, and vendor RAW-processing
4545
classification where applicable.
4646
They also append grouped candidates for related DNG color matrix/calibration/
4747
reduction/forward matrix tags, DNG white-balance vector tags, and
48-
lens-correction table groups. RAW-processing queries add conservative groups
49-
for black/white levels, linearization tables, CFA/sensor layout, source
48+
lens-correction table groups. Vendor-classified MakerNote/RAW fields can also
49+
form per-family grouped candidates for white balance, color, raw-storage,
50+
sensor, and source-processing records. RAW-processing queries add conservative
51+
groups for black/white levels, linearization tables, CFA/sensor layout, source
5052
geometry, raw-storage identifiers, and source-private processing buckets.
5153
Grouped candidates use `matrix_set`, `vector_set`, and `table` value shapes.
5254
When

docs/host_integration.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -607,6 +607,9 @@ geometry/storage, lens correction, raw-data, sensor-calibration,
607607
computational, thermal, preview/face geometry, stitch/panorama geometry, or
608608
vendor-private table metadata. Use it to audit transfer safety decisions and
609609
host UI, not as a rendered-output write source.
610+
The same classification feeds semantic query and interpretation records as
611+
per-family grouped table/vector candidates when multiple related vendor fields
612+
are present.
610613

611614
```cpp
612615
#include "openmeta/vendor_raw_processing.h"

docs/interpretation_status.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,12 @@ explicit outcome:
3535
| IPTC-IIM and portable XMP | IPTC datasets and XMP properties decode into typed entries, and bounded EXIF/IPTC-to-XMP projection exists for transfer/writeback. | Medium-high, about 75-85%. | Full MWG-style reconciliation of duplicated EXIF/XMP/IPTC concepts is still bounded. |
3636
| Orientation | EXIF/TIFF orientation query, LibRaw flip mapping, generic orientation helpers for index, rotation degrees, mirrored state, dimension swap, rotation-only fallback, human-readable labels, and EXIF-vs-XMP conflict reporting in the LibRaw bridge. | High, about 90-95%. | Higher-level policy for resolving container and host pixel-orientation state remains host-specific. |
3737
| Geometry, crop, active area, and borders | DNG crop/active-area/masked-area tags, Phase One/Leaf geometry, canonical border margins, vendor RAW-processing geometry buckets, and fuzzy crop/border-style paths are queryable. | Medium-high, about 85-90%. | More vendor-specific normalized rectangles and stronger output contracts for ambiguous multi-tag geometry. |
38-
| Color, white balance, and matrices | DNG color/calibration/reduction/forward matrix groups, white-balance vector groups, ICC metadata, RAW color/source-processing safety buckets, transfer hints, and cross-family concept candidates with full grouped value vectors are identified. | Medium-high, about 79-87%. | Deeper camera/vendor color science interpretation is intentionally conservative, especially for rendered-image transfer. |
39-
| Lens correction and RAW processing | Lens-correction groups, black/white levels, linearization, CFA/sensor layout, raw-storage identifiers, vendor RAW/source-processing buckets, transfer hints, transfer diagnostics, and concept candidates with grouped table/vector values are classified for query and transfer safety. | Medium-high, about 78-85%. | Long-tail per-model correction tables and richer numeric normalization. |
40-
| Vendor MakerNotes | Broad MakerNote naming and source-processing classification exists for common vendors and several live computational/thermal vendors. Unknown entries remain lossless and source-private subgroups distinguish preview, face geometry, computational, thermal, stitch/panorama, pixel-shift, multi-shot, composite, auto-lighting, RAW crop/active-area, source color-transform, lens-correction, raw-level processing data, and Phase One/Leaf RAW-processing fields handled by direct classification plus dedicated normalized helpers. | Medium-high, about 80-88%. | ExifTool-style long-tail print conversions, encrypted/custom settings, and per-model private tables. |
38+
| Color, white balance, and matrices | DNG color/calibration/reduction/forward matrix groups, white-balance vector groups, ICC metadata, RAW color/source-processing safety buckets, transfer hints, per-family grouped vendor color/WB candidates, and cross-family concept candidates with full grouped value vectors are identified. | Medium-high, about 80-88%. | Deeper camera/vendor color science interpretation is intentionally conservative, especially for rendered-image transfer. |
39+
| Lens correction and RAW processing | Lens-correction groups, black/white levels, linearization, CFA/sensor layout, raw-storage identifiers, vendor RAW/source-processing buckets, per-family vendor raw-storage/sensor/source-processing table candidates, transfer hints, transfer diagnostics, and concept candidates with grouped table/vector values are classified for query and transfer safety. | Medium-high, about 80-87%. | Long-tail per-model correction tables and richer numeric normalization. |
40+
| Vendor MakerNotes | Broad MakerNote naming and source-processing classification exists for common vendors and several live computational/thermal vendors. Unknown entries remain lossless and source-private subgroups distinguish preview, face geometry, computational, thermal, stitch/panorama, pixel-shift, multi-shot, composite, auto-lighting, RAW crop/active-area, source color-transform, lens-correction, raw-level processing data, and Phase One/Leaf RAW-processing fields handled by direct classification plus dedicated normalized helpers. Classified multi-field vendor groups now surface as grouped query/interpretation candidates where safe to expose structurally. | Medium-high, about 82-89%. | ExifTool-style long-tail print conversions, encrypted/custom settings, and per-model private tables. |
4141
| BMFF item graph, HEIF/AVIF/CR3, JUMBF, and C2PA | BMFF derived fields, item-info rows, bounded relations, primary-linked roles, aux semantics, and draft C2PA/JUMBF structural fields are exposed. | Medium, about 60-70%. | Full BMFF scene modeling and full C2PA manifest/policy semantics. |
4242
| Photoshop IRB | Raw resources are preserved and a bounded interpreted subset is decoded for fixed-layout resources. | Medium, about 60-70%. | Broader resource-specific interpretation. |
43-
| Semantic query/search and records | Query helpers expose raw matches, confidence, provenance, value shapes, normalized candidates, canonical crop/active-area rectangles, border margins, source-processing buckets, optional RapidFuzz near-miss matching, structured interpretation records, and bounded cross-family concept resolution for orientation, date/time, color/profile, GPS, geometry, lens-correction, and RAW-processing with parsed date/time fields, timezone/precision classification, combined GPS timestamps, GPS altitude-reference state and display token, canonical geometry origin/size/rect/margins, full grouped value vectors, transfer hints, rendered/compatible safety booleans, and tolerance-aware GPS/color/geometry conflicts. | Medium, about 68-74%. | More long-tail per-model concept aliases and richer localized policy wording. |
43+
| Semantic query/search and records | Query helpers expose raw matches, confidence, provenance, value shapes, normalized candidates, canonical crop/active-area rectangles, border margins, per-family grouped vendor records, source-processing buckets, optional RapidFuzz near-miss matching, structured interpretation records, and bounded cross-family concept resolution for orientation, date/time, color/profile, GPS, geometry, lens-correction, and RAW-processing with parsed date/time fields, timezone/precision classification, combined GPS timestamps, GPS altitude-reference state and display token, canonical geometry origin/size/rect/margins, full grouped value vectors, transfer hints, rendered/compatible safety booleans, and tolerance-aware GPS/color/geometry conflicts. | Medium, about 70-76%. | More long-tail per-model concept aliases and richer localized policy wording. |
4444
| Transfer-safety classification | Compatible-file versus rendered-image safety policies classify source-specific image geometry, color/profile, RAW-processing, MakerNote, JUMBF/C2PA, and vendor-private data, with concept-level diagnostics that report keep/drop/requires-target-image-spec actions, severity, and default message text before prepare. | High, about 88-92%. | More per-family policy tests and optional host localization hooks. |
4545

4646
## Competitor Position

docs/quick_start.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,12 @@ Every raw query match carries `exact_match`, `fuzzy_match`, and `fuzzy_score`
153153
fields, so inspection UI can label near-miss results separately from exact
154154
tag/name hits.
155155

156+
For vendor MakerNote/RAW fields, the query layer also builds conservative
157+
per-family grouped candidates for related white-balance, color, raw-storage,
158+
sensor, and source-processing records. These records are for inspection and
159+
safe-transfer policy, not for writing source RAW transforms into rendered
160+
outputs.
161+
156162
Call `metadata_query_fuzzy_search_available()` when a UI wants to expose that
157163
the stronger fuzzy matcher is compiled in.
158164

0 commit comments

Comments
 (0)