Add text/plain validation support with error codes and test vectors#9
Open
erik-sv wants to merge 5 commits into
Open
Add text/plain validation support with error codes and test vectors#9erik-sv wants to merge 5 commits into
erik-sv wants to merge 5 commits into
Conversation
Add full rubric evaluation pipeline for the C2PA asset conformance program's composable rubric framework. json-formula-rs: - Add normalize_expression() for bare true/false/null keyword rewriting - Add arg_count() helper and $argN parameterized named expression support in register_expression() with globals injection and save/restore pattern profile-evaluator-rs: - Add evaluate_rubric_conformance() for whole-crJSON conformance evaluation with failIfMatched support and true/false trait bucketing - Add evaluate_rubric_signals() for per-manifest signal detection with inception/transformation grouping, ingredient index resolution, assertedBy extraction, and mimeType derivation - Support both report_text (profiles) and reportText (rubrics) field names - 36 golden fixture tests matching upstream Python reference evaluator output (1 documented deviation: startsWith array projection in ii2i) c2pa-validate CLI: - Add -rubric, -rubric-dir, -rubric-mode (conformance/signals), -emit-crjson, -crjson, -rubric-strict flags - Remove crJSON evaluation bail that blocked rubric eval on crJSON inputs - Add rubric_results to CrJsonValidationReport for structured output Test fixtures: - 5 rubric YAML files from c2pa-org/conformance PR #324 - 18 golden test scenarios (54 files) from upstream test suite
When --rubric is used with binary assets, the CLI now extracts crJSON and runs rubric evaluation even when trust verification fails. This supports the conformance program onboarding workflow where products use self-signed certificates before receiving program-issued certs. The pipeline attempts trust verification first, then falls back to reading with verify_trust disabled. The crJSON validationResults still reflect the untrusted state, so the trusted_success trait fails as expected while all other conformance traits can be evaluated. Also fix structured JSON output for CrJsonValidation report items so rubric results are properly serialized instead of null.
…andling Move $argN injection and bare-keyword normalization from json-formula-rs to profile-evaluator-rs. This keeps json-formula-rs upstream-compatible while supporting parameterized named expressions in rubric evaluation. json-formula-rs retains only register_function() and globals_mut() as additions over upstream. All expression preprocessing now runs in the evaluator layer at registration time.
Add 6 manifest.text.* well-formedness codes (corruptedWrapper, multipleWrappers, invalidMagic, unsupportedVersion, lengthMismatch, emptyManifest) to all four rubric YAMLs. These codes enable well_formed_success to catch text wrapper validation failures. Add text test vectors in testfiles/encypher-assets/document/text/: - signed_test.txt: valid signed text (positive vector) - signed_test_txt.json: crJSON extracted from signed text - double_wrapper.txt: duplicate wrapper (negative) - corrupted_wrapper.txt: truncated JUMBF (negative) - bad_magic.txt: wrong magic bytes (negative) - bad_version.txt: unsupported version byte (negative)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
text/plainformat support to the conformance validation tool, including format-specific error codes and negative test vectors.Depends on:
Changes
Text error codes:
manifest.text.missing,manifest.text.bad_magic,manifest.text.bad_version,manifest.text.corrupted_wrapper,manifest.text.double_wrapper. Added to all rubric YAML files so text assets are evaluated alongside other formats.Test vectors: Five text files covering the signed happy path and four failure modes (bad magic bytes, unsupported version, corrupted wrapper structure, duplicate wrapper). These serve as regression fixtures for text validation.
c2pa-rs submodule update: Points to the commit that includes the
TextIOasset handler, enabling nativetext/plainreading and writing in the validation pipeline.Context
C2PA Section A.7 defines text as a supported asset type. The
c2pa-textreference implementation encodes JUMBF manifest bytes as invisible Unicode Variation Selectors. This PR enables the conformance tool to validate text assets the same way it validates images, audio, and video.Test plan
cargo testpasses with text_io available in the vendored c2pa-rs