feat(classification): enforce valid class vocabulary with re-prompt/retry (#356) by rstrahan · Pull Request #371 · aws-solutions-library-samples/accelerated-intelligent-document-processing-on-aws

rstrahan · 2026-06-22T20:32:10Z

Summary

Closes #356.

Page-level classification (multimodalPageLevelClassification) can return a class label that is not in the configured vocabulary — e.g. predicting receipt when only invoice, w2, and check are valid. This is especially common with smaller/cheaper models. Previously the code detected this but only logged a warning and used the invalid label anyway.

This PR adds a deterministic validation + retry guardrail:

After the model returns a class, validate it against the configured vocabulary.
If invalid, re-prompt the model — re-sending the original content with an appended correction listing the allowed classes. (At temperature=0, re-sending an identical request would return the identical wrong answer, so the correction is load-bearing — this is a single-turn re-prompt, not a blind retry.)
Retry up to maxValidationRetries.
If exhausted, assign invalidClassFallback (default unclassified) and flag the page with a validation_error in classification metadata. The document keeps processing — no hard failure.

New config keys (under `classification:`)

Key	Default	Description
`enforceValidClasses`	`true`	Validate + retry on out-of-vocabulary classes. `false` = legacy "warn and use as-is".
`maxValidationRetries`	`2`	Re-prompts after the initial attempt. `0` disables retries.
`invalidClassFallback`	`unclassified`	Class assigned when retries are exhausted.

All three are editable in the Configuration UI.

⚠️ Behavior change on upgrade

Enforcement is on by default, so an out-of-vocabulary prediction that previously passed through unchanged is now corrected or coerced to unclassified. Set enforceValidClasses: false to restore the prior behavior. This was a deliberate decision (stronger correctness guarantee out of the box); calling it out here for reviewer sign-off.

Design notes

Chose a set-membership check + single-turn re-prompt over a generic pydantic-validation framework. The classification "schema" is just an enum of strings; the broader "reuse pydantic everywhere / Discovery" idea from the issue is out of scope (the author flagged it as separate too).
Metering is aggregated across all retry attempts so token usage isn't lost.
Holistic (textbasedHolisticClassification) is not covered by this loop yet — noted as a follow-up.

Changes

Core: validation/retry loop + _build_validation_retry_content helper in classify_page_bedrock
Config model: 3 fields + string-tolerant validators on ClassificationConfig
Defaults: base-classification.yaml (inherited by config_library samples via merge)
Config UI: 3 ConfigSchema entries in patterns/unified/template.yaml with dependsOn wiring
Tests: 5 service tests (retry→valid, exhausted→fallback, valid-first no-retry, enforcement-off legacy, metering aggregation) + 2 config-model tests
Notebook: notebooks/misc/classification-valid-class-enforcement.ipynb (deterministic mock-driven demo of all 3 scenarios)
Docs: user (docs/classification.md), developer README, CHANGELOG with migration note

Testing

cd lib/idp_common_pkg && pytest tests/unit/classification tests/unit/config/test_config_models.py → 84 passed, 1 skipped
ruff check and ruff format --check clean on all changed files
CloudFormation template parses; all 3 schema entries well-formed
Notebook logic verified end-to-end against the real service

Note: 6 failures in tests/unit/config/test_configuration_sync.py are pre-existing on develop (they test extraction.temperature sync, unrelated to this change).

🤖 Generated with Claude Code

…etry (#356) Page-level classification (multimodalPageLevelClassification) now validates the model's predicted class against the configured class vocabulary and re-prompts the model on out-of-vocabulary predictions, retrying up to a configurable limit. Because classification runs at temperature=0, the retry appends a correction message listing the allowed classes (a single-turn re-prompt) rather than re-sending an identical request. When retries are exhausted the page is assigned a configurable fallback class and flagged with a validation_error in metadata; the document keeps processing (no hard failure). Three new classification config keys control it (on by default): - enforceValidClasses (default true) - maxValidationRetries (default 2) - invalidClassFallback (default unclassified) Behavior change on upgrade: enforcement is on by default, so out-of-vocabulary predictions that previously passed through unchanged are now corrected or coerced to the fallback. Set enforceValidClasses: false to restore legacy "warn and use as-is" behavior. - Add fields + validators to ClassificationConfig - Add defaults to base-classification.yaml (inherited by config_library samples) - Add ConfigSchema entries to patterns/unified/template.yaml for the Config UI - Add validation/retry loop and _build_validation_retry_content helper - Add unit tests (retry-then-valid, exhausted-fallback, valid-first, legacy, metering aggregation) and config-model default/parsing tests - Add demo notebook notebooks/misc/classification-valid-class-enforcement.ipynb - Update user docs, developer README, and CHANGELOG Holistic packet classification is not covered by this loop yet. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

rstrahan merged commit 9fe710c into develop Jun 22, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(classification): enforce valid class vocabulary with re-prompt/retry (#356)#371

feat(classification): enforce valid class vocabulary with re-prompt/retry (#356)#371
rstrahan merged 1 commit into
developfrom
feature/classification-enforce-valid-classes

rstrahan commented Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rstrahan commented Jun 22, 2026

Summary

New config keys (under classification:)

⚠️ Behavior change on upgrade

Design notes

Changes

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New config keys (under `classification:`)