Skip to content

Commit 8fa3a60

Browse files
committed
Merge branch 'develop' v0.5.1
2 parents 930fc54 + 2eb084f commit 8fa3a60

240 files changed

Lines changed: 22816 additions & 4432 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,50 @@ SPDX-License-Identifier: MIT-0
55

66
## [Unreleased]
77

8+
## [0.5.1]
9+
10+
### Added
11+
12+
- **Scalable Document List and Test Executions** — Comprehensive redesign to eliminate UI and backend bottlenecks when working with thousands of documents. ([#203](https://github.com/aws-solutions-library-samples/accelerated-intelligent-document-processing-on-aws/issues/203))
13+
- **TypeDateIndex GSI on TrackingTable**: New DynamoDB Global Secondary Index (`ItemType` + `InitialEventTime`) enables efficient queries by item type (document, testrun, testset) sorted by time, replacing full table scans. Includes 20 projected attributes for list-view rendering without base table fetches.
14+
- **GSI Attribute Backfill Mechanism**: Robust Step Functions state machine with parallel scan workers that automatically backfills `ItemType` and `HITLPendingReview` attributes on existing items during stack upgrades. Features timeout-safe continuation, idempotent conditional updates, and automatic trigger via CloudFormation Custom Resource.
15+
- **GSI-Based Document List Resolver**: New `listDocuments` Lambda resolver queries the TypeDateIndex GSI with server-side pagination (`limit`/`nextToken`).
16+
- **`getDocumentCount` API**: New efficient count query using GSI `Select: 'COUNT'` for accurate document totals without fetching data.
17+
- **UI Document List Rewrite**: Eliminated the N+1 query pattern (shard queries → individual `getDocument` per document). Now uses a single paginated `listDocuments` GSI query for all time periods. First page renders immediately with incremental background loading of remaining pages.
18+
- **Subscription Optimization**: `onUpdateDocument` events now use subscription data directly instead of triggering individual `getDocument` API calls, eliminating thousands of redundant requests during active processing.
19+
- **GSI-Based Test Runs Query**: Replaced full table scan in `get_test_runs()` and `get_test_runs_by_date_range()` with GSI query + BatchGetItem pattern for efficient test run listing with all fields (including Context, ConfigVersion).
20+
- **GSI-Based Test Sets Query**: Replaced full table scan in `get_test_sets()` with GSI query + BatchGetItem pattern, avoiding scanning the entire TrackingTable (which includes all documents) just to find ~10 test sets.
21+
- **`ItemType` Written on All Creation Paths**: All document, test run, and test set creation paths (DynamoDB service, AppSync resolvers, test runners, dataset deployers) now write `ItemType` and `InitialEventTime` for immediate GSI indexing.
22+
- **Improved Error Messages**: Document list errors now show the actual failure reason (e.g., Lambda throttling, timeout details) instead of generic "please try again" messages.
23+
24+
- **GraphQL Type Generation & Unit Testing** — Replaced 60+ hand-written GraphQL query/mutation/subscription files with auto-generated types via `@graphql-codegen`, added typed AWSJSON parsers with unit tests (vitest + jsdom), and integrated a CI codegen-check to prevent type drift.
25+
26+
- **Third-Party Model Support** — Added Meta Llama 4 Maverick 17B, Llama 4 Scout 17B, Google Gemma 3 27B IT, and NVIDIA Nemotron Nano 12B v2 VL as selectable models across all pipeline stages (OCR, Classification, Extraction, Assessment, Summarization, Evaluation, Discovery, Agents, Rule Validation). Includes per-token pricing configuration and EU region fallback mappings for Llama 4 models. ([#217](https://github.com/aws-solutions-library-samples/accelerated-intelligent-document-processing-on-aws/issues/217))
27+
28+
- **Load Test Config Version Support** — Added `--config-version` parameter to the `idp-cli load-test` command, enabling load tests to target a specific configuration version. Files uploaded during load tests now include `config-version` S3 metadata, consistent with the `process` command behavior.
29+
30+
- **Deploy Failure Root Cause Analysis** — Enhanced `idp-cli deploy` failure reporting to recursively analyze nested stack events and identify actual root causes. Previously, failures in nested stacks showed only a generic "Embedded stack was not successfully created" message. Now displays a structured "Root Cause Analysis" section with the specific resource, type, and error message from the nested stack that caused the failure, along with cascade failure counts.
31+
32+
- **MCP Server** — Added additional tool to MCP Server for retrieving results of the processed document from the IDP system.
33+
34+
35+
### Changed
36+
37+
- **OCR Benchmark Config Optimization** — Optimized `config_library/unified/ocr-benchmark` configuration with targeted field descriptions, explicit model/prompt/OCR settings, and corrected date format (YYYY-MM-DD to match ground truth). Improved overall extraction accuracy from 51.5% to 75.2% on the full 293-document benchmark at equivalent cost (~$2.62). Classification remains 100% across all 9 document classes. ([#220](https://github.com/aws-solutions-library-samples/accelerated-intelligent-document-processing-on-aws/pull/220))
38+
39+
- **GraphQL Type Generation & Unit Testing** — Replaced 60+ hand-written GraphQL query/mutation/subscription files with auto-generated types via `@graphql-codegen`, added typed AWSJSON parsers with unit tests (vitest + jsdom), and integrated a CI codegen-check to prevent type drift.
40+
41+
### Fixed
42+
43+
- **AgentCore Gateway Manager** — Fixed the issue where gateway was not getting deleted once stack is deleted.
44+
45+
- **Configuration Page Error Display** — Fixed `[object Object]` error message when configuration loading fails (e.g., due to Lambda throttling) by properly extracting error messages from Amplify GraphQL error responses.
46+
47+
### Templates
48+
- us-west-2: `https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.1.yaml`
49+
- us-east-1: `https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.1.yaml`
50+
- eu-central-1: `https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.1.yaml`
51+
852
## [0.5.0]
953

1054
### Added

Makefile

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,12 @@ lint-cicd:
118118
echo -e "$(RED)ERROR: UI build failed$(NC)"; \
119119
exit 1; \
120120
fi
121-
121+
122+
@if ! make codegen-check; then \
123+
echo -e "$(RED)ERROR: GraphQL codegen check failed$(NC)"; \
124+
exit 1; \
125+
fi
126+
122127
@echo -e "$(GREEN)All code quality checks passed!$(NC)"
123128

124129
# Validate AWS CodeBuild buildspec files
@@ -194,6 +199,29 @@ ui-build:
194199
@echo "Checking UI build"
195200
cd src/ui && npm ci --prefer-offline --no-audit && npm run build
196201

202+
# Verify generated GraphQL types and operations are up-to-date
203+
codegen:
204+
@cd src/ui && npm run codegen
205+
@echo -e "$(GREEN)✅ GraphQL types regenerated. Don't forget to commit the changes.$(NC)"
206+
207+
codegen-check:
208+
@echo "Checking if GraphQL codegen output is up-to-date..."
209+
@cd src/ui && npm ci --prefer-offline --no-audit && npm run codegen
210+
@if ! git diff --quiet src/ui/src/graphql/generated/; then \
211+
if [ -n "$$CI" ] || [ -n "$$GITHUB_ACTIONS" ]; then \
212+
echo -e "$(RED)ERROR: Generated GraphQL files are out of date!$(NC)"; \
213+
echo -e "$(YELLOW)Run 'make codegen' and commit the updated files.$(NC)"; \
214+
git diff --stat src/ui/src/graphql/generated/; \
215+
exit 1; \
216+
else \
217+
echo -e "$(YELLOW)Generated GraphQL files were out of date — auto-updated.$(NC)"; \
218+
git diff --stat src/ui/src/graphql/generated/; \
219+
echo -e "$(YELLOW)Please commit the changes above.$(NC)"; \
220+
fi \
221+
else \
222+
echo -e "$(GREEN)✅ GraphQL codegen output is up-to-date$(NC)"; \
223+
fi
224+
197225
commit: lint test
198226
$(info Generating commit message...)
199227
export COMMIT_MESSAGE="$(shell kiro-cli chat --no-interactive --trust-all-tools "Understand pending local git change and changes to be committed, then infer a commit message. Return this commit message only on a single line." | grep ">" | tail -n 1 | sed 's/\x1b\[[0-9;]*m//g')" && \

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.5.0
1+
0.5.1

config_library/pricing.yaml

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -606,6 +606,35 @@ pricing:
606606
- name: outputTokens
607607
price: "2.66E-6"
608608

609+
- name: bedrock/us.meta.llama4-maverick-17b-instruct-v1:0
610+
units:
611+
- name: inputTokens
612+
price: "2.4E-7"
613+
- name: outputTokens
614+
price: "9.7E-7"
615+
616+
- name: bedrock/us.meta.llama4-scout-17b-instruct-v1:0
617+
units:
618+
- name: inputTokens
619+
price: "1.7E-7"
620+
- name: outputTokens
621+
price: "6.6E-7"
622+
623+
- name: bedrock/google.gemma-3-27b-it
624+
units:
625+
- name: inputTokens
626+
price: "2.3E-7"
627+
- name: outputTokens
628+
price: "3.8E-7"
629+
630+
- name: bedrock/nvidia.nemotron-nano-12b-v2
631+
units:
632+
- name: inputTokens
633+
price: "2.0E-7"
634+
- name: outputTokens
635+
price: "6.0E-7"
636+
637+
609638
# ---------------------------------------------------------------------------
610639
# AWS Lambda Pricing (US East - N. Virginia)
611640
# ---------------------------------------------------------------------------

config_library/unified/ocr-benchmark/README.md

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,46 @@ The OCR Benchmark dataset contains diverse document types with ground truth JSON
2020
| **REAL_ESTATE** | Real estate transaction data | transactions[], transactionsByCity[] |
2121
| **SHIFT_SCHEDULE** | Employee scheduling | title, facility, employees[] with shifts |
2222

23+
## Benchmark Results
24+
25+
Evaluated on the full 293-document dataset using IDP Accelerator v0.5.0 (pattern-2, pipeline mode). Evaluation methods are identical across all configs for apples-to-apples comparison.
26+
27+
| Metric | Previous Config | This Config (Nova 2 Lite) | With Sonnet 4.6 |
28+
|--------|----------------|---------------------------|------------------|
29+
| **Overall Accuracy** | 51.5% | 75.2% | 91.2% |
30+
| **Classification Accuracy** | 100% | 100% | 100% |
31+
| **Total Cost (293 docs)** | $2.60 | $2.62 | $9.73 |
32+
| **Cost per Document** | ~$0.009 | ~$0.009 | ~$0.033 |
33+
34+
### Per-Class Extraction Accuracy
35+
36+
| Class | Previous | This Config (Nova) | With Sonnet |
37+
|-------|----------|-------------------|-------------|
38+
| DELIVERY_NOTE (8) | 89.5% | 98.9% | 99.4% |
39+
| PETITION_FORM (51) | 74.7% | 96.7% | 98.4% |
40+
| COMMERCIAL_LEASE_AGREEMENT (52) | 75.5% | 96.3% | 98.5% |
41+
| SHIFT_SCHEDULE (18) | 68.9% | 95.7% | 96.0% |
42+
| REAL_ESTATE (59) | 80.6% | 91.4% | 98.9% |
43+
| BANK_CHECK (52) | 82.6% | 86.1% | 97.0% |
44+
| EQUIPMENT_INSPECTION (11) | 60.8% | 83.6% | 97.1% |
45+
| CREDIT_CARD_STATEMENT (11) | 53.1% | 74.7% | 82.3% |
46+
| GLOSSARY (31) | 68.0% | 67.3% | 95.0% |
47+
48+
### Models Used
49+
50+
- **Classification**: Nova 2 Lite (`us.amazon.nova-2-lite-v1:0`)
51+
- **Extraction**: Nova 2 Lite (`us.amazon.nova-2-lite-v1:0`)
52+
- **OCR**: Textract (Layout feature)
53+
54+
To use Sonnet 4.6 for extraction, change `extraction.model` to `us.anthropic.claude-sonnet-4-6-20250929-v1:0`.
2355

2456
## Processing Mode
2557

2658
**Default Mode**: Pipeline (use_bda: false). Set use_bda: true for BDA mode.
2759

2860
## Validation Level
2961

30-
**Level**: 2 - Minimal Testing
62+
**Level**: 3 - Benchmarked
3163

32-
- **Testing Evidence**: This configuration has been lightly tested with the RealKIE-FCC-Verified Dataset.
33-
- **Known Limitations**: Performance may vary - consider this configuration a starting point. We welome Pull Requests to improve the accuracy.
64+
- **Testing Evidence**: Evaluated on the full 293-document OmniAI OCR Benchmark dataset with per-class accuracy breakdown. Evaluation methods identical to previous config for fair comparison.
65+
- **Known Limitations**: GLOSSARY class has lower accuracy (67.3%) due to OCR challenges with single-digit numbers. Upgrading extraction model to Claude Sonnet 4.6 improves overall accuracy to 91.2% at higher cost.

0 commit comments

Comments
 (0)