You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Replaced Amazon SageMaker A2I (Augmented AI) with a built-in HITL review system integrated directly into the Web UI
14
+
-**Persona-Based Access Control**:
15
+
-**Admin**: Full access to all documents, can skip reviews, release review locks, and manage users
16
+
-**Reviewer**: Access limited to documents pending HITL review, can claim and complete section reviews
17
+
-**Review Workflow Features**:
18
+
- Start Review button to claim document ownership and prevent concurrent edits
19
+
- Section-level review with inline JSON editing and visual document viewer
20
+
- Mark Section Review Complete to approve individual sections
21
+
- Skip All Reviews (Admin only) to bypass pending reviews and continue workflow
22
+
- Release Review to unlock document for other reviewers
23
+
-**Real-time Status Updates**: HITL Status, Review Status, Review Owner, and Reviewed By fields update in real-time across all user sessions via GraphQL subscriptions
24
+
- See [Human-in-the-Loop Review Documentation](./docs/human-review.md) for detailed workflow information
25
+
-**Note**: These are Phase 1 of HITL process updates. In upcoming phases, we are working to deliver futher improvements to human review capabilities with the ability to update document classification, extraction, and resubmit for incremental processing as part of a holistic approach to huiman reviews.
26
+
-**User Management**
27
+
- New User Management page for Admin users to create and manage additional Admin & Reviewer accounts
28
+
- Cognito user groups (Admin, Reviewer) for role-based access control
29
+
- Automatic user synchronization with Cognito
30
+
31
+
-**RVL-CDIP-N-MP-Packets Test Set Auto-Deployment**
32
+
- Automatically deploys 500 multi-page packet PDFs from HuggingFace dataset (https://huggingface.co/datasets/jordyvl/rvl_cdip_n_mp) during stack deployment
-**Multi-Document Packets**: Each of 500 packets contains 2-10 distinct subdocuments of different types for comprehensive splitting and classification testing
35
+
-**Packet Statistics**: 7,330 total pages across 2,027 document sections with average of 14.7 pages and 4.1 sections per packet
36
+
-**Ground Truth Included**: Page-level classification and document boundary information for each packet. Extraction ground truth is not included.
37
+
-**Evaluation Capabilities**: Enables testing of page-level classification accuracy, document splitting accuracy, and split order preservation. Does NOT enable testing of extraction accuracy since there is no extraction ground truth for this data set
38
+
- Test set available in Test Studio UI alongside RealKIE-FCC-Verified and OmniAI-OCR-Benchmark datasets
39
+
- Corresponding configs available in Configuration Library
40
+
- Ideal for evaluating document splitting and classification accuracy in complex multi-document scenarios
41
+
42
+
43
+
### Changed
44
+
45
+
-**HITL Configuration**
46
+
- HITL is now disabled by default in the configuration
47
+
- Users must explicitly enable HITL in the Configuration page (Assessment & HITL Configuration section) to trigger human review workflows
- Removed `EnableHITL` and `PrivateWorkteamArn` CloudFormation parameters
56
+
57
+
58
+
### Changed
59
+
60
+
-**Lambda Layers Architecture for Improved Build Efficiency**
61
+
- Replaced bundled `idp_common` package dependencies in individual Lambda functions with three shared Lambda Layers
62
+
-**Three Specialized Layers**:
63
+
-`base` layer: Core functionality with docs_service and image extras
64
+
-`reporting` layer: Reporting and analytics dependencies
65
+
-`agents` layer: Agent-related dependencies
66
+
-**Key Benefits**:
67
+
- Reduced SAM build times by eliminating redundant dependency installation across 50+ Lambda functions
68
+
- Layer content-based hashing ensures layers are only rebuilt when actual contents change
69
+
- Automatic removal of Lambda runtime packages (boto3, botocore, etc.) reduces layer sizes by ~100MB
70
+
- Layer zips cached locally and in S3, skipping uploads when content hasn't changed
71
+
-**Build System Integration**: publish.py automatically builds, hashes, and uploads layers before SAM builds
72
+
73
+
-**Enhanced publish.py Performance and Logging**
74
+
-**Consistent Logging Helpers**: Added 8 standardized logging methods (`log_phase`, `log_task`, `log_detail`, `log_success`, `log_cached`, `log_warning`, `log_error`) for uniform output formatting with colored icons and thread prefixes
75
+
-**Timed S3 Uploads**: Added `upload_to_s3_with_timer()` helper with spinner animation, elapsed time display, and optimized `TransferConfig` for multi-threaded multipart uploads
76
+
-**AWS CLI Config Library Sync**: Replaced boto3 ThreadPoolExecutor-based config library upload (~60 lines) with `aws s3 sync` command for built-in concurrency, delta sync (skip unchanged files), and simpler code
77
+
-**Timing Breakdown Summary**: End-of-build summary shows top 4 time-consuming steps and percentages for build optimization insights
78
+
-**Phase Headers**: Major build phases now display with clear `═══` separator lines and emojis for visual clarity
79
+
80
+
-**AppSync Resolvers Extracted to Nested Stack for Improved Template Modularity**
81
+
- Refactored main CloudFormation template by extracting 130 AppSync resources into new nested stack architecture
82
+
-**Extracted Components**:
83
+
- Created `nested/appsync/template.yaml` containing GraphQLSchema, AppSyncServiceRole, Lambda resolver functions, LogGroups, DataSources, and Resolvers
84
+
- Moved related Lambda functions from `src/lambda/` to `nested/appsync/src/lambda/` with colocated template definitions
85
+
- Relocated GraphQL schema from `src/api/` to `nested/appsync/src/api/`
86
+
-**Main Template Optimization**: Reduced resource count by keeping only core infrastructure (GraphQLApi, GraphQLApiLogGroup, AppSyncCwlRole, WAF resources, background worker functions)
87
+
-**Build System Integration**: Updated `publish.py` to build nested stack in parallel with patterns
88
+
-**Impact**: Main template now more manageable and faster to navigate, nested stack enables modular development of AppSync resources, parallel builds reduce overall build time
- Moved `options/bda-lending-project` and `options/bedrockkb` into `nested/` directory for simplified project organization
92
+
- All CloudFormation nested stacks now located in single `nested/` directory alongside `appsync`, `bda-lending-project`, and `bedrockkb`
93
+
- Updated build system to build only two categories concurrently (nested + patterns) instead of three (nested + patterns + options)
94
+
-**Breaking Change**: Directory paths changed - `options/` → `nested/`. Existing work-in-progress branches will have merge conflicts in directory structure.
95
+
96
+
97
+
### Fixed
98
+
99
+
-**Fixed page_indices Reset Bug in Multi-Section Documents**
100
+
- Fixed issue where all sections in document packets had page_indices starting from 0 instead of their actual position in the original document by pre-calculating indices during classification with access to global minimum page ID and storing in section.attributes for extraction step to use
101
+
102
+
-**Metering Table Added Requests**
103
+
- Added requests count to bedrock metering data to track API request metrics
104
+
105
+
-**IDP CLI Stack Parameter Preservation During Updates**
106
+
- Fixed bug where `idp-cli deploy` command was resetting ALL stack parameters to their default values during updates, even when users only intended to change specific parameters
107
+
108
+
109
+
### Upgrade Notes
110
+
111
+
-**⚠️ IMPORTANT: Upgrading from v0.4.11 or earlier**
112
+
-**Complete all pending HITL workflows before upgrading**: Any documents waiting in SageMaker A2I human review loops will be orphaned as A2I resources are deleted during the upgrade
113
+
-**Re-enable HITL after upgrade**: If you previously had `EnableHITL=true` CloudFormation parameter, you must now enable HITL through the Configuration page in the Web UI (Assessment & HITL Configuration → Enable HITL)
114
+
-**User migration**: Existing Cognito users will need to be assigned to Admin or Reviewer groups for HITL access
-**Lending Package Configuration Support for Pattern-2**
1061
1176
- Added new `lending-package-sample` configuration to Pattern-2, providing comprehensive support for lending and financial document processing workflows
1062
1177
- New default configuration for Pattern-2 stack deployments, optimized for loan applications, mortgage processing, and financial verification documents
1063
-
- Previous `rvl-cdip-sample` configuration remains available by selecting `rvl-cdip-package-sample` for the `Pattern2Configuration` parameter when deploying or updating stacks
1178
+
- Previous `rvl-cdip-sample` configuration remains available by selecting `rvl-cdip` for the `Pattern2Configuration` parameter when deploying or updating stacks
1064
1179
1065
1180
-**Text Confidence View for Document Pages**
1066
1181
- Added support for displaying OCR text confidence data through new `TextConfidenceUri` field
0 commit comments