Releases: aws-solutions-library-samples/accelerated-intelligent-document-processing-on-aws
v0.5.9
[0.5.9]
Added
-
Policy Discovery & Rule Validation Policy Classification: Upload a regulatory document (e.g., an NCCI Medicare policy manual) and automatically extract structured validation rules from it. A new "Policy Discovery" tab in the Discovery page walks you through the process, and the extracted rules feed directly into the rule validation workflow.
- A new policy classification step runs before rule validation, matching each document against your configured
policy_classesusing regex patterns on document names and page content. Only matching policy rules are evaluated, so unrelated rules are skipped automatically. - The configuration key
rule_classeshas been renamed topolicy_classesfor clarity. Existing configs will need to update this key. - The Schema Builder now has dedicated support for editing policy classes with policy-specific labels, and extraction-only settings are hidden when editing policy schemas.
- A "Policy Discovery" section has been added to Discovery Configuration in the UI, letting you choose the model, temperature, and prompts used for Policy Discovery.
- The legacy
rule-extractionconfiguration preset has been removed. Use Policy Discovery on the Discovery tab instead — it writes extracted rules directly into the active config'spolicy_classes.
- A new policy classification step runs before rule validation, matching each document against your configured
-
Document-level Download button on the Document Details page — A new Download dropdown in the Document Details header lets users pull every output artifact for a document in a single click, packaged as a ZIP. Three scopes are offered:
- Download All (ZIP) — document attributes, metering, summary, evaluation & rule-validation reports, per-section predictions, baselines (when available), per-page text/confidence, and optionally per-page images and/or the source document (checkboxes).
- Download Predictions (ZIP) — all section result JSONs plus a self-describing
manifest.json. - Download Baselines (ZIP) — all baseline section result JSONs (shown only when an evaluation baseline is available).
- Bucket-mirrored ZIP layout — files are organised under top-level
output/,baseline/, andinput/folders that preserve the real S3 key structure, so the archive can be diffed with a directaws s3 syncof the same buckets.
-
Headless REST API mode with VPC-secured deployment for GovCloud — a first-party Jobs REST API for programmatic document submission and status tracking, plus an optional VPC-secured deployment that keeps the API off the public internet. Makes end-to-end GovCloud deployment viable without the UI/AppSync stack, and gives Commercial customers a supported alternative to direct S3 uploads for machine-to-machine integrations.
- Jobs REST API (new
src/lambda/api_handler/,src/lambda/job_tracker/,src/lambda/batch_pre_processor/):POST /jobs— creates a job record and returns a presigned POST URL for the input zip (1-hour expiry, content-type pinned toapplication/zip, 5 GB content-length cap).GET /jobs/{job_id}— returns overall status (PENDING_UPLOAD/IN_PROGRESS/SUCCEEDED/PARTIALLY_SUCCEEDED/FAILED/ABORTED), per-file status map, and — on success — a presigned GET URL forresults.zip.SUCCEEDEDis gated onresults.zipactually being present in the output bucket to avoid racing callers into a 404.- OAuth2
client_credentialsauth via a dedicated Cognito User Pool + Resource Server (idp-api/jobs.read,idp-api/jobs.writescopes). Separate from the existing web-UI Cognito pool. - Per-client job ownership (M1): each job records its creating Cognito principal (
sub/client_id) asCreatedBy.GET /jobs/{job_id}returns HTTP 404 (not 403, to avoid existence-leak) when the caller's principal doesn't match the job's owner. Legacy job records written before this field existed remain readable by any authenticated caller. Behavior change:GET /jobs/{job_id}on a non-existent job now correctly returns 404; previously returned 400 (a pre-existing response-code bug in the API handler).
- Private API Gateway + bastion tunneling:
AWS::Serverless::ApiwithEndpointConfiguration: PRIVATEbound to a customer-suppliedApiGatewayVpcEndpointIdand a resource policy that denies all traffic not originating from that VPC endpoint.- Optional
DeployBastionHost=truespins up an SSM-reachablet3.smallEC2 with IMDSv2 required, encrypted EBS via a dedicated rotating KMS key, and no inbound SSH.scripts/bastion.sh <STACK_NAME>sets up a local SSH tunnel for dev-time API access;scripts/get_api_token.sh <STACK_NAME>fetches an OAuth2 bearer token.
- Safe zip extraction in
batch_pre_processor(M2 + M3):MAX_UNCOMPRESSED_BYTES(default 20 GiB, env-configurable) andMAX_ENTRIES(default 10,000) bounds checked pre-flight before any uploads begin. Bound violations write a terminalFAILEDmarker to the job record so the API surfaces the failure.- Per-entry streaming via
zipfile.ZipFile.open()+s3.upload_fileobj()— no more loading whole entries into Lambda memory. - Per-entry failure isolation — one bad file is marked
FAILEDand the rest of the batch still uploads and advances through the pipeline; the job converges toPARTIALLY_SUCCEEDED/FAILED/SUCCEEDEDas appropriate.
- New CFN parameters (all default to off/empty, fully backward-compatible):
EnableHeadless(bool) — turns on the Jobs REST API.DeployInVPC(bool) — places all IDP Lambdas in customer-supplied private subnets with a customer-supplied security group.VpcId,PrivateSubnetIds,ApiGatewayVpcEndpointId,LambdaSecurityGroupId,ApiStageName— customer-supplied networking.DeployBastionHost,BastionHostSubnetId,BastionHostSecurityGroupId— optional dev-access bastion.- CloudFormation console UX - the 11 new parameters are grouped into two dedicated
AWS::CloudFormation::Interfacesections ("Headless API Deployment (required for GovCloud)" and "Headless API Deployment - Bastion Host (optional, requires VPC Secured Mode)") with friendlierParameterLabelsand rewrittenDescriptiontext. Each description now explicitly states when the parameter is required, what the default behavior is (no Jobs API / no Lambda VPC placement / no bastion EC2 unless explicitly enabled), and which companion parameters it depends on. Ensures Quick-Start users who click the README's "Launch Stack" button see clear opt-in sections rather than assuming the bastion host or Jobs API is always deployed.
- CFN fail-fast validation (H1) — new
Rules:block entries catch misconfiguration at stack create / update time with clearAssertDescriptionerrors, instead of failing deep in resource provisioning:HeadlessRequiresVPC—EnableHeadless=truerequiresDeployInVPC=true+ non-emptyVpcId/ApiGatewayVpcEndpointId/LambdaSecurityGroupId.BastionRequiresVPC—DeployBastionHost=truerequiresDeployInVPC=true+ non-empty bastion subnet / SG.
- Plus defense-in-depth on the two API-gated Lambdas:
VpcConfigis wrapped in!If [DeployInVPC, …, AWS::NoValue]so even if the Rules block is ever relaxed, the Lambdas won't fail to create on empty!Refvalues. - CLI (
idp-cli):--headlessnow auto-sets theEnableHeadless=truestack parameter — they were always used together.idp-cli deploy --headless --from-code . --stack-name <NEW>no longer requires--admin-email. The headless template strips the UI Cognito pool and has noAdminEmailparameter; passing it through producedValidationError: Parameters: [AdminEmail] do not exist in the template. Now skipped and dropped with a note. Non-headless new-stack creation still requires--admin-email.
- Publish pipeline fixes that make headless-to-GovCloud deploys work:
cfn-lintin headless mode now lintsidp-headless.yamland skips commercial-only templates (idp-main.yaml,nested/appsync), which containAWS::AppSync::*/AWS::CloudFront::*resources that don't exist inus-gov-*regions. FixesE3006 Resource type … does not exist.- E/W classification in
_validate_cfn_lintnow uses^E\d{4}/^W\d{4}regex anchors. Previously the substring":E"also matched resource prefixes likeAWS::EC2::, inflating warning-severity lines to errors. WorkflowStateChangeRuleJobTracker target moved from a conditionalArnfield (flaggedE3003 'Arn' is a required property) to a conditional full-target dict via!If.
- Documentation:
- New
docs/govcloud-batch-api.md— REST API reference with schemas, OAuth flow, bastion tunneling setup, and an Authorization model section covering per-client ownership and multi-client behavior. - New
docs/govcloud-architecture.md,docs/govcloud-operations.md,docs/vpc-secured-mode.md. - Overhauled
docs/govcloud-deployment.mdwith a deployment-variant matrix (Vanilla / Headless API / Headless + VPC / Headless + VPC + Bastion).
- New
- End-to-end test script:
scripts/e2e_test_headless.py <STACK_NAME> <PATH_TO_FILE>exercises the full flow (OAuth → POST /jobs → presigned upload → status poll → download results).
- Jobs REST API (new
-
Managed configuration upload rejection —
idp-cli config uploadnow rejects configuration files withmanaged: trueto prevent users from accidentally creating stack-managed configurations that would be overwritten on stack updates. All user-uploaded configurations automatically havemanaged: falseset, ensuring they persist across stack lifecycle events.
Fixed
- Evaluation markdown/report rendering resilience — two defensive fixes that keep evaluation and test-results pages from crashing when upstream data is non-numeric or empty.
Security
Hardening response to security review - Highlights:
- **Stored XSS defense-in-depth (fronten...
v0.5.8
[0.5.8]
Added
- Excluded-class feature — skip static instruction / legal / boilerplate pages — Government forms and similar packages often bundle static informational pages (legal warnings, fee instructions, tax notices, oaths) alongside the pages that carry applicant data. Mark a document class with
x-aws-idp-exclude-from-processing: trueand all downstream stages (extraction, assessment, summarization, rule validation, evaluation) skip sections classified as that class — making zero LLM calls on boilerplate pages.- Optional
x-aws-idp-exclusion-reason("instructions", "legal", "cover-page", …) surfaces as a greySkipped: <reason>badge in the UI Sections panel and as an "Excluded Sections (Not Evaluated)" table in the evaluation markdown report. - Configurable via the UI Configuration Editor → Document Schema → select a document-type class → "Exclude from Processing" checkbox + "Exclusion Reason" input.
- New end-to-end sample config at
config_library/unified/ds11-passport-application/with a matching DS-11 U.S. Passport Application PDF fixture and a standalone demo notebook (notebooks/usecase-specific-examples/ds11-passport-application/). - Additive: classes without the new flag behave exactly as before.
- See
docs/classification.md#excluding-static-pages-eg-instructions-legal-boilerplate.
- Optional
Changed
-
UI dependency cleanup — eliminated 11 of 12 npm deprecation warnings — Replaced deprecated
@aws-sdk/*packages with@smithy/*equivalents, removed unused Babel plugins, migrated ESLint 8→9 (flat config), upgraded Prettier 2→3, and upgraded jsdom 26→29. Added"type": "module"topackage.json. Also addedcaughtErrors: 'none'to ESLint config to stop flagging unused catch clause variables. AddedFORCE=1arg tomake ui-lintto force re-run despite checksum match. -
Headless deployment documentation generalized — headless mode is no longer documented as a GovCloud-only capability. New
docs/headless-deployment.mdis the canonical guide covering headless deployment for both Commercial and GovCloud regions (API-only / pipeline integrations, organizational restrictions on UI-layer services, cost optimization, and required for GovCloud).
Templates
- us-west-2:
https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.8.yaml - us-east-1:
https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.8.yaml - eu-central-1:
https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.8.yaml
v0.5.7
[0.5.7]
Added
-
Claude Opus 4.7 Model Support — Added
anthropic.claude-opus-4-7(and:1mcontext variant) across allus,eu, andglobalinference profiles. Includes unified template enums, UI model dropdowns, cachepoint support, EU region mappings, pricing entries, and documentation updates. -
Add Documents to Existing Test Sets — New "Add Documents" action in Test Studio allows incrementally adding documents (with ground truth) to an existing test set. Supports both "From Existing Files" (S3 pattern) and "From Upload" (ZIP) sources. Key features:
- Automatic baseline filtering: When using the Input Bucket, files without matching baseline/ground truth data are automatically excluded rather than failing the operation, with a result message reporting counts (e.g., "Added 8 of 12 files (4 excluded - no baseline data)")
- Time filter: Optional "Modified after" filter with presets (Last 1 hour, 4 hours, 24 hours, 7 days, 30 days) and a custom date/time picker, available in both new test set creation and add-documents flows
- Idempotent: Re-adding an existing document overwrites it; file counts are always recounted from S3 for accuracy
- UPDATING status: Test sets show a transient "Updating..." badge while documents are being added
-
Creating Custom Test Sets Guide — New tutorial-style documentation (
docs/creating-custom-test-sets.md) walking through the end-to-end workflow for creating custom test sets with ground truth data from scratch: configure for max accuracy, discover document schema, process samples, review/edit predictions, save evaluation baselines, register test sets, and run comparative test executions to evaluate cost vs. accuracy tradeoffs. Referenced fromdocs/demo-videos.md. -
Configuration Version Tracking Across All Analytics Tables — Added
config_versionfield to all analytics tables (metering, document_evaluations, section_evaluations, attribute_evaluations, and document_sections_*) to enable comprehensive tracking and analytics per configuration version. All Glue tables now include aconfig_versioncolumn, and all Parquet files store the configuration version used for each document. Enables direct filtering and comparison queries without complex JOINs - users can query "show me W2 documents processed with config v2.1" or "compare accuracy for configs v2.0 vs v2.1" with simple WHERE clauses. Supports cost analysis, A/B testing, quality comparison, and data lineage tracking. Documents without a config version default to "default".
Fixed
-
Incorrect global inference profile IDs for Knowledge Base model — Fixed
global.anthropic.claude-haiku-4-5-v1:0andglobal.anthropic.claude-sonnet-4-5-v1:0in theKnowledgeBaseModelIdCloudFormation parameter dropdown. These shortened IDs were invalid and causedResourceNotFoundExceptionwhen used. Corrected toglobal.anthropic.claude-haiku-4-5-20251001-v1:0andglobal.anthropic.claude-sonnet-4-5-20250929-v1:0per the AWS Bedrock inference profiles documentation. (#286) -
Application Inference Profile IAM permissions — Added
application-inference-profile/*ARN pattern tobedrock:InvokeModelIAM policies across all templates (root, appsync, multi-doc-discovery, and sample templates). PR #236 previously fixed onlypatterns/unified/template.yaml; this completes the fix for all Lambda execution roles. Also addedbedrock:GetInferenceProfileread permission to support prompt caching resolution. (#272) -
Prompt caching with application inference profiles — Fixed
<<CACHEPOINT>>tags being stripped when using Bedrock application inference profile ARNs as model IDs. The cachepoint check now resolves inference profile ARNs to their underlying foundation model via theGetInferenceProfileAPI, enabling prompt caching for profiles that wrap supported models (Claude, Nova). Results are cached to avoid repeated API calls, with graceful fallback if the API call fails. (#272) -
Chat with document uses hardcoded US model ID — Fixed "Chat with document" feature failing in non-US regions (e.g.,
eu-west-1) with "The provided model identifier is invalid" error. The backend Lambda'sget_summarization_model()fallback was hardcoded tous.amazon.nova-pro-v1:0. Addedget_default_model_for_region()helper that selects the appropriate region-prefixed model (eu.amazon.nova-pro-v1:0for EU,us.amazon.nova-pro-v1:0for US) based onAWS_REGION. (#282) -
BDA activation modal checking wrong version config — Fixed the "Activate Version" flow incorrectly checking the currently selected version's
use_bdaflag (mergedConfig?.use_bda) instead of the target version being activated. This caused the BDA sync confirmation modal to appear (or not appear) based on the wrong version's configuration. The fix fetches and inspects the target version's actual config before deciding whether to show the modal. Also added afetchVersions()refresh after BDA sync operations to keep BDA project ARN metadata up to date in the versions list.
Templates
- us-west-2:
https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.7.yaml - us-east-1:
https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.7.yaml - eu-central-1:
https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.7.yaml
v0.5.6
[0.5.6]
Added
-
Custom Model Fine-Tuning — Fine-tune Amazon Nova 2 models (Lite and Pro) for document classification and extraction using your own labeled Test Sets. The end-to-end workflow — validate data, generate training data, train via Bedrock, and deploy an on-demand custom model endpoint — is driven from a new Custom Models page in the Web UI. Custom models can then be selected in any configuration version for classification and/or extraction. Available to Admin and Author roles. Note: currently requires deployment in
us-east-1. Seedocs/custom-model-finetuning.md. -
External SAML/OIDC Identity Provider Federation — Optional support for federating authentication through an external SAML or OIDC identity provider via Amazon Cognito. Enables organizations to use existing enterprise identity providers (PingOne, Okta, Microsoft Entra ID, etc.) for single sign-on. All federation functionality is opt-in through 12 new CloudFormation parameters — leaving them empty results in zero additional resources and identical behavior to existing Cognito-native authentication. See
docs/external-idp.md. -
Private Network Deployment — Deploy the IDP Accelerator in fully private / air-gapped environments. New
AppSyncVisibilityparameter (GLOBAL|PRIVATE) makes the AppSync API accessible only from inside the VPC. All processing Lambda functions (21 across 3 templates) are conditionally placed in customer VPC subnets with an HTTPS-only security group. Includes a separate VPC endpoint CloudFormation template (scripts/vpc-endpoints.yaml) with 16 interface endpoints (AppSync, Bedrock, SQS, DynamoDB, S3, Lambda, SSM, KMS, STS, Textract, and more) and per-endpoint creation flags to skip pre-existing endpoints. All features are off by default — existing deployments are completely unaffected. Seedocs/deployment-private-network.md. -
Enhanced Information Panels — Added comprehensive help content to the Information (ⓘ) panel on every page in the Web UI. Each panel now includes a feature summary, list of key capabilities, and "Learn more" links to relevant docs-site documentation pages. Created new panels for 8 pages that previously had none (Pricing, Capacity Planning, Custom Models, Discovery, User Management, Test Studio), and enriched the existing 7 panels with fuller descriptions and documentation links.
Changed
-
Removed Claude Sonnet 4:1m and Sonnet 4.5:1m model variants — The 1M context window beta for Claude Sonnet 4 (
claude-sonnet-4-20250514-v1:0:1m) and Sonnet 4.5 (claude-sonnet-4-5-20250929-v1:0:1m) is being retired effective April 30, 2026. These:1mmodel variants have been removed from all enum lists, UI dropdowns, quota code mappings, pricing, and documentation. Users needing 1M context windows should migrate to Claude Sonnet 4.6 (claude-sonnet-4-6:1m), where the 1M context window is generally available (GA). -
Default extraction model updated to
us.anthropic.claude-sonnet-4-6(wasus.anthropic.claude-sonnet-4-20250514-v1:0) in system defaults. -
Error Analyzer system prompt improvements — Added strategy for large batches, priority ordering, and error classification guidance.
-
Error Analyzer settings — Replaced duplicate inline cache with the shared cache from the common monitoring package.
-
Shared CloudWatch Logs — Extracted log search logic from the Error Analyzer into a reusable library in the common monitoring package.
-
Enhanced CI/CD Automated Testing — Enhanced GitLab CI/CD pipeline smoke tests with parallel test execution (8 tests running concurrently with fail-fast behavior), deeper verification (extraction fields, classification results, rule statistics), and added new tests: multi-document concurrent processing (Test 4), Test Studio evaluation with metrics validation (Test 7), agentic extraction with large table validation - 532 fund items (Test 8), single-document discovery (Test 9), and multi-document discovery (Test 10).
Fixed
- Fixed agentic extraction crash (
TypeError: unsupported format string passed to NoneType.__format__) when table parsing stats containNonevalues foravg_confidenceorparse_success_rate. - Fixed agentic extraction
map_table_to_schemaproducing phantom empty rows from non-matching tables (e.g. account_summary rows prepended to transaction_details), causing list item ordering to be shifted by several positions. - Error Analyzer model selection — The agent was using the Chat Companion's model instead of its own configured model.
- Error Analyzer log processing — Fixed early termination that stopped searching after the first Lambda function with errors; now searches all relevant log groups.
- Error Analyzer log truncation — Fixed handling of long log messages to trim them rather than skip them entirely.
- Reprocess from Document Details — Fixed config version not being passed when reprocessing a document from the Document Details page (showed "N/A" instead of the selected version).
- Analytics Agent date awareness — Injected current UTC date/time into the analytics agent system prompt so the LLM can correctly handle relative-time queries (e.g., "show me today's documents", "what was processed this week").
Templates
- us-west-2:
https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.6.yaml - us-east-1:
https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.6.yaml - eu-central-1:
https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.6.yaml
v0.5.5
[0.5.5]
Added
-
Multi-Document Discovery — New capability to automatically discover document classes from a collection of documents. Instead of manually defining document schemas one at a time, users point to a folder of mixed documents and the system automatically identifies document types, clusters similar documents, generates JSON Schemas with field definitions for each type, and saves them to a configuration version — ready for immediate use in the processing pipeline. Available from the Web UI, CLI (
idp-cli discover-multidoc), and SDK (client.discovery.run_multi_doc()).- Web UI: New "Multi-Document" tab on the Discovery page with job submission form (config version selector, bucket selector, S3 prefix input, zip upload), jobs table with search/filter/sort/pagination, and detailed job results page with pipeline progress, expandable JSON schemas, config deep-links, and Quality Review Report
- CLI:
idp-cli discover-multidoc --dir ./samples/ -o ./schemas/with Rich progress bars, results table, and reflection report - SDK:
client.discovery.run_multi_doc(document_dir="./samples/")with typedMultiDocDiscoveryResultresponse model - Two Input Modes: S3 path (select bucket + prefix), zip upload (presigned URL), or local directory (CLI/SDK)
- Configuration Integration: Discovered classes are saved directly to the selected config version's
classesarray in DynamoDB, immediately available for document processing without manual schema creation
-
Prompt Preview — New "Prompt Preview" tab in the Configuration page lets you preview the actual prompts sent to the LLM for each processing step (Classification, Extraction, Assessment, Summarization). Config-derived placeholders are filled in with real values (class names, cleaned JSON Schema), while document-specific placeholders are shown as highlighted markers. Includes token estimates, copy-to-clipboard, and a substitution details panel showing the exact schema sent to the LLM. Helps optimize document class schemas and prompt templates.
-
IDP CLI
chatCommand & SDKChatOperation— Interactive Agent Companion Chat from the terminal and programmatic SDK access. Runs the same multi-agent orchestrator as the Web UI locally, with real-time streaming and multi-turn conversation support. Includes Analytics Agent, Error Analyzer Agent, and optionally Code Intelligence Agent (--enable-code-intelligence). Available asidp-cli chat --stack-name <stack>for interactive use,--promptflag for single-shot scripting, andclient.chat.send_message()in the Python SDK. Seedocs/idp-cli.md#chat. -
Per-Class Extraction Model Override — New JSON Schema extension allows overriding the global
extraction.modelon a per-document-class basis. Useful when certain document types benefit from a different model (e.g., a more powerful model for complex financial forms, a faster/cheaper model for simple documents). Classes without the extension continue to use the global default. Works with both traditional and agentic extraction modes. Seedocs/extraction.md— Per-Class Extraction Model Override section. -
Chandra OCR Lambda Hook Sample — New
GENAIIDP-chandra-ocr-hooksample insamples/lambda-hook-inference/that integrates Datalab Chandra OCR 2 with the LambdaHook feature for high-quality OCR. Supports 90+ languages, math, tables, forms, and handwriting. Uses the Datalab hosted async API (/api/v1/convert) with configurable output format (markdown/json/html) and conversion mode (fast/balanced/accurate). Includes standalone SAM template, local test script, and deployment instructions. Seedocs/lambda-hook-inference.md— Chandra OCR Integration section. -
Average Cost Per Page Metric — Test results and test comparison views now display an "Avg Cost/Page" metric, calculated from total cost and page counts in the cost breakdown. Also included in CSV and JSON exports from the comparison view.
-
Wildcard pattern support for delete-documents —
idp-cli delete-documentsandclient.batch.delete_documents()now accept a--pattern/patternparameter for fnmatch-style wildcard matching (e.g."batch-123/*.pdf","*invoice*"). Combines with--status-filterto delete e.g. all failed invoices across batches. -
Agentic Extraction Hardening — Improved robustness, observability, and table parsing for agentic extraction:
- Pre-flight OCR & schema analysis with adaptive guidance strength (RECOMMENDED → STRONGLY_RECOMMENDED → MANDATORY) ensures table parsing tool is used for large tables
- Deterministic Markdown table parser with lookahead recovery, auto-merge of split tables, and configurable
max_empty_line_gap - Post-extraction completeness validation against schema constraints with detailed shortfall reporting
- Processing report with tool usage decisions, completeness checks, and root cause diagnostics (new UI tab + CloudWatch logs)
- Thread-safe state management via
contextvars.ContextVar; deprecated review agent (config fields preserved as no-ops) - Bug fixes:
patch_buffer_dataslice correction, confidence assessment loop fix, row-based parse success metric, NoneType guard in completeness check
Fixed
-
Headless deployment fails with
ConfigurationPresetAllowedValues error andGraphQLApi.Arnreference error — Addedlending-package-sample-govcloudto the base template AllowedValues and ConfigurationMap, and auto-detect GovCloud region (us-gov-*) for headless template transform instead of missing or hardcoded flag. Also added Discovery resources (BlueprintOptimization, MultiDocDiscovery, DiscoveryProcessor, etc.) to headless removal list to fixGraphQLApi.Arnunresolved reference error. -
delete-documentsfails with DynamoDB errors — Fixed two bugs inget_documents_by_batch(): (1) passing emptyExpressionAttributeNames={}when no status filter causedValidationException, and (2) using low-level DynamoDB client type descriptors ({"S": "..."}) with the high-level Table resource causedbegins_withoperand type mismatch. Rewrote to use the high-levelTable.scan()API withboto3.dynamodb.conditions.Attr.
Templates
- us-west-2:
https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.5.yaml - us-east-1:
https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.5.yaml - eu-central-1:
https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.5.yaml
v0.5.4
[0.5.4]
Added
-
MLflow Experiment Tracking Integration — Optional integration with Amazon SageMaker MLflow for automated test run logging. When enabled (
EnableMLflow=true), every Test Studio run automatically logs metrics (accuracy, cost, field-level scores), configuration parameters (model IDs, temperatures, inference settings), and artifacts (full config snapshots, class definitions, cost breakdowns) to an MLflow tracking server. Fire-and-forget async invocation — never blocks or delays test results. Zero resources created when disabled. Seedocs/mlflow-integration.md. -
BDA Blueprint Optimization — Automatically improves BDA extraction accuracy using the
InvokeBlueprintOptimizationAsyncAPI. When discovery includes a ground truth file andenable_blueprint_optimization: trueis set, the system optimizes the BDA blueprint by comparing extraction results against ground truth, evaluates before/after metrics, and updates the blueprint schema if improved. Disabled by default. Seedocs/discovery.md— Blueprint Optimization section. -
idp_common API Reference & Documentation — Added
docs/idpcommon-api-reference.mdcovering all 22 modules, created 6 missing module READMEs (discovery, schema, image, s3, utils, metrics), updated core data model docs to match current code, fixedIDPConfiglazy-loading bug in__init__.py, and integrated into docs-site sidebar. -
Consolidated publish and headless deploy into
idp-cli— All build/publish/deploy functionality now available through the CLI, deprecating standalone scripts:publish.pyandpublish.share deprecated — useidp-cli publishinstead.publish.pyremains as a thin backward-compatibility wrapper.publish.shhas been removed.scripts/generate_govcloud_template.pyis deprecated — useidp-cli publish --headlessoridp-cli deploy --headlessinstead. The script remains as a thin wrapper.- New
--template-fileoption onidp-cli deployfor deploying from a local CloudFormation template file produced by a previousidp-cli publish. idp-cli deploy --headless(without--from-code) now downloads the published template, transforms to headless with GovCloud config defaults, uploads to S3, and deploys — all in one command.
Fixed
-
HITL review start overwrites document sections — Fixed the Start Review action to update only the Review Status and Review Owner fields, preserving all existing document sections and other fields.
-
Evaluation schema error for free-form objects — Stickler mapper now detects and skips unevaluable object schemas (e.g., objects with
additionalPropertiesbut no definedproperties, and arrays of such objects) instead of raising validation errors. -
Full document reprocess not re-running OCR — Fixed bug where clicking "Reprocess" in the UI reused stale OCR results from the previous run instead of re-executing OCR with the current configuration. The reprocess resolver now deletes previous output data from S3 before queuing, preventing the OCR function's retry-safe recovery from reinstalling old results.
-
Agentic extraction timeout on long documents — Fixed repeated Lambda timeouts when agentic extraction exceeds the 15-minute limit on large documents (e.g., 25-page brokerage statements with 600+ holdings). Added incremental S3 checkpointing that saves extraction state after each tool call — covers both the extraction tools path (
extraction_tool,apply_json_patches,make_buffer_data_final_extraction) and the buffer tools path (patch_buffer_data) that the agent uses for very large batched extractions. The checkpoint format tracks which state was saved (current_extractionvsintermediate_extractionbuffer) so the correct resume path is used. On Step Function retry, the Lambda loads the checkpoint and the agent resumes from where it left off rather than restarting from scratch. No CloudFormation or Step Function changes required — the existingSandbox.Timedoutretry mechanism now makes incremental progress. Only active when agentic extraction is enabled; standard extraction is unaffected. -
Agentic extraction fails on Bedrock InternalServerException without retrying — Fixed
InternalServerExceptionerrors (transient Bedrock server-side errors) causing immediate Lambda failure after only botocore's fast 7 retries, bypassing the application-level retry decorator (50 retries with 5s→1800s exponential backoff). Root cause:InternalServerExceptionandInternalServerErrorwere missing from all three retry layers — theasync_exponential_backoff_retrydecorator'sDEFAULT_RETRYABLE_ERRORSset (bedrock_utils.py), theBedrockClient._invoke_with_retry()retryable errors list (bedrock/client.py), and the Step Functions ExtractionStep RetryErrorEqualslist (workflow.asl.json). All three layers now include these transient errors, providing proper exponential backoff retry at the application level and Lambda-level retry via Step Functions as a safety net.
Templates
- us-west-2:
https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.4.yaml - us-east-1:
https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.4.yaml - eu-central-1:
https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.4.yaml
v0.5.3
[0.5.3]
Added
-
Discovery UX Enhancements — Major improvements to the Discovery experience:
- Multi-Section Package Discovery — New "Multi-Section Package" discovery mode with PDF page thumbnail preview, color-coded page ranges, and parallel job creation. Users define page ranges to discover multiple classes from a single PDF. Each range creates an independent discovery job.
- ✨ AI Auto-Detect Sections — "Auto-detect sections" button uses a configurable LLM prompt (
discovery.auto_split) to automatically identify document boundaries and pre-fill page ranges with document type labels. - Discovery Mode Selector — Tile-based mode choice between "Single Section Document" (with optional ground truth) and "Multi-Section Package" (with page ranges). Ground truth and page ranges are mutually exclusive.
- Class Name Hints — Document type labels (from auto-detect or manual entry) are passed as class name hints to guide the discovery LLM's
$idandx-aws-idp-document-typeoutput. - Real-time Job Monitoring — Live progress messages, elapsed time counters, phased upload status ("Creating jobs..." → "Uploading..." → "Refreshing..."), discovered class name badges, and expandable error details with user-friendly messages.
- Jobs Table UX — Search/filter, time range selector, pagination, resizable columns, column preferences, multi-select delete, config version hyperlinks, and page range badges on multi-section jobs.
- S3 Upload Race Condition Fix — Replaced hardcoded
time.sleep(30)with smart S3 polling using exponential backoff (2s–10s, 60s timeout). - New GraphQL APIs —
autoDetectSectionsmutation,pageRanges/pageLabelsonuploadDiscoveryDocument,pageRange/discoveredClassName/statusMessageon job types,deleteDiscoveryJobmutation.
-
Discovery CLI & SDK Enhancements — New capabilities in
idp-cli discoverandclient.discoverythat bring parity with the Web UI's Discovery features:- Class Name Hints —
--class-hint(CLI) /class_name_hint=(SDK) to pre-label discovered classes, guiding the LLM's$idoutput. - Multi-Section Page Ranges —
--page-range "1-3" --page-label "W2 Form"(CLI, repeatable) /discovery.run_multi_section(page_ranges=[...])(SDK) to discover multiple document classes from a single multi-page PDF. - AI Auto-Detect Sections —
--auto-detect/--detect-only(CLI) /discovery.auto_detect_sections()(SDK) to automatically identify document section boundaries using LLM analysis, then optionally discover each section. - BDA Sync Command — New
idp-cli config-sync-bdacommand andclient.config.sync_bda()SDK method for explicit bidirectional synchronization between IDP configuration classes and BDA blueprints. Supports--direction(bidirectional, bda-to-idp, idp-to-bda) and--mode(replace, merge). - New Models —
AutoDetectResult,AutoDetectSection,ConfigSyncBdaResult,page_rangefield onDiscoveryResult.
- Class Name Hints —
-
IDP SDK & CLI Overhaul — Major refactoring of the SDK and CLI for a cleaner, more maintainable architecture:
IDPCliententry point — Single public interface with typed namespace access (client.batch,client.stack,client.config,client.manifest,client.testing). CLI commands now route throughIDPClientinstead of importing internal modules, ensuring consistent behavior across CLI, Web UI, and programmatic access.- Typed return models — SDK operations return Pydantic models instead of raw dictionaries, enabling IDE auto-complete and type checking.
- Enhanced config validation — Manifest and config validation reports deprecated/unknown fields; config upload detects whether a version exists and handles creation vs. update correctly.
- Enhanced stack operations — Deploy and delete commands support in-progress detection, live monitoring, cancel-update, and failure analysis.
- Private API boundaries — Internal modules renamed from
core/to_core/with lint rules enforcing the boundary.
-
IDP MCP Connector — Local package that bridges coding assistants like Cline and Kiro to the IDP MCP Server with automatic Cognito authentication and dynamic tool discovery.
-
ALB+S3 VPC Hosting Mode — Alternative web UI hosting using Application Load Balancer with S3 VPC Interface Endpoint for environments that require VPC-based hosting (private networks, regulated environments, corporate networks without internet-facing CDN access). (#245)
- New
WebUIHostingparameter (CloudFront|ALB) with conditional resource creation — CloudFront and ALB resources are mutually exclusive - ALB hosting nested stack (
nested/alb-hosting/template.yaml) with ALB, S3 Interface VPC Endpoint, security groups, custom resource Lambdas for VPC CIDR lookup and target registration - TLS 1.3 enforcement, access logging, scoped VPC endpoint policy (
s3:GetObject/s3:ListBucketonly), and multi-CIDR security group ingress management - Self-signed certificate generation script (
scripts/generate_self_signed_cert.sh) for demo/testing - New documentation:
docs/alb-hosting.md— prerequisites, deployment steps, security considerations, troubleshooting, CloudFront vs ALB comparison
- New
-
make helptarget — Addedmake helpwith categorized, auto-generated descriptions for all 33 Makefile targets; updated CONTRIBUTING.md to match. -
Test Studio Field-Level Metrics — Test results now display per-field extraction performance in an interactive table showing Field Name, Accuracy, Precision, Recall, TP, FP, TN, FN. Metrics are searchable, sortable, and paginated in an expandable section. Enables identification of low-performing fields and tracking improvements after configuration changes.
-
Stickler Bulk Aggregation for Test Studio — Test Studio now uses Stickler's
BulkStructuredModelEvaluatorwithaggregate_from_comparisons()for accurate metric aggregation across multiple documents. Each document is evaluated withinclude_confusion_matrix=True, results are stored in S3, and aggregated when viewing test results. Eliminates Athena queries for new data, improving accuracy, consistency, and cost-effectiveness. -
RBAC Security Hardening — Comprehensive audit and hardening of GraphQL API authorization against the documented RBAC permission matrix:
- Query-level
@aws_authdirectives — Added server-side role enforcement to 20+ GraphQL queries that were previously open to all authenticated users. Configuration, pricing, capacity, discovery, test studio, config library, and agent query system queries now enforce role restrictions at the AppSync schema level (e.g., Reviewer cannot access configuration, discovery, test studio, or pricing queries). - Admin-only enforcement for "Save as Version" / "Save as Default" — The
updateConfigurationresolver now checks caller role and rejects non-Admin users attemptingsaveAsVersionorsaveAsDefaultoperations, which were previously only blocked in the UI. - Server-side RBAC filtering in
listDocumentsByDateRange— Added reviewer-only document filtering and config-version scope filtering to the date range resolver, matching the existinglistDocumentsGSI resolver pattern. Updated CloudFormation template withUSERS_TABLE_NAMEenvironment variable and DynamoDB IAM permissions. - Updated RBAC documentation (
docs/rbac.md) — Complete mutation and query authorization tables, AppSync@aws_auth+@aws_iamlimitation documented, all previously missing API entries added.
- Query-level
-
Threat Model Documentation — Comprehensive threat model for the GenAI IDP Accelerator covering architecture overview, STRIDE analysis, feature-specific threats (agent analysis, companion chat, knowledge base, Lambda hooks, MCP integration, RBAC, reporting, SDK/CLI, web UI), risk assessment matrix, AI-generated threat analysis, implementation guide, and Threat Composer JSON export.
-
Managed Configuration Versions — Pre-deployed test sets now have dedicated stack-managed config versions (
managed: true) that are automatically created and overwritten on stack updates. Save and delete are disabled for managed versions in the UI and API. Test Studio auto-selects the matching config version when a test set is selected, replacing the hardcoded mapping. -
Removed older Claude models from Configuration UI picklists (3.x, 4.0, 4.1). Haiku 4.5, Sonnet 4.5, Sonnet 4.6, Opus 4.5, and Opus 4.6 are available for selection in the UI. Existing configurations using older versions still work.
Changed
- SDK & CLI: Renamed processing commands for clarity — Old names are deprecated (emit
DeprecationWarning) but remain available for backward compatibility:client.batch.run()→client.batch.process()client.batch.rerun()→client.batch.reprocess()(same forclient.document.rerun()→.reprocess())idp-cli run-inference→idp-cli processidp-cli rerun-inference→idp-cli reprocess
- SDK:
stack.delete()now waits by default — Thewaitparameter defaults toTrue(previously fire-and-forget). Passwait=Falseto restore the old behavior. - MCP: Renamed
docs/mcp-integration.mdtodocs/mcp-server.mdfor clarity. - MCP: Renamed Lambda function
agentcore_analytics_processortoagentcore_mcp_handlerto better reflect its role as the MCP protocol handler (not just analytics).- CloudFormation resource
AgentCoreAnalyticsLambdaFunction→AgentCoreMCPHandlerFunction - CloudFormation resource
AgentCoreAnalyticsLambdaLogGroup→AgentCoreMCPHandlerLogGroup - Lambda FunctionName:
${StackName}-agentcore-analytics→${StackName}-agentcore-mcp-handler - Source directory:
src/lambda/agentcore_analytics_processor/→ `src/lambda/agen...
- CloudFormation resource
v0.5.2
[0.5.2]
Added
-
Multi-tenancy with Role-Based Access Control (RBAC) — 4-role model (Admin, Author, Reviewer, Viewer) with server-side AppSync auth directives, server-side Reviewer document filtering, and UI adaptation. Admin has full access; Author can edit config and process documents but cannot manage users or delete config versions; Viewer has read-only access (editors, save buttons, and edit mode all disabled); Reviewer sees only HITL-pending documents. Non-admin roles can be scoped to specific use cases via
allowedConfigVersions. Seedocs/rbac.md. -
Standard Class Catalog — When adding a new document class in the Schema Builder, users can now choose between Custom Class (define from scratch) and Standard Class (import from a catalog of 35 pre-built document types). Standard classes are derived from AWS BDA standard blueprints and include common document types like Invoice, Receipt, W-2, Bank Statement, Payslip, US Driver License, US Passport, various tax forms (1040, 941, 940, W-9, 1098, 1099), insurance cards, birth/death/marriage certificates, and more. Each standard class comes with a complete extraction schema including attributes, descriptions, and nested types. Imported classes are fully editable. Run
make classes-from-bdato refresh the catalog from the BDA API. -
Documentation Site — Added a hosted documentation site built with Astro Starlight, auto-deployed to GitHub Pages. Provides full-text search (Pagefind), sidebar navigation organized by topic, dark/light mode, and a professional landing page — all sourced directly from the existing
docs/markdown files with zero content duplication. Browse at aws-solutions-library-samples.github.io/accelerated-intelligent-document-processing-on-aws. -
Discovery accessible from CLI and SDK — Discovery can now be run programmatically via the IDP SDK (
client.discovery.run()) and CLI (idp-cli discover), enabling users with many document classes to automate schema generation without the Web UI. Supports both modes: without ground truth (exploratory) and with ground truth (optimized). (#228)
Changed
-
Sync to BDA no longer auto-activates the config version — Previously, performing "Sync to BDA" would automatically set the current config version as active. Since each config version now has its own BDA project, auto-activation is unnecessary. Users can manually choose which version to activate via the Versions table. The "Sync to BDA" confirmation modal text has been updated accordingly.
-
Removed
Bedrock Data Automation (BDA) Project ARNCloudFormation parameter — The deploy-timePattern1BDAProjectArnparameter has been removed as it was redundant with the per-config-version BDA project management already available in the Web UI, CLI, and GraphQL API. BDA projects are now managed entirely post-deployment: enableuse_bda: truein your configuration, then use "Sync to BDA" to create or link a BDA project, or "Sync from BDA" to import from any existing BDA project. This simplifies the deployment experience (one fewer parameter) and better aligns the CloudFormation interface with the system's actual architecture. Existing deployed stacks are unaffected — runtime BDA project ARN resolution reads from DynamoDB per-version tracking, not from the CloudFormation parameter. Also removed the unusednested/bda-lending-project/directory (dead code not referenced by any template) and the legacyBDA_PROJECT_ARNenvironment variable fallback from the sync resolver.
Fixed
-
CLI: Remove deprecated
--patternreferences — Updatedidp-cli.mdand CLI code to reflect the unified pattern architecture. Removed--patternfrom all deploy and config command examples/options. -
Discovery no longer injects default config classes into target version — Previously, running Discovery on a configuration version would merge all classes from the
defaultversion into the target version alongside the newly discovered class. Now Discovery only adds/updates the discovered class within the target version's own class list, keeping the version's classes exactly as the user curated them. -
Documentation: Comprehensive review and cleanup — Fixed outdated references, broken links, and missing content across documentation files.
-
Inference Profile pricing ARN truncation in UI — Fixed pricing display and cost breakdown truncation for Bedrock Application Inference Profile ARNs containing multiple
/characters (e.g.,bedrock/arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/088k6ehrxpci). The UI was splitting on all/separators instead of preserving the full ARN, causing the profile ID to be dropped in the Pricing page display, Test Studio cost breakdowns, and CSV exports. Backend pricing lookup was not affected. (#237)
Templates
- us-west-2:
https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.2.yaml - us-east-1:
https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.2.yaml - eu-central-1:
https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.2.yaml
v0.5.1
[0.5.1]
Added
-
Scalable Document List and Test Executions — Comprehensive redesign to eliminate UI and backend bottlenecks when working with thousands of documents. (#203)
- TypeDateIndex GSI on TrackingTable: New DynamoDB Global Secondary Index (
ItemType+InitialEventTime) enables efficient queries by item type (document, testrun, testset) sorted by time, replacing full table scans. Includes 20 projected attributes for list-view rendering without base table fetches. - GSI Attribute Backfill Mechanism: Robust Step Functions state machine with parallel scan workers that automatically backfills
ItemTypeandHITLPendingReviewattributes on existing items during stack upgrades. Features timeout-safe continuation, idempotent conditional updates, and automatic trigger via CloudFormation Custom Resource. - GSI-Based Document List Resolver: New
listDocumentsLambda resolver queries the TypeDateIndex GSI with server-side pagination (limit/nextToken). getDocumentCountAPI: New efficient count query using GSISelect: 'COUNT'for accurate document totals without fetching data.- UI Document List Rewrite: Eliminated the N+1 query pattern (shard queries → individual
getDocumentper document). Now uses a single paginatedlistDocumentsGSI query for all time periods. First page renders immediately with incremental background loading of remaining pages. - Subscription Optimization:
onUpdateDocumentevents now use subscription data directly instead of triggering individualgetDocumentAPI calls, eliminating thousands of redundant requests during active processing. - GSI-Based Test Runs Query: Replaced full table scan in
get_test_runs()andget_test_runs_by_date_range()with GSI query + BatchGetItem pattern for efficient test run listing with all fields (including Context, ConfigVersion). - GSI-Based Test Sets Query: Replaced full table scan in
get_test_sets()with GSI query + BatchGetItem pattern, avoiding scanning the entire TrackingTable (which includes all documents) just to find ~10 test sets. ItemTypeWritten on All Creation Paths: All document, test run, and test set creation paths (DynamoDB service, AppSync resolvers, test runners, dataset deployers) now writeItemTypeandInitialEventTimefor immediate GSI indexing.- Improved Error Messages: Document list errors now show the actual failure reason (e.g., Lambda throttling, timeout details) instead of generic "please try again" messages.
- TypeDateIndex GSI on TrackingTable: New DynamoDB Global Secondary Index (
-
GraphQL Type Generation & Unit Testing — Replaced 60+ hand-written GraphQL query/mutation/subscription files with auto-generated types via
@graphql-codegen, added typed AWSJSON parsers with unit tests (vitest + jsdom), and integrated a CI codegen-check to prevent type drift. -
Third-Party Model Support — Added Meta Llama 4 Maverick 17B, Llama 4 Scout 17B, Google Gemma 3 27B IT, and NVIDIA Nemotron Nano 12B v2 VL as selectable models across all pipeline stages (OCR, Classification, Extraction, Assessment, Summarization, Evaluation, Discovery, Agents, Rule Validation). Includes per-token pricing configuration and EU region fallback mappings for Llama 4 models. (#217)
-
Load Test Config Version Support — Added
--config-versionparameter to theidp-cli load-testcommand, enabling load tests to target a specific configuration version. Files uploaded during load tests now includeconfig-versionS3 metadata, consistent with theprocesscommand behavior. -
Deploy Failure Root Cause Analysis — Enhanced
idp-cli deployfailure reporting to recursively analyze nested stack events and identify actual root causes. Previously, failures in nested stacks showed only a generic "Embedded stack was not successfully created" message. Now displays a structured "Root Cause Analysis" section with the specific resource, type, and error message from the nested stack that caused the failure, along with cascade failure counts. -
MCP Server — Added additional tool to MCP Server for retrieving results of the processed document from the IDP system.
Changed
-
OCR Benchmark Config Optimization — Optimized
config_library/unified/ocr-benchmarkconfiguration with targeted field descriptions, explicit model/prompt/OCR settings, and corrected date format (YYYY-MM-DD to match ground truth). Improved overall extraction accuracy from 51.5% to 75.2% on the full 293-document benchmark at equivalent cost (~$2.62). Classification remains 100% across all 9 document classes. (#220) -
GraphQL Type Generation & Unit Testing — Replaced 60+ hand-written GraphQL query/mutation/subscription files with auto-generated types via
@graphql-codegen, added typed AWSJSON parsers with unit tests (vitest + jsdom), and integrated a CI codegen-check to prevent type drift.
Fixed
-
AgentCore Gateway Manager — Fixed the issue where gateway was not getting deleted once stack is deleted.
-
Configuration Page Error Display — Fixed
[object Object]error message when configuration loading fails (e.g., due to Lambda throttling) by properly extracting error messages from Amplify GraphQL error responses.
Templates
- us-west-2:
https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.1.yaml - us-east-1:
https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.1.yaml - eu-central-1:
https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.1.yaml
v0.5.0
[0.5.0]
Added
-
Unified Pattern — Merged Pattern-1 (BDA) and Pattern-2 (Pipeline) into a single deployment. Switch between BDA and Pipeline processing modes at runtime using the
use_bdaconfiguration toggle — no redeployment needed. Use Test Studio to compare accuracy and cost across both modes to find the optimal approach for your documents. See the Migration Guide for upgrade instructions. -
Rule Validation for BDA mode — Rule validation (business rule checking) is now available in both BDA and Pipeline modes. Previously it was Pipeline-only.
-
Fake W-2 Tax Form Test Set Auto-Deployment — New pre-deployed benchmark test set with 2,000 synthetically generated US W-2 tax form images and structured ground truth, sourced from HuggingFace (
singhsays/fake-w2-us-tax-form-dataset, originally from Kaggle under CC0: Public Domain license). Features 45 ground truth fields per document covering employer info (EIN, name, address), employee info (SSN, name, address), federal wages/taxes (boxes 1-8), compensation codes (boxes 12a-d), checkboxes (box 13), and state/local taxes (boxes 15-20). Includes both clean and noisy image variants for testing OCR robustness. Ideal for benchmarking W-2 extraction accuracy, evaluating image quality impact on processing, and testing structured form data extraction at scale. -
AWS Profile Support for CLI — Added optional
--profileparameter to specify AWS credentials profile. Can be placed anywhere in the command. Automatically applies to all AWS SDK calls. -
Enhanced
statusCLI/MCP Command with Advanced Search, Filtering, and Analytics — Added PK substring search (--batch-idnow matches partial batch identifiers across multiple batches),--object-statusfilter for searching by processing status (COMPLETED, FAILED, etc.),--get-timeflag for timing statistics (processing, queue, total time with min/max outlier tracking),--include-meteringflag for Lambda GB-seconds usage and cost estimates, and--show-detailsflag for detailed document information. IntroducesTrackingTableSearcherclass for flexible DynamoDB tracking table queries. Fully backward compatible with existing usage. -
Added Replace/Merge sync modes for BDA synchronization — Both "Sync from BDA" and "Sync to BDA" now support two modes: Replace (default) aligns the target to match the source exactly, removing items not in the source; Merge adds source items to the target without removing existing items. The UI modal now always shows a mode selection and ARN input (pre-filled for linked projects).
Deprecated
-
Pattern-1 (BDA) and Pattern-2 (Pipeline) separate deployments — Replaced by the Unified Pattern. Existing stacks are automatically upgraded. See the Migration Guide for details.
-
Pattern-3 (UDOP + Bedrock) — Pattern-3 is no longer available as a deployment option. If you are currently using Pattern-3 with a SageMaker UDOP endpoint, do not upgrade to v0.5.x without first testing in a non-production environment. You can use the Lambda Inference Hooks feature (introduced in v0.4.15) to call your existing SageMaker UDOP endpoint from the unified pattern's classification step via a custom Lambda function.
Changed
-
Switched
idp_sdkpyproject.toml to auto-discovery — Replaced explicit subpackage listing withsetuptools.packages.findusinginclude = ["idp_sdk*"]so new subpackages are automatically included without manual pyproject.toml updates. -
Resilient Test Set Deployment — Graceful Degradation on Download Failures — All test set deployer Lambdas (RealKIE-FCC, OmniAI-OCR-Benchmark, DocSplit-Poly-Seq) now handle download failures gracefully instead of causing CloudFormation stack rollbacks. When a dataset source (HuggingFace) is unreachable or a download fails, the deployer creates a FAILED test set record in DynamoDB with a descriptive error message visible in the Test Studio UI, and sends
cfnresponse.SUCCESSto CloudFormation so the stack deployment continues. Previously failed deployments are automatically retried on the next stack update. This ensures transient third-party service outages never block IDP infrastructure deployment. -
Replaced PyMuPDF (AGPL-3.0) with pypdfium2 (Apache-2.0/BSD-3-Clause) for PDF rendering — Resolves license incompatibility with the project's MIT-0 license. pypdfium2 provides equivalent PDF-to-image rendering using PDFium engine. Page rendering is now performed sequentially before parallel OCR processing to ensure thread-safety.
Fixed
-
Fixed "Sync from BDA" not removing IDP classes absent from BDA project — Previously, "Sync from BDA" only added new classes from the BDA project without removing classes that weren't in BDA. Now defaults to "Replace" mode which fully aligns the config version's classes with the BDA project, removing classes not present in BDA. A new "Merge" mode is also available to preserve the legacy additive behavior.
-
Fixed insufficient Lambda memory for Extraction, Assessment, and Evaluation functions in unified pattern template — Increased MemorySize from 512 MB (Extraction, Assessment) and 1024 MB (Evaluation) to 4096 MB to match all other document processing Lambda functions, preventing potential out-of-memory errors during document processing. (#205)
-
Fixed DOCX processing to extract text from embedded images and correct page splitting — DOCX files with embedded images (e.g.,
<w:drawing>elements) now have image content OCR'd and included in the extracted text instead of being silently skipped. Page splitting now uses DOCX metadata (explicit page breaks, image display dimensions fromwp:extent, section properties) instead of inaccurate height estimates, producing correct page boundaries.
Templates
- us-west-2:
https://s3.us-west-2.amazonaws.com/aws-ml-blog-us-west-2/artifacts/genai-idp/idp-main_0.5.0.yaml - us-east-1:
https://s3.us-east-1.amazonaws.com/aws-ml-blog-us-east-1/artifacts/genai-idp/idp-main_0.5.0.yaml - eu-central-1:
https://s3.eu-central-1.amazonaws.com/aws-ml-blog-eu-central-1/artifacts/genai-idp/idp-main_0.5.0.yaml