| title | GenAI IDP Accelerator Cost Considerations |
|---|
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: MIT-0
INFORMATION DOCUMENT
This document provides conceptual guidance on the cost factors to consider when using the GenAI Intelligent Document Processing (GenAIIDP) Accelerator solution.
This document provides a framework for understanding the cost elements of running the GenAI IDP Accelerator solution. It outlines the primary contributors to cost and provides guidance on cost optimization across the different processing patterns.
The primary cost drivers for the GenAI IDP Accelerator solution include:
- BDA Processing: The main cost component for Pattern 1, charged per document processed.
- Amazon Bedrock: Used for summarization (if enabled).
- Amazon Textract: Costs based on the number of pages processed.
- Amazon Bedrock: Costs based on the models used, input tokens processed, and output tokens generated.
- Amazon Textract: Costs based on the number of pages processed.
- Amazon SageMaker: Costs based on the instance type used and running time.
- Amazon Bedrock: Costs for extraction and optional summarization.
- Amazon S3: Costs based on the amount of data stored and storage duration.
- Amazon DynamoDB: Costs based on the stored document metadata.
- AWS Lambda: Costs based on request count, duration, and memory usage.
- AWS Step Functions: Costs based on state transitions for workflow orchestration.
- Amazon SQS: Costs based on message count for document queue management.
- Amazon CloudWatch: Costs for logs and metrics.
- Amazon Cognito: Costs based on monthly active users.
- AWS AppSync: Costs based on GraphQL API queries.
- Bedrock Knowledge Base: Costs for queries and storage if this optional feature is used.
-
Right-size your model selection:
- Use simpler models for routine document processing
- Reserve more powerful models for complex documents requiring higher accuracy
-
Configure OCR features appropriately:
- Only enable Textract features you need (e.g., TABLES, FORMS, SIGNATURES)
- Select processing options based on document requirements
-
Implement prompt caching:
- The solution supports prompt caching to significantly reduce costs when processing similar documents
- Especially effective when using few-shot examples, as these can be cached across invocations
-
Optimize document preprocessing:
- Compress images before processing to reduce Token costs
-
Implement tiered storage:
- Move older processed documents to S3 Infrequent Access or Glacier
- Implement lifecycle policies based on document age
-
Monitor and alert on costs:
- Set up AWS Budgets to track spending
- Create alerts for unusual processing volumes
-
Optimize knowledge base usage (if used):
- Limit knowledge base queries to essential use cases
- Implement caching for common queries
The GenAI IDP Accelerator solution includes a built-in cost estimation feature in the web UI that calculates and displays the actual processing costs for each document. This feature:
- Tracks and displays costs per service/API used during document processing
- Breaks down costs by input tokens, output tokens, and page processing
- Shows the total estimated cost for each document processed
- Enables per-page cost analysis for detailed cost monitoring
- Uses service pricing from the solution configuration, which can be modified to reflect any pricing variations or special agreements
This real-time cost tracking helps you monitor actual usage patterns and optimize costs based on real-world usage.
In addition to the built-in cost tracking, consider using these AWS tools:
- AWS Cost Explorer: Analyze and visualize your costs and usage over time
- AWS Budgets: Set custom budgets and receive alerts when costs exceed thresholds
- AWS Cost and Usage Reports: Generate detailed reports on your AWS costs and usage
Amazon Bedrock Application Inference Profiles let you tag Bedrock model invocations with custom cost-allocation tags (e.g., project, team, or migration program identifiers). Because all Bedrock calls in the GenAI IDP Accelerator are driven by model IDs in the configuration, you can enable cost attribution without any code changes — just create an inference profile and update the configuration.
By default, Bedrock usage appears in AWS Cost Explorer as a single line item per model. Application inference profiles enable you to:
- Attribute Bedrock costs to specific projects, teams, or migration programs using AWS cost-allocation tags
- Track costs per workload when multiple applications share the same AWS account
- Support MAP (Migration Acceleration Program) tagging — e.g.,
map-migrated: migDNDBZMXMLZ - Separate cost reporting across different IDP document processing pipelines
- Open the Amazon Bedrock Console → Inference → Inference profiles
- Click Create inference profile
- Select the foundation model currently used in your IDP configuration (e.g.,
us.anthropic.claude-3-7-sonnet-20250219-v1:0) - Add your cost-allocation tags. For example:
map-migrated:migDNDBZMXMLZproject:my-idp-workload
- Note the Application Inference Profile ARN after creation — it will look like:
arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abcdef123456
💡 Tip: Create separate inference profiles for each processing stage (classification, extraction, assessment) if you need per-stage cost breakdowns.
-
In the IDP web UI, go to View/Edit Configuration
-
Toggle to JSON View or YAML View
-
Find the
modelormodel_idfields in the relevant processing step sections — for example:- Classification:
classification.model_id - Extraction:
extraction.model_id - Assessment:
assessment.model - Summarization:
summarization.model
- Classification:
-
Replace the standard model ID or cross-region inference profile ID with the new Application Inference Profile ARN
Before:
model_id: us.anthropic.claude-3-7-sonnet-20250219-v1:0
After:
model_id: arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abcdef123456
-
Click Save Changes
⚠️ Note: Application inference profiles are region-specific. If you are using cross-region inference profiles (e.g.,us.*oreu.*prefixes) for multi-region routing, be aware that an application inference profile pins invocations to the region where it was created. See the EU Region Model Support doc for details on cross-region behavior.
After processing documents with the updated configuration:
- Go to AWS Cost Explorer
- Group by your cost-allocation tag (e.g.,
map-migrated) - Bedrock invocations made through the application inference profile will now appear under the tagged allocation
📝 Note: Cost allocation tags may take up to 24 hours to appear in Cost Explorer after first use. You must also activate the tags in the Billing console if they are user-defined tags.
The GenAI IDP Accelerator solution is designed to provide cost transparency and efficiency. However, actual costs will depend on your specific implementation, document characteristics, and processing needs. Always refer to the official AWS pricing pages for the most current pricing information for all services used.