| title | MCP Server |
|---|
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: MIT-0
The GenAI IDP solution provides MCP (Model Context Protocol) integration that enables external applications like Amazon Quick Suite to access IDP functionality through AWS Bedrock AgentCore Gateway. This allows third-party applications to query processed document data and perform analytics operations through natural language interfaces.
The MCP integration exposes IDP capabilities to external applications by:
- Analytics Gateway: Provides natural language access to processed document analytics data
- Secure Authentication: Uses AWS Cognito OAuth 2.0 for secure external application access
- MCP Protocol: Implements Model Context Protocol for standardized tool integration
- Real-time Queries: Enables external applications to query document processing results in real-time
- Extensible Architecture: Designed to support additional IDP functionality in future releases
QuickSuiteDemo.mp4
mcp-connector-demo.mp4
External applications can integrate with the IDP system through the AgentCore Gateway by:
- Authentication: Obtaining OAuth tokens from the IDP's Cognito User Pool
- Gateway Connection: Connecting to the AgentCore Gateway endpoint
- Tool Discovery: Discovering available analytics tools via MCP protocol
- Query Execution: Executing natural language queries against processed document data
External App → Cognito Auth → AgentCore Gateway → Analytics Lambda → IDP Data
The MCP integration is controlled by the EnableMCP parameter:
Enable MCP Integration:
EnableMCP: 'true' # Default valueDisable MCP Integration:
EnableMCP: 'false'When enabled, the stack automatically creates:
- AgentCore Gateway Manager Lambda function
- AgentCore Analytics Lambda function
- External App Client in Cognito User Pool
- Required IAM roles and policies
- AgentCore Gateway resource
- MCP Content Bucket for document uploads
When disabled, these resources are not created, reducing deployment complexity and costs.
The AgentCore Gateway provides five integrated tools for document processing and analytics:
Natural language queries for document analytics and system information.
Input Schema:
{
"query": {
"type": "string",
"description": "Natural language question about processed documents or analytics data"
}
}Output Schema:
{
"success": "boolean",
"query": "string",
"result": "string"
}Example Request:
{
"method": "tools/call",
"params": {
"name": "search",
"arguments": {
"query": "How many documents were processed last month?"
}
}
}Example Response:
{
"success": true,
"query": "How many documents were processed last month?",
"result": "1,250 documents were processed in the last month with a 98.5% success rate."
}Process documents from S3 or base64 content. To process documents via S3:
- Upload documents to the
MCPContentBucket(available in CloudFormation stack outputs):aws s3 cp documents/ s3://<MCPContentBucket>/documents/ --recursive
- Call the
processtool with the S3 URI pointing to your uploaded documents - The tool queues documents for processing through the IDP pipeline
Alternatively, process documents via base64 content by providing the encoded content directly to the tool.
Input Schema:
{
"location": {
"type": "string",
"description": "S3 URI for batch processing (e.g., 's3://mcp-content-bucket/documents/'). Optional if content is provided."
},
"content": {
"type": "string",
"description": "Base64-encoded document content for single document processing. Optional if location is provided."
},
"name": {
"type": "string",
"description": "Document filename with extension (e.g., 'invoice.pdf'). Required if content is provided."
},
"prefix": {
"type": "string",
"description": "Optional batch ID prefix (default: 'mcp-batch')"
}
}Output Schema:
{
"success": "boolean",
"batch_id": "string",
"documents_queued": "integer",
"message": "string"
}Example Request (S3 Location):
{
"method": "tools/call",
"params": {
"name": "process",
"arguments": {
"location": "s3://mcp-content-bucket/documents/",
"prefix": "batch-001"
}
}
}Example Request (Base64 Content):
{
"method": "tools/call",
"params": {
"name": "process",
"arguments": {
"content": "JVBERi0xLjQKJeLjz9MNCjEgMCBvYmo...",
"name": "invoice.pdf",
"prefix": "mcp-batch"
}
}
}Example Response:
{
"success": true,
"batch_id": "mcp-batch-20250124-143000",
"documents_queued": 5,
"message": "Successfully queued 5 documents for processing"
}Reprocess documents from classification or extraction steps.
Input Schema:
{
"step": {
"type": "string",
"enum": ["classification", "extraction"],
"description": "Pipeline step to reprocess from"
},
"document_ids": {
"type": "string",
"description": "Comma-separated list of document IDs to reprocess (alternative to batch_id)"
},
"batch_id": {
"type": "string",
"description": "Batch ID to get document IDs from (alternative to document_ids)"
},
"region": {
"type": "string",
"description": "AWS region (optional)"
}
}Output Schema:
{
"success": "boolean",
"batch_id": "string",
"documents_queued": "integer",
"step": "string",
"message": "string"
}Example Request:
{
"method": "tools/call",
"params": {
"name": "reprocess",
"arguments": {
"step": "extraction",
"batch_id": "mcp-batch-20250124-143000"
}
}
}Example Response:
{
"success": true,
"batch_id": "mcp-batch-20250124-143000",
"documents_queued": 5,
"step": "extraction",
"message": "Successfully queued 5 documents for extraction reprocessing"
}Retrieve processing results and extracted metadata for all documents in a batch.
Input Schema:
{
"batch_id": {
"type": "string",
"description": "Batch identifier (e.g., 'mcp-batch-20250124-143022'). Required to identify which batch to retrieve metadata from."
},
"section_id": {
"type": "integer",
"description": "Section number within documents (default: 1). Use for multi-section documents like healthcare packages."
},
"limit": {
"type": "integer",
"description": "Maximum documents to return per page (default: 10, max: 100)."
},
"next_token": {
"type": "string",
"description": "Pagination token from previous request for retrieving next page of results."
}
}Output Schema:
{
"success": "boolean",
"batch_id": "string",
"section_id": "integer",
"count": "integer",
"total_in_batch": "integer",
"documents": "array",
"next_token": "string (optional)",
"message": "string"
}Example Request:
{
"method": "tools/call",
"params": {
"name": "get_results",
"arguments": {
"batch_id": "mcp-batch-20250124-143022",
"section_id": 1,
"limit": 10
}
}
}Example Response:
{
"success": true,
"batch_id": "mcp-batch-20250124-143022",
"section_id": 1,
"count": 1,
"total_in_batch": 1,
"documents": [
{
"document_id": "mcp-batch-20250124-143022/document-001.pdf",
"document_class": "invoice",
"fields": {
"vendor_info": {
"name": "<vendor_name>",
"address": "<vendor_address>",
"tax_id": "<tax_id>"
},
"line_items": [
{"description": "<item_description>", "amount": "<amount>"},
{"description": "<item_description>", "amount": "<amount>"}
],
"total_amount": "<total>",
"invoice_date": "<date>"
},
"confidence": {
"vendor_info": {
"name": 0.98,
"address": 0.95,
"tax_id": 1.0
},
"total_amount": 0.99,
"invoice_date": 0.97
},
"page_count": 1,
"status": "COMPLETED"
}
],
"message": "Retrieved results for 1 document"
}Query batch and document processing status.
Input Schema:
{
"batch_id": {
"type": "string",
"description": "Batch identifier (e.g., 'mcp-batch-20250124-143000')"
},
"options": {
"type": "object",
"description": "Optional status parameters",
"properties": {
"detailed": {
"type": "boolean",
"description": "Include per-document details (default: false)"
},
"include_errors": {
"type": "boolean",
"description": "Include error details (default: true)"
}
}
},
"region": {
"type": "string",
"description": "AWS region (optional)"
}
}Output Schema:
{
"success": "boolean",
"batch_id": "string",
"status": {
"total": "integer",
"completed": "integer",
"in_progress": "integer",
"failed": "integer",
"queued": "integer"
},
"progress": {
"percentage": "number"
},
"all_complete": "boolean"
}Example Request:
{
"method": "tools/call",
"params": {
"name": "status",
"arguments": {
"batch_id": "mcp-batch-20250124-143000",
"options": {
"detailed": true
}
}
}
}Example Response:
{
"success": true,
"batch_id": "mcp-batch-20250124-143000",
"status": {
"total": 5,
"completed": 3,
"in_progress": 1,
"failed": 0,
"queued": 1
},
"progress": {
"percentage": 60.0
},
"all_complete": false
}The get_results tool returns paginated results:
- Default page size: 10 documents
- Maximum page size: 100 documents
- Use
next_tokento retrieve subsequent pages total_in_batchshows the complete batch size- Per-document data is accurate for the current page only
For multi-section documents (e.g., lending packages):
- Section 1: Primary extraction results
- Sections 2+: Additional document types within the same file
- Use
section_idparameter to retrieve specific sections
Confidence scores mirror the field structure exactly:
- Flat fields: Confidence is a numeric value (0.0-1.0)
- Nested objects: Confidence is nested with the same structure as fields
- Array fields: Confidence scores are not provided for array items (e.g.,
line_items) - Null values: Fields with null values have confidence score of 0.0
Example:
{
"fields": {
"vendor_info": {"tax_id": "<tax_id>"},
"line_items": [{"description": "<item>", "amount": "<amount>"}]
},
"confidence": {
"vendor_info": {"tax_id": 1.0},
"line_items": null
}
}-
AgentCore Gateway Manager Lambda
- Creates and manages the AgentCore Gateway
- Handles CloudFormation custom resource lifecycle
- Configures JWT authorization using Cognito
-
AgentCore MCP Handler Lambda
- Implements MCP protocol following AgentCore schema
- Processes natural language queries via search_genaiidp tool
- Translates queries to appropriate backend operations
- Returns structured responses in natural language
-
AgentCore Gateway
- AWS Bedrock AgentCore Gateway resource
- Routes requests between external applications and MCP handler Lambda
- Handles authentication and authorization
- External Application requests access token from Cognito
- Cognito User Pool validates credentials and returns JWT token
- External Application calls AgentCore Gateway with Bearer token
- AgentCore Gateway validates JWT token against Cognito
- Analytics Lambda processes the request and returns results
The Analytics Lambda has read-only access to:
- Analytics Database: Glue catalog with processed document metadata
- Reporting Bucket: S3 bucket containing analytics data and query results
- Configuration Tables: DynamoDB tables with system configuration
- Tracking Tables: DynamoDB tables with processing status
The MCP Server uses AWS Cognito OAuth 2.0 for secure authentication:
- External applications obtain JWT tokens from the Cognito User Pool
- AgentCore Gateway validates JWT tokens on every request
- Tokens include scopes (openid, email, profile) for fine-grained access control
- Token expiration and refresh mechanisms prevent unauthorized access
The AgentCore Analytics Lambda operates with least-privilege IAM permissions:
- Read-only access to DynamoDB tracking and configuration tables
- Read-only access to S3 analytics and reporting buckets
- No write permissions to input or output buckets
- Scoped permissions prevent access to resources outside the IDP stack
- Service role restricts Lambda execution to authorized operations only
Document processing through the MCP Server follows secure S3 access patterns:
- Input documents from S3 are processed through the standard IDP pipeline
- Base64-encoded documents are uploaded to a temporary MCP bucket with restricted access
- Temporary files are automatically cleaned up after processing
- All S3 operations use IAM role credentials (no long-lived access keys)
- Bucket policies restrict access to the IDP stack's execution roles
Data security is maintained throughout the MCP integration:
- In Transit: All communication between external applications and AgentCore Gateway uses HTTPS/TLS
- At Rest: DynamoDB tables and S3 buckets use AWS-managed encryption keys
- JWT Tokens: Signed with Cognito's private keys and validated using public keys
- Sensitive Data: Client secrets are stored securely in AWS Secrets Manager and rotated regularly
The stack creates a dedicated S3 bucket for MCP document uploads:
- Bucket Name:
MCPContentBucket(available in CloudFormation stack outputs) - Purpose: Upload documents for processing via the
processtool - Access: Accessible through the MCP Server tools with proper authentication
- Usage: Provide the S3 URI (e.g.,
s3://mcp-content-bucket/documents/) to theprocesstool - Cleanup: Temporary files are automatically managed by the IDP pipeline
Example Workflow:
- Upload documents to MCPContentBucket via S3 console or AWS CLI
- Use the
processtool with the S3 URI pointing to MCPContentBucket - Documents are processed through the standard IDP pipeline
- Results are available in the output bucket
The IDP solution creates a Cognito User Pool with:
- Domain: Auto-generated unique domain (e.g.,
stack-name-timestamp.auth.region.amazoncognito.com) - Password Policy: Configurable security requirements
- User Management: Admin-managed user creation
- OAuth Flows: Authorization code flow for external applications; client credentials flow for machine-to-machine (M2M) integrations (no user login required)
When MCP is enabled, the stack creates two separate Cognito User Pool Clients with different OAuth flows. Cognito does not allow mixing client_credentials and authorization_code flows on the same client, so each integration type requires its own dedicated client.
| CloudFormation Resource | Client Name | OAuth Flow | Purpose |
|---|---|---|---|
ExternalAppClient |
external-app-client |
authorization_code |
External apps, QuickSight integration |
MCPConnectorClient |
mcp-connector-client |
client_credentials |
MCP Connector machine-to-machine (M2M) auth — no user login required |
The ExternalAppClient is used for external applications requiring user-based login (e.g., Amazon QuickSight).
Client Configuration:
- Client Name:
external-app-client - Client Secret: Generated automatically
- Auth Flows: USER_PASSWORD_AUTH, ADMIN_USER_PASSWORD_AUTH, REFRESH_TOKEN_AUTH
- OAuth Flows: Authorization code flow
- OAuth Scopes: openid, email, profile
- Callback URLs:
- CloudFront distribution URL
- Quick Suite OAuth callback
- Cognito User Pool domain
- Stack Outputs:
MCPClientId,MCPClientSecret
The MCPConnectorClient is used by AI coding assistants (Cline, Amazon Q, etc.) that connect to the IDP MCP server via machine-to-machine (M2M) OAuth — authentication happens automatically in the background without any user login prompt.
Client Configuration:
- Client Name:
mcp-connector-client - Client Secret: Generated automatically
- OAuth Flows:
client_credentials(machine-to-machine (M2M) — no user login or browser redirect required) - OAuth Scopes:
idp-mcp-connector/access - Stack Outputs:
MCPConnectorClientId,MCPConnectorClientSecret
Note: Use
MCPConnectorClientId/MCPConnectorClientSecretfor MCP Connector configuration. These credentials use the machine-to-machine (M2M)client_credentialsOAuth flow, meaning the connector authenticates directly using its client ID and secret — no user login or browser is involved. TheMCPClientId/MCPClientSecretoutputs are reserved for QuickSight and other external apps that use theauthorization_codeflow (user-interactive login).
Each client type uses a different OAuth flow for token acquisition:
MCP Connector — Client Credentials Flow (machine-to-machine (M2M): the connector authenticates using its client ID and secret directly, with no user login or browser redirect):
curl -X POST <MCPTokenURL> \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials&scope=idp-mcp-connector/access" \
-u "<MCPConnectorClientId>:<MCPConnectorClientSecret>"External App / QuickSight — Authorization Code Flow (user-interactive):
# Step 1: Get authorization code
<MCPAuthorizationURL>?\
response_type=code&\
client_id=<MCPClientId>&\
redirect_uri=CALLBACK_URL&\
scope=openid+email+profile
# Step 2: Exchange code for tokens
curl -X POST <MCPTokenURL> \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=authorization_code&client_id=<MCPClientId>&client_secret=<MCPClientSecret>&code=AUTH_CODE&redirect_uri=CALLBACK_URL"When MCP integration is enabled, the CloudFormation stack provides the following outputs required for external application integration:
MCPContentBucket: S3 bucket for uploading documents to process via MCP tools- Use this bucket to upload documents before calling the
processtool - Provide the S3 URI from this bucket to the
processtool'slocationparameter
- Use this bucket to upload documents before calling the
MCPServerEndpoint: The HTTPS endpoint for the MCP Server- The AgentCore Gateway URL for MCP protocol communication
- Required for external applications to connect to the gateway via MCP protocol
The stack provides separate output parameters for each Cognito client:
MCP Connector (client_credentials — use for AI coding assistants):
-
MCPConnectorClientId: Cognito client ID for the MCP Connector (machine-to-machine (M2M)client_credentialsflow — no user login required)- Use this when configuring the IDP MCP Connector package
- Required for
client_credentialstoken requests
-
MCPConnectorClientSecret: Cognito client secret for the MCP Connector (machine-to-machine (M2M)client_credentialsflow)- Use this when configuring the IDP MCP Connector package
- Should be securely stored (e.g., in environment variables or a secrets manager)
External App / QuickSight (authorization_code — use for user-facing applications):
-
MCPClientId: Cognito client ID for the External App Client (QuickSight / authorization code flow)- Use this for Amazon QuickSight and other external applications requiring user login
- Used in OAuth authorization code flows
-
MCPClientSecret: Cognito client secret for the External App Client (QuickSight / authorization code flow)- Use this for Amazon QuickSight and other external applications requiring user login
- Should be securely stored and rotated regularly
Shared authentication parameters:
-
MCPUserPool: Cognito User Pool ID- Required for token validation and user management
- Used by both clients
-
MCPTokenURL: OAuth token endpoint URL- Format:
https://domain-name.auth.region.amazoncognito.com/oauth2/token - Used for obtaining access tokens via both OAuth flows
- Format:
-
MCPAuthorizationURL: OAuth authorization endpoint URL- Format:
https://domain-name.auth.region.amazoncognito.com/oauth2/authorize - Used for initiating OAuth authorization code flows (External App / QuickSight only)
- Format:
This example uses the MCP Connector client credentials (MCPConnectorClientId / MCPConnectorClientSecret) for machine-to-machine (M2M) authentication — the application authenticates directly using its client ID and secret, with no user login or browser redirect involved.
import requests
import json
# Configuration from CloudFormation outputs
GATEWAY_URL = "<MCPServerEndpoint>" # From stack outputs
CLIENT_ID = "<MCPConnectorClientId>" # From stack outputs (M2M client_credentials client)
CLIENT_SECRET = "<MCPConnectorClientSecret>" # From stack outputs (M2M client_credentials client)
TOKEN_URL = "<MCPTokenURL>" # From stack outputs
MCP_BUCKET = "<MCPContentBucket>" # From stack outputs
# Get access token via client_credentials flow
token_response = requests.post(
TOKEN_URL,
headers={"Content-Type": "application/x-www-form-urlencoded"},
data={
"grant_type": "client_credentials",
"scope": "idp-mcp-connector/access"
},
auth=(CLIENT_ID, CLIENT_SECRET)
)
access_token = token_response.json()["access_token"]
# Process documents from MCP bucket
process_request = {
"method": "tools/call",
"params": {
"name": "process",
"arguments": {
"location": f"s3://{MCP_BUCKET}/documents/",
"prefix": "batch-001"
}
}
}
response = requests.post(
GATEWAY_URL,
headers={
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/json"
},
json=process_request
)
result = response.json()
print(f"Processing result: {result}")For Amazon QuickSight integration, configure the MCP connection using the External App Client outputs (authorization code flow). These are separate from the MCP Connector credentials.
- MCP Server: Use
MCPServerEndpointoutput value - Client ID: Use
MCPClientIdoutput value (External App Client — authorization code flow) - Client Secret: Use
MCPClientSecretoutput value (External App Client — authorization code flow) - Token URL: Use
MCPTokenURLoutput value - Authorization URL: Use
MCPAuthorizationURLoutput value - Content Bucket: Use
MCPContentBucketoutput value for document uploads
Do not use
MCPConnectorClientId/MCPConnectorClientSecretfor QuickSight. Those are for the MCP Connector's machine-to-machine (M2M)client_credentialsflow (no user login) and will not work with theauthorization_codeflow required by QuickSight, which expects a user login redirect.