This guide provides detailed information for DevOps engineers and infrastructure management. It assumes you've already read the main README and understand the basic deployment process.
- Development: Uses
RemovalPolicy.DESTROYfor easy cleanup - Production: Uses
RemovalPolicy.RETAINfor data protection - All Environments: Enable X-Ray tracing and CloudWatch logging
# Standard deployment
npm run deploy <environment>
# Hotswap for faster iterations
npm run deploy:hotswap <environment>
# Get stack outputs
npm run outputs <environment>
# Destroy stack
npm run destroy <environment>All resources follow: ${appName}-${environment}-resourceName
Examples:
payerSyncOnboarder-dev-stack(CloudFormation stack)payerSyncOnboarder-dev-adyen-webhook-handler(Lambda function)payerSyncOnboarder-dev-adyen-lem-api-key(Secrets Manager secret)payerSyncOnboarder-dev-adyen-webhook-access-logs(S3 bucket)payerSyncOnboarder-dev-adyen-webhook-bus(EventBridge bus)
- Constructs: Reusable components for common patterns
- Environment Separation: Clear environment-specific configurations
- Tagging: Comprehensive resource tagging for cost and security
- Security: CDK Nag for compliance and security checks
One-time setup per AWS account/region:
bash scripts/setup-apigw-logs-role.shThis creates the required IAM role for API Gateway logging.
Enabled for all Lambda functions:
- Distributed tracing across services
- Performance monitoring and bottleneck identification
- Request flow visualization
- Error tracking and debugging
- Invocation Counts: Monitor function call volumes
- Duration: Track performance and identify bottlenecks
- Error Rates: Alert on function failures
- Throttles: Monitor concurrency limits
- Request Counts: Track API usage patterns
- Latency: Monitor response times
- 4XX/5XX Errors: Alert on client and server errors
- Cache Hit Rates: Optimize caching strategies
- Event Delivery: Monitor success/failure rates
- Rule Evaluation: Track rule matching and routing
- Target Invocation: Monitor Lambda processor success rates
- Performance: Track event processing latency
- CPU Utilization: Monitor database performance
- Connections: Track connection pool usage
- Storage: Monitor disk space and I/O
- Replication Lag: Monitor Multi-AZ replication
- Message Counts: Monitor failed event processing
- Age of Oldest Message: Track processing delays
- Visibility Timeout: Monitor message processing times
# Check webhook endpoint status
curl -I https://your-api-id.execute-api.region.amazonaws.com/prod/adyen/webhook
# Monitor webhook logs
aws logs tail /aws/lambda/payerSyncOnboarder-{env}-adyen-webhook-handler --follow- Monitor HMAC validation failures
- Track signature verification success rates
- Alert on suspicious webhook patterns
# Monitor EventBridge logs
aws logs tail /aws/events/adyen-webhook-bus-{env} --follow
# Check processor logs
aws logs tail /aws/lambda/payerSyncOnboarder-{env}-standard-notification-handler --follow# Check database metrics
aws cloudwatch get-metric-statistics \
--namespace AWS/RDS \
--metric-name CPUUtilization \
--dimensions Name=DBInstanceIdentifier,Value=payerSyncOnboarder-{env}-onboarding-reporting \
--start-time $(date -d '1 hour ago' --iso-8601) \
--end-time $(date --iso-8601) \
--period 300 \
--statistics Average# Monitor database initialization
aws logs tail /aws/lambda/payerSyncOnboarder-{env}-db-init-custom-resource --follow
# Check initialization status
aws cloudformation describe-stack-events \
--stack-name payerSyncOnboarder-{env}-stack \
--query 'StackEvents[?ResourceType==`AWS::CloudFormation::CustomResource`]'# Test reporting endpoints
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" \
https://your-api-id.execute-api.region.amazonaws.com/reporting/schema
# Monitor reporting logs
aws logs tail /aws/lambda/payerSyncOnboarder-{env}-reporting-handler --follow- Monitor response times for reporting queries
- Track database query performance
- Alert on slow reporting operations
- Infrastructure Recovery: Use CDK to redeploy
- Data Recovery: Restore from DynamoDB point-in-time recovery
- Database Recovery: Use RDS automated backups and snapshots
- Configuration Recovery: Secrets automatically restored from AWS Secrets Manager
- Webhook Recovery: Replay events from S3 storage if needed
- Update environment configuration
- Deploy with new environment name:
npm run deploy <new-env> - Configure monitoring and alerts
- Set up webhook endpoints in Adyen dashboard
- Update documentation
- Export any needed data from DynamoDB and RDS
- Destroy stack:
npm run destroy <environment> - Clean up any manual resources (S3 buckets, etc.)
- Remove webhook endpoints from Adyen dashboard
- Update documentation
# Check current Internet Gateway count
aws ec2 describe-internet-gateways --region us-east-2 --query 'length(InternetGateways)'
# Delete unused stacks to free up resources
aws cloudformation delete-stack --stack-name unused-stack-name --region us-east-2# Delete conflicting bucket
aws s3 rb s3://bucket-name --force --region us-east-2
# Or use different bucket name in CDK stack# Check database initialization logs
aws logs get-log-events \
--log-group-name "/aws/lambda/payerSyncOnboarder-{env}-db-init-custom-resource" \
--region us-east-2 \
--start-from-head \
--output text# Check HMAC secret in Secrets Manager
aws secretsmanager get-secret-value \
--secret-id payerSyncOnboarder-{env}-adyen-hmac-secret \
--region us-east-2# Check EventBridge logs
aws logs get-log-events \
--log-group-name "/aws/events/adyen-webhook-bus-{env}" \
--region us-east-2 \
--start-from-head \
--output text# Check DLQ message counts
aws sqs get-queue-attributes \
--queue-url "https://sqs.region.amazonaws.com/account/payerSyncOnboarder-{env}-standard-notification-dlq" \
--attribute-names ApproximateNumberOfMessages \
--region us-east-2- Increase timeout in CDK stack
- Optimize function code
- Check external API response times
- Monitor RDS metrics
- Check connection pool usage
- Optimize database queries
- Monitor API Gateway metrics
- Check Lambda cold starts
- Optimize function memory allocation
# Rotate secrets
aws secretsmanager rotate-secret \
--secret-id payerSyncOnboarder-{env}-adyen-lem-api-key \
--region us-east-2# Check IAM access logs
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=GetSecretValue \
--region us-east-2- Monitor SSL certificate expiration
- Check TLS version compliance
- Alert on SSL configuration changes
- Monitor Lambda execution costs
- Track RDS instance usage
- Monitor S3 storage costs
- Check EventBridge event volumes
- Use Lambda provisioned concurrency for critical functions
- Implement RDS read replicas for reporting queries
- Optimize S3 lifecycle policies
- Monitor and adjust resource allocations
Set up alarms for:
- Lambda error rates > 5%
- API Gateway 5XX errors > 1%
- RDS CPU utilization > 80%
- SQS DLQ message count > 0
- EventBridge delivery failures
- Database connection count > 80%
Configure SNS topics for:
- Critical infrastructure failures
- Security incidents
- Performance degradation
- Cost threshold alerts