feat(typescript):airflow-lambda-dynamodb-approval#1207
feat(typescript):airflow-lambda-dynamodb-approval#1207architec wants to merge 5 commits intoaws-samples:mainfrom
Conversation
| import { AwsMwaaCdkStack } from '../lib/aws-mwaa-cdk-stack'; | ||
|
|
||
| const app = new cdk.App(); | ||
| new AwsMwaaCdkStack(app, 'AwsMwaaCdkStack'); |
…odb-approval - Add lib/aws-mwaa-cdk-stack.ts with complete MWAA infrastructure - Update .gitignore to track TypeScript source files in lib/ - Stack includes VPC, S3, Lambda, DynamoDB, and MWAA environment - Addresses PR aws-samples#1207 review feedback
kaiz-io
left a comment
There was a problem hiding this comment.
Thanks for the MWAA example - the core concept (Airflow + Lambda + DynamoDB approval) is a useful pattern.
A few changes needed before this is ready to merge, mostly around IAM best practices. Since this is a public samples repo, people copy these patterns directly into production.
Summary of requested changes:
- Use CDK grant methods instead of manual IAM policies where possible (
grantInvoke,grantReadWriteData). This follows CDK best practices and automatically scopes permissions to specific resources. - Remove wildcard
resources: ['*']on Lambda permissions - scope to the demo function ARN. - Remove redundant S3 policy -
dagsBucket.grantReadWrite()already covers it. - Fix SQS ARN - uses
*for account ID instead of${this.account}. - Remove overly broad logs resource -
arn:aws:logs:...:*matches everything. - Update Lambda runtime from Python 3.9 to 3.12+ (CDK Nag AwsSolutions-L1).
- Change
example_dag.pyschedule from hourly toNone- auto-scheduling costs money if left deployed. - Remove
webserver.expose_config: 'True'- exposes sensitive config in the web UI.
The DynamoDB and approval DAG code looks good. Happy to re-review once these are addressed!
| statements: [ | ||
| new iam.PolicyStatement({ | ||
| effect: iam.Effect.ALLOW, | ||
| actions: [ |
There was a problem hiding this comment.
Uses * for the account ID. Should be ${this.account}:
`arn:aws:sqs:${this.region}:${this.account}:airflow-celery-*`| destinationKeyPrefix: 'dags', | ||
| }); | ||
|
|
||
| // 4. Create Demo Lambda Function for MWAA testing (original simple demo) |
There was a problem hiding this comment.
Python 3.9 is outdated (CDK Nag rule AwsSolutions-L1). Use lambda.Runtime.PYTHON_3_13 or at minimum PYTHON_3_12.
| 'sqs:GetQueueUrl', | ||
| 'sqs:ReceiveMessage', | ||
| 'sqs:SendMessage' | ||
| ], |
There was a problem hiding this comment.
This manual S3 policy is redundant - line 289 already calls dagsBucket.grantReadWrite(mwaaExecutionRole) which covers these permissions. You can remove this entire statement block.
| enabled: true, | ||
| logLevel: 'INFO' | ||
| }, | ||
| schedulerLogs: { |
There was a problem hiding this comment.
expose_config: 'True' exposes the full Airflow configuration (including connection strings and env vars) in the web UI. Even for a demo, this is a bad pattern to teach. Consider removing or setting to 'False'.
| LambdaExecutionPolicy: new iam.PolicyDocument({ | ||
| statements: [ | ||
| new iam.PolicyStatement({ | ||
| effect: iam.Effect.ALLOW, |
There was a problem hiding this comment.
arn:aws:logs:${this.region}:${this.account}:* matches every log group in the account. The first three resource entries already cover MWAA log groups. Remove this last entry.
| // 5. Create Security Group for MWAA | ||
| const mwaaSecurityGroup = new ec2.SecurityGroup(this, 'MwaaSecurityGroup', { | ||
| vpc, | ||
| description: 'Security group for MWAA environment', |
There was a problem hiding this comment.
This manual policy is correctly scoped to the table ARN, but CDK has a built-in grant method that does the same thing more idiomatically:
approvalTable.grantReadWriteData(mwaaExecutionRole);This automatically handles the table ARN and index ARNs with least-privilege permissions.
| sid: 'DynamoDBAccessPermissions' | ||
| }) | ||
| ] | ||
| }) |
There was a problem hiding this comment.
This grants lambda:InvokeFunction, lambda:GetFunction, and lambda:ListFunctions on * (every Lambda in the account). Since this is a samples repo, people will copy this pattern verbatim.
Use CDK's grant method instead to scope to the specific function:
demoLambdaFunction.grantInvoke(mwaaExecutionRole);If ListFunctions is needed for the lambda_invoke_dag.py, you can add a scoped statement just for that action.
| 'example_dag', | ||
| default_args=default_args, | ||
| description='A simple example DAG for MWAA', | ||
| schedule_interval=timedelta(hours=1), |
There was a problem hiding this comment.
This runs every hour automatically. For a demo example, this will cost money continuously if someone deploys and forgets about it. The other two DAGs correctly use schedule=None (manual trigger). Change this to:
schedule_interval=None,
Shows how to use Airflow with Lambda, and DynamoDB for human approval.
Airflow calls Lambda
Airflow adds entry to DDB, waiting for value change from DDB to continue in workflow
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.