Skip to content

feat(typescript):airflow-lambda-dynamodb-approval#1207

Open
architec wants to merge 5 commits intoaws-samples:mainfrom
architec:airflow-lambda-dynamodb-approval
Open

feat(typescript):airflow-lambda-dynamodb-approval#1207
architec wants to merge 5 commits intoaws-samples:mainfrom
architec:airflow-lambda-dynamodb-approval

Conversation

@architec
Copy link
Copy Markdown
Contributor

Shows how to use Airflow with Lambda, and DynamoDB for human approval.

Airflow calls Lambda
Airflow adds entry to DDB, waiting for value change from DDB to continue in workflow


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@architec architec changed the title feat:(typescript/airflow-lambda-dynamodb-approval) feat(typescript):airflow-lambda-dynamodb-approval Sep 3, 2025
Copy link
Copy Markdown
Contributor

@kaiz-io kaiz-io left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST CHANGES: Missing critical infrastructure file

The main CDK stack file (lib/aws-mwaa-cdk-stack.ts) is missing from this PR.

import { AwsMwaaCdkStack } from '../lib/aws-mwaa-cdk-stack';

const app = new cdk.App();
new AwsMwaaCdkStack(app, 'AwsMwaaCdkStack');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this file?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaiz-io Fixed. Added lib/aws-mwaa-cdk-stack.ts file in commit 267dabc.

@kaiz-io kaiz-io added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Oct 9, 2025
kaiz-io and others added 3 commits October 9, 2025 14:37
…odb-approval

- Add lib/aws-mwaa-cdk-stack.ts with complete MWAA infrastructure
- Update .gitignore to track TypeScript source files in lib/
- Stack includes VPC, S3, Lambda, DynamoDB, and MWAA environment
- Addresses PR aws-samples#1207 review feedback
Copy link
Copy Markdown
Contributor

@kaiz-io kaiz-io left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the MWAA example - the core concept (Airflow + Lambda + DynamoDB approval) is a useful pattern.

A few changes needed before this is ready to merge, mostly around IAM best practices. Since this is a public samples repo, people copy these patterns directly into production.

Summary of requested changes:

  1. Use CDK grant methods instead of manual IAM policies where possible (grantInvoke, grantReadWriteData). This follows CDK best practices and automatically scopes permissions to specific resources.
  2. Remove wildcard resources: ['*'] on Lambda permissions - scope to the demo function ARN.
  3. Remove redundant S3 policy - dagsBucket.grantReadWrite() already covers it.
  4. Fix SQS ARN - uses * for account ID instead of ${this.account}.
  5. Remove overly broad logs resource - arn:aws:logs:...:* matches everything.
  6. Update Lambda runtime from Python 3.9 to 3.12+ (CDK Nag AwsSolutions-L1).
  7. Change example_dag.py schedule from hourly to None - auto-scheduling costs money if left deployed.
  8. Remove webserver.expose_config: 'True' - exposes sensitive config in the web UI.

The DynamoDB and approval DAG code looks good. Happy to re-review once these are addressed!

statements: [
new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
actions: [
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uses * for the account ID. Should be ${this.account}:

`arn:aws:sqs:${this.region}:${this.account}:airflow-celery-*`

destinationKeyPrefix: 'dags',
});

// 4. Create Demo Lambda Function for MWAA testing (original simple demo)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python 3.9 is outdated (CDK Nag rule AwsSolutions-L1). Use lambda.Runtime.PYTHON_3_13 or at minimum PYTHON_3_12.

'sqs:GetQueueUrl',
'sqs:ReceiveMessage',
'sqs:SendMessage'
],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This manual S3 policy is redundant - line 289 already calls dagsBucket.grantReadWrite(mwaaExecutionRole) which covers these permissions. You can remove this entire statement block.

enabled: true,
logLevel: 'INFO'
},
schedulerLogs: {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expose_config: 'True' exposes the full Airflow configuration (including connection strings and env vars) in the web UI. Even for a demo, this is a bad pattern to teach. Consider removing or setting to 'False'.

LambdaExecutionPolicy: new iam.PolicyDocument({
statements: [
new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arn:aws:logs:${this.region}:${this.account}:* matches every log group in the account. The first three resource entries already cover MWAA log groups. Remove this last entry.

// 5. Create Security Group for MWAA
const mwaaSecurityGroup = new ec2.SecurityGroup(this, 'MwaaSecurityGroup', {
vpc,
description: 'Security group for MWAA environment',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This manual policy is correctly scoped to the table ARN, but CDK has a built-in grant method that does the same thing more idiomatically:

approvalTable.grantReadWriteData(mwaaExecutionRole);

This automatically handles the table ARN and index ARNs with least-privilege permissions.

sid: 'DynamoDBAccessPermissions'
})
]
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This grants lambda:InvokeFunction, lambda:GetFunction, and lambda:ListFunctions on * (every Lambda in the account). Since this is a samples repo, people will copy this pattern verbatim.

Use CDK's grant method instead to scope to the specific function:

demoLambdaFunction.grantInvoke(mwaaExecutionRole);

If ListFunctions is needed for the lambda_invoke_dag.py, you can add a scoped statement just for that action.

'example_dag',
default_args=default_args,
description='A simple example DAG for MWAA',
schedule_interval=timedelta(hours=1),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runs every hour automatically. For a demo example, this will cost money continuously if someone deploys and forgets about it. The other two DAGs correctly use schedule=None (manual trigger). Change this to:

schedule_interval=None,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants