|
| 1 | +# Lambda S3 Download |
| 2 | + |
| 3 | +This pattern deploys a Lambda function that downloads a file from a URL and uploads it to an S3 bucket using multipart upload. It streams the file in configurable chunks through `/tmp`, making it capable of handling files larger than Lambda's memory and storage limits. |
| 4 | + |
| 5 | +Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example. |
| 6 | + |
| 7 | +## Requirements |
| 8 | + |
| 9 | +* [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources. |
| 10 | +* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured |
| 11 | +* [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) |
| 12 | +* [AWS Serverless Application Model](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) (AWS SAM) installed |
| 13 | + |
| 14 | +## Deployment Instructions |
| 15 | + |
| 16 | +1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository: |
| 17 | + ``` |
| 18 | + git clone https://github.com/aws-samples/serverless-patterns |
| 19 | + ``` |
| 20 | +1. Change directory to the pattern directory: |
| 21 | + ``` |
| 22 | + cd serverless-patterns/lambda-s3-download |
| 23 | + ``` |
| 24 | +1. Build the application: |
| 25 | + ``` |
| 26 | + sam build |
| 27 | + ``` |
| 28 | +1. Deploy the application: |
| 29 | + ``` |
| 30 | + sam deploy --guided |
| 31 | + ``` |
| 32 | +1. During the prompts: |
| 33 | + * Enter a stack name |
| 34 | + * Enter the desired AWS Region |
| 35 | + * Enter the target S3 bucket name (the bucket must already exist) |
| 36 | + * Allow SAM CLI to create IAM roles with the required permissions |
| 37 | +
|
| 38 | + Once you have run `sam deploy --guided` mode once and saved arguments to a configuration file (samconfig.toml), you can use `sam deploy` in future to use these defaults. |
| 39 | +
|
| 40 | +1. Note the outputs from the SAM deployment process. These contain the resource names and/or ARNs which are used for testing. |
| 41 | +
|
| 42 | +## How it works |
| 43 | +
|
| 44 | +The Lambda function: |
| 45 | +
|
| 46 | +1. Receives a download URL and filename via the event payload |
| 47 | +2. Initiates an S3 multipart upload with SHA256 checksums |
| 48 | +3. Streams the file from the URL in chunks (default 128 MB), writing each chunk to `/tmp` and uploading it as a multipart part |
| 49 | +4. Cleans up each chunk from `/tmp` after uploading to stay within the 10 GB ephemeral storage limit |
| 50 | +5. Completes the multipart upload and returns the S3 object checksum |
| 51 | +6. If any step fails, aborts the multipart upload to avoid orphaned parts |
| 52 | +
|
| 53 | +The function is configured with a 15-minute timeout, 1 GB memory, and 10 GB ephemeral storage. |
| 54 | +
|
| 55 | +## Testing |
| 56 | +
|
| 57 | +Invoke the Lambda function with a test event: |
| 58 | +
|
| 59 | +```bash |
| 60 | +aws lambda invoke \ |
| 61 | + --function-name FUNCTION_NAME \ |
| 62 | + --cli-binary-format raw-in-base64-out \ |
| 63 | + --payload '{ |
| 64 | + "download_url": "https://example.com/file.zip", |
| 65 | + "download_filename": "file.zip" |
| 66 | + }' \ |
| 67 | + response.json |
| 68 | +``` |
| 69 | + |
| 70 | +Optional event parameters: |
| 71 | + |
| 72 | +| Parameter | Description | Default | |
| 73 | +|---|---|---| |
| 74 | +| `target_bucket` | S3 bucket name (overrides the deployed parameter) | Value from template parameter | |
| 75 | +| `target_bucket_region` | S3 bucket region | Lambda's region | |
| 76 | +| `chunk_size_mb` | Size of each download chunk in MB (clamped between 5 and 5120) | 128 | |
| 77 | + |
| 78 | +## Known Limitations |
| 79 | + |
| 80 | +- The Lambda function has a 15-minute maximum timeout. If the download and upload combined take longer than that, the function will be killed mid-stream and the multipart upload will be left incomplete. Consider setting an [S3 lifecycle rule](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpu-abort-incomplete-mpu-lifecycle-config.html) on the target bucket to auto-clean incomplete multipart uploads. |
| 81 | +- The `download_filename` should be a flat filename (e.g. `file.zip`). If it contains slashes (e.g. `path/to/file.zip`), the temporary file path in `/tmp` will include subdirectories that may not exist, causing a write failure. |
| 82 | + |
| 83 | +## Cleanup |
| 84 | + |
| 85 | +1. Delete the stack |
| 86 | + ```bash |
| 87 | + aws cloudformation delete-stack --stack-name STACK_NAME |
| 88 | + ``` |
| 89 | +1. Confirm the stack has been deleted |
| 90 | + ```bash |
| 91 | + aws cloudformation list-stacks --query "StackSummaries[?contains(StackName,'STACK_NAME')].StackStatus" |
| 92 | + ``` |
| 93 | +---- |
| 94 | +Copyright 2025 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
| 95 | + |
| 96 | +SPDX-License-Identifier: MIT-0 |
0 commit comments