Skip to content

compress S3 uploads with GZIP before COPY to Redshift#253

Merged
tobiascadee merged 1 commit into
mainfrom
gzip-s3-upload
Apr 2, 2026
Merged

compress S3 uploads with GZIP before COPY to Redshift#253
tobiascadee merged 1 commit into
mainfrom
gzip-s3-upload

Conversation

@tobiascadee
Copy link
Copy Markdown
Contributor

Summary

  • Changes the S3 file extension from .csv to .csv.gzsmart_open auto-compresses on write when the URI ends in .gz
  • Adds GZIP keyword to the Redshift COPY statement so Redshift decompresses on load

Reduces S3 upload size and transfer time, which is typically the bottleneck for large batches.

Note: Users with custom copy_options in their config should ensure they don't already include GZIP to avoid a duplicate keyword error.

Test plan

  • Run target against a real Redshift cluster and verify records load correctly
  • Check S3 that uploaded files are .csv.gz and are valid gzip
  • Verify existing custom copy_options configs still work

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@tobiascadee tobiascadee requested a review from a team as a code owner April 2, 2026 11:32
@tobiascadee tobiascadee enabled auto-merge (squash) April 2, 2026 12:16
@tobiascadee tobiascadee merged commit 53b84a0 into main Apr 2, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant