Summary
Initial snapshot of certain tables fails during the ConsolidateQRepPartitions step with a ZSTD decompression error when loading AVRO-staged data into Snowflake
via COPY INTO. The error retries indefinitely with the same corrupt stage files — PeerDB never re-uploads fresh data on retry.
Environment
- PeerDB version: stable-v0.36.12
- Source: RDS PostgreSQL 16.8
- Destination: Snowflake
- Deployment: Docker Compose (self-hosted on EC2)
Steps to Reproduce
- Create a CDC mirror from PostgreSQL to Snowflake
- Include a table with ~2M+ rows and a JSONB column
- Initial snapshot partitions are read from Postgres successfully (131K rows/batch, multiple parallel partitions)
- All partitions are staged as AVRO files in @PEERDB_INTERNAL.peerdb_stage_clone...
- ConsolidateQRepPartitions runs COPY INTO with FILE_FORMAT=(TYPE=AVRO), PURGE=TRUE
Error
failed to copy stage to destination: failed to handle append mode: failed to run COPY INTO command:
100079 (22000): Invalid data encountered during decompression for file: 'not_file',
compression type used: 'ZSTD', cause: 'Data corruption detected'
Key Observations
- Other tables in the same mirror work fine — One with 9M+ rows completed successfully with identical configuration
- The error is on the Consolidate step, not the partition replication step — data is read from Postgres and staged to Snowflake successfully, but the COPY INTO from stage to destination table fails
- Retries always fail because PeerDB retries the COPY INTO against the same corrupt stage files rather than re-uploading fresh AVRO data
- Clearing the stage manually (REMOVE @PEERDB_INTERNAL.peerdb_stage...) does not help — the next retry still fails, suggesting the files are being written
corrupt in the first place
- Warehouse size is not the cause — tested with X-SMALL and MEDIUM, same result
- Fresh database/schema doesn't help — reproduced after dropping and recreating the entire Snowflake database
Summary
Initial snapshot of certain tables fails during the ConsolidateQRepPartitions step with a ZSTD decompression error when loading AVRO-staged data into Snowflake
via COPY INTO. The error retries indefinitely with the same corrupt stage files — PeerDB never re-uploads fresh data on retry.
Environment
Steps to Reproduce
Error
failed to copy stage to destination: failed to handle append mode: failed to run COPY INTO command:
100079 (22000): Invalid data encountered during decompression for file: 'not_file',
compression type used: 'ZSTD', cause: 'Data corruption detected'
Key Observations
corrupt in the first place