fix(source-s3): Consume CDK concurrent source fix#78291
fix(source-s3): Consume CDK concurrent source fix#78291devin-ai-integration[bot] wants to merge 1 commit into
Conversation
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksPR Slash CommandsAirbyte Maintainers (that's you!) can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful Resources
|
| @@ -356,6 +356,7 @@ This connector utilizes the open source [Unstructured](https://unstructured-io.g | |||
|
|
|||
There was a problem hiding this comment.
[markdownlint-fix] reported by reviewdog 🐶
|
Deploy preview for airbyte-docs ready!
Deployed with vercel-action |
|
|
↪️ Triggering Reason: This linked Green P1 issue has a draft fix PR with CI passing, while a newer human-authored S3 PR is handling expanded CDK/state-throttle follow-up. AI review should verify whether this original fix PR should continue, be superseded, or be closed in favor of the newer path. https://github.com/airbytehq/oncall/issues/12663 |
Reviewing PR for connector safety and quality.
|
|
AI PR Review is starting for this PR. Session: https://app.devin.ai/sessions/918de40026fd41c490b65156fe64e1e3 |
AI PR Review ReportReview Action: REQUEST CHANGES
🔧 Remediation RequiredRequired Actions
📋 PR DetailsConnector & PR InfoConnector(s): Risk LevelLevel: 4 — High Risk Level is reported for downstream consumers (e.g. auto-merge policy, reviewer routing). It does not change the review action — APPROVE here means "no blocking objection," not "safe to merge unattended." Review Action DetailsREQUEST CHANGES - The Live / E2E Tests enforced gate failed because
🔍 Gate Evaluation DetailsGate-by-Gate Analysis
Detailed trigger evidence:
📚 Evidence ConsultedEvidence
❓ How to RespondProviding Context or JustificationYou can add explanations that the bot will see on the next review: Option 1: PR Description (recommended) ## AI PR Review Justification
### {Gate Name}
[Your explanation here]Option 2: PR Comment After adding your response, re-run Note: Justifications provide context for the bot to evaluate. For some gates (like the Live / E2E Tests gate), a sufficient justification can lead to PASS. For other gates, justifications help explain the situation but may still require escalation if the gate cannot be remediated. |
What
Resolves https://github.com/airbytehq/oncall/issues/12663:
A
source-s3large full refresh can start a stream but fail to emit a terminal stream status if the concurrent file-based read path starves partition readers. This bumpssource-s3from4.15.4to4.15.5and consumesairbyte-cdk7.19.2, which includes the concurrent partition-generator cap for the file-based read path.Requested by API User via
/ai-fix.How
source-s3lockfile soairbyte-cdkresolves to7.19.2.pyproject.tomlto4.15.5and add the docs changelog entry.Review guide
airbyte-integrations/connectors/source-s3/poetry.lockairbyte-integrations/connectors/source-s3/unit_tests/v4/test_source.pyairbyte-integrations/connectors/source-s3/metadata.yamldocs/integrations/sources/s3.mdUser Impact
Large
source-s3full refresh syncs should be less likely to hang after emittingSTARTEDwithout a terminal stream status. No schema, spec, stream, or state format changes are included.Declarative-First Evaluation
source-s3is a Python CDK file-based connector (language:python,cdk:python-file-based), not a manifest-only or low-code declarative connector. The fix is a runtime dependency update, so declarative manifest alternatives do not apply.Breaking Change Evaluation
This is not a breaking change: it does not change schemas, primary keys, cursors, connector spec fields, streams, or state format. Standard patch versioning is used.
Test Coverage
test_airbyte_cdk_limits_concurrent_partition_generators, which fails without the CDK runtime support formax_concurrent_partition_generatorsand passes withairbyte-cdk7.19.2.Test plan
cd /home/ubuntu/repos/airbyte/airbyte-integrations/connectors/source-s3 && poetry run pytest unit_tests/ -xcd /home/ubuntu/repos/airbyte/airbyte-integrations/connectors/source-s3 && poetry run poe check-ruff-lintcd /home/ubuntu/repos/airbyte/airbyte-integrations/connectors/source-s3 && poetry run mypy unit_tests/v4/test_source.py --follow-imports=skipcd /home/ubuntu/repos/airbyte-python-cdk && poetry run pytest unit_tests/sources/streams/concurrent/test_concurrent_read_processor.py -q -k 'max_concurrent_partition_generators or concurrent_limit'cd /home/ubuntu/repos/airbyte-python-cdk && poetry run ruff check airbyte_cdk/sources/concurrent_source/concurrent_source.py airbyte_cdk/sources/concurrent_source/concurrent_read_processor.py unit_tests/sources/streams/concurrent/test_concurrent_read_processor.pypoetry run poe check-mypywas also attempted for the full connector, but it fails on existing source-s3 typing/stub issues outside this change (for example missingairbyte/jsonschemastubs and legacy type errors in source files).Can this PR be safely reverted and rolled back?
Devin session