Skip to content

fix: re-poll for fresh download URL before fetching async job records#973

Closed
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1775057411-fix-sas-token-expiry-async-download
Closed

fix: re-poll for fresh download URL before fetching async job records#973
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1775057411-fix-sas-token-expiry-async-download

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Summary

Fixes an issue where download URLs with short TTLs (e.g. Azure Blob Storage SAS tokens with 10-minute expiry) can expire between poll-completion and the actual download in AsyncHttpJobRepository. This is triggered when many concurrent async streams cause a significant delay between update_jobs_status storing the polling response and fetch_records extracting the download URL from it.

The fix adds a _refresh_download_url call at the start of fetch_records() that re-polls the API to get a fresh response before extracting download targets.

Reported in: https://github.com/airbytehq/oncall/issues/11749 (Bing Ads connector, 29 concurrent streams, SAS token 403 errors)

Review & Testing Checklist for Human

  • Unconditional re-poll tradeoff: The re-poll happens on every fetch_records call with no staleness check. This adds one extra HTTP request per job download for all async connectors, not just those with expiring URLs. Verify this overhead is acceptable, or consider gating behind a time-elapsed threshold or a flag.
  • Re-poll failure behavior: If _get_validated_polling_response raises during the refresh (transient network error, API 5xx), the download will now fail even though a cached response exists. Consider whether the refresh should be best-effort (try/except falling back to cached response).
  • Run the full async retriever test suite to confirm no regressions in other async connector patterns (e.g., Salesforce bulk API, SendGrid exports).

Notes

  • This is a CDK-level change that affects all connectors using AsyncHttpJobRepository. The Bing Ads connector (manifest-only, uses source-declarative-manifest base image) will pick up the fix automatically once a new CDK version is released and the base image is rebuilt.
  • The test mocks sequential polling responses (stale URL → fresh URL) and asserts only the fresh URL is used for download.

Link to Devin session: https://app.devin.ai/sessions/e3c1004bcc834f40a854c0c489a70b98

Download URLs (e.g. Azure Blob Storage SAS tokens) may expire between
poll-completion and actual download when many concurrent streams delay
record fetching. This adds a re-poll step in fetch_records() to ensure
the download URL is still valid before attempting the download.

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1775057411-fix-sas-token-expiry-async-download#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1775057411-fix-sas-token-expiry-async-download

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

PyTest Results (Fast)

3 976 tests  +1   3 965 ✅ +1   7m 40s ⏱️ +8s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit c71c8ee. ± Comparison against base commit 69cd63d.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

PyTest Results (Full)

3 979 tests  +1   3 967 ✅ +1   11m 15s ⏱️ -5s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit c71c8ee. ± Comparison against base commit 69cd63d.

@pnilan
Copy link
Copy Markdown
Contributor

I don't want a CDK fix. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant