Skip to content

fix: suppress misleading backoff library log in user_defined_backoff_handler#981

Merged
sophiecuiy merged 3 commits intomainfrom
devin/1775496581-fix-misleading-backoff-log-message
Apr 6, 2026
Merged

fix: suppress misleading backoff library log in user_defined_backoff_handler#981
sophiecuiy merged 3 commits intomainfrom
devin/1775496581-fix-misleading-backoff-log-message

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot commented Apr 6, 2026

Summary

The backoff library's default logger produces a misleading message when user_defined_backoff_handler retries:

Backing off _send(...) for 0.0s (UserDefinedBackoffException: HTTP Status Code: 429...)

The 0.0s is the backoff library's interval parameter (set to 0), not the actual sleep time. The real sleep happens in the on_backoff callback (sleep_on_ratelimit) and is typically 60+ seconds. This generates confusion and incorrect bug reports from users.

Fix: Pass logger=None to backoff.on_exception() to suppress the library's default log handler, and improve the custom sleep_on_ratelimit callback to emit a single clear message:

UserDefinedBackoffException: Rate limit exceeded (HTTP 429). Retrying in 61 seconds.

No changes to retry logic or sleep behavior — only logging is affected.

Resolves https://github.com/airbytehq/airbyte-internal-issues/issues/16135
Related to https://github.com/airbytehq/oncall/issues/11837

Updates since last revision

  • Prepended UserDefinedBackoffException: to both log messages in sleep_on_ratelimit per reviewer feedback, so the error source is immediately visible in logs.
  • Fixed if exc.response:exc.response is not None: requests.Response.__bool__ returns False for status >= 400, so the old check (pre-existing bug) would skip the HTTP status code branch on 429 responses.
  • Log message now reports the actual sleep duration (retry_after + 1) instead of just retry_after, so the logged time matches the real sleep.
  • Added kwargs.pop("logger", None) before passing logger=None to backoff.on_exception() to prevent TypeError if a caller ever passes logger in kwargs.

Review & Testing Checklist for Human

  • exc.response is not None behavior change: The old code used if exc.response: which was falsey for HTTP error responses (including 429). The fix means the status code will now appear in logs where it previously did not. Confirm this is the desired behavior.
  • Logged sleep time includes +1 buffer: Users will now see e.g. "Retrying in 61 seconds" (for a 60s backoff) instead of the old "Sleeping for 60 seconds". Confirm this is acceptable — it's more accurate but may look slightly odd.
  • Response content no longer logged during retries: The old code logged raw Response Content: {exc.response.content!r} on every retry attempt. The new code omits this in favor of a cleaner user-facing message. Verify this trade-off is acceptable.
  • Test plan: Trigger a 429 from any connector using a custom backoff strategy (e.g., source-amazon-seller-partner) and confirm: (1) no more "Backing off _send(...) for 0.0s" messages appear, (2) the new "UserDefinedBackoffException: Rate limit exceeded (HTTP 429). Retrying in {N} seconds." message appears instead, (3) retry/sleep behavior is unchanged.

Notes

  • No existing unit tests cover the logging output of user_defined_backoff_handler. The change is logging-only and isolated to one function.
  • The other backoff handlers (default_backoff_handler, http_client_default_backoff_handler, rate_limit_default_backoff_handler) are not affected — they use backoff.expo with real intervals so their default log messages are accurate.

Link to Devin session: https://app.devin.ai/sessions/2ced27cd22bd409f845816cd10a0c5f4

…handler

The backoff library's default logger reported 'Backing off _send(...) for 0.0s'
because user_defined_backoff_handler uses interval=0 (actual sleep happens in
the on_backoff callback). This caused confusion and incorrect bug reports.

Changes:
- Pass logger=None to backoff.on_exception() to suppress the misleading default
  log message that reports the interval (0s) instead of the actual sleep time
- Improve sleep_on_ratelimit log to show HTTP status code and actual retry time:
  'Rate limit exceeded (HTTP 429). Retrying in {N} seconds.'
- Format elapsed time in log_give_up for consistency

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1775496581-fix-misleading-backoff-log-message#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1775496581-fix-misleading-backoff-log-message

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

PyTest Results (Fast)

4 012 tests  ±0   4 001 ✅ ±0   7m 41s ⏱️ +2s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 30132a2. ± Comparison against base commit 5b14f41.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

PyTest Results (Full)

4 015 tests  ±0   4 003 ✅ ±0   11m 4s ⏱️ -8s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 30132a2. ± Comparison against base commit 5b14f41.

♻️ This comment has been updated with latest results.

@sophiecuiy
Copy link
Copy Markdown
Contributor

enhance logs to inform user that the error comes from UserDefinedBackoffException.
append UserDefinedBackoffException: to the front of the error log

… feedback

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

sophiecuiy Done — prepended UserDefinedBackoffException: to both log messages in sleep_on_ratelimit. The logs now read:

UserDefinedBackoffException: Rate limit exceeded (HTTP 429). Retrying in 60 seconds.

Pushed in 789ce9d.

@sophiecuiy sophiecuiy marked this pull request as ready for review April 6, 2026 21:02
Copilot AI review requested due to automatic review settings April 6, 2026 21:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts retry logging for the user_defined_backoff_handler to avoid misleading backoff library messages (reporting 0.0s interval) and to provide a clearer, single log line describing the actual retry delay.

Changes:

  • Updates sleep_on_ratelimit to log a clearer “rate limit exceeded” message including the retry delay.
  • Tweaks the give-up log message formatting for elapsed time.
  • Passes logger=None to backoff.on_exception() to suppress the library’s default backoff logging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…og, kwargs conflict

Co-Authored-By: bot_apk <apk@cognition.ai>
@sophiecuiy
Copy link
Copy Markdown
Contributor

/ai-prove-fix

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Proof of Fix

Ran a proof script that exercises user_defined_backoff_handler with a real requests.Response (status 429) and patched time.sleep:

1. requests.Response falsiness confirmed

bool(response) = False   <-- This is why `if exc.response:` was wrong for 429
response is not None = True   <-- Correct check (used in this PR)

2. Log output (new behavior)

UserDefinedBackoffException: Rate limit exceeded (HTTP 429). Retrying in 61 seconds.
UserDefinedBackoffException: Rate limit exceeded (HTTP 429). Retrying in 61 seconds.
  • No "Backing off _send(...) for 0.0s" message (logger=None suppresses it)
  • HTTP status code 429 appears (thanks to is not None fix)
  • Logged time 61s matches actual time.sleep(61) call

3. Sleep calls

[1] sleep(61) — custom handler (retry_after=60 + 1)
[2] sleep(0)  — backoff library (no-op, interval=0)
[3] sleep(61) — custom handler
[4] sleep(0)  — backoff library (no-op)

Function succeeded after 3 attempts (2 retries + 1 success).

Before vs After

Old New
Message Backing off _send(...) for 0.0s (UserDefinedBackoffException: HTTP Status Code: 429...) UserDefinedBackoffException: Rate limit exceeded (HTTP 429). Retrying in 61 seconds.
Reported time 0.0s (misleading) 61s (accurate)
HTTP status Buried in exception string Clear (HTTP 429)
429 response logged? No — if exc.response: was False for 429 Yes — exc.response is not None is True
Full proof script
import logging, sys, time
from unittest.mock import MagicMock, patch
from requests import Response
from airbyte_cdk.sources.streams.http.exceptions import UserDefinedBackoffException
from airbyte_cdk.sources.streams.http.rate_limiting import user_defined_backoff_handler

real_response = Response()
real_response.status_code = 429
real_response._content = b"Too many requests"

call_count = 0
def mock_send(request, kwargs):
    global call_count
    call_count += 1
    if call_count <= 2:
        raise UserDefinedBackoffException(backoff=60, request=MagicMock(), response=real_response)
    return MagicMock()

decorated = user_defined_backoff_handler(max_tries=3)(mock_send)
with patch("airbyte_cdk.sources.streams.http.rate_limiting.time.sleep") as mock_sleep:
    result = decorated(MagicMock(), {})
    for i, call in enumerate(mock_sleep.call_args_list):
        print(f"sleep({call.args[0]})")

@sophiecuiy sophiecuiy merged commit 4aaafcf into main Apr 6, 2026
27 of 28 checks passed
@sophiecuiy sophiecuiy deleted the devin/1775496581-fix-misleading-backoff-log-message branch April 6, 2026 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants