fix(source-github): fix silent error swallowing in exception handlers#76124
Conversation
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksPR Slash CommandsAirbyte Maintainers (that's you!) can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful Resources
|
|
Deploy preview for airbyte-docs ready! ✅ Preview Built with commit 85d707c. |
|
Patrick Nilan (pnilan)
left a comment
There was a problem hiding this comment.
API Source PR Review: fix(source-github): fix silent error swallowing in exception handlers
Connector: source-github (Custom Python)
Linked Issue: airbytehq/oncall#11907
Verdict: APPROVED WITH CHANGES
Total Issues: 4 (P0: 1, P1: 1, P2: 1, P3: 1)
Issue Analysis
The linked issue accurately identifies three distinct silent error-swallowing bugs in streams.py. All three root cause analyses are correct:
- ContributorActivity — Indentation-level mismatch causes
else: raise eto pair with the outerifinstead of the innerif status_code == ACCEPTED. ✅ Correctly diagnosed. - GithubStreamABC —
andinstead oforprevents proper short-circuit evaluation. Withand, ifelacks_exception, accessinge._exceptionin the second operand raisesAttributeError. ✅ Correctly diagnosed. - Releases — Bare
except Exception:catches everything including unexpected errors. ✅ Correctly diagnosed, low severity.
The PR's approach (targeted fixes at each site) is the right strategy. All three code fixes are logically correct.
Findings by Priority
P0 - Critical (Must fix before merge)
1. PR has merge conflicts — needs rebase
- Files:
metadata.yaml,pyproject.toml,streams.py,docs/integrations/sources/github.md - Category: Completeness
- Problem: The PR branch is based on a pre-#76080 master. GitHub reports
mergeStateStatus: DIRTY,mergeable: CONFLICTING. Specific conflicts:streams.pyline ~1814-1815: PR branch still hasexcept MessageRepresentationAirbyteTracedErrors as e:while master already hasexcept AirbyteTracedException as e:(fixed by PR #76080).MessageRepresentationAirbyteTracedErrorsno longer exists in the CDK, so theexceptclause on the PR branch would never match at runtime.- Version: Master is already at
2.1.17(from PR #76080). The PR bumps2.1.16 → 2.1.17, which conflicts. After rebase, the bump should be2.1.17 → 2.1.18. - Changelog: Master already has a
2.1.17entry for PR #76080. The PR's2.1.17changelog entry will conflict.
- Fix: Rebase onto current master. After rebase:
- The
MessageRepresentationAirbyteTracedErrorsreference will resolve toAirbyteTracedException - Bump version to
2.1.18inmetadata.yamlandpyproject.toml - Update changelog entry to
2.1.18
- The
P1 - High Priority
2. Test for Bug 2 (and → or) may not directly exercise the guard clause
- File:
unit_tests/test_stream.py—test_github_stream_abc_read_records_reraises_when_no_exception_attr - Category: Test Quality
- Problem: This test mocks a 500 response and expects
AirbyteTracedExceptionto be raised. However, when the CDK wraps HTTP errors intoAirbyteTracedException, it typically sets_exceptionon the exception object. If_exceptionIS set with a valid.response, the guard clause (if not hasattr(e, "_exception") or ...) evaluates toFalseand doesn't fire — the test then relies on the existingelse: raise eat line ~188 (SERVER_ERROR for non-WorkflowRuns streams) to propagate, which is pre-existing behavior. - Evidence: The test docstring says _"should re-raise when AirbyteTracedException has no exception attribute" — but a 500 mock response will likely produce an exception WITH
_exceptionset. - Suggestion: To directly test the
orfix, construct anAirbyteTracedExceptionwithout_exceptionand verify it propagates. For example, mock the stream's internal send method to raise a bareAirbyteTracedException()without the_exceptionattribute.
P2 - Medium Priority
3. Second Bug 2 test exercises pre-existing behavior, not the fix
- File:
unit_tests/test_stream.py—test_github_stream_abc_read_records_reraises_for_unhandled_status - Category: Test Quality
- Problem: This test uses status 422. With 422, the exception will have
_exceptionand_exception.responseset, so the guard clause passes through (evaluates toFalse). The 422 falls through all theif/elifchecks and hits the finalelse: raise eat line ~188. This path already worked on master before theand→orfix. Valid as a regression test, but doesn't exercise the changed code. - Suggestion: Consider adding a test where
_exceptionexists but_exception.responseisNoneor missing — this would exercise the second part of theorcondition.
P3 - Low Priority
4. Redundant exception type in Bug 3 except tuple
- File:
streams.py:828 - Category: Code Organization
- Problem:
binascii.Erroris a subclass ofValueError, soexcept (ValueError, struct.error, binascii.Error)is technically redundant. However, listing both explicitly documents the intended exception sources. - Suggestion: No change needed — the explicit listing improves readability.
Positive Observations
- All three fixes are logically correct — The
else: raise eindentation in Bug 1, theorshort-circuit in Bug 2, and the narrowed exception tuple in Bug 3 are all sound fixes. - Good test coverage for Bug 1 and Bug 3 —
test_contributor_activity_reraises_non_accepted_statusexercises the ContributorActivity path end-to-end, and the Releases tests cover both the "now propagates" and "still catches expected errors" directions. - Clean, minimal diff — Tightly scoped to the three reported issues with appropriate version bump, changelog, and docs.
- Excellent PR description — The review guide with indentation analysis and recommended checklist is very thorough.
Review Checklist
- PR resolves all three instances from the linked issue
- All three code fixes are logically correct
- PR needs rebase — merge conflicts due to stale base (P0)
- Follows CDK patterns for custom Python connector
- Unit tests cover Bug 1 and Bug 3 well
- Bug 2 test doesn't directly exercise the
and→orfix (P1) - Error cases tested (401 for ContributorActivity, TypeError for Releases)
- No schema changes
- No breaking changes (behavioral change is intentional — errors now surface instead of being silently swallowed)
- Version bump appropriate (patch for bug fix)
- Version needs update after rebase (2.1.17 → 2.1.18)
Fix three bugs where exceptions were silently swallowed: 1. ContributorActivity.read_records(): else/raise was paired with outer if instead of inner if, causing non-ACCEPTED status codes to be silently swallowed. 2. GithubStreamABC.read_records(): guard used 'and' instead of 'or', causing AttributeError when _exception is missing and failing to re-raise when response is missing. 3. Releases._extract_database_id_from_node_id(): bare except Exception replaced with specific (ValueError, struct.error, binascii.Error). Co-Authored-By: bot_apk <apk@cognition.ai>
Co-Authored-By: bot_apk <apk@cognition.ai>
Co-Authored-By: bot_apk <apk@cognition.ai>
7d956aa to
85d707c
Compare
|
Thanks for the thorough review! I've addressed all findings: P0 — Rebase + version conflicts: Rebased onto current master (which includes PR #76080). Version is now P1 — Test for Bug 2 doesn't exercise the guard clause: Rewrote P2 — Second Bug 2 test exercises pre-existing behavior: Rewrote P3 — Redundant All 64 tests pass locally (excluding the pre-existing Link to Devin session: https://app.devin.ai/sessions/e7b3b318b64e4770a309c7aaadce8de9 |
What
Fixes three silent error-swallowing bugs in the source-github connector's exception handling logic.
Resolves https://github.com/airbytehq/oncall/issues/11907
Related: https://github.com/airbytehq/oncall/issues/11869
How
Bug 1:
ContributorActivity.read_records()(streams.py ~line 1831)The
else: raise ewas paired with the outerif hasattr(...)check, not the innerif status_code == ACCEPTED. When an exception had a valid_exception.responsebut a non-ACCEPTED status code, execution fell through without re-raising — silently swallowing the error. Fix: Addedelse: raise eto the innerifblock.Bug 2:
GithubStreamABC.read_records()(streams.py ~line 140)The guard condition used
andinstead ofor:Bug 3:
Releases._extract_database_id_from_node_id()(streams.py ~line 830)Replaced bare
except Exception:withexcept (ValueError, struct.error, binascii.Error):— the specific exceptions thatbase64.urlsafe_b64decodeandstruct.unpackcan raise. Unexpected exceptions (e.g.TypeError) now propagate instead of being silently swallowed.Review guide
source_github/streams.py— all three fixes, plus theimport binasciiadditionunit_tests/test_stream.py— six new tests covering each fix (re-raise behavior + regression for expected catches)metadata.yaml/pyproject.toml— patch version bump 2.1.17 → 2.1.18docs/integrations/sources/github.md— changelog entryRecommended review checklist:
else: raise eindentation in Bug 1 is correctly paired with the innerif(line ~1832). This is the highest-risk change since Python indentation determines control flow — confirm both the new innerelse: raise eand the existing outerelse: raise eare at the correct levels.orshort-circuit logic in Bug 2 is correct(ValueError, struct.error, binascii.Error)covers all realistic failure modes ofbase64.urlsafe_b64decode+struct.unpack_exceptionand (b) with_exceptionbut without.response— verify both paths through theorUpdates since last revision
MessageRepresentationAirbyteTracedErrors→AirbyteTracedExceptionrename from fix(source-github): fix remaining NameError references to removed MessageRepresentationAirbyteTracedErrors #76080 is incorporated.test_github_stream_abc_read_records_reraises_when_no_exception_attr— now constructs a bareAirbyteTracedExceptionwith_exceptiondeleted, patchesHttpStream.read_recordsto raise it. Directly exercises the first branch of theorguard.test_github_stream_abc_read_records_reraises_when_no_response_attr(renamed) — now constructs anAirbyteTracedExceptionwith_exceptionset to a plainException(no.response). Directly exercises the second branch of theorguard.User Impact
Exceptions that were previously silently swallowed will now be properly raised, making sync failures visible to users instead of silently producing incomplete data.
Can this PR be safely reverted and rolled back?
Link to Devin session: https://app.devin.ai/sessions/e7b3b318b64e4770a309c7aaadce8de9