Skip to content

fix(source-github): make parse_response and error handlers defensive against unexpected response shapes#77685

Merged
Patrick Nilan (pnilan) merged 8 commits into
masterfrom
devin/1777672505-source-github-defensive-parse-response
May 3, 2026
Merged

fix(source-github): make parse_response and error handlers defensive against unexpected response shapes#77685
Patrick Nilan (pnilan) merged 8 commits into
masterfrom
devin/1777672505-source-github-defensive-parse-response

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented May 1, 2026

What

Resolves https://github.com/airbytehq/airbyte-internal-issues/issues/16281:

Several parse_response methods in source-github and two functions in errors_handlers.py assume specific JSON response shapes. When GitHub returns unexpected responses (HTML error pages from load balancers, malformed JSON, missing keys), these methods crash with TypeError/KeyError/json.JSONDecodeError and bypass the CDK error pipeline.

How

Applied a consistent defensive pattern to each affected site:

streams.py (4 sites):

  1. Workflows.parse_response — wrap .json() in try/except ValueError, validate workflows key is a list
  2. WorkflowRuns.parse_response — same pattern for workflow_runs key
  3. WorkflowJobs.parse_response — same pattern for jobs key (was using ["jobs"] which raises KeyError)
  4. IssueTimelineEvents.parse_response — validate response is a list (not a dict error wrapper)

errors_handlers.py (3 sites):
5. is_conflict_with_empty_repository — wrap .json() in try/except ValueError
6. GithubStreamABCErrorHandler.interpret_response — extract .json() call into _safe_json_check_graphql_rate_limited helper with try/except
7. GitHubGraphQLErrorHandler.interpret_response — extract .json().get("errors") into _safe_json_get_errors helper with try/except

All sites log a warning and yield no records (or return False) instead of crashing.

Declarative-First Evaluation

Not applicable — source-github is a Python CDK connector (cdk:python), not a declarative/manifest-only connector.

Test Coverage

Added 36 new parametrized unit tests covering all 7 fixed sites:

  • test_workflows_parse_response_defensive — 7 cases (valid, empty list, HTML body, empty body, missing key, wrong type, None key)
  • test_workflow_runs_parse_response_defensive — 7 cases
  • test_workflow_jobs_parse_response_defensive — 8 cases (includes no-cursor record)
  • test_issue_timeline_events_parse_response_defensive — 6 cases (includes dict-instead-of-list)
  • test_is_conflict_with_empty_repository_defensive — 5 cases
  • test_graphql_rate_limit_check_with_html_response — 1 case
  • test_graphql_error_handler_with_html_response — 1 case
  • test_graphql_error_handler_with_valid_errors — 1 case

All 144 unit tests pass locally.

Review guide

  1. source_github/streams.py — defensive parse_response for Workflows, WorkflowRuns, WorkflowJobs, IssueTimelineEvents
  2. source_github/errors_handlers.py — safe JSON helpers and wrapped .json() calls
  3. unit_tests/test_stream.py — 36 new parametrized tests at bottom of file

User Impact

Syncs that previously crashed with TypeError, KeyError, or json.JSONDecodeError when GitHub returned unexpected responses (HTML error pages, malformed JSON) will now log a warning and skip the affected page, allowing the sync to continue.

Can this PR be safely reverted and rolled back?

  • YES 💚

Prior Art

Link to Devin session: https://app.devin.ai/sessions/ce182b4200d5420d8be36f435b7415a8

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

…against unexpected response shapes

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • 🛠️ Quick Fixes
    • /format-fix - Fixes most formatting issues.
    • /bump-version - Bumps connector versions, scraping changelog description from the PR title.
      • Bump types: patch (default), minor, major, major_rc, rc, promote.
      • The rc type is a smart default: applies minor_rc if stable, or bumps the RC number if already RC.
      • The promote type strips the RC suffix to finalize a release.
      • Example: /bump-version type=rc or /bump-version type=minor
    • /bump-progressive-rollout-version - Alias for /bump-version type=rc. Bumps with an RC suffix and enables progressive rollout.
  • ❇️ AI Testing and Review (internal link: AI-SDLC Docs):
    • /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
    • /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
    • /ai-review - AI-powered PR review for connector safety and quality gates.
  • 📝 AI Documentation:
    • /ai-docs-review - AI-powered documentation review for PRs with connector changes.
    • /ai-create-docs-pr - Creates a documentation PR for connector changes, stacked on the current PR.
  • 🚀 Connector Releases:
    • /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
  • ☕️ JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
  • 🐍 Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
  • ⚙️ Admin commands:
    • /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
      Example: /force-merge reason="CI is flaky, tests pass locally"
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

@pnilan Patrick Nilan (pnilan) self-requested a review May 1, 2026 22:01
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

source-github Connector Test Results

168 tests   164 ✅  17s ⏱️
  3 suites    4 💤
  3 files      0 ❌

Results for commit 4074cc3.

♻️ This comment has been updated with latest results.

Co-Authored-By: bot_apk <apk@cognition.ai>
@pnilan
Copy link
Copy Markdown
Contributor

🤖 This comment was generated by an AI Agent.

Nice work on the defensive hardening. Pattern is consistent, test coverage is solid. Below are the fixes I'd like to see before merge, framed by priority.

P1 — Should fix before merge

P1.1 — Guard _safe_json_check_graphql_rate_limited against body=None

errors_handlers.py:61-66. If response.json() returns literal null (valid JSON), body is None and check_graphql_rate_limited(None) will raise AttributeError on response_json.get(...) at streams.py:122. The parallel _safe_json_get_errors already handles this with (body or {}).get(...). One-line fix for parity:

def _safe_json_check_graphql_rate_limited(self, response: requests.Response) -> bool:
    try:
        body = response.json()
    except ValueError:
        return False
    return self.stream.check_graphql_rate_limited(body or {})  # ← add `or {}`

P1.2 — IssueTimelineEvents should yield base record on parse failure (Option A)

streams.py:1925-1948. IssueTimelineEvents is unusual — one HTTP response = one record (not N records). Before this PR, the base record {repository, issue_number} was always yielded, including for an empty events_list. After the PR, on a non-JSON or non-list response the function returns early with no record, but a valid empty list [] still yields the base record. That asymmetry could cause downstream consumers joining on the primary key to silently miss issues whose timeline endpoint returned HTML.

Yield the base record on parse failure (matches empty-list behavior):

def parse_response(
    self,
    response: requests.Response,
    stream_state: Mapping[str, Any],
    stream_slice: Mapping[str, Any] = None,
    next_page_token: Mapping[str, Any] = None,
) -> Iterable[Mapping]:
    record = {\"repository\": stream_slice[\"repository\"], \"issue_number\": stream_slice[\"number\"]}
    try:
        events_list = response.json()
    except ValueError:
        self.logger.warning(
            \"\`%s\` received non-JSON response (HTTP %s, first 500 chars: %r). \"
            \"Yielding base record without timeline events.\",
            self.name, response.status_code, response.text[:500],
        )
        yield record
        return
    if not isinstance(events_list, list):
        self.logger.warning(
            \"\`%s\` expected a JSON list but got %s (HTTP %s). \"
            \"Yielding base record without timeline events.\",
            self.name, type(events_list).__name__, response.status_code,
        )
        yield record
        return
    for event in events_list:
        record[event[\"event\"]] = event
    yield record

Update test_issue_timeline_events_parse_response_defensive parametrization accordingly — failure cases should now expect expected_count=1 (the base record).

P2 — Nice to have

P2.1 — Wrap the latent .json() call at errors_handlers.py:117

Currently safe because it's gated by is_gone_with_feature_disabled, but inconsistent with the rest of the PR and a latent crash if that contract ever changes:

if is_gone_with_feature_disabled(response_or_exception=response_or_exception):
    try:
        message = response_or_exception.json().get(\"message\", \"Feature disabled\")
    except ValueError:
        message = \"Feature disabled\"
    log_message = f\"Skipping stream slice for '{response_or_exception.url}': {message}.\"
    return ErrorResolution(...)

P2.3 — Method consistency in errors_handlers.py

_safe_json_check_graphql_rate_limited is an instance method; _safe_json_get_errors is @staticmethod. Easiest fix: drop @staticmethod from the latter so both are instance methods.

class GitHubGraphQLErrorHandler(GithubStreamABCErrorHandler):
    def _safe_json_get_errors(self, response: requests.Response) -> bool:
        try:
            body = response.json()
        except ValueError:
            return False
        return bool((body or {}).get(\"errors\"))

Call site stays the same (self._safe_json_get_errors(...) is already correct).

P2.4 — Fix _make_response test helper for accurate semantics

The else branch (no text, no json_data) currently returns valid JSON null, not a JSONDecodeError. The id=\"empty_body\" parametrize cases pass text=\"\" explicitly so they don't hit this branch — but the helper's default is misleading. Change to raise:

def _make_response(status_code=200, json_data=None, text=None):
    resp = MagicMock(spec=requests.Response)
    resp.status_code = status_code
    if text is not None:
        resp.text = text
        resp.json = MagicMock(side_effect=ValueError(\"No JSON\"))
    elif json_data is not None:
        resp.text = json.dumps(json_data)
        resp.json = MagicMock(return_value=json_data)
    else:
        resp.text = \"\"
        resp.json = MagicMock(side_effect=ValueError(\"No JSON\"))  # ← change from return_value=None
    return resp

P2.5 — Add a rate-limit happy-path unit test

test_graphql_rate_limit_check_with_html_response only covers non-crash. Pair it with a positive case:

def test_graphql_rate_limit_check_with_valid_rate_limited_body():
    \"\"\"When response has graphql rate-limit header AND a valid body with RATE_LIMITED error,
    handler should return RATE_LIMITED action.\"\"\"
    stream = RepositoryStats(repositories=[\"test_repo\"], page_size_for_large_streams=30)
    handler = stream.get_error_handler()
    resp = MagicMock(spec=requests.Response)
    resp.status_code = 200
    resp.headers = {\"X-RateLimit-Resource\": \"graphql\"}
    resp.text = '{\"errors\": [{\"type\": \"RATE_LIMITED\"}]}'
    resp.ok = True
    resp.json = MagicMock(return_value={\"errors\": [{\"type\": \"RATE_LIMITED\"}]})
    result = handler.interpret_response(resp)
    assert result.response_action == ResponseAction.RATE_LIMITED

Skipped

  • P2.2 (URL in warning logs) — intentionally skipped to avoid risk of accidentally logging API keys, tokens, or other secrets that can appear in URLs (query params, auth headers being reflected, etc.). Stream name + status code + body excerpt is sufficient context.

…parse failure, consistency fixes

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review Patrick Nilan (@pnilan)! Addressed all items in 7338b72:

P1.1 — Added or {} guard in _safe_json_check_graphql_rate_limited to handle body=None.

P1.2IssueTimelineEvents now yields the base record {repository, issue_number} on parse failure (matching empty-list behavior). Updated test expectations accordingly.

P2.1 — Skipped: is_gone_with_feature_disabled does not exist in the current codebase, so there is no latent .json() call to wrap.

P2.3 — Dropped @staticmethod from _safe_json_get_errors so both helpers are instance methods.

P2.4 — Fixed _make_response else branch to raise ValueError instead of returning None.

P2.5 — Added test_graphql_rate_limit_check_with_valid_rate_limited_body happy-path test.

All 145 unit tests pass locally.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Deploy preview for airbyte-docs ready!

Project:airbyte-docs
Status: ✅  Deploy successful!
Preview URL:https://airbyte-docs-m56vcv5fa-airbyte-growth.vercel.app
Latest Commit:4074cc3

Deployed with vercel-action

try:
response_data = response_or_exception.json()
except ValueError:
return False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log warning

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added a warning log with the first 50 chars of the response body in 02fc609.

try:
body = response.json()
except ValueError:
return False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log warning

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added a warning log with HTTP status and first 50 chars of the response body in 02fc609.

"`%s` received non-JSON response (HTTP %s, first 500 chars: %r). Yielding no records for this page.",
self.name,
response.status_code,
response.text[:500],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too many characters -> :50

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — reduced to [:50] and updated the log message text to match in 02fc609.

"`%s` received non-JSON response (HTTP %s, first 500 chars: %r). Yielding no records for this page.",
self.name,
response.status_code,
response.text[:500],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[:50]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — [:50] in 02fc609.

"`%s` received non-JSON response (HTTP %s, first 500 chars: %r). Yielding no records for this page.",
self.name,
response.status_code,
response.text[:500],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[:50]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — [:50] in 02fc609.

"`%s` received non-JSON response (HTTP %s, first 500 chars: %r). " "Yielding base record without timeline events.",
self.name,
response.status_code,
response.text[:500],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[:50]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — [:50] in 02fc609.

…ror handler catch blocks

Co-Authored-By: bot_apk <apk@cognition.ai>
@pnilan
Copy link
Copy Markdown
Contributor

Patrick Nilan (pnilan) commented May 1, 2026

/ai-prove-fix

AI Prove Fix Started

Running readiness checks and testing against customer connections.
View workflow run
🔍 AI Prove Fix session starting... Running readiness checks and testing against customer connections. View playbook

Devin AI session created successfully!

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

devin-ai-integration Bot commented May 1, 2026

Fix Validation Evidence — source-github v2.1.25

Status: ✅ PASSED — Regression tests confirm no regressions. Recommend merge.

Pre-release: 2.1.25-preview.02fc609 (publish workflow)

Regression Test Results

Regression test workflow — Comparison mode: airbyte/source-github:dev (PR) vs airbyte/source-github:2.1.24 (stable)

Command Result
SPEC ✅ No regression
CHECK ✅ No regression
DISCOVER ✅ No regression
READ ✅ No regression (39 streams, 1028 vs 1026 records)

All 39 streams tested. The streams with modified parse_response methods (workflows, issue_timeline_events) produced identical record counts between target and control.

Pre-flight Checks
  • Viability: Fix wraps .json() calls in try/except, validates expected keys/types, logs warnings instead of crashing
  • Safety: No malicious code, no credential changes — pure defensive error handling
  • Breaking Change: Not breaking — no schema/spec/stream/state changes, patch bump (2.1.24 → 2.1.25)
  • Reversibility: Fully reversible — no state/config format changes
Evidence Plan & Rationale

Strategy: Regression tests only (no live connection tests needed)

Proving criteria: Regression tests show no behavioral differences (all ✅) between target and control, demonstrating the fix does not regress normal operation.

Rationale: This fix adds defensive error handling for edge cases (non-JSON responses, missing keys) that only occur during GitHub API outages. In normal operation, there should be zero behavioral change. The error-case behavior is validated by 36 unit tests in the PR.

Live connection testing was not needed because:

  • Error conditions cannot be triggered against real GitHub API
  • All affected streams produce identical output in regression tests
  • source-github does not use forced OAuth write-back

Devin session

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Pre-release Connector Publish Started

Publishing pre-release build for connector source-github.
PR: #77685

Pre-release versions will be tagged as {version}-preview.02fc609
and are available for version pinning via the scoped_configuration API.

View workflow run
Pre-release Publish FAILED for source-github.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Pre-release Connector Publish Started

Publishing pre-release build for connector source-github.
PR: #77685

Pre-release versions will be tagged as {version}-preview.02fc609
and are available for version pinning via the scoped_configuration API.

View workflow run
Pre-release Publish FAILED for source-github.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Pre-release Connector Publish Started

Publishing pre-release build for connector source-github.
PR: #77685

Pre-release versions will be tagged as {version}-preview.02fc609
and are available for version pinning via the scoped_configuration API.

View workflow run
Pre-release Publish FAILED for source-github.

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

↪️ Triggering /ai-review per Hands-Free AI Triage Project triage next step.

Reason: PR is ready for review with no prior /ai-review triggered.

Devin session

@octavia-bot
Copy link
Copy Markdown
Contributor

octavia-bot Bot commented May 2, 2026

AI PR Review starting...

Reviewing PR for connector safety and quality.
View playbook

Devin AI session created successfully!

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

devin-ai-integration Bot commented May 2, 2026

AI PR Review Report

Review Action: APPROVED
Risk Level: 4 — Defensive error-handling change with user-visible error-semantic shift (prior crash → warn+skip); Anti-Pattern Behavioral Changes keyword triggered, so risk is forced to ≥4 even though regression tests show no normal-operation behavioral diff vs 2.1.24.

Gate Status
PR Hygiene PASS
Code Hygiene PASS
Test Coverage PASS
Code Security PASS
Per-Record Performance PASS
Breaking Dependencies PASS
Backwards Compatibility PASS
Forwards Compatibility PASS
Behavioral Changes FAIL (warning, does not block APPROVE)
Out-of-Scope Changes PASS
CI Checks PASS
Live / E2E Tests PASS

📋 PR Details

Connector & PR Info

Connector(s): source-github
PR: #77685
HEAD SHA: 02fc609fdc3665b1bf8c11c4a4abc6a0277f77e8
Session: https://app.devin.ai/sessions/09f1e17f33e548c8a0f73753f67b12fe

Risk Level

Level: 4 — High
Rationale: PR changes error-handling semantics in parse_response and errors_handlers.py (previously raised TypeError/KeyError/json.JSONDecodeError; now logs warning and yields empty / returns False). Anti-Pattern Behavioral Changes keyword (rate_limit) matched in newly introduced wrapper method names, which deterministically forces risk_level ≥ 4 per the rubric. Regression tests via /ai-prove-fix (target airbyte/source-github:dev vs control 2.1.24) showed no record-count differences across 39 streams in normal operation, so the user-visible impact is confined to error-case behavior (graceful skip instead of sync crash).

Risk Level is reported for downstream consumers (e.g. auto-merge policy, reviewer routing). It does not change the review action — APPROVE here means "no blocking objection," not "safe to merge unattended."

Review Action Details

APPROVED — All enforced gates pass (PR Hygiene, Code Hygiene, Test Coverage, Code Security, CI Checks, Live / E2E Tests). The Behavioral Changes Anti-Pattern gate is FAIL on the keyword heuristic, but Anti-Pattern gates are warning-level — they elevate risk_level (here to 4) but do not block APPROVE. octavia-bot will submit the APPROVE PR review based on the marker; merge decision remains with humans / downstream auto-merge policy reading decision and risk_level.

🔍 Gate Evaluation Details

Gate-by-Gate Analysis

Gate Status Enforced? Details
PR Hygiene PASS Yes PR description is comprehensive (>2000 chars, multiple sections: What/How/Test Coverage/User Impact/Review guide/Prior Art); changelog row added in docs/integrations/sources/github.md for 2.1.25; all human reviewer (pnilan) comments have author replies (P1.1, P1.2, P2.x, and per-line [:50] / log warning requests all addressed in commits 7338b72 and 02fc609).
Code Hygiene PASS WARNING Source files streams.py and errors_handlers.py modified; test file unit_tests/test_stream.py modified with 184 added lines / 36 new parametrized tests. Coverage evidence: Found.
Test Coverage PASS Yes Behavioral change indicator matched (PR title prefix fix(...)); new test functions present in diff: test_workflows_parse_response_defensive, test_workflow_runs_parse_response_defensive, test_workflow_jobs_parse_response_defensive, test_issue_timeline_events_parse_response_defensive, test_is_conflict_with_empty_repository_defensive, test_graphql_rate_limit_check_with_html_response, test_graphql_error_handler_with_html_response, test_graphql_error_handler_with_valid_errors. Each fixed call-site has dedicated parametrized cases for HTML body, empty body, malformed JSON, missing key, wrong type.
Code Security PASS Yes No file paths match security-sensitive patterns (no *auth*/*oauth*/*token*/*credential*/*secret*/*encrypt*/*decrypt*/*mask*/*redact*/*pii*). metadata.yaml diff hunks contain only dockerImageTag: 2.1.24 → 2.1.25 (no allowedHosts / connectorBuildOptions / dockerRepository keyword changes). No manifest.yaml (Python connector).
Per-Record Performance PASS WARNING Try/except wrappers are placed outside the per-record for record in body[...] loops (one try/except per page in parse_response, not per record). Logging only fires on the failure path. No per-record allocation/regex/network introduced.
Breaking Dependencies PASS WARNING pyproject.toml diff is connector self-version bump only (2.1.24 → 2.1.25); no third-party dependency version changes (no JDBC / HTTP / serialization library bumps).
Backwards Compatibility PASS Warning (elevates Risk Level) No spec file modified; metadata.yaml change limited to dockerImageTag; no releases.breakingChanges entry needed; no streams added/removed/renamed; no primary key / cursor / schema property changes. Spec required arrays unchanged.
Forwards Compatibility PASS Warning (elevates Risk Level) No state/cursor/checkpoint/incremental_sync/state_migrations keywords in changed hunks; no paginator/page_size/page_token/offset changes; no transformations/AddFields/RemoveFields/record_selector changes. The IssueTimelineEvents change yields the base record on parse failure (matches existing empty-list behavior), preserving downstream PK joins on rollback.
Behavioral Changes FAIL Warning (elevates Risk Level) Operational keyword rate_limit matches in non-comment diff hunks: new method _safe_json_check_graphql_rate_limited and call to existing self.stream.check_graphql_rate_limited(body or {}) in errors_handlers.py. Per the deterministic rule, this forces FAIL → risk_level ≥ 4. Note: the underlying rate-limit detection logic is unchanged — the call is wrapped in try/except ValueError that returns False only when response.json() raises (i.e., non-JSON HTML/empty bodies). Regression tests via /ai-prove-fix (workflow run 25236047998) showed no behavioral diff vs 2.1.24 across 39 streams.
Out-of-Scope Changes PASS Skip All changed files are within airbyte-integrations/connectors/source-github/** and docs/integrations/sources/github.md. PR is in scope.
CI Checks PASS Yes 37 passed, 0 failed, 0 pending, 6 skipped (skips are not-applicable matrices, e.g. CDK Check, Bulk CDK version check, JVM connectors). Core checks green: Test source-github Connector, Lint source-github Connector, Build and Verify Artifacts (source-github), source-github Connector Test Results (152 tests, 148 ✅ / 4 💤 / 0 ❌), Format Check, Check Changelog Updated, Validate PR Title, CodeQL, Docs / MarkDownLint, Docs / Vale.
Live / E2E Tests PASS Yes (a) CI check source-github Pre-Release Checks is green (job 74001762199). (b) /ai-prove-fix was triggered by pnilan and returned PASSED validation evidence: regression test workflow 25236047998 compared airbyte/source-github:dev (PR) vs airbyte/source-github:2.1.24 (stable) — SPEC / CHECK / DISCOVER / READ all "No regression" (READ: 39 streams, 1028 vs 1026 records). Streams with modified parse_response (workflows, issue_timeline_events) produced identical record counts.

Spec Comparison (Backwards Compatibility):

  • No spec.json, spec.yaml, or manifest.yaml modified in this PR — spec comparison not applicable.
  • metadata.yaml change is limited to dockerImageTag (2.1.24 → 2.1.25); no releases.breakingChanges block required.

Author Justifications:

  • None required; the PR description sufficiently documents the change scope under "What" / "How" / "Test Coverage" / "User Impact" / "Can this PR be safely reverted and rolled back?" sections, and the Behavioral Changes FAIL is keyword-driven on a defensive wrapper rather than a real operational policy change.
📚 Evidence Consulted

Evidence

  • Changed files: 6 files (metadata.yaml, pyproject.toml, source_github/errors_handlers.py, source_github/streams.py, unit_tests/test_stream.py, docs/integrations/sources/github.md); +297 / −11.
  • CI checks: 37 passed / 0 failed / 0 pending / 6 skipped on HEAD 02fc609. Notable: Test source-github Connector ✅, Lint source-github Connector ✅, source-github Pre-Release Checks ✅, Build Airbyte Docs ✅, Format Check ✅, Check Changelog Updated ✅, CodeQL ✅, Docs / MarkDownLint ✅, Docs / Vale ✅.
  • PR labels: connectors/source/github. No live-tests-passed / prerelease-validated / community labels.
  • PR description: present, comprehensive (>2000 chars after stripping HTML comments / templates).
  • Linked issue: airbytehq/airbyte-internal-issues#16281 (referenced in ## What).
  • Existing bot reviews on this HEAD SHA: none from this bot.
  • Validation evidence: /ai-prove-fix PASSED (comment 22 on PR; regression test workflow 25236047998; session https://app.devin.ai/sessions/0fb9a1875014445886ccbdc4c7eadcf1).
  • Human reviewer feedback: pnilan posted P1/P2 review (comment 6) plus 5 inline comments; all addressed in commits 7338b72 and 02fc609 and confirmed by reply comments from devin-ai-integration[bot].
❓ How to Respond

Providing Context or Justification

You can add explanations that the bot will see on the next review:

Option 1: PR Description (recommended)
Add a section to your PR description:

## AI PR Review Justification

### {Gate Name}
[Your explanation here]

Option 2: PR Comment
Add a comment starting with:

AI PR Review Justification:
[Your explanation here]

After adding your response, re-run /ai-review to have the bot evaluate it.

Note: The Behavioral Changes FAIL above is a keyword-driven warning on a defensive wrapper (no real change to rate-limit/retry/timeout policy). It elevates risk_level to 4 but does not block APPROVE. No author action is required for the bot to approve; a human reviewer can decide whether the elevated risk warrants additional scrutiny before merge.

@pnilan
Copy link
Copy Markdown
Contributor

update from main, resolve confclits

…n test assertion

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Done — merged from main, resolved conflicts, and bumped version to 2.1.27. Also added a warning log to the new is_gone_with_feature_disabled function (merged from 2.1.26) and fixed a broken test assertion in test_read_records_410_projects_disabled_message (the test expected a message that didn't match what the code produces). All 161 tests pass.


Devin session

@pnilan
Copy link
Copy Markdown
Contributor

Patrick Nilan (pnilan) commented May 3, 2026

/ai-canary-prerelease

AI Canary Prerelease Started

Rolling out to 5-10 connections, watching results, and reporting findings.
View workflow run
🐤 AI Canary Prerelease session starting... Rolling out to 5-10 connections, watching results, and reporting findings. View playbook

Devin AI session created successfully!

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Canary Prerelease: Starting

Starting canary prerelease testing for this PR. I'll:

  1. Read the PR and linked issue(s) for context
  2. Verify the change is non-breaking and reversible
  3. Select 5-10 canary connections (prioritizing any from prior canary runs on this PR)
  4. Request approval via Slack HITL before pinning
  5. Pin to the prerelease version, monitor syncs for 2-4 hours, and post a PASS/FAIL/BLOCKED verdict

Devin session

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 3, 2026

Pre-release Connector Publish Started

Publishing pre-release build for connector source-github.
PR: #77685

Pre-release versions will be tagged as {version}-preview.09e85a7
and are available for version pinning via the scoped_configuration API.

View workflow run
Pre-release Publish: SUCCESS

Docker image (pre-release):
airbyte/source-github:2.1.27-preview.09e85a7

Docker Hub: https://hub.docker.com/layers/airbyte/source-github/2.1.27-preview.09e85a7

Registry JSON:

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Canary Prerelease: Approval Requested

Connector: source-github
Prerelease version: 2.1.27-preview.09e85a7 (Docker image confirmed published)
Workflow: registry compile run

Phase 2 — Breaking change & reversibility gate: PASS

  • Patch bump 2.1.26 → 2.1.27, no schema/spec/state/PK/cursor changes
  • PR description checks Can this PR be safely reverted and rolled back? [x] YES
  • Prior /ai-prove-fix regression tests passed (39 streams, 1028 vs 1026 records vs 2.1.24)

Phase 3 — Proposed canary set (7 source actors, 9 connections, all TIER_2):

Ref Tier Region Destination Notes
Internal-A-EU-1 TIER_2 EU GCS Internal Airbyte workspace
Internal-B-US-1 TIER_2 US ab-analytics Internal Airbyte workspace
Internal-B-US-2 TIER_2 US ab-analytics Internal Airbyte workspace (high-volume, 7 syncs/2d)
Customer-C-EU-1 TIER_2 EU BigQuery Same source actor as -EU-2 / -EU-3
Customer-C-EU-2 TIER_2 EU BigQuery
Customer-C-EU-3 TIER_2 EU BigQuery
Customer-D-US-1 TIER_2 US Redshift
Customer-E-US-1 TIER_2 US Postgres
Customer-F-USC-1 TIER_2 US-Central (Honey Pot - Github Sync) US-Central dataplane coverage

Coverage: 3 internal Airbyte source actors + 4 external customer actors. Destinations: GCS, ab-analytics, BigQuery, Redshift, Postgres. Regions: EU, US, US-Central.

Approval requested via Slack HITL (escalation workflow run). Pinning will not begin until I receive an Approve callback. For full canary set details (workspace IDs, actor IDs, connection IDs), see the linked private internal issue.

Devin session

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Canary Prerelease: Approved & Pinned

Approval received via Slack HITL. All 7 source actors (9 connections) pinned to 2.1.27-preview.09e85a7. Previous version on every actor was 2.1.26.

Ref Pinned Previous Region Destination
Internal-A-EU-1 2.1.27-preview.09e85a7 2.1.26 EU GCS
Internal-B-US-1 2.1.27-preview.09e85a7 2.1.26 US ab-analytics
Internal-B-US-2 2.1.27-preview.09e85a7 2.1.26 US ab-analytics
Customer-C-EU-1 2.1.27-preview.09e85a7 2.1.26 EU BigQuery
Customer-C-EU-2 2.1.27-preview.09e85a7 2.1.26 EU BigQuery
Customer-C-EU-3 2.1.27-preview.09e85a7 2.1.26 EU BigQuery
Customer-D-US-1 2.1.27-preview.09e85a7 2.1.26 US Redshift
Customer-E-US-1 2.1.27-preview.09e85a7 2.1.26 US Postgres
Customer-F-USC-1 2.1.27-preview.09e85a7 2.1.26 US-Central (unspecified)

Customer-C-EU-1/-2/-3 share a single source actor (3 connections to BigQuery), so this is 7 actor pins covering 9 connections.

Now entering Phase 5 — monitoring syncs. Reminders set at +60 min (initial check-in) and +2.5h (final PASS/FAIL/BLOCKED verdict).

For full mapping (workspace IDs, connection IDs, actor IDs, customer names) and rollback steps, see the linked private issue.


Devin session

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Canary Monitoring Update — t+85 min

Pinned at approximately 01:14 UTC; sampling at approximately 02:39 UTC.

Connection Schedule Post-pin syncs Failures Status
Internal-A-EU-1 daily 23:00 0 0 Awaiting next scheduled sync
Internal-B-US-1 12-hourly 0 0 Awaiting next scheduled sync
Internal-B-US-2 approximately 2-hourly 0 0 Next sync expected approximately 02:45
Customer-C-EU-1 approximately 6-hourly 0 0 Pre-pin job still running (started 00:13)
Customer-C-EU-2 approximately 6-hourly 0 0 Pre-pin job still running (started 00:07)
Customer-C-EU-3 approximately 6-hourly 0 0 Last succeeded 00:32 (pre-pin); next approximately 06:30
Customer-D-US-1 daily 0 0 Awaiting next scheduled sync
Customer-E-US-1 approximately hourly 1 0 Job 82541514 succeeded 02:12, 83 records — first post-pin sync
Customer-F-USC-1 approximately 6-hourly 0 0 Awaiting next scheduled sync

Overall status so far: HEALTHY (1 sync on 2.1.27-preview.09e85a7, no failures, record count matches historical baseline of 83).

Two pre-pin Customer-C-EU jobs (started 00:07 and 00:13) are still in running state — these will complete on the prior 2.1.26 version since they predate the pin, so they don't reflect prerelease behavior either way. Subsequent syncs after they finish will use the prerelease.

Coverage is light because most connections are on schedules >= 2h. Extending monitoring by approximately 90 min (+4h reminder) so faster connections (Internal-B-US-2, Customer-E-US-1) accumulate multiple post-pin syncs and slower ones (Customer-C-EU-3, Customer-F-USC-1) get at least one.


Devin session

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Canary Monitoring Update — t+155 min

Sampled at approximately 03:50 UTC.

Connection Schedule Post-pin syncs Failures Latest post-pin observation
Internal-A-EU-1 daily 23:00 0 0 — (next sync expected approximately 23:00 UTC)
Internal-B-US-1 12-hourly 0 0 — (next sync expected approximately 11:00 UTC)
Internal-B-US-2 2-hourly 1 0 02:44 succeeded, 189 records (matches baseline 184–189)
Customer-C-EU-1 6-hourly 0 0 Pre-pin job started 00:13 still running
Customer-C-EU-2 6-hourly 0 0 Pre-pin job started 00:07 still running
Customer-C-EU-3 6-hourly 0 0 — (next sync expected approximately 06:30)
Customer-D-US-1 daily 0 0 — (next sync expected approximately 00:10)
Customer-E-US-1 hourly 2 0 02:06 and 03:07 both succeeded, 83 records each (matches baseline)
Customer-F-USC-1 6-hourly 0 0 — (next sync expected approximately 06:09)

3 successful syncs on 2.1.27-preview.09e85a7 across 2 actors. Zero failures. Record counts and bytes match historical baselines exactly. Status: HEALTHY.

Two pre-pin Customer-C-EU jobs (started 00:07 / 00:13) are still in running state on the prior 2.1.26 version. They're long-running but predate the pin, so their outcome doesn't affect the canary verdict — only subsequent syncs (which will be on the prerelease) do.

Next checkpoint at t+240 min (final PASS/FAIL/BLOCKED verdict).


Devin session

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Canary Prerelease: Final Report

Connector: source-github
Prerelease Version: 2.1.27-preview.09e85a7
Monitoring Period: 2026-05-03 01:14 UTC → 04:50 UTC (~3.5 h)
Connections Tested: 9 (across 7 source actors / 7 workspaces; 3 internal + 4 external)

Summary

Defensive parse_response / error-handler hardening (try/except around .json(); patch bump 2.1.26 → 2.1.27) was rolled out to 7 actors covering 9 connections spanning 4 destination types. All post-pin syncs completed successfully with record counts and byte sizes matching historical baselines exactly. No new error patterns observed.

Detailed Results

Connection Schedule Post-pin syncs Failures Notes
Internal-A-EU-1 daily 0 0 GCS dest; next run outside window
Internal-B-US-1 12-hourly 0 0 ab-analytics dest; next run outside window
Internal-B-US-2 2-hourly 1 (+1 running) 0 ab-analytics dest; 189 records (baseline 184-189)
Customer-C-EU-1 6-hourly 0 0 BigQuery dest; one pre-pin job still running on 2.1.26
Customer-C-EU-2 6-hourly 0 (1 running) 0 BigQuery dest; in-progress sync started post-pin
Customer-C-EU-3 6-hourly 0 0 BigQuery dest; next run outside window
Customer-D-US-1 daily 0 0 Redshift dest; next run outside window
Customer-E-US-1 hourly 3 0 Postgres dest; 83/83/85 records across 3 syncs (baseline 83-86)
Customer-F-USC-1 6-hourly 0 0 Other dest; next run outside window

Totals: 4 successful post-pin syncs · 2 in progress · 0 failures · 0 new error patterns · record counts and bytes match baseline exactly.

Canary Verdict

Overall Status: PASS

The prerelease performed cleanly across the connections that produced post-pin syncs in the monitoring window. The defensive parse_response / error-handler changes did not alter record counts, byte sizes, or sync semantics for any observed run. Six connections have not yet exercised the prerelease because their natural schedules (6-hourly / 12-hourly / daily) fall outside the canary window — pins remain in place so they will exercise the prerelease on their next scheduled run, providing additional in-flight validation while the PR is open.

Recommend proceeding to formal release.

Next steps:

  1. Merge the PR to publish 2.1.27
  2. Canary pins will be removed automatically by the playbook after the PR is merged (chained 3-day reminders polling for merge, since the GitHub-event subscription tool is currently disabled)
  3. If the PR is closed without merging, all 7 pins will be reverted to 2.1.26

For full per-actor IDs, workspace links, and the long-running pre-pin job note, see the linked private internal issue.


Devin session

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

↪️ Triggering /ai-review per Hands-Free AI Triage Project triage next step.

Reason: Prior AI review APPROVE was on head_sha 02fc609f; a substantive commit has landed since (fa609844 merge: update from main, resolve conflicts, bump to 2.1.27, fix broken test assertion), so re-review is warranted per the playbook re-review criteria.

Devin session

@octavia-bot
Copy link
Copy Markdown
Contributor

octavia-bot Bot commented May 3, 2026

AI PR Review starting...

Reviewing PR for connector safety and quality.
View playbook

Devin AI session created successfully!

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

devin-ai-integration Bot commented May 3, 2026

AI PR Review Report

Review Action: APPROVED
Risk Level: 4 — Defensive error-handling change with user-visible error-semantic shift (prior crash → warn+skip); Anti-Pattern Behavioral Changes keywords match (rate_limit, error_handler) so risk is forced to ≥4 even though /ai-prove-fix regression and /ai-canary-prerelease (9 connections, 0 failures) showed no normal-operation behavioral diff vs 2.1.26.

Gate Status
PR Hygiene PASS
Code Hygiene PASS
Test Coverage PASS
Code Security PASS
Per-Record Performance PASS
Breaking Dependencies PASS
Backwards Compatibility PASS
Forwards Compatibility PASS
Behavioral Changes FAIL (warning, does not block APPROVE)
Out-of-Scope Changes PASS
CI Checks PASS
Live / E2E Tests PASS

📋 PR Details

Connector & PR Info

Connector(s): source-github
PR: #77685
HEAD SHA: 09e85a750ff598bbbbdcc4c9a3c3eed7685fb7d1
Session: https://app.devin.ai/sessions/c2bafd8cbe484388b5f1594525bc5440
Re-review: Yes — prior APPROVE was on 02fc609f; since then fa609844 (merge from master, bump 2.1.26 → 2.1.27, fix broken test assertion, add warning log to is_gone_with_feature_disabled) and 09e85a75 (prettier) have landed.

Risk Level

Level: 4 — High
Rationale: Defensive try/except wrappers in streams.py (4 sites) and errors_handlers.py (4 sites including the merged is_gone_with_feature_disabled) change error-case semantics from crash to warn-and-skip. While normal-operation behavior is unchanged (validated by /ai-prove-fix regression and /ai-canary-prerelease PASS), the Anti-Pattern Behavioral Changes gate matches rate_limit / error_handler keywords in the diff hunks. Per playbook rule "If any Anti-Pattern gate is FAIL, force risk_level ≥ 4."

Risk Level is reported for downstream consumers (e.g. auto-merge policy, reviewer routing). It does not change the review action — APPROVE here means "no blocking objection," not "safe to merge unattended."

Review Action Details

APPROVED — All enforced gates pass (PR Hygiene, Test Coverage, Code Security, CI Checks, Live / E2E Tests). The Anti-Pattern Behavioral Changes gate is FAIL but is a WARNING-level gate per the playbook — it elevates Risk Level (forced ≥4) but does not block APPROVE. octavia-bot will submit the APPROVE review based on the marker above. This does NOT authorize auto-merge; the merge decision remains with humans (or a separate auto-merge policy reading decision and risk_level from the metrics marker).

🔍 Gate Evaluation Details

Gate-by-Gate Analysis

Gate Status Enforced? Details
PR Hygiene PASS Yes PR description has full What/How/Test Coverage/Review Guide/User Impact/Reversibility/Prior Art sections (well over 50 chars). Changelog row added in docs/integrations/sources/github.md for 2.1.27. All human reviewer comments from Patrick Nilan (@pnilan) resolved/replied.
Code Hygiene PASS WARNING Source modified in streams.py and errors_handlers.py; tests added in unit_tests/test_stream.py (+187 lines, 36+ new parametrized cases). Coverage evidence: Found.
Test Coverage PASS Yes Behavioral change indicators matched (PR title contains fix, linked issue airbyte-internal-issues#16281). 36+ new test cases via @pytest.mark.parametrize covering all 7 (now 8) defensive sites. New test functions: test_workflows_parse_response_defensive, test_workflow_runs_parse_response_defensive, test_workflow_jobs_parse_response_defensive, test_issue_timeline_events_parse_response_defensive, test_is_conflict_with_empty_repository_defensive, test_graphql_rate_limit_check_with_html_response, test_graphql_error_handler_with_html_response, test_graphql_error_handler_with_valid_errors.
Code Security PASS Yes No path-based security patterns matched (no *auth*, *oauth*, *token*, *credential*, *secret*, etc.). metadata.yaml diff is dockerImageTag only — no security keyword (allowedHosts, connectorBuildOptions, dockerRepository) added. No spec changes.
Per-Record Performance PASS WARNING parse_response runs once per page (not per record). Added try/except around a single .json() call and an isinstance check — negligible overhead. Hot per-record loops untouched.
Breaking Dependencies PASS WARNING pyproject.toml diff is version bump only (2.1.26 → 2.1.27). No dependency additions/removals/upgrades.
Backwards Compatibility PASS Warning (elevates Risk Level) No spec.json/spec.yaml changes. No streams added/removed/renamed. No primary key, cursor field, or schema changes. Patch bump (2.1.26 → 2.1.27) appropriate for non-breaking fix.
Forwards Compatibility PASS Warning (elevates Risk Level) No state/cursor/checkpoint/incremental_sync/paginator/transformations keywords in diff hunks. The parse_response change yields the same record shapes and counts on valid responses; on invalid responses it logs and skips (or yields the existing base record for IssueTimelineEvents). State format unchanged. /ai-canary-prerelease confirmed identical record counts vs prior version.
Behavioral Changes FAIL (warning) Warning (elevates Risk Level) Operational keywords match in diff hunks: rate_limit/rate-limit (in new helper _safe_json_check_graphql_rate_limited and warning string "GraphQL rate-limit check…") and error_handler (GithubStreamABCErrorHandler, GitHubGraphQLErrorHandler class context). The actual semantic shift is: previously these paths could raise TypeError/KeyError/json.JSONDecodeError; now they log a warning and return safely. Defensive in nature — does not change happy-path behavior. WARNING-level gate; surfaces as risk elevation, does NOT block APPROVE.
Out-of-Scope Changes PASS Skip All 6 changed files are in scope: 4 in airbyte-integrations/connectors/source-github/**, 1 in docs/integrations/sources/github.md.
CI Checks PASS Yes 37 passed, 0 failed, 0 pending, 6 skipped. Core checks green: Test source-github Connector, Lint source-github Connector, Build and Verify Artifacts (source-github), Connector CI Checks Summary, Format Check, MarkDownLint, Vale, CodeQL. Pre-Release Checks excluded from CI Checks evaluation per playbook (covered by Live / E2E Tests gate).
Live / E2E Tests PASS Yes Validation required (bug fix). Strong evidence: (1) source-github Pre-Release Checks check-run PASSED; (2) Pre-release 2.1.27-preview.09e85a7 was successfully published to Docker Hub; (3) /ai-prove-fix regression on prior commit (39 streams, 1028 vs 1026 records vs 2.1.24) PASSED; (4) /ai-canary-prerelease PASSED on 9 connections / 7 actors / 4 destination types (GCS, ab-analytics, BigQuery, Redshift, Postgres) with 4 successful post-pin syncs and 0 failures and record counts/bytes matching baseline exactly.

Spec Comparison (Backwards Compatibility)

No spec.json, spec.yaml, or spec section in manifest.yaml modified — comparison not required. metadata.yaml diff is solely dockerImageTag: 2.1.26 → 2.1.27.

Behavioral Changes Detail

Keywords matched in diff hunks (case-insensitive):

  • rate_limit / rate-limit: New helper method name _safe_json_check_graphql_rate_limited and warning string "GraphQL rate-limit check received non-JSON response" in errors_handlers.py:91-101.
  • error_handler: Class context GithubStreamABCErrorHandler, GitHubGraphQLErrorHandler (existing classes; new methods added).

Semantic analysis: The change is purely defensive — wraps existing .json() calls in try/except ValueError and validates response shapes via isinstance. On valid responses (the only path exercised in normal operation and in /ai-canary-prerelease), behavior is identical. On malformed responses, the connector now logs a warning and yields no records (or, for IssueTimelineEvents, yields the base record without timeline events) instead of raising and crashing the sync. This is intentional rate-limit/error-handler hardening.

📚 Evidence Consulted

Evidence

  • Changed files (6):
    • airbyte-integrations/connectors/source-github/metadata.yaml (+1/-1, dockerImageTag bump only)
    • airbyte-integrations/connectors/source-github/pyproject.toml (+1/-1, version bump only)
    • airbyte-integrations/connectors/source-github/source_github/errors_handlers.py (+39/-1)
    • airbyte-integrations/connectors/source-github/source_github/streams.py (+74/-9)
    • airbyte-integrations/connectors/source-github/unit_tests/test_stream.py (+186/-1)
    • docs/integrations/sources/github.md (+1/-0, changelog row)
  • CI checks: 37 passed, 0 failed, 0 pending, 6 skipped (e.g. Test source-github Connector ✅, Lint source-github Connector ✅, source-github Pre-Release Checks ✅, Build and Verify Artifacts (source-github) ✅, Format Check ✅, CodeQL ✅, Docs / MarkDownLint ✅, Docs / Vale ✅, Vercel Preview ✅).
  • PR labels: connectors/source/github (no validation/override labels needed; check-run evidence sufficient).
  • PR description: Present, well-structured (What / How / Declarative-First Evaluation / Test Coverage / Review Guide / User Impact / Reversibility / Prior Art).
  • Validation evidence:
    • /ai-prove-fix (comment 22): Regression PASS — 39 streams, 1028 vs 1026 records vs 2.1.24 stable.
    • /ai-canary-prerelease (comment 38): PASS on 9 connections / 7 actors across GCS / ab-analytics / BigQuery / Redshift / Postgres — 4 successful post-pin syncs, 0 failures, record counts/bytes match baseline exactly.
  • Existing bot reviews: Prior /ai-review APPROVE on 02fc609f (comment 28). This re-review is for new HEAD 09e85a75 (substantive merge from master + version bump landed since).
  • Peer feedback: All review comments from Patrick Nilan (@pnilan) are resolved or replied to and the cited diff lines reflect the requested fixes ([:50] truncation, warning logs, body or {} guard, base-record yield on parse failure for IssueTimelineEvents, etc.).

Devin session

Comment on lines +1941 to +1960
try:
events_list = response.json()
except ValueError:
self.logger.warning(
"`%s` received non-JSON response (HTTP %s, first 50 chars: %r). " "Yielding base record without timeline events.",
self.name,
response.status_code,
response.text[:50],
)
yield record
return
if not isinstance(events_list, list):
self.logger.warning(
"`%s` expected a JSON list but got %s (HTTP %s). " "Yielding base record without timeline events.",
self.name,
type(events_list).__name__,
response.status_code,
)
yield record
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code is repeated multiple times, pull into its own private method

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — extracted the repeated defensive JSON parsing into GithubStream._safe_json_list(response, key=None) in 4074cc3.

The helper handles both cases:

  • Keyed dict (key="workflows" / "workflow_runs" / "jobs"): parses JSON, extracts body[key], validates it's a list
  • Top-level list (key=None): parses JSON, validates the body itself is a list

Returns None on any failure (non-JSON, missing key, wrong type) with a warning log, so callers short-circuit with a simple if items is None: return.

All 4 call sites (Workflows, WorkflowRuns, WorkflowJobs, IssueTimelineEvents) now use this helper. Net -28 lines. All 161 tests pass.

…t helper

Co-Authored-By: bot_apk <apk@cognition.ai>
@pnilan Patrick Nilan (pnilan) enabled auto-merge (squash) May 3, 2026 22:01
@pnilan Patrick Nilan (pnilan) merged commit b4f7d25 into master May 3, 2026
44 checks passed
@pnilan Patrick Nilan (pnilan) deleted the devin/1777672505-source-github-defensive-parse-response branch May 3, 2026 22:02
devin-ai-integration Bot added a commit that referenced this pull request May 3, 2026
… dates, refine permissions section

- Add 2.1.27 changelog entry with correct merge date (2026-05-03 instead of 2026-05-02) and a clearer subject describing the defensive parse_response and error-handler behavior introduced in PR #77685
- Fix 2.1.25 changelog date (2026-05-02, was 2026-05-01) and rewrite its subject to reference SAML SSO authorization
- Fix 2.1.15 changelog date (2026-04-01, was 2026-03-27)
- Fix 2.1.9 changelog date (2026-02-06, was 2026-01-27); the dependency-update PR was opened 2026-01-27 but merged 2026-02-06
- Reword 2.1.26 entry to describe the generic 'feature is disabled' skip behavior
- Remove the duplicate 'Max Waiting Time' setup step (one copy still claimed default 10 min / max 60 min, contradicting the spec which is default 120 min / max 240 min)
- Tighten the 'Start date' note for clarity and grammar
- Rewrite 'Permissions and scopes' to call out the minimum scope set, OAuth-vs-PAT differences, per-stream requirements (Collaborators, Teams, Projects), and the SAML SSO authorization requirement that produces 403s on PATs that have the right scopes but are not SSO-authorized
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants