Skip to content

Fixes #29358: fix BigQuery test connection signature, per-project scoping, and engine teardown#29359

Merged
chirag-madlani merged 8 commits into
mainfrom
ices2/bigquery-test-connection-fix
Jun 30, 2026
Merged

Fixes #29358: fix BigQuery test connection signature, per-project scoping, and engine teardown#29359
chirag-madlani merged 8 commits into
mainfrom
ices2/bigquery-test-connection-fix

Conversation

@IceS2

@IceS2 IceS2 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Fixes #29358

What

Follow-up to the BaseConnection migration (#28977). The BigQuery test connection crashed with TypeError: 'BigQueryConnection' object cannot be interpreted as an integer, lost per-project testing, and leaked the test-connection engine.

How

  • BigquerySource._test_connection: call the test function with the migrated signature (metadata only) — the old (metadata, engine, service_connection) call pushed the service connection into timeout_seconds, which reached signal.alarm(). Restore per-project testing by scoping the service connection to each project.
  • helper.py: extract clone_connection_for_project (the deepcopy + SingleProjectId scope), reused by get_inspector_details.
  • BigQueryConnection: register engine.dispose via _on_close in _get_client, and wrap test_connection in try/finally: self.close() so each per-project engine is disposed.

Tests

New test_bigquery_test_connection.py (3 tests):

  • per-project clone scoping + deepcopy isolation of the original multi-project connection;
  • _test_connection invokes the test function once per project with metadata only (guards the crash signature);
  • test_connection disposes its engine (guards both the _on_close registration and the finally: self.close()).

Mutation-checked (reverting each fix fails the matching test); ruff clean.

Greptile Summary

This PR fixes three bugs introduced by the BaseConnection migration (#28977) in the BigQuery connector: a TypeError crash when test_connection was called with the old three-argument signature (service connection object fell into timeout_seconds), lost per-project scoping during test connection, and an engine leak after each test run.

  • connection.py: registers engine.dispose via _on_close in _get_client and adds try/finally: self.close() to test_connection so the engine is always released.
  • helper.py: extracts clone_connection_for_project (deep-copy + SingleProjectId scope) and reuses it in both get_inspector_details and the metadata source, eliminating the duplicated deepcopy+assignment pattern.
  • metadata.py: fixes _test_connection to call test_connection_fn(self.metadata) with the new single-argument signature and scopes each call to its own project via clone_connection_for_project.

Confidence Score: 5/5

Safe to merge — all three targeted bugs are fixed with correct logic and each fix is covered by a dedicated regression test.

The changes are narrow and well-scoped: a two-line engine teardown addition in connection.py, a clean refactor that extracts an existing deepcopy pattern in helper.py, and a one-liner signature correction in metadata.py. BaseConnection.close() uses ExitStack with a full reset, so there is no risk of double-disposal or stale callback accumulation. The three new unit tests are mutation-checked and directly guard against regression of the exact bugs being fixed.

No files require special attention.

Important Files Changed

Filename Overview
ingestion/src/metadata/ingestion/source/database/bigquery/connection.py Registers engine.dispose via _on_close and wraps test_connection in try/finally: self.close() — both halves of the engine teardown fix. Logic is correct and consistent with BaseConnection's ExitStack lifecycle.
ingestion/src/metadata/ingestion/source/database/bigquery/helper.py Extracts clone_connection_for_project from get_inspector_details; the refactored impersonation conditional is logically equivalent to the original nested form.
ingestion/src/metadata/ingestion/source/database/bigquery/metadata.py _test_connection now uses clone_connection_for_project and the single-argument test_connection_fn(self.metadata) call — correctly fixes the signature crash and restores per-project scoping.
ingestion/tests/unit/topology/database/test_bigquery_test_connection.py New test file with 3 targeted regression tests covering the crash signature, per-project cloning isolation, and engine disposal — each test is mutation-checked against its specific fix.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant MS as BigquerySource._test_connection
    participant H as helper.clone_connection_for_project
    participant SC as get_test_connection_fn
    participant BC as BigQueryConnection.test_connection
    participant GC as _get_client
    participant ES as ExitStack (_on_close)
    participant E as SQLAlchemy Engine

    loop for each project_id
        MS->>H: deepcopy(service_connection) + SingleProjectId(project_id)
        H-->>MS: project_connection (scoped)
        MS->>SC: get_test_connection_fn(project_connection)
        SC-->>MS: bound test_connection_fn
        MS->>BC: test_connection_fn(metadata)
        BC->>GC: self.client (lazy build)
        GC->>E: create_generic_db_connection(...)
        GC->>ES: _on_close(engine.dispose)
        GC-->>BC: engine
        BC->>BC: test_connection_inner(engine)
        Note over BC: try/finally
        BC->>ES: self.close()
        ES->>E: engine.dispose()
        ES->>ES: "reset ExitStack, _client=None"
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant MS as BigquerySource._test_connection
    participant H as helper.clone_connection_for_project
    participant SC as get_test_connection_fn
    participant BC as BigQueryConnection.test_connection
    participant GC as _get_client
    participant ES as ExitStack (_on_close)
    participant E as SQLAlchemy Engine

    loop for each project_id
        MS->>H: deepcopy(service_connection) + SingleProjectId(project_id)
        H-->>MS: project_connection (scoped)
        MS->>SC: get_test_connection_fn(project_connection)
        SC-->>MS: bound test_connection_fn
        MS->>BC: test_connection_fn(metadata)
        BC->>GC: self.client (lazy build)
        GC->>E: create_generic_db_connection(...)
        GC->>ES: _on_close(engine.dispose)
        GC-->>BC: engine
        BC->>BC: test_connection_inner(engine)
        Note over BC: try/finally
        BC->>ES: self.close()
        ES->>E: engine.dispose()
        ES->>ES: "reset ExitStack, _client=None"
    end
Loading

Reviews (10): Last reviewed commit: "Merge branch 'main' into ices2/bigquery-..." | Re-trigger Greptile

Comment thread ingestion/tests/unit/topology/database/test_bigquery_test_connection.py Outdated
@github-actions

Copy link
Copy Markdown
Contributor

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 63%
63.43% (70702/111461) 45.9% (40629/88509) 47.83% (12375/25872)

@sonarqubecloud

Copy link
Copy Markdown

@sonarqubecloud

Copy link
Copy Markdown

@gitar-bot

gitar-bot Bot commented Jun 30, 2026

Copy link
Copy Markdown
Code Review ✅ Approved 1 resolved / 1 findings

Restores BigQuery test connection functionality by fixing the call signature, enabling per-project scoping, and ensuring proper engine disposal. No issues found after verifying with new mutation-checked regression tests.

✅ 1 resolved
Quality: Test asserts internal call signature via mocked internal fns

📄 ingestion/tests/unit/topology/database/test_bigquery_test_connection.py:66-80
test_test_connection_runs_once_per_project_with_metadata_only mocks the internal functions clone_connection_for_project and get_test_connection_fn, builds the source with object.__new__(BigquerySource), and then asserts on the recorded call_args of those internal calls (e.g. call.args == (source.metadata,)). Per the project's testing philosophy ("Integration over Mocks: test behavior and outcomes, not internal method calls" and "Mocking: Only mock external boundaries, not internal classes"), this couples the test to implementation details rather than observable outcomes. It still serves as a useful regression guard for the crash signature, but it is brittle: any refactor of the internal wiring (even a behavior-preserving one) will break it. Consider asserting on the observable outcome instead — e.g. let the real clone_connection_for_project run and stub only the external boundary (the test-connection step runner / test_connection_steps), then verify each project is probed independently. This is a minor, test-only concern; the production changes themselves are correct.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BigQuery test connection crashes and leaks engines after BaseConnection migration

3 participants