Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
STARROCKS_PARTITION_DETAILS,
STARROCKS_SHOW_FULL_COLUMNS,
)
from metadata.ingestion.source.database.starrocks.utils import get_table_comment

Check warning on line 46 in ingestion/src/metadata/ingestion/source/database/starrocks/metadata.py

View workflow job for this annotation

GitHub Actions / Unit Tests & Static Checks (3.10)

Type of "get_table_comment" is partially unknown   Type of "get_table_comment" is "(_: Unknown, connection: Unknown, table_name: Unknown, schema: Unknown | None = None, **kw: Unknown) -> dict[str, Unknown | None]" (reportUnknownVariableType)
from metadata.utils.logger import ingestion_logger
from metadata.utils.ssl_manager import SSLManager, check_ssl_and_init

Expand Down Expand Up @@ -258,21 +259,16 @@

return tables

@staticmethod
def get_table_description(schema_name: str, table_name: str, inspector: Inspector) -> Optional[str]:
# pylint: disable=arguments-differ,unused-argument
def get_table_description(self, schema_name: str, table_name: str, inspector: Inspector) -> Optional[str]:
Comment thread
gitar-bot[bot] marked this conversation as resolved.
Comment on lines +262 to +263
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StarRocksSource.get_table_description now returns Optional[str], but the base CommonDbSourceService.get_table_description is annotated to return str. This override widens the return type and can trigger basedpyright/pyright incompatible-override errors. Consider keeping the return annotation as str to match the base signature, or updating the base method annotation to Optional[str] (and any other overrides) if None is a valid return everywhere.

Copilot uses AI. Check for mistakes.
description = None
try:
table_info: dict = inspector.get_table_comment(table_name, schema_name)
table_info = get_table_comment(None, self.connection, table_name, schema=schema_name)

Check warning on line 266 in ingestion/src/metadata/ingestion/source/database/starrocks/metadata.py

View workflow job for this annotation

GitHub Actions / Unit Tests & Static Checks (3.10)

Type of "table_info" is partially unknown   Type of "table_info" is "dict[str, Unknown | None]" (reportUnknownVariableType)
except Exception as exc: # pylint: disable=broad-except
logger.debug(traceback.format_exc())
logger.warning(f"Table description error for table [{schema_name}.{table_name}]: {exc}")
else:
description = table_info.get("text")

if description is None:
return None
if isinstance(description, (list, tuple)) and len(description) > 0:
return description[0]
return description
Comment on lines +262 to 272
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: Override changes @staticmethod to instance method, breaking contract

The parent class CommonDbSourceService.get_table_description is declared as @staticmethod (and @abstractmethod in the base SqlAlchemySource). The override in StarRocksSource changes it to an instance method (self as first parameter). While Python allows this at runtime (calling self.get_table_description(...) will work because self is simply consumed as the first positional arg), it violates the method contract, triggers the arguments-differ pylint warning (hence the suppression comment), and loses the @calculate_execution_time() decorator that the parent applies.

This also means any code that calls the method as a static method (e.g., StarRocksSource.get_table_description(schema, table, inspector)) would fail because self is now expected.

A cleaner approach: keep it as a @staticmethod and pass connection explicitly, or access self.connection but properly override without suppressing arguments-differ.

Suggested fix:

# Keep the @staticmethod contract and pass connection:
@staticmethod
@calculate_execution_time()
def get_table_description(
    schema_name: str, table_name: str, inspector: Inspector
) -> Optional[str]:
    description = None
    try:
        table_info = inspector.get_table_comment(table_name, schema_name)
    except Exception as exc:
        ...
    else:
        description = table_info.get("text")
    return description
# And patch inspector.get_table_comment via the dialect instead.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion


def _get_columns(self, table_name, schema=None):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
{"table_name": table_name, "schema": schema},
**kw,
)
for table_comment in rows:
comment = table_comment
for table_comment in rows.mappings():

Check warning on line 45 in ingestion/src/metadata/ingestion/source/database/starrocks/utils.py

View workflow job for this annotation

GitHub Actions / Unit Tests & Static Checks (3.10)

Type of "mappings" is unknown (reportUnknownMemberType)
comment = table_comment["TABLE_COMMENT"]
break
return {"text": comment}
79 changes: 78 additions & 1 deletion ingestion/tests/unit/topology/database/test_starrocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"""

from unittest import TestCase
from unittest.mock import patch
from unittest.mock import MagicMock, patch

import pytest
from sqlalchemy import types as sqltypes
Expand All @@ -26,6 +26,7 @@
StarRocksSource,
_get_sqlalchemy_type,
)
from metadata.ingestion.source.database.starrocks.utils import get_table_comment

mock_starrocks_config = {
"source": {
Expand Down Expand Up @@ -171,3 +172,79 @@ def test_iceberg_relkind_mapping(self):
from metadata.ingestion.source.database.starrocks.metadata import RELKIND_MAP

assert RELKIND_MAP["ICEBERG"] == TableType.Iceberg


class TestStarRocksGetTableDescription(TestCase):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Tests use unittest.TestCase; project prefers plain pytest

Per the project's testing guidelines, new tests should use pytest with plain assert statements and fixtures for setup, rather than inheriting from unittest.TestCase. The new TestStarRocksGetTableDescription and TestGetTableComment classes both extend TestCase. While functional, converting them to plain pytest classes (using @pytest.fixture instead of setUp) would be more consistent with the project's conventions.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

"""Tests for get_table_description delegating to utils.get_table_comment"""

@patch("metadata.ingestion.source.database.common_db_source.CommonDbSourceService.test_connection")
def setUp(self, test_connection):
test_connection.return_value = False
self.config = OpenMetadataWorkflowConfig.model_validate(mock_starrocks_config)
self.source = StarRocksSource.create(
mock_starrocks_config["source"],
self.config.workflowConfig.openMetadataServerConfig,
)
thread_id = self.source.context.get_current_thread_id()
self.source._connection_map[thread_id] = MagicMock()

@patch("metadata.ingestion.source.database.starrocks.metadata.get_table_comment")
def test_returns_table_comment(self, mock_get_table_comment):
mock_get_table_comment.return_value = {"text": "审计日志表"}

description = self.source.get_table_description(
schema_name="test_db",
table_name="audit_tbl",
inspector=MagicMock(),
)
assert description == "审计日志表"
mock_get_table_comment.assert_called_once_with(None, self.source.connection, "audit_tbl", schema="test_db")

Comment on lines +191 to +202
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests mock starrocks.metadata.get_table_comment, so they won't catch integration failures in the new code path (e.g., incorrect interaction with SQLAlchemy reflection caching / dialect arguments). Consider testing get_table_description by mocking self.source.connection.execute(...) (or testing utils.get_table_comment directly) so the suite fails if the helper invocation or row parsing is wrong.

Copilot uses AI. Check for mistakes.
@patch("metadata.ingestion.source.database.starrocks.metadata.get_table_comment")
def test_returns_none_for_empty_comment(self, mock_get_table_comment):
mock_get_table_comment.return_value = {"text": None}

description = self.source.get_table_description(
schema_name="test_db",
table_name="no_comment_tbl",
inspector=MagicMock(),
)
assert description is None

@patch("metadata.ingestion.source.database.starrocks.metadata.get_table_comment")
def test_returns_none_on_exception(self, mock_get_table_comment):
mock_get_table_comment.side_effect = Exception("connection error")

description = self.source.get_table_description(
schema_name="test_db",
table_name="error_tbl",
inspector=MagicMock(),
)
assert description is None
Comment on lines +177 to +223
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions adding 4 unit tests (including a missing-table case), but this file currently adds 3 tests and does not include a distinct missing-table scenario. Either add the missing test case (e.g., get_table_comment returning {} / no rows) or update the PR description to match what’s actually covered.

Copilot uses AI. Check for mistakes.


class TestGetTableComment(TestCase):
"""Tests for utils.get_table_comment row parsing"""

def _make_connection(self, rows):
connection = MagicMock()
connection.info = {}
result = MagicMock()
result.mappings.return_value = iter(rows)
connection.execute.return_value = result
return connection
Comment on lines +229 to +235
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit test mocks connection.execute() as an iterator of plain dicts, but the real implementation iterates SQLAlchemy Result rows (e.g. Row / RowMapping), which can behave differently for key access. To avoid tests passing while production breaks, mock the SQLAlchemy result shape more closely (e.g. return an object with .mappings() yielding dict-like rows, or yield rows exposing ._mapping).

Copilot uses AI. Check for mistakes.
Comment on lines +230 to +235
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_table_comment is @sqlalchemy.engine.reflection.cache-decorated and the decorator expects connection.info to be a real dict (it uses connection.info.setdefault(...) and membership checks). In _make_connection the connection is a bare MagicMock, so connection.info becomes another MagicMock and the cache wrapper can incorrectly treat it as a populated cache and return a MagicMock instead of executing the function. Set connection.info = {} (or pass info_cache={} when calling get_table_comment) to make these unit tests exercise the real row-parsing logic reliably.

Copilot uses AI. Check for mistakes.

def test_returns_comment_from_row(self):
row = {"TABLE_COMMENT": "审计日志表"}
connection = self._make_connection([row])

result = get_table_comment(None, connection, "audit_tbl", schema="test_db")

assert result == {"text": "审计日志表"}

def test_returns_none_when_no_rows(self):
connection = self._make_connection([])

result = get_table_comment(None, connection, "missing_tbl", schema="test_db")

assert result == {"text": None}
Loading