Skip to content

fix(profiler): SAP HANA CREATE_TIME from SYS.TABLES + uppercase catalog matching#27483

Merged
pmbrull merged 11 commits intomainfrom
task/debug-sap-hana-profiler-create-time-miss-53888850
Apr 20, 2026
Merged

fix(profiler): SAP HANA CREATE_TIME from SYS.TABLES + uppercase catalog matching#27483
pmbrull merged 11 commits intomainfrom
task/debug-sap-hana-profiler-create-time-miss-53888850

Conversation

@IceS2
Copy link
Copy Markdown
Contributor

@IceS2 IceS2 commented Apr 17, 2026

Fixes #27439

Summary

  • CREATE_TIME fix: HANA's SYS.M_TABLES lacks CREATE_TIME. Query SYS.TABLES via a second CTE + LEFT JOIN so createDateTime is populated in table profiles.
  • Case sensitivity fix: HANA catalog stores identifiers in uppercase but OM stores them lowercase. Apply .upper() to schema/table names in WHERE clauses so rows actually match.
  • Tests: 5 new unit tests covering uppercase matching, LEFT JOIN NULL handling, CTE structure, nonexistent table, and column metadata labels.

Test plan

  • Unit tests pass (pytest test_table_metric_computer.py — 28/28)
  • E2E validated against HANA Express: libraries_ table profiled with rowCount=7, columnCount=9, sizeInByte=16480, createDateTime=2026-04-16T15:32:08
image

…case catalog matching

HANA's SYS.M_TABLES lacks CREATE_TIME — query SYS.TABLES via CTE + LEFT JOIN.
HANA catalog stores identifiers uppercase but OM stores lowercase — apply .upper() to WHERE filters.
@IceS2 IceS2 requested a review from a team as a code owner April 17, 2026 12:20
Copilot AI review requested due to automatic review settings April 17, 2026 12:20
@github-actions github-actions Bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Apr 17, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes SAP HANA table profiling so createDateTime is correctly populated and table/schema matching works when OpenMetadata stores identifiers in lowercase but HANA system views use uppercase.

Changes:

  • Update SAPHanaTableMetricComputer to pull CREATE_TIME from SYS.TABLES via a second CTE and LEFT JOIN to SYS.M_TABLES.
  • Uppercase schema/table identifiers used in the HANA system-view filters to ensure rows match.
  • Add unit tests covering the new CTE/join structure, uppercase matching, and NULL create-time behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
ingestion/src/metadata/profiler/orm/functions/table_metric_computer.py Reworks HANA metrics query to join SYS.M_TABLES with SYS.TABLES and uppercase identifiers for matching.
ingestion/tests/unit/observability/profiler/test_table_metric_computer.py Adds tests validating the new HANA query shape (CTEs + LEFT JOIN) and uppercase filtering behavior.

Comment on lines +353 to +355
assert sql_upper.count("WITH") >= 1 or sql_upper.count("AS (") >= 2 or sql_upper.count("AS \n(") >= 2, (
f"Expected two CTEs in query, got: {sql}"
)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion in this test doesn’t actually verify that there are two CTEs: sql_upper.count("WITH") >= 1 will pass even with a single CTE, and the or chain makes the check very permissive. Consider tightening this to assert the presence of both CTE definitions (e.g., both M_TABLES and TABLES appear in the WITH ... clause, or count CTE AS ( occurrences in a dialect-stable way).

Suggested change
assert sql_upper.count("WITH") >= 1 or sql_upper.count("AS (") >= 2 or sql_upper.count("AS \n(") >= 2, (
f"Expected two CTEs in query, got: {sql}"
)
assert "WITH" in sql_upper, f"Expected WITH clause in query, got: {sql}"
assert "M_TABLES AS (" in sql_upper, (
f"Expected M_TABLES CTE definition in query, got: {sql}"
)
assert "TABLES AS (" in sql_upper, (
f"Expected TABLES CTE definition in query, got: {sql}"
)

Copilot uses AI. Check for mistakes.
Comment on lines +362 to +367
"""When table not found in either system view, execute().first() returns None."""
session = _build_mock_session()
session.execute.return_value.first.return_value = None
computer = _build_computer(session, SAPHanaTableMetricComputer)
result = computer.compute()
assert result is None
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_compute_returns_none_for_nonexistent_table duplicates test_compute_returns_none_when_no_result earlier in the same class (both only assert compute() returns None when execute().first() is None). Consider removing one, or adjusting this one to cover a distinct case not already exercised.

Suggested change
"""When table not found in either system view, execute().first() returns None."""
session = _build_mock_session()
session.execute.return_value.first.return_value = None
computer = _build_computer(session, SAPHanaTableMetricComputer)
result = computer.compute()
assert result is None
"""When table is absent from HANA system views, compute returns None and
still queries using uppercased identifiers expected by the catalog."""
session = _build_mock_session()
session.execute.return_value.first.return_value = None
computer = _build_computer(session, SAPHanaTableMetricComputer)
result = computer.compute()
sql = str(
session.execute.call_args[0][0].compile(
compile_kwargs={"literal_binds": True}
)
)
assert result is None
assert "TEST_SCHEMA" in sql, (
f"Nonexistent-table lookup must use uppercased schema name, got: {sql}"
)
assert "TEST_TABLE" in sql, (
f"Nonexistent-table lookup must use uppercased table name, got: {sql}"
)

Copilot uses AI. Check for mistakes.
session.execute.return_value.first.return_value = mock_result
computer = _build_computer(session, SAPHanaTableMetricComputer)
computer.compute()
sql = str(session.execute.call_args[0][0].compile(compile_kwargs={"literal_binds": True}))
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These .compile(compile_kwargs={"literal_binds": True}) lines exceed Black’s default line length (88) and will likely fail the repo’s black --check formatting step (see ingestion/noxfile.py:55). Running Black should wrap these expressions into multiple lines.

Suggested change
sql = str(session.execute.call_args[0][0].compile(compile_kwargs={"literal_binds": True}))
sql = str(
session.execute.call_args[0][0].compile(
compile_kwargs={"literal_binds": True}
)
)

Copilot uses AI. Check for mistakes.
session.execute.return_value.first.return_value = mock_result
computer = _build_computer(session, SAPHanaTableMetricComputer)
computer.compute()
sql = str(session.execute.call_args[0][0].compile(compile_kwargs={"literal_binds": True}))
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is likely not Black-formatted (it exceeds Black’s 88-char line length). Please run the formatter or wrap the str(session.execute...compile(...)) call so black --check passes.

Copilot uses AI. Check for mistakes.
session.execute.return_value.first.return_value = mock_result
computer = _build_computer(session, SAPHanaTableMetricComputer)
computer.compute()
sql = str(session.execute.call_args[0][0].compile(compile_kwargs={"literal_binds": True}))
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line likely exceeds Black’s 88-character limit and will fail formatting checks. Consider wrapping the .compile(compile_kwargs={...}) call across lines and re-running black.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

IceS2 added 2 commits April 17, 2026 16:43
…ack formatting

- Guard against None schema/table name before .upper() in HANA compute
- Tighten CTE assertion to check both M_TABLES and TABLES definitions
- Make nonexistent-table test distinct by verifying uppercase in query
- Apply Black formatting fixes
Copilot AI review requested due to automatic review settings April 17, 2026 15:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread ingestion/src/metadata/profiler/orm/functions/table_metric_computer.py Outdated
TeddyCr
TeddyCr previously approved these changes Apr 17, 2026
@TeddyCr TeddyCr enabled auto-merge (squash) April 17, 2026 15:26
Copilot AI review requested due to automatic review settings April 19, 2026 07:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment on lines +799 to +801
"Missing schema or table name for HANA table metric computation"
)
return None
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The early guard returns None when schema_name/table_name are missing. Elsewhere in this module, missing names fall back to the base computation (see TableMetricComputerFactory.construct() fallback behavior), so returning None here can silently skip profiling. Consider falling back to super().compute() (or raising to trigger factory fallback) and include the actual schema/table values in the log message for debugging.

Suggested change
"Missing schema or table name for HANA table metric computation"
)
return None
"Missing schema or table name for HANA table metric computation. "
"Falling back to base computation with schema_name=%r, table_name=%r",
self.schema_name,
self.table_name,
)
return super().compute()

Copilot uses AI. Check for mistakes.
IceS2 and others added 2 commits April 20, 2026 08:33
…ssing names

- Restore docstring on HANA compute() for consistency with other computers
- Fall back to super().compute() instead of None when schema/table name missing
- Include schema/table values in warning log for debugging
Copilot AI review requested due to automatic review settings April 20, 2026 06:34
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 20, 2026

Code Review ✅ Approved

Corrects SAP HANA metadata retrieval by mapping CREATE_TIME from SYS.TABLES and normalizing catalog object matching to uppercase. No issues found.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@sonarqubecloud
Copy link
Copy Markdown

@pmbrull pmbrull merged commit 6eb88ea into main Apr 20, 2026
53 checks passed
@pmbrull pmbrull deleted the task/debug-sap-hana-profiler-create-time-miss-53888850 branch April 20, 2026 09:17
IceS2 added a commit that referenced this pull request Apr 20, 2026
…og matching (#27483)

* fix(profiler): SAP HANA CREATE_TIME sourced from SYS.TABLES and uppercase catalog matching

HANA's SYS.M_TABLES lacks CREATE_TIME — query SYS.TABLES via CTE + LEFT JOIN.
HANA catalog stores identifiers uppercase but OM stores lowercase — apply .upper() to WHERE filters.

* fix checkstyle

* fix(profiler): address PR review — null guard, tighter assertions, Black formatting

- Guard against None schema/table name before .upper() in HANA compute
- Tighten CTE assertion to check both M_TABLES and TABLES definitions
- Make nonexistent-table test distinct by verifying uppercase in query
- Apply Black formatting fixes

* fix(profiler): tighten HANA CTE assertions and restore compute docstring

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/175f9aa0-30e5-49a2-84af-d4e76fb01a14

Co-authored-by: IceS2 <4912399+IceS2@users.noreply.github.com>

* fix(profiler): normalize HANA SQL assertions for stable CTE checks

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/175f9aa0-30e5-49a2-84af-d4e76fb01a14

Co-authored-by: IceS2 <4912399+IceS2@users.noreply.github.com>

* fix(profiler): add docstring and fallback to super().compute() for missing names

- Restore docstring on HANA compute() for consistency with other computers
- Fall back to super().compute() instead of None when schema/table name missing
- Include schema/table values in warning log for debugging

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The profiler fails because it queries a CREATE_TIME column from SYS.M_TABLES that doesn't exist

5 participants