feat(data-quality): Add impact ranking for data quality checks #26658 by AshutoshIIT1234 · Pull Request #27346 · open-metadata/OpenMetadata

AshutoshIIT1234 · 2026-04-14T09:49:40Z

Implement Data Quality Checks Impact feature for issue #26658

Add scoring model that calculates impact based on:
- Downstream usage (40% weight)
- Consumer count (30% weight)
- Recent incidents (30% weight)
Add REST API endpoint for impact ranking
Add frontend components to display impact scores
Add schema for dataQualityCheckImpact
This helps prioritize which data quality checks are most critical to fix by ranking them based on their business impact.
Describe your changes:
I worked on the Data Quality Checks Impact feature because the issue requested ranking data quality checks by impact to give a clear idea of data quality failure criticality.
Type of change:
[✓] New feature
Checklist:
[✓] I have read the CONTRIBUTING document.
[✓] My PR title is Fixes Data Quality Checks Impact #26658: Add impact ranking for data quality checks
[✓ ] For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Implement Data Quality Checks Impact feature for issue open-metadata#26658 - Add scoring model that calculates impact based on: - Downstream usage (40% weight) - Consumer count (30% weight) - Recent incidents (30% weight) - Add REST API endpoint for impact ranking - Add frontend components to display impact scores - Add schema for dataQualityCheckImpact This helps prioritize which data quality checks are most critical to fix by ranking them based on their business impact.

github-actions · 2026-04-14T09:50:08Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

github-actions · 2026-04-14T09:50:27Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

gitar-bot · 2026-04-14T09:52:59Z

  }

-  public DataQualityReport getDataQualityReport(
+public DataQualityReport getDataQualityReport(


💡 Quality: Indentation damage on pre-existing lines (accidental whitespace changes)

Several pre-existing lines had their leading indentation removed (e.g., getDataQualityReport at line 362, searchDataQualityLineage at line 2787, SubjectContext at line 545, getDataQualityTab at line 74). These appear to be accidental whitespace-only changes that break the consistent indentation style of the codebase. They should be reverted to avoid noisy diffs.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

Copilot

Pull request overview

Adds an “Impact Ranking” feature for Data Quality checks, spanning backend search/query logic, a new REST endpoint, and a new Data Quality UI tab to display ranked checks.

Changes:

UI: Introduces a new Data Quality “Impact” tab and an ImpactRanking table/summary view.
Service: Adds a GET /v1/dataQuality/testSuites/dataQualityCheckImpact endpoint and supporting repository/search methods.
Spec: Adds a new JSON schema for dataQualityCheckImpact.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 22 comments.

Show a summary per file

File	Description
openmetadata-ui/src/main/resources/ui/src/rest/testAPI.ts	Adds a client helper and TS types for fetching impact ranking data
openmetadata-ui/src/main/resources/ui/src/pages/DataQuality/DataQualityPage.interface.ts	Adds `IMPACT` tab enum value
openmetadata-ui/src/main/resources/ui/src/pages/DataQuality/DataQualityClassBase.ts	Registers the new Impact tab and component
openmetadata-ui/src/main/resources/ui/src/components/DataQuality/ImpactRanking/ImpactRanking.component.tsx	New UI component to render impact summaries + ranked table
openmetadata-spec/src/main/resources/json/schema/tests/dataQualityCheckImpact.json	New schema intended to define the impact ranking payload
openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java	Adds a helper to search test cases for impact ranking
openmetadata-service/src/main/java/org/openmetadata/service/resources/dqtests/TestSuiteResource.java	Adds the REST endpoint to fetch impact ranking
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java	Implements ranking/scoring logic and builds the search query

Copilot · 2026-04-14T10:01:58Z

+          doc.put("downstreamUsage", 0);
+          doc.put("consumerCount", 0);
+          results.add(doc);


downstreamUsage and consumerCount are hardcoded to 0 (and recentIncidents isn’t populated at all), which makes the impact scoring endpoint always produce 0 scores. These metrics need to be computed/fetched before ranking, otherwise the feature is non-functional.

Copilot · 2026-04-14T10:01:58Z

+    for (Map<String, Object> testCase : testCases) {
+      int downstreamUsage =
+          ((Number) testCase.getOrDefault("downstreamUsage", 0)).intValue();
+      int consumerCount = ((Number) testCase.getOrDefault("consumerCount", 0)).intValue();
+      int recentIncidents =


The scoring loop assumes downstreamUsage, consumerCount, and recentIncidents are present in the input docs, but the current retrieval code defaults them to 0. That makes impactScore always 0 and defeats the ranking.

Copilot · 2026-04-14T10:01:58Z

+      "description": "Fully qualified name of the test suite.",
+      "$ref": "../type/basic.json#/definitions/fullyQualifiedEntityName"
+    },
+    "entityFullyQualifiedName": {


This schema uses entityFullyQualifiedName, while the UI/API currently refer to entityFQN. Align the property names (and required fields) with the actual API payload so generated models and UI types stay consistent.

Suggested change

"entityFullyQualifiedName": {

"entityFQN": {

Copilot · 2026-04-14T10:01:59Z

+      dataIndex: 'testCaseStatus',
+      key: 'testCaseStatus',
+      render: (status: string) => (
+        <Tag color={status === 'Failed' ? 'error' : 'success'}>{status}</Tag>
+      ),


testCaseStatus can be Aborted or Queued as well as Failed/Success, but this render treats everything except Failed as success. Consider mapping each status to an appropriate tag color (e.g., warning/processing for Queued, default for Aborted) to avoid misleading UI.

Copilot · 2026-04-14T10:01:59Z

  }

-  public getDataQualityTab() {
+public getDataQualityTab() {


This method declaration lost its indentation (public getDataQualityTab() is no longer aligned with the rest of the class). This is likely to fail Prettier/lint formatting checks; please reformat to match surrounding style.

Suggested change

public getDataQualityTab() {

public getDataQualityTab() {

Copilot · 2026-04-14T10:02:02Z

+            .withIndex(Entity.TEST_CASE)
+            .withQuery(query)
+            .withFrom(from)
+            .withSize(size)
+            .withSortFieldParam("timestamp")


This request sorts by timestamp on the testCase index, but the test_case mapping doesn’t have a root timestamp field (it’s under testCaseResult.timestamp). This sort (and any range filter on timestamp) won’t behave as intended unless you use the correct index/field path.

Copilot · 2026-04-14T10:02:02Z

+import { getTestCaseDetailPath } from '../../utils/RouterUtils';
+import {
+  DataQualityCheckImpact,
+  getDataQualityCheckImpact,
+} from '../../rest/testAPI';


The RouterUtils/testAPI imports are using an incorrect relative path for this file location, and getTestCaseDetailPath isn’t exported (the codebase has getTestCaseDetailPagePath). This will fail TypeScript compilation until the import paths/symbols are corrected.

Suggested change

import { getTestCaseDetailPath } from '../../utils/RouterUtils';

import {

DataQualityCheckImpact,

getDataQualityCheckImpact,

} from '../../rest/testAPI';

import { getTestCaseDetailPagePath as getTestCaseDetailPath } from '../../../utils/RouterUtils';

import {

DataQualityCheckImpact,

getDataQualityCheckImpact,

} from '../../../rest/testAPI';

Copilot · 2026-04-14T10:02:03Z

+  const handleTestCaseClick = (testCaseFQN: string) => {
+    history.push(getTestCaseDetailPagePath(testCaseFQN));
+  };


handleTestCaseClick calls getTestCaseDetailPagePath(...), but that symbol isn’t imported in this file, so this will be a compile-time error. Import the correct RouterUtils helper (and remove the unused one).

Copilot · 2026-04-14T10:02:03Z

+    StringBuilder query = new StringBuilder();
+    query.append("{\"query\": {\"bool\": {\"must\": [");
+
+    if (testCaseStatus != null && !testCaseStatus.isEmpty()) {
+      query.append(String.format("{\"term\": {\"testCaseStatus\": \"%s\"}},", testCaseStatus.toLowerCase()));
+    }
+
+    query.append(
+        String.format(
+            "{\"range\": {\"timestamp\": {\"gte\": %d}}}}}", thirtyDaysAgo / 1000));
+    return query.toString();


buildTestCaseImpactQuery starts a JSON query string but never closes the bool.must array/object properly, so it will generate invalid JSON. Build this query with a JSON builder (or fix the missing closing brackets/braces).

Suggested change

StringBuilder query = new StringBuilder();

query.append("{\"query\": {\"bool\": {\"must\": [");

if (testCaseStatus != null && !testCaseStatus.isEmpty()) {

query.append(String.format("{\"term\": {\"testCaseStatus\": \"%s\"}},", testCaseStatus.toLowerCase()));

}

query.append(

String.format(

"{\"range\": {\"timestamp\": {\"gte\": %d}}}}}", thirtyDaysAgo / 1000));

return query.toString();

JsonArrayBuilder mustClauses = Json.createArrayBuilder();

if (testCaseStatus != null && !testCaseStatus.isEmpty()) {

mustClauses.add(

Json.createObjectBuilder()

.add(

"term",

Json.createObjectBuilder()

.add("testCaseStatus", testCaseStatus.toLowerCase())));

}

mustClauses.add(

Json.createObjectBuilder()

.add(

"range",

Json.createObjectBuilder()

.add(

"timestamp",

Json.createObjectBuilder().add("gte", thirtyDaysAgo / 1000))));

JsonObjectBuilder query =

Json.createObjectBuilder()

.add(

"query",

Json.createObjectBuilder()

.add(

"bool",

Json.createObjectBuilder().add("must", mustClauses)));

return query.build().toString();

Copilot · 2026-04-14T10:02:03Z

+              "testCaseFullyQualifiedName",
+              JsonUtils.extractValue(sourceNode.toString(), FULLY_QUALIFIED_NAME));
+          doc.put("entityFQN", JsonUtils.extractValue(sourceNode.toString(), "entityFQN"));
+          doc.put("testCaseStatus", JsonUtils.extractValue(sourceNode.toString(), "testCaseStatus"));
+          doc.put("timestamp", JsonUtils.extractValue(sourceNode.toString(), "timestamp"));


The fields being extracted here (entityFQN, testCaseStatus, timestamp) are not present at the root of the test_case index documents (they’re nested under testCaseResult.*). As a result, these values will be null unless the query/index is adjusted.

- Fix malformed JSON query in buildTestCaseImpactQuery (missing ] bracket) - Fix wrong function name import (getTestCaseDetailPagePath) - Fix wrong import path for RouterUtils - Compute recentIncidents from grouped test case results instead of hardcoded 0 - Fix property name mismatch (entityFQN -> entityFullyQualifiedName) Fixes open-metadata#26658

github-actions · 2026-04-14T16:17:24Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

github-actions · 2026-04-14T16:20:42Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 9 comments.

Copilot · 2026-04-14T16:30:01Z

+                    schema = @Schema(implementation = List.class)))
+      })
+  public List<?> getDataQualityCheckImpact(
+      @Context UriInfo uriInfo,


The endpoint returns List<?> and documents the response as List.class, even though a dedicated DataQualityCheckImpact schema/model was added. Please return a typed List<DataQualityCheckImpact> (or DTO) and update the OpenAPI response annotation to the correct item type so clients can generate proper bindings.

Copilot · 2026-04-14T16:30:02Z

+        <Tooltip title={t('message.impact-score-tooltip')}>
+          <Tag color={getImpactColor(score)}>{getImpactLabel(score)}</Tag>
+        </Tooltip>


Several i18n keys referenced in this component (e.g. label.impact-ranking, label.impact-score, message.impact-score-tooltip, label.average-impact, label.data-quality-checks-by-impact, message.no-data-quality-checks-found) are not present in src/locale/languages/en-us.json, so the UI will render raw keys. Please add the missing keys (and run the i18n sync).

Copilot · 2026-04-14T16:30:02Z

+  testCaseFullyQualifiedName: string;
+  entityFullyQualifiedName: string;
+  testCaseStatus: TestCaseStatus;
+  timestamp: number;


timestamp is declared as a number, but the backend currently extracts it as a string from the search response and returns it in the response map as a JSON string. Either return a numeric timestamp from the API (preferred) or update this type to match the actual response shape.

Suggested change

timestamp: number;

timestamp: string;

Copilot · 2026-04-14T16:30:02Z

+    query.append("{\"query\": {\"bool\": {\"must\": [");
+
+    if (testCaseStatus != null && !testCaseStatus.isEmpty()) {
+      query.append(String.format("{\"term\": {\"testCaseStatus\": \"%s\"}},", testCaseStatus.toLowerCase()));


User input (testCaseStatus) is interpolated into the JSON query without escaping. A value containing quotes will break the JSON (and can be used for query injection); escape it (e.g., escapeDoubleQuotes) or build the query via a JSON builder.

Suggested change

query.append(String.format("{\"term\": {\"testCaseStatus\": \"%s\"}},", testCaseStatus.toLowerCase()));

query.append(

String.format(

"{\"term\": {\"testCaseStatus\": \"%s\"}},",

escapeDoubleQuotes(testCaseStatus.toLowerCase())));

Copilot · 2026-04-14T16:30:03Z

+    query.append("{\"query\": {\"bool\": {\"must\": [");
+
+    if (testCaseStatus != null && !testCaseStatus.isEmpty()) {
+      query.append(String.format("{\"term\": {\"testCaseStatus\": \"%s\"}},", testCaseStatus.toLowerCase()));


This term filter targets testCaseStatus, but in the TestCase index mapping the field is testCaseResult.testCaseStatus. Using the wrong field path means the status filter will never match any documents.

Suggested change

query.append(String.format("{\"term\": {\"testCaseStatus\": \"%s\"}},", testCaseStatus.toLowerCase()));

query.append(

String.format(

"{\"term\": {\"testCaseResult.testCaseStatus\": \"%s\"}},",

testCaseStatus.toLowerCase()));

Copilot · 2026-04-14T16:30:03Z

+
+    query.append(
+        String.format(
+            "{\"range\": {\"timestamp\": {\"gte\": %d}}}]}}", thirtyDaysAgo / 1000));


This range filter targets top-level timestamp, but in the TestCase index mapping the field is testCaseResult.timestamp. Using the wrong field path means the last-30-days filter won’t apply (and may return no results).

Suggested change

"{\"range\": {\"timestamp\": {\"gte\": %d}}}]}}", thirtyDaysAgo / 1000));

"{\"range\": {\"testCaseResult.timestamp\": {\"gte\": %d}}}]}}",

thirtyDaysAgo / 1000));

Copilot · 2026-04-14T16:30:03Z

+
+    query.append(
+        String.format(
+            "{\"range\": {\"timestamp\": {\"gte\": %d}}}]}}", thirtyDaysAgo / 1000));


The JSON built here is malformed: it opens { "query": { "bool": { "must": [ but the final append only closes with }]}}, missing a final }. This will fail query_filter parsing and the endpoint will return empty results.

Suggested change

"{\"range\": {\"timestamp\": {\"gte\": %d}}}]}}", thirtyDaysAgo / 1000));

"{\"range\": {\"timestamp\": {\"gte\": %d}}}]}}}", thirtyDaysAgo / 1000));

Copilot · 2026-04-14T16:30:04Z

+        JsonNode sourceNode = JsonUtils.extractValue(jsonNode.toString(), SEARCH_SOURCE);
+        if (sourceNode != null) {
+          String testCaseFQN =
+              JsonUtils.extractValue(sourceNode.toString(), FULLY_QUALIFIED_NAME);
+          String testCaseId = JsonUtils.extractValue(sourceNode.toString(), ID);
+          String entityFQN = JsonUtils.extractValue(sourceNode.toString(), "entityFQN");
+          String testCaseStatus = JsonUtils.extractValue(sourceNode.toString(), "testCaseStatus");
+          String timestamp = JsonUtils.extractValue(sourceNode.toString(), "timestamp");


This loop repeatedly converts JsonNode to string and re-parses it via JsonUtils.extractValue(...). There are existing overloads that operate directly on JsonNode (used elsewhere in this class), which avoids extra parsing and is less error-prone.

Suggested change

JsonNode sourceNode = JsonUtils.extractValue(jsonNode.toString(), SEARCH_SOURCE);

if (sourceNode != null) {

String testCaseFQN =

JsonUtils.extractValue(sourceNode.toString(), FULLY_QUALIFIED_NAME);

String testCaseId = JsonUtils.extractValue(sourceNode.toString(), ID);

String entityFQN = JsonUtils.extractValue(sourceNode.toString(), "entityFQN");

String testCaseStatus = JsonUtils.extractValue(sourceNode.toString(), "testCaseStatus");

String timestamp = JsonUtils.extractValue(sourceNode.toString(), "timestamp");

JsonNode sourceNode = JsonUtils.extractValue(jsonNode, SEARCH_SOURCE);

if (sourceNode != null) {

String testCaseFQN = JsonUtils.extractValue(sourceNode, FULLY_QUALIFIED_NAME);

String testCaseId = JsonUtils.extractValue(sourceNode, ID);

String entityFQN = JsonUtils.extractValue(sourceNode, "entityFQN");

String testCaseStatus = JsonUtils.extractValue(sourceNode, "testCaseStatus");

String timestamp = JsonUtils.extractValue(sourceNode, "timestamp");

Copilot · 2026-04-14T16:30:04Z

+      int downstreamUsage =
+          ((Number) testCase.getOrDefault("downstreamUsage", 0)).intValue();
+      int consumerCount = ((Number) testCase.getOrDefault("consumerCount", 0)).intValue();
+      int recentIncidents =
+          ((Number) testCase.getOrDefault("recentIncidents", 0)).intValue();


downstreamUsage and consumerCount are inputs to the scoring model here, but they’re currently always 0 in the search results (set in searchTestCasesForImpact). That means the 40%/30% weights never affect the score, which is misleading given the feature description; please populate these metrics or remove them from scoring until available.

gitar-bot · 2026-04-14T17:27:10Z

+
+    query.append(
+        String.format(
+            "{\"range\": {\"timestamp\": {\"gte\": %d}}}]}}", thirtyDaysAgo / 1000));


🚨 Bug: JSON query still malformed — missing one closing brace

The fix at line 431 changed }}}} to }]}}, which correctly adds the missing ] for the must array. However, it still produces invalid JSON because there are only 2 closing braces after ] when 3 are needed.

The structure opened is: {"query": {"bool": {"must": [ — that's 3 { and 1 [. After the range clause closes its own 3 braces, the remaining closers should be ]}}} (1 ] + 3 }), but the code produces ]}} (1 ] + 2 }).

This means the Elasticsearch query is still malformed and searchTestCasesForImpact will throw a parse error, causing the entire impact ranking feature to return empty results.

Suggested fix:

Change the format string to include the missing brace: "{"range": {"timestamp": {"gte": %d}}}]}}}") Or better yet, use a JSON library to build queries instead of string concatenation to prevent these bracket-matching issues.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-14T17:27:11Z

+            doc.put("downstreamUsage", 0);
+            doc.put("consumerCount", 0);


⚠️ Bug: downstreamUsage and consumerCount always 0, crippling score

While recentIncidents is now correctly computed by counting failed runs (lines 2843-2846), downstreamUsage and consumerCount remain hardcoded to 0 (lines 2858-2859). Since the impact score formula is 0.4 * downstream + 0.3 * consumer + 0.3 * incidents, 70% of the score weight is permanently zeroed out. The ranking will only ever reflect recent incident counts, making the multi-factor scoring model misleading — users see weights advertised for downstream/consumer impact that have no effect.

This was flagged in the previous review and is only partially addressed by this commit.

Suggested fix:

Either: 1. Populate `downstreamUsage` and `consumerCount` with real data from lineage/usage APIs, or 2. Remove these factors from the score formula and UI until they are implemented, to avoid confusing users with always-zero metrics.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-14T17:27:13Z

+          Map<String, Object> doc;
+          if (groupedTestCases.containsKey(testCaseFQN)) {
+            doc = groupedTestCases.get(testCaseFQN);
+            int runCount = ((Number) doc.getOrDefault("runCount", 0)).intValue();


💡 Edge Case: ClassCastException risk on getOrDefault with Integer 0

At lines 2840 and 2843, doc.getOrDefault("runCount", 0) and doc.getOrDefault("recentIncidents", 0) use an int literal as the default. Since the map stores Integer values from doc.put("runCount", 1), the cast (Number) works. However, the default 0 is auto-boxed to Integer which is fine, but mixing with the pattern is fragile — if anyone changes the stored value type (e.g., to Long), the cast chain could break. This is minor but worth noting for maintainability.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

prasanthcp · 2026-04-16T16:11:36Z

As suggested, I think the JsonArrayBuilder or JsonUtils should really help to handle dynamic JSON.

github-actions · 2026-04-16T20:33:53Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

gitar-bot · 2026-04-16T20:36:00Z

Code Review 🚫 Blocked 4 resolved / 8 findings

Impact ranking for data quality checks is currently blocked by a malformed JSON query and hardcoded zero values for downstream usage metrics. Additionally, accidental indentation changes and a potential ClassCastException require attention.

🚨 Bug: JSON query still malformed — missing one closing brace

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:431

The fix at line 431 changed }}}} to }]}}, which correctly adds the missing ] for the must array. However, it still produces invalid JSON because there are only 2 closing braces after ] when 3 are needed.

The structure opened is: {"query": {"bool": {"must": [ — that's 3 { and 1 [. After the range clause closes its own 3 braces, the remaining closers should be ]}}} (1 ] + 3 }), but the code produces ]}} (1 ] + 2 }).

This means the Elasticsearch query is still malformed and searchTestCasesForImpact will throw a parse error, causing the entire impact ranking feature to return empty results.

Suggested fix

Change the format string to include the missing brace:

  "{"range": {"timestamp": {"gte": %d}}}]}}}")

Or better yet, use a JSON library to build queries instead of string concatenation to prevent these bracket-matching issues.

⚠️

Bug: downstreamUsage and consumerCount always 0, crippling score

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2858-2859 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:395-403

While recentIncidents is now correctly computed by counting failed runs (lines 2843-2846), downstreamUsage and consumerCount remain hardcoded to 0 (lines 2858-2859). Since the impact score formula is 0.4 * downstream + 0.3 * consumer + 0.3 * incidents, 70% of the score weight is permanently zeroed out. The ranking will only ever reflect recent incident counts, making the multi-factor scoring model misleading — users see weights advertised for downstream/consumer impact that have no effect.

This was flagged in the previous review and is only partially addressed by this commit.

Suggested fix

Either:
1. Populate `downstreamUsage` and `consumerCount` with real data from lineage/usage APIs, or
2. Remove these factors from the score formula and UI until they are implemented, to avoid confusing users with always-zero metrics.

💡 Quality: Indentation damage on pre-existing lines (accidental whitespace changes)

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:362 📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2787 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/dqtests/TestSuiteResource.java:545 📄 openmetadata-ui/src/main/resources/ui/src/pages/DataQuality/DataQualityClassBase.ts:74

Several pre-existing lines had their leading indentation removed (e.g., getDataQualityReport at line 362, searchDataQualityLineage at line 2787, SubjectContext at line 545, getDataQualityTab at line 74). These appear to be accidental whitespace-only changes that break the consistent indentation style of the codebase. They should be reverted to avoid noisy diffs.

💡 Edge Case: ClassCastException risk on getOrDefault with Integer 0

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2840 📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2843

At lines 2840 and 2843, doc.getOrDefault("runCount", 0) and doc.getOrDefault("recentIncidents", 0) use an int literal as the default. Since the map stores Integer values from doc.put("runCount", 1), the cast (Number) works. However, the default 0 is auto-boxed to Integer which is fine, but mixing with the pattern is fragile — if anyone changes the stored value type (e.g., to Long), the cast chain could break. This is minor but worth noting for maintainability.

✅ 4 resolved

✅ Bug: Malformed JSON query — missing ] and } in buildTestCaseImpactQuery

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:429-431 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:425-427
The buildTestCaseImpactQuery method constructs an Elasticsearch query string with unbalanced braces/brackets. The opening is {"query": {"bool": {"must": [ (3 braces + 1 bracket), but the closing }}}}} provides only 5 braces and no bracket. The correct closing after the range clause should be }}}]}} — 3 braces to close the range object, ] to close the must array, and }} to close bool and query. This will cause a JSON parse error at runtime, making the entire impact ranking endpoint non-functional.

✅ Bug: Wrong function imported — will fail to compile or crash at runtime

📄 openmetadata-ui/src/main/resources/ui/src/components/DataQuality/ImpactRanking/ImpactRanking.component.tsx:30 📄 openmetadata-ui/src/main/resources/ui/src/components/DataQuality/ImpactRanking/ImpactRanking.component.tsx:76
Line 30 imports getTestCaseDetailPath from RouterUtils, but line 76 calls getTestCaseDetailPagePath which is never imported. The correct function getTestCaseDetailPagePath exists in RouterUtils.ts (line 555) but is not imported. This will either fail at build time (TypeScript) or crash at runtime when clicking a test case row.

✅ Bug: Impact scores always based on zeros — feature is non-functional

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2832-2834 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:388-402
searchTestCasesForImpact hardcodes downstreamUsage and consumerCount to 0 (lines 2832-2833), and recentIncidents is never populated either. Since these are the three inputs to the impact score formula (40%, 30%, 30% weights), every test case will receive an impact score of 0. The feature as shipped will show a table of zeros and provide no ranking value. The backend needs to actually compute these metrics — e.g., by querying lineage for downstream count, querying usage for consumers, and counting recent failed test results.

✅ Quality: Relative import path for RouterUtils is likely wrong

📄 openmetadata-ui/src/main/resources/ui/src/components/DataQuality/ImpactRanking/ImpactRanking.component.tsx:30 📄 openmetadata-ui/src/main/resources/ui/src/components/DataQuality/ImpactRanking/ImpactRanking.component.tsx:32-34
The component is at components/DataQuality/ImpactRanking/ImpactRanking.component.tsx (3 levels deep from src/), but the import uses ../../utils/RouterUtils which resolves to components/utils/RouterUtils — not the correct src/utils/RouterUtils. The path should be ../../../utils/RouterUtils.

🤖 Prompt for agents

Code Review: Impact ranking for data quality checks is currently blocked by a malformed JSON query and hardcoded zero values for downstream usage metrics. Additionally, accidental indentation changes and a potential ClassCastException require attention.

1. 💡 Quality: Indentation damage on pre-existing lines (accidental whitespace changes)
   Files: openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:362, openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2787, openmetadata-service/src/main/java/org/openmetadata/service/resources/dqtests/TestSuiteResource.java:545, openmetadata-ui/src/main/resources/ui/src/pages/DataQuality/DataQualityClassBase.ts:74

   Several pre-existing lines had their leading indentation removed (e.g., `getDataQualityReport` at line 362, `searchDataQualityLineage` at line 2787, `SubjectContext` at line 545, `getDataQualityTab` at line 74). These appear to be accidental whitespace-only changes that break the consistent indentation style of the codebase. They should be reverted to avoid noisy diffs.

2. 🚨 Bug: JSON query still malformed — missing one closing brace
   Files: openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:431

   The fix at line 431 changed `}}}}` to `}]}}`, which correctly adds the missing `]` for the `must` array. However, it still produces invalid JSON because there are only 2 closing braces after `]` when 3 are needed.
   
   The structure opened is: `{"query": {"bool": {"must": [` — that's 3 `{` and 1 `[`. After the range clause closes its own 3 braces, the remaining closers should be `]}}}` (1 `]` + 3 `}`), but the code produces `]}}` (1 `]` + 2 `}`).
   
   This means the Elasticsearch query is still malformed and `searchTestCasesForImpact` will throw a parse error, causing the entire impact ranking feature to return empty results.

   Suggested fix:
   Change the format string to include the missing brace:
   
     "{"range": {"timestamp": {"gte": %d}}}]}}}")
   
   Or better yet, use a JSON library to build queries instead of string concatenation to prevent these bracket-matching issues.

3. ⚠️ Bug: downstreamUsage and consumerCount always 0, crippling score
   Files: openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2858-2859, openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java:395-403

   While `recentIncidents` is now correctly computed by counting failed runs (lines 2843-2846), `downstreamUsage` and `consumerCount` remain hardcoded to 0 (lines 2858-2859). Since the impact score formula is `0.4 * downstream + 0.3 * consumer + 0.3 * incidents`, 70% of the score weight is permanently zeroed out. The ranking will only ever reflect recent incident counts, making the multi-factor scoring model misleading — users see weights advertised for downstream/consumer impact that have no effect.
   
   This was flagged in the previous review and is only partially addressed by this commit.

   Suggested fix:
   Either:
   1. Populate `downstreamUsage` and `consumerCount` with real data from lineage/usage APIs, or
   2. Remove these factors from the score formula and UI until they are implemented, to avoid confusing users with always-zero metrics.

4. 💡 Edge Case: ClassCastException risk on getOrDefault with Integer 0
   Files: openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2840, openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:2843

   At lines 2840 and 2843, `doc.getOrDefault("runCount", 0)` and `doc.getOrDefault("recentIncidents", 0)` use an `int` literal as the default. Since the map stores `Integer` values from `doc.put("runCount", 1)`, the cast `(Number)` works. However, the default `0` is auto-boxed to `Integer` which is fine, but mixing with the pattern is fragile — if anyone changes the stored value type (e.g., to `Long`), the cast chain could break. This is minor but worth noting for maintainability.

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

Copilot AI review requested due to automatic review settings April 14, 2026 09:49

AshutoshIIT1234 requested a review from a team as a code owner April 14, 2026 09:49

Copilot started reviewing on behalf of AshutoshIIT1234 April 14, 2026 09:50 View session

gitar-bot Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestSuiteRepository.java Outdated

gitar-bot Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread ...i/src/main/resources/ui/src/components/DataQuality/ImpactRanking/ImpactRanking.component.tsx Outdated

gitar-bot Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java Outdated

gitar-bot Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread ...i/src/main/resources/ui/src/components/DataQuality/ImpactRanking/ImpactRanking.component.tsx Outdated

Copilot AI reviewed Apr 14, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings April 14, 2026 16:20

AshutoshIIT1234 force-pushed the feature/data-quality-check-impact branch from 2741b5d to 1ddfe68 Compare April 14, 2026 16:20

Copilot started reviewing on behalf of AshutoshIIT1234 April 14, 2026 16:20 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

gitar-bot Bot reviewed Apr 14, 2026

View reviewed changes

Merge branch 'open-metadata:main' into feature/data-quality-check-impact

a01d9ca

-    StringBuilder query = new StringBuilder();
-    query.append("{\"query\": {\"bool\": {\"must\": [");
-    if (testCaseStatus != null && !testCaseStatus.isEmpty()) {
-      query.append(String.format("{\"term\": {\"testCaseStatus\": \"%s\"}},", testCaseStatus.toLowerCase()));
-    }
-    query.append(
-        String.format(
-            "{\"range\": {\"timestamp\": {\"gte\": %d}}}}}", thirtyDaysAgo / 1000));
-    return query.toString();
+    JsonArrayBuilder mustClauses = Json.createArrayBuilder();
+    if (testCaseStatus != null && !testCaseStatus.isEmpty()) {
+      mustClauses.add(
+          Json.createObjectBuilder()
+              .add(
+                  "term",
+                  Json.createObjectBuilder()
+                      .add("testCaseStatus", testCaseStatus.toLowerCase())));
+    }
+    mustClauses.add(
+        Json.createObjectBuilder()
+            .add(
+                "range",
+                Json.createObjectBuilder()
+                    .add(
+                        "timestamp",
+                        Json.createObjectBuilder().add("gte", thirtyDaysAgo / 1000))));
+    JsonObjectBuilder query =
+        Json.createObjectBuilder()
+            .add(
+                "query",
+                Json.createObjectBuilder()
+                    .add(
+                        "bool",
+                        Json.createObjectBuilder().add("must", mustClauses)));
+    return query.build().toString();

	"{\"range\": {\"timestamp\": {\"gte\": %d}}}]}}", thirtyDaysAgo / 1000));
	"{\"range\": {\"testCaseResult.timestamp\": {\"gte\": %d}}}]}}",
	thirtyDaysAgo / 1000));

Conversation

AshutoshIIT1234 commented Apr 14, 2026

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gitar-bot Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

gitar-bot Bot commented Apr 16, 2026 •

edited

Loading