Skip to content

Extended UT Framework for Functional Correctness of query conversion#21060

Closed
nssuresh2007 wants to merge 1 commit intoopensearch-project:feature/datafusionfrom
nssuresh2007:feature/datafusion
Closed

Extended UT Framework for Functional Correctness of query conversion#21060
nssuresh2007 wants to merge 1 commit intoopensearch-project:feature/datafusionfrom
nssuresh2007:feature/datafusion

Conversation

@nssuresh2007
Copy link
Copy Markdown

Changes to test the end to end DSL query conversion without OpenSearch cluster. Currently adds support for match_all and terms with agg for the initial commit.

Description

This feature introduces a golden-file-based unit test framework for the OpenSearch DSL query executor plugin (sandbox/plugins/dsl-query-executor/). The framework validates the correctness of both the forward conversion path (DSL → RelNode) and the reverse conversion path (RelNode execution results → SearchResponse DSL output) by comparing actual outputs against pre-approved golden files. Each golden file encodes a complete test case: the input DSL, the expected RelNode logical plan, simulated execution rows, and the expected DSL output.

Requirements

  • As a developer, I want each golden file to contain all inputs and expected outputs for a single test case, so that test cases are self-contained and easy to review in code reviews.
  • As a developer, I want the test framework to automatically discover and parse golden files, so that adding a new test case requires only adding a new golden file.
  • As a developer, I want to validate that a given DSL input produces the expected Calcite logical plan, so that regressions in the conversion pipeline are caught automatically.
  • As a developer, I want to validate that simulated execution results are correctly converted back into a SearchResponse, so that regressions in the response building logic are caught automatically.
  • As a developer, I want to verify that the forward and reverse paths are consistent within a golden file, so that the entire DSL → RelNode → execution → DSL output pipeline is validated end-to-end.
  • As a developer, I want golden file test cases covering the hits pipeline scenarios, so that query translation correctness is validated for common query types.
  • As a developer, I want golden file test cases covering the aggregation pipeline scenarios, so that aggregation translation correctness is validated for supported metric and bucket types.
  • As a developer, I want a way to regenerate golden files when intentional changes are made to the conversion logic, so that maintaining golden files does not become a burden.
  • As a developer, I want the golden file tests to run as fast unit tests without requiring a running OpenSearch cluster, so that they can be part of the standard build cycle.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@nssuresh2007 nssuresh2007 requested a review from a team as a code owner March 31, 2026 10:51
@github-actions
Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Golden File Framework Infrastructure (POJO, Loader, Updater, CalciteInfra, Design Doc)

Relevant files:

  • sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/GoldenTestCase.java
  • sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/GoldenFileLoader.java
  • sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/GoldenFileUpdater.java
  • sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/CalciteTestInfra.java
  • sandbox/plugins/dsl-query-executor/src/test/design.md

Sub-PR theme: Golden File Test Cases and Test Runner

Relevant files:

  • sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/DslGoldenFileTests.java
  • sandbox/plugins/dsl-query-executor/src/test/resources/golden/match_all_hits.json
  • sandbox/plugins/dsl-query-executor/src/test/resources/golden/terms_with_avg_aggregation.json

⚡ Recommended focus areas for review

Missing Aggregation Output

The golden file terms_with_avg_aggregation.json has expectedOutputDsl that only contains an empty hits section with no aggregation buckets. The reverse path test for aggregation queries should validate the actual aggregation output (buckets with brand keys, avg_price values, doc_counts), but the expected output appears to be a placeholder that doesn't reflect the actual aggregation results from the executionRows.

import java.util.List;
import java.util.Map;

/**
 * Golden file tests for the DSL query executor plugin.
 *
 * <p>Each test method loads its own specific golden file by name, keeping
 * dependencies explicit. Tests validate the forward path (DSL → RelNode),
 * reverse path (ExecutionResult → SearchResponse), and field name consistency.
 */
public class DslGoldenFileTests extends OpenSearchTestCase {
Update Mode Race Condition

In runForwardPathTest, update mode calls GoldenFileUpdater.update(filePath, actualPlan, tc.getExpectedOutputDsl()) — passing the existing expected output DSL rather than the actual computed output. This means update mode for the forward path test will not regenerate the expectedOutputDsl with real values, potentially writing stale data back to the golden file.

if (GoldenFileUpdater.isUpdateMode()) {
    Path filePath = goldenFilePath(goldenFileName);
    GoldenFileUpdater.update(filePath, actualPlan, tc.getExpectedOutputDsl());
} else {
Mutable Map Comparison

stripNonDeterministicFields mutates tc.getExpectedOutputDsl() directly, which modifies the GoldenTestCase object's internal state. If the same test case object were reused (e.g., in runAllChecks which calls both runReversePathTest and runConsistencyCheck), the stripped map could cause unexpected behavior. Consider working on a defensive copy instead.

Map<String, Object> expectedOutput = tc.getExpectedOutputDsl();
stripNonDeterministicFields(actualOutput);
stripNonDeterministicFields(expectedOutput);
Missing planType Validation

The validate method checks all required fields except planType, which is used in DslGoldenFileTests via QueryPlans.Type.valueOf(tc.getPlanType()). A missing or invalid planType in a golden file will cause a NullPointerException or IllegalArgumentException at test runtime rather than a clear validation error at load time.

private static void validate(GoldenTestCase testCase, Path filePath) {
    requireNonNull(testCase.getTestName(), "testName", filePath);
    requireNonNull(testCase.getIndexName(), "indexName", filePath);
    requireNonNull(testCase.getIndexMapping(), "indexMapping", filePath);
    requireNonNull(testCase.getInputDsl(), "inputDsl", filePath);
    requireNonNull(testCase.getExpectedRelNodePlan(), "expectedRelNodePlan", filePath);
    requireNonNull(testCase.getExecutionFieldNames(), "executionFieldNames", filePath);
    requireNonNull(testCase.getExecutionRows(), "executionRows", filePath);
    requireNonNull(testCase.getExpectedOutputDsl(), "expectedOutputDsl", filePath);
}
Incorrect Expected Output

The expectedOutputDsl for match_all_hits.json shows "hits": [] with "total": {"value": 0}, but executionRows contains 2 rows (laptop and phone). The expected output does not reflect the actual execution rows, suggesting the golden file's expected output is a placeholder and the reverse path test may be asserting against incorrect expected values.

"expectedOutputDsl": {
  "num_reduce_phases": 0,
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": 0.0,
    "hits": []
  }
}

@github-actions
Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix stale value passed in forward-path update mode

In update mode for the forward path, tc.getExpectedOutputDsl() is passed as the
output DSL, but this is the old/existing value from the golden file rather than the
actual computed output. The actual output DSL should be preserved from the reverse
path run, or the forward-path update should only update expectedRelNodePlan while
leaving expectedOutputDsl unchanged. Passing the stale expected value will silently
overwrite the file with no real change to expectedOutputDsl.

sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/DslGoldenFileTests.java [74-77]

 if (GoldenFileUpdater.isUpdateMode()) {
     Path filePath = goldenFilePath(goldenFileName);
-    GoldenFileUpdater.update(filePath, actualPlan, tc.getExpectedOutputDsl());
+    // Only update the relNodePlan; preserve existing expectedOutputDsl
+    GoldenFileUpdater.update(filePath, actualPlan, GoldenFileLoader.load(filePath).getExpectedOutputDsl());
 } else {
Suggestion importance[1-10]: 6

__

Why: In update mode for the forward path, tc.getExpectedOutputDsl() passes the old/stale value from the golden file rather than the actual computed output. While this is a real issue, the impact is limited since the forward path test is only meant to update expectedRelNodePlan, and the reverse path test handles expectedOutputDsl updates separately. The improved code re-reads the file which is redundant since tc already has the value.

Low
Guard against null planType causing NPE

tc.getPlanType() may be null if the planType field is missing from the golden file,
since GoldenFileLoader.validate() does not require planType. Calling
QueryPlans.Type.valueOf(null) will throw a NullPointerException. Add a null check or
include planType in the required field validation.

sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/DslGoldenFileTests.java [106-107]

-QueryPlans.Type expectedType = QueryPlans.Type.valueOf(tc.getPlanType());
+String planTypeStr = tc.getPlanType();
+if (planTypeStr == null) {
+    throw new IllegalStateException("Golden file " + goldenFileName + " missing required field: planType");
+}
+QueryPlans.Type expectedType = QueryPlans.Type.valueOf(planTypeStr);
 QueryPlans.QueryPlan plan = plans.get(expectedType).get(0);
Suggestion importance[1-10]: 5

__

Why: planType is not validated as a required field in GoldenFileLoader.validate(), so a null value could cause an NPE when QueryPlans.Type.valueOf(null) is called. Adding a null check or including planType in validation would prevent this, though in practice all golden files in the PR include planType.

Low
Add missing empty-list guard in reverse and consistency helpers

In runReversePathTest and runConsistencyCheck, plans.get(expectedType).get(0) is
called without first checking whether the list is empty, which would throw an
IndexOutOfBoundsException. The emptiness check (assertFalse) is only present in
runForwardPathTest. Add the same guard in the other two helpers.

sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/DslGoldenFileTests.java [106-108]

 QueryPlans.Type expectedType = QueryPlans.Type.valueOf(tc.getPlanType());
 List<QueryPlans.QueryPlan> matchingPlans = plans.get(expectedType);
 assertFalse("No " + expectedType + " plan produced for " + goldenFileName, matchingPlans.isEmpty());
+QueryPlans.QueryPlan plan = matchingPlans.get(0);
 
-RelNode relNode = matchingPlans.get(0).relNode();
-
Suggestion importance[1-10]: 5

__

Why: runReversePathTest and runConsistencyCheck call .get(0) on the plans list without checking if it's empty, which could throw IndexOutOfBoundsException. The forward path has this guard but the other two helpers don't, creating an inconsistency that could produce confusing test failures.

Low
General
Avoid mutating shared expected output map

stripNonDeterministicFields mutates the map returned by tc.getExpectedOutputDsl(),
which is the same map held inside the GoldenTestCase object. If the test is ever
reused or the golden file is loaded once and run multiple times, the expected map
will be permanently modified. Work on a defensive copy instead.

sandbox/plugins/dsl-query-executor/src/test/java/org/opensearch/dsl/golden/DslGoldenFileTests.java [206-210]

 private void stripNonDeterministicFields(Map<String, Object> responseMap) {
     responseMap.remove("took");
     responseMap.remove("timed_out");
     responseMap.remove("_shards");
 }
 
+// In runReversePathTest, use copies:
+// Map<String, Object> expectedOutput = new java.util.LinkedHashMap<>(tc.getExpectedOutputDsl());
+// Map<String, Object> actualOutputCopy = new java.util.LinkedHashMap<>(actualOutput);
+// stripNonDeterministicFields(actualOutputCopy);
+// stripNonDeterministicFields(expectedOutput);
+
Suggestion importance[1-10]: 3

__

Why: The stripNonDeterministicFields method mutates the map from tc.getExpectedOutputDsl() directly. However, since each test loads its own GoldenTestCase instance via GoldenFileLoader.load(), the mutation doesn't affect other tests. The improved_code snippet is also not a clean replacement of the existing_code, making it harder to apply.

Low

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for b859be5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@vinaykpud
Copy link
Copy Markdown
Contributor

Thanks @nssuresh2007

This PR adds a "golden file" testing layer that reads JSON files and validates:

  • Forward path: SearchSourceBuilder → SearchSourceConverter.convert() → RelNode.explain() → assert matches expected plan string
  • Reverse path: mock rows → SearchResponseBuilder.build() → assert JSON matches expected output

I think it's worth considering how this fits with what we already have:

  1. Unit tests per converter/translator : these test individual modules for each DSL construct and RelNode construct.
  2. internalClusterTest integration tests (DslQueryIT, DslAggregationIT, DslSortIT, DslProjectIT) : these test real execution against a real cluster.

The golden file tests chain SearchSourceConverter + SearchResponseBuilder but still use mocked infrastructure (no cluster), so there's a fair bit of overlap with the existing unit tests. One concern is that maintenance burden could grow over time : for example, if a RelNode plan shape changes due to a refactor (even though semantics are correct), all golden files would need updating.

A couple of alternatives worth considering:

We could extend the existing internalClusterTest suite for each new query shape. These already test real execution against a single-node cluster:

// In DslQueryIT.java
public void testRangeQuery() {
    SearchResponse resp = executeQuery(indexName,
        "{\"query\":{\"range\":{\"price\":{\"gte\":100}}}}");
    assertEquals(RestStatus.OK, resp.status());
    // validate actual hits from real execution
}

If we want JSON-driven test cases for convenience, we could build that on top of integration test infrastructure. Two options that come to mind:

  1. Java REST QA test : similar to the PPL query tests in https://github.com/opensearch-project/OpenSearch/tree/feature/datafusion/sandbox/qa/analytics-engine-rest. Tests queries against a
    real cluster for a standard set of query shapes.
  2. YAML REST QA test : since the DSL query executor intercepts standard _search via SearchActionFilter, we're essentially testing the search API with different query bodies. YAML REST tests could be a natural fit here with no custom framework needed (same pattern as https://github.com/opensearch-project/OpenSearch/tree/main/rest-api-spec).

What do you think? Happy to discuss which approach makes the most sense here.

@nssuresh2007
Copy link
Copy Markdown
Author

@vinaykpud thanks for the feedback.

This extended unit tests are needed for functional correctness of DSL plugin as a stand alone testing framework without OpenSearch cluster dependency. This requirement is not satisfied by the existing tests:

  1. Unit tests - Very class level focused. Does not help in functional correctness verification testing to validate if the entire query conversion flow works or not.
  2. Integration tests - Runs tests on actual cluster. While Integration Tests help to cover these test scenarios, IT can still fail if the underlying analytics plugin or backend plugin has issues since it focuses on end to end testing. This cannot help in standalone testing of DSL plugin to certify its functional correctness.

Hence this becomes necessary to have this extended UT to validate if the logic that we are currently adding is correct. Otherwise, we would be able to catch it very late in the development lifecycle until all downstream plugins are functional and testable. Also note that this is not a replacement for UT or IT, but it is complementary to the above.

Please let me know if that helps to address your query.

@nssuresh2007 nssuresh2007 changed the title DSL Golden File Testing Framework (to validate query conversion end to end) Extended UT Framework for Functional Correctness of query conversion Apr 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 5, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit b03d5c1.

PathLineSeverityDescription
sandbox/plugins/dsl-query-executor/build.gradle27highNew dependency added: 'com.google.guava:guava:${versions.guava}' via custom 'calciteCompile' configuration. Per mandatory rule, all dependency additions must be flagged regardless of apparent legitimacy. Maintainers should verify the resolved artifact matches the expected Guava release.
sandbox/plugins/dsl-query-executor/build.gradle32highNew test dependency added: 'com.fasterxml.jackson.core:jackson-databind:${versions.jackson_databind}'. Per mandatory rule, all dependency additions must be flagged. Maintainers should verify the resolved artifact is the expected Jackson Databind release.
sandbox/plugins/dsl-query-executor/build.gradle18mediumA custom Gradle configuration 'calciteCompile' is introduced specifically to bypass the existing 'compileClasspath { exclude group: "com.google.guava" }' exclusion rule. Adding the Guava dependency via a separate configuration that is then appended to compileClasspath circumvents the project-wide exclusion policy. While the comment provides a rationale, this pattern warrants review to confirm it does not introduce unwanted Guava classes into production compile or runtime scope.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 2 | Medium: 1 | Low: 0


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

ExecutionResult result = new ExecutionResult(plan, rows);

// Build SearchResponse
var response = SearchResponseBuilder.build(List.of(result), 0L);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forward path we'll use SearchSourceBuilder to convert to Calcite and handoff - but for reverse path, the plan is to get results as SearchResponseBuilder / SearchResponse to maintain result compatibility ?

For SQL / PPL , we might directly get back the rows, hence the question.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but for reverse path, the plan is to get results as SearchResponseBuilder / SearchResponse to maintain result compatibility ?

Yes, that is the plan

Path filePath = goldenFilePath(goldenFileName);
GoldenFileUpdater.update(filePath, actualPlan, tc.getExpectedOutputDsl());
} else {
assertEquals(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In RedNode plan, should we validate schema as well at each level?

Copy link
Copy Markdown
Author

@nssuresh2007 nssuresh2007 Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +78 to +87
private static void validate(GoldenTestCase testCase, Path filePath) {
requireNonNull(testCase.getTestName(), "testName", filePath);
requireNonNull(testCase.getIndexName(), "indexName", filePath);
requireNonNull(testCase.getIndexMapping(), "indexMapping", filePath);
requireNonNull(testCase.getInputDsl(), "inputDsl", filePath);
requireNonNull(testCase.getExpectedRelNodePlan(), "expectedRelNodePlan", filePath);
requireNonNull(testCase.getExecutionFieldNames(), "executionFieldNames", filePath);
requireNonNull(testCase.getExecutionRows(), "executionRows", filePath);
requireNonNull(testCase.getExpectedOutputDsl(), "expectedOutputDsl", filePath);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, can we use @nonnull in GoldenTestCase class itself?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently not able to find any library which uses NonNull annotation. Also currently other places in OpenSearch also use Objects.RequireNonNull only

stripNonDeterministicFields(actualOutput);
stripNonDeterministicFields(expectedOutput);

assertEquals(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bucket order also might not be deterministic for cases where OpenSearch use topK approach at a shard level. e.g. terms bucket aggregation with size parameter.

Copy link
Copy Markdown
Author

@nssuresh2007 nssuresh2007 Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Removes non-deterministic fields (took, _shards) from a serialized
* SearchResponse map so that comparisons are stable.
*/
private void stripNonDeterministicFields(Map<String, Object> responseMap) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Score also we will have to strip as it wo'nt be supported?

Copy link
Copy Markdown
Author

@nssuresh2007 nssuresh2007 Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to test the end to end DSL query conversion without OpenSearch cluster.
Currently adds support for match_all and terms with agg for the initial commit.
@nssuresh2007 nssuresh2007 reopened this Apr 12, 2026
@nssuresh2007 nssuresh2007 deleted the branch opensearch-project:feature/datafusion April 12, 2026 14:48
@nssuresh2007 nssuresh2007 deleted the feature/datafusion branch April 12, 2026 14:48
nssuresh2007 added a commit to nssuresh2007/OpenSearch that referenced this pull request Apr 12, 2026
)

Changes to test the end to end DSL query conversion without OpenSearch cluster.
Currently adds support for match_all and terms with agg for the initial commit.
nssuresh2007 added a commit to nssuresh2007/OpenSearch that referenced this pull request Apr 12, 2026
)

Changes to test the end to end DSL query conversion without OpenSearch cluster.
Currently adds support for match_all and terms with agg for the initial commit.
@nssuresh2007
Copy link
Copy Markdown
Author

Apologies, since the forked repo branch got deleted by mistake raised new PR after re-creating it and addressed the comments: #21210

nssuresh2007 added a commit to nssuresh2007/OpenSearch that referenced this pull request Apr 14, 2026
)

Changes to test the end to end DSL query conversion without OpenSearch cluster.
Currently adds support for match_all and terms with agg for the initial commit.
nssuresh2007 added a commit to nssuresh2007/OpenSearch that referenced this pull request Apr 23, 2026
)

Changes to test the end to end DSL query conversion without OpenSearch cluster.
Currently adds support for match_all and terms with agg for the initial commit.

Signed-off-by: Suresh N S <nssuresh@amazon.com>
nssuresh2007 added a commit to nssuresh2007/OpenSearch that referenced this pull request Apr 29, 2026
)

Changes to test the end to end DSL query conversion without OpenSearch cluster.
Currently adds support for match_all and terms with agg for the initial commit.

Signed-off-by: Suresh N S <nssuresh@amazon.com>
nssuresh2007 added a commit to nssuresh2007/OpenSearch that referenced this pull request May 5, 2026
)

Changes to test the end to end DSL query conversion without OpenSearch cluster.
Currently adds support for match_all and terms with agg for the initial commit.

Signed-off-by: Suresh N S <nssuresh@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants