Skip to content

Commit 7da02d5

Browse files
authored
Carry CalciteFieldFormatCommandIT through the helper-managed index path (#5417)
Same shape as #5407 for CalciteEvalCommandIT. The IT's init() previously created the in-test test_eval index via direct `PUT /test_eval/_doc/N` requests, relying on dynamic mapping. Two problems: 1. The doc PUTs auto-create the index with whatever settings the cluster defaults to. The analytics-engine compatibility path (force-routing on; tests.analytics.parquet_indices=true) needs parquet-backed indices, which TestUtils.createIndexByRestClient applies via TestUtils.makeParquetBacked when the system property is set. Direct PUTs sidestep that helper, so test_eval lands as Lucene-backed and the analytics planner rejects it with "No backend can scan all requested fields on index [test_eval]". All four working tests fail at execution. 2. init() runs before every @test method. The doc PUTs are doc-level idempotent, so re-running was wasteful but not failing. Once we switch to createIndexByRestClient, the index-level PUT is no longer idempotent and re-running throws "resource_already_exists_exception". Both addressed in one change: - test_eval is created via TestUtils.createIndexByRestClient with an explicit mapping (name/title=keyword, age=long). The helper honours tests.analytics.parquet_indices=true and produces a parquet-backed index for the analytics-engine sweep; on the v2 path the helper is a no-op around the index PUT, so behaviour is unchanged. - The whole init body is guarded by TestUtils.isIndexExist — same idempotency idiom that loadIndex uses for predefined fixtures. First @test method provisions; subsequent methods skip. Also pins the projection order on testFieldFormatStringConcatenation. The original query (`source=test_eval | fieldformat greeting = 'Hello ' + name`) had no `| fields` clause and relied on the implicit projection's column order — v2 returns Lucene-source insertion order, analytics returns parquet-storage order (alphabetical), so the assertion only matched on v2 by coincidence. Adding `| fields name, title, age, greeting` makes the assertion deterministic across paths; the existing expected rows (`rows("Alice", "Engineer", 25, "Hello Alice")`) already match this order, so v2 behaviour is preserved. The other four tests already had explicit `| fields ...` clauses, so no change there. No semantic change for the v2 path: the explicit mapping types (keyword, long, keyword) resolve to the same PPL types ("string", "bigint", "string") that dynamic mapping inferred, and fieldformat reads from _source either way. Analytics-route compatibility goes from 1/5 to 4/5 (verified locally against a runTask cluster with analytics-engine + opensearch-sql-plugin). The remaining `testFieldFormatStringConcatenationWithNullFieldToString` needs a `tostring()` UDF on the analytics path — a multi-mode UDF (binary / hex / commas / duration) tracked separately as out of scope. Test plan: - ./gradlew :integ-test:integTest --tests 'org.opensearch.sql.calcite.remote.CalciteFieldFormatCommandIT' -> 5/5 green (v2 path, no regression). - ./gradlew :integ-test:analyticsCompatibilityTest --tests 'org.opensearch.sql.calcite.remote.CalciteFieldFormatCommandIT' -> 4/5 pass; the 5th fails on `tostring`'s missing capability registration, which is the documented out-of-scope category. Signed-off-by: Kai Huang <ahkcs@amazon.com>
1 parent 5d31506 commit 7da02d5

1 file changed

Lines changed: 34 additions & 11 deletions

File tree

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteFieldFormatCommandIT.java

Lines changed: 34 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
import org.json.JSONObject;
1313
import org.junit.jupiter.api.Test;
1414
import org.opensearch.client.Request;
15+
import org.opensearch.sql.legacy.TestUtils;
1516
import org.opensearch.sql.ppl.PPLIntegTestCase;
1617

1718
public class CalciteFieldFormatCommandIT extends PPLIntegTestCase {
@@ -23,26 +24,48 @@ public void init() throws Exception {
2324

2425
loadIndex(Index.BANK);
2526

26-
// Create test data for string concatenation
27-
Request request1 = new Request("PUT", "/test_eval/_doc/1?refresh=true");
28-
request1.setJsonEntity("{\"name\": \"Alice\", \"age\": 25, \"title\": \"Engineer\"}");
29-
client().performRequest(request1);
27+
// Pre-create test_eval through the helper so the analytics-engine compatibility run
28+
// (tests.analytics.parquet_indices=true) provisions it as a parquet-backed composite
29+
// index. Plain auto-mapping via the doc PUTs would create a Lucene-backed index, which
30+
// the analytics-engine planner cannot scan ("No backend can scan all requested fields").
31+
// Explicit mapping pins types so both v2 (verifySchema "string"/"bigint") and analytics
32+
// paths see the same shape regardless of dynamic-mapping behavior on the parquet engine.
33+
// Guarded by isIndexExist for idempotency — init() runs before each @Test method.
34+
if (!TestUtils.isIndexExist(client(), "test_eval")) {
35+
String testEvalMapping =
36+
"{\"mappings\":{\"properties\":{"
37+
+ "\"name\":{\"type\":\"keyword\"},"
38+
+ "\"age\":{\"type\":\"long\"},"
39+
+ "\"title\":{\"type\":\"keyword\"}}}}";
40+
TestUtils.createIndexByRestClient(client(), "test_eval", testEvalMapping);
3041

31-
Request request2 = new Request("PUT", "/test_eval/_doc/2?refresh=true");
32-
request2.setJsonEntity("{\"name\": \"Bob\", \"age\": 30, \"title\": \"Manager\"}");
33-
client().performRequest(request2);
42+
// Create test data for string concatenation
43+
Request request1 = new Request("PUT", "/test_eval/_doc/1?refresh=true");
44+
request1.setJsonEntity("{\"name\": \"Alice\", \"age\": 25, \"title\": \"Engineer\"}");
45+
client().performRequest(request1);
3446

35-
Request request3 = new Request("PUT", "/test_eval/_doc/3?refresh=true");
36-
request3.setJsonEntity("{\"name\": \"Charlie\", \"age\": null, \"title\": \"Analyst\"}");
37-
client().performRequest(request3);
47+
Request request2 = new Request("PUT", "/test_eval/_doc/2?refresh=true");
48+
request2.setJsonEntity("{\"name\": \"Bob\", \"age\": 30, \"title\": \"Manager\"}");
49+
client().performRequest(request2);
50+
51+
Request request3 = new Request("PUT", "/test_eval/_doc/3?refresh=true");
52+
request3.setJsonEntity("{\"name\": \"Charlie\", \"age\": null, \"title\": \"Analyst\"}");
53+
client().performRequest(request3);
54+
}
3855
}
3956

4057
@Test
4158
public void testFieldFormatStringConcatenation() throws IOException {
59+
// Pin the projection so column order is deterministic across execution paths — the
60+
// analytics-engine route reads parquet schema in storage order, which can differ from the
61+
// v2 / Lucene path's _source-iteration order. Adding an explicit | fields makes the test
62+
// a strict assertion on the fieldformat expression rather than a coincidence of projection
63+
// order.
4264
JSONObject result =
4365
executeQuery(
4466
StringEscapeUtils.escapeJson(
45-
"source=test_eval | fieldformat greeting = 'Hello ' + name"));
67+
"source=test_eval | fieldformat greeting = 'Hello ' + name | fields name, title,"
68+
+ " age, greeting"));
4669
verifySchema(
4770
result,
4871
schema("name", "string"),

0 commit comments

Comments
 (0)