Skip to content

Commit f5ea743

Browse files
committed
spath: Arrow Map response marshalling + parquet-backed test indices
Two complementary changes for closing PPL `spath` parity on the analytics-engine route. Pairs with opensearch-project/OpenSearch#21664 (the analytics-engine `json_extract_all` UDF + ITEM-on-MAP dispatch + MAP capability registrations). Both PRs together take `CalcitePPLSpathCommandIT` on the analytics-engine route from 0 / 16 to 16 / 16 without regressing the v2 / Calcite path (still 16 / 16). ## Pass rate | IT | Route | Before | After | |---|---|---|---| | `CalcitePPLSpathCommandIT` | analytics-engine (`-Dtests.analytics.force_routing=true -Dtests.analytics.parquet_indices=true`) | 0 / 16 | **16 / 16** | | `CalcitePPLSpathCommandIT` | default v2 / Calcite (no flags) | 16 / 16 | 16 / 16 (no regression) | The analytics-route number depends on this PR + #21664 landing together; neither PR alone moves it off 0 / 16. ## Changes 1. **`core/.../ExprValueUtils.java`** — two boundary fixes for Arrow Map values that arrive from the analytics-engine response stream: - `fromObjectValue` gains a FQN-keyed branch for `org.apache.arrow.vector.util.Text` (decoded via `toString()`). Arrow's MapVector / StructVector emit values as `Text` (a UTF-8 byte-buffer wrapper that does NOT implement `CharSequence`), which none of the typed `instanceof` branches recognized. Without this, any UDF returning `Map<Utf8, Utf8>` through the analytics-engine route surfaces as `ExpressionEvaluationException: unsupported object class org.apache.arrow.vector.util.Text`. FQN match keeps `core/` free of an Arrow dependency. - `tupleValue` widens its parameter type to `Map<?, Object>` and normalizes keys via `String.valueOf` so the downstream `LinkedHashMap<String, ExprValue>` isn't polluted with non-String keys (the JDBC response formatter then sees stable String keys). 2. **`integ-test/.../CalcitePPLSpathCommandIT.java`** — refactor `init()` to use `TestUtils.createIndexByRestClient` with an explicit keyword mapping for each of the four test indices (`test_spath`, `test_spath_auto`, `test_spath_cmd`, `test_spath_null`) before the per-doc PUTs. Without an explicit createIndex, the dynamic-mapping route bypasses the `tests.analytics.parquet_indices=true` parquet injection (because the toggle only fires inside `TestUtils.createIndexByRestClient`) and the analytics-engine fragment driver then fails with `UnsupportedOperationException: acquireReader is not supported in EngineBackedIndexer` for any test that reaches the runtime — `testSimpleSpath` was the only test affected before this PR (the other 15 failed earlier at the planner capability check). Idempotency via `TestUtils.isIndexExist` so the cluster-reuse pattern between `@Test` methods keeps working. No change for the v2 / Calcite path (the helper is a no-op for non-parquet runs). 3. **`docs/dev/spath-command-analytics-route-status.md`** — new companion to `docs/dev/ppl-analytics-engine-routing.md` documenting the full spath workstream: per-test baseline, bucket classification, the two repos' changes side-by-side, the design decisions (Map return type vs. stringified JSON, ITEM-on-MAP dispatch, FieldType.MAP capability registration), and the reproduction instructions. Signed-off-by: Kai Huang <ahkcs@amazon.com>
1 parent 1efb6c3 commit f5ea743

3 files changed

Lines changed: 255 additions & 32 deletions

File tree

core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,17 @@ public static ExprValue nullValue() {
111111
return ExprNullValue.of();
112112
}
113113

114+
private static final String ARROW_TEXT_CLASS_NAME = "org.apache.arrow.vector.util.Text";
115+
116+
/**
117+
* Whether {@code o} is an Arrow {@code Text} (the UTF-8 byte-buffer wrapper that arrow's Map /
118+
* Struct / List vectors emit for string values). FQN match keeps {@code core/} free of an Arrow
119+
* dependency.
120+
*/
121+
private static boolean isArrowText(Object o) {
122+
return o != null && ARROW_TEXT_CLASS_NAME.equals(o.getClass().getName());
123+
}
124+
114125
/** Construct ExprValue from Object. */
115126
public static ExprValue fromObjectValue(Object o) {
116127
if (null == o) {
@@ -143,6 +154,18 @@ public static ExprValue fromObjectValue(Object o) {
143154
return new ExprDoubleValue(d);
144155
} else if (o instanceof String) {
145156
return stringValue((String) o);
157+
} else if (isArrowText(o)) {
158+
// Arrow MapVector / StructVector yields values as
159+
// `org.apache.arrow.vector.util.Text` — a UTF-8 byte-buffer wrapper that
160+
// does NOT implement CharSequence and therefore wouldn't match any of the
161+
// typed branches above. `Text.toString()` decodes to a real Java String.
162+
// Matched by FQN rather than instanceof so `core/` doesn't acquire an
163+
// Arrow dependency for one type-system bridge. Without this branch the
164+
// analytics-engine route surfaces `ExpressionEvaluationException:
165+
// unsupported object class org.apache.arrow.vector.util.Text` from any
166+
// UDF returning Map<Utf8, Utf8> (first such UDF is `json_extract_all`
167+
// powering PPL `spath`).
168+
return stringValue(o.toString());
146169
} else if (o instanceof Float f) {
147170
if (!Float.isFinite(f)) {
148171
return LITERAL_NULL;
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# `spath` command on the analytics-engine route — current status
2+
3+
Snapshot of `CalcitePPLSpathCommandIT` against the analytics-engine path
4+
(`-Dtests.analytics.force_routing=true -Dtests.analytics.parquet_indices=true`
5+
on `:integ-test:integTestRemote`), as of 2026-05-14.
6+
7+
## Pass / fail summary
8+
9+
| IT | Run | Pass | Fail | Notes |
10+
|---|---|---|---|---|
11+
| `CalcitePPLSpathCommandIT` | Baseline (upstream/main + coverage-bundle toggles) | 0 / 16 | 16 | 15 rejected at plan time on `JSON_EXTRACT_ALL` capability gap; 1 (`testSimpleSpath`) hit `EngineBackedIndexer.acquireReader` because PUT-created indices weren't parquet-backed |
12+
| `CalcitePPLSpathCommandIT` | After IT-infra refactor only | 1 / 16 | 15 | `testSimpleSpath` passes (uses `JSON_EXTRACT`, already in `STANDARD_PROJECT_OPS`); remaining 15 still on `JSON_EXTRACT_ALL` |
13+
| `CalcitePPLSpathCommandIT` | After Rust UDF + Java wiring | 10 / 16 | 6 | Auto-extract variants pass; 5 ITEM tests blocked on `map_extract` substrait + CHAR(N) key coercion; 1 WHERE test on `Unrecognized field type [MAP]` |
14+
| `CalcitePPLSpathCommandIT` | After ITEM/map_extract/CHAR→VARCHAR coercion + FieldType.MAP + filter capability | **16 / 16** | 0 | Full parity. v2/Calcite default path also 16/16 — no regression. |
15+
16+
Reproduce:
17+
18+
```bash
19+
./gradlew :integ-test:integTestRemote \
20+
-Dtests.rest.cluster=localhost:9200 \
21+
-Dtests.cluster=localhost:9300 \
22+
-Dtests.clustername=runTask \
23+
-Dtests.analytics.force_routing=true \
24+
-Dtests.analytics.parquet_indices=true \
25+
--tests "org.opensearch.sql.calcite.remote.CalcitePPLSpathCommandIT"
26+
```
27+
28+
## Bucket
29+
30+
**Mixed Bucket 3 / M (Rust UDF) and Bucket 1 / S0 (capability registration), plus
31+
a small SQL-plugin-side IT-infrastructure fix.** Closing `spath` parity touches
32+
both halves of the stack — Layer 2 (capability registry) and Layer 3 (DataFusion
33+
runtime) — because `JSON_EXTRACT_ALL` is a PPL-internal UDF with no native
34+
DataFusion equivalent. The fix is shaped like a one-time investment that
35+
unblocks every future map-returning PPL function (the `LinkedHashMap<String, V>`
36+
pattern in `core/.../jsonUDF/`).
37+
38+
## What landed
39+
40+
The diff spans two repos:
41+
42+
### `opensearch-project/OpenSearch` — analytics-engine side
43+
44+
| File | Purpose |
45+
|---|---|
46+
| `sandbox/plugins/analytics-backend-datafusion/rust/src/udf/json_extract_all.rs` *(new, ~550 lines)* | Rust UDF returning `Map<Utf8, Utf8>`; mirrors `JsonExtractAllFunctionImpl` semantics (dot-path flatten, `{}` array marker, `[a, b, c]` merge format, `"null"` literal, malformed → empty map, top-level scalar → NULL). 16 unit tests. |
47+
| `sandbox/plugins/analytics-backend-datafusion/rust/src/udf/mod.rs` | Register `json_extract_all` in `register_all()` + module-info log message. |
48+
| `sandbox/plugins/analytics-backend-datafusion/src/main/resources/opensearch_scalar_functions.yaml` | Substrait extension entries for `json_extract_all(string) -> any1` and `map_extract(map<any1, any2>, any1) -> list<any2>`. |
49+
| `sandbox/libs/analytics-framework/.../ScalarFunction.java` | Add `JSON_EXTRACT_ALL(Category.SCALAR, SqlKind.OTHER_FUNCTION)` enum constant. |
50+
| `sandbox/libs/analytics-framework/.../FieldType.java` | Add `MAP("map")` enum constant and `case MAP -> FieldType.MAP` in `fromSqlTypeName`. |
51+
| `sandbox/plugins/analytics-backend-datafusion/.../JsonFunctionAdapters.java` | Add `JsonExtractAllAdapter` declaring `LOCAL_JSON_EXTRACT_ALL_OP` SqlFunction (return-type placeholder — `AbstractNameMappingAdapter#adapt` preserves the original PPL call's MAP RelDataType). |
52+
| `sandbox/plugins/analytics-backend-datafusion/.../ArrayElementAdapter.java` | Add `LOCAL_MAP_EXTRACT_OP` and a MAP-input branch in `adapt`: route `ITEM(map, key)` to `array_element(map_extract(map, CAST(key AS VARCHAR)), 1)`. The `array_element(…, 1)` wrap projects DataFusion's `List<value>` return down to a scalar; the CHAR→VARCHAR cast satisfies the substrait `any1` type-variable unification rule. |
53+
| `sandbox/plugins/analytics-backend-datafusion/.../DataFusionFragmentConvertor.java` | Register `json_extract_all` and `map_extract` in `ADDITIONAL_SCALAR_SIGS` so isthmus emits the right function reference. |
54+
| `sandbox/plugins/analytics-backend-datafusion/.../DataFusionAnalyticsBackendPlugin.java` | New `MAP_RETURNING_PROJECT_OPS` set (mirrors `ARRAY_RETURNING_PROJECT_OPS`) registered with `FieldType.MAP`. Register every operator in `STANDARD_FILTER_OPS` against `FieldType.MAP` so `where doc.user.name = 'John'` (which references the MAP column through ITEM) survives the filter-rule's field-index-keyed viability check. Adapter binding `JSON_EXTRACT_ALL → JsonExtractAllAdapter`. |
55+
| `sandbox/plugins/analytics-engine/.../ArrowValues.java` | Detect Arrow `MapVector` ahead of the ListVector branch and reassemble entries into a `LinkedHashMap<String, Object>` (Text→String normalization on both keys and values). Without this, the SQL plugin's response marshaller saw the raw Arrow Map layout (a `List<Map{keys, values}>` of entry structs) and rejected the values as `unsupported object class org.apache.arrow.vector.util.Text`. |
56+
| `gradle/run.gradle` | `arrow-flight-rpc` plugin block now also sets `opensearch.experimental.feature.transport.stream.enabled=true` so the analytics-engine + SQL-plugin co-install boots without the duplicate-handler Guice failure. |
57+
58+
### `opensearch-project/sql` — SQL plugin side
59+
60+
| File | Purpose |
61+
|---|---|
62+
| `core/.../data/model/ExprValueUtils.java` | (a) `tupleValue` widens its parameter type to `Map<?, Object>` so keys are normalized to strings regardless of the underlying Arrow `Text` type; (b) `fromObjectValue` gains a FQN-keyed branch for `org.apache.arrow.vector.util.Text` (string-decoded via `toString()`) so the response marshaller doesn't reject Arrow Map values. The FQN match keeps `core/` free of an Arrow dependency. |
63+
| `integ-test/.../calcite/remote/CalcitePPLSpathCommandIT.java` | Refactor `init()` to call `TestUtils.createIndexByRestClient` with an explicit `keyword` mapping for each of the four test indices (`test_spath`, `test_spath_auto`, `test_spath_cmd`, `test_spath_null`). Without explicit createIndex, the dynamic-mapping route bypassed `tests.analytics.parquet_indices=true`'s parquet-injection and `EngineBackedIndexer.acquireReader` failed at runtime. Idempotency via `TestUtils.isIndexExist` so the cluster-reuse pattern between `@Test` methods keeps working. |
64+
65+
## Knock-on coverage
66+
67+
The Rust UDF + capability registration work is reusable across the entire PPL
68+
map-returning function family. The same pattern (MAP_RETURNING_PROJECT_OPS,
69+
ITEM-on-Map dispatch via map_extract+array_element, MAP filter capability) now
70+
exists in the codebase, so a future PPL command that emits a `Map<VARCHAR, V>`
71+
column (e.g. `mapfrom` if ever lowered to a Calcite MAP RelDataType) is a
72+
one-line registration on top.
73+
74+
The `ArrowValues.MapVector` branch and the `ExprValueUtils.Text` branch are
75+
also generic — they unlock any future UDF returning `Map<Utf8, Utf8>` through
76+
the analytics-engine route without further SQL-plugin work.
77+
78+
## What was tricky
79+
80+
- **Field-name alignment between Rust MapBuilder and Arrow Java's MapVector.**
81+
Arrow's Java `MapVector.KEY_NAME` / `VALUE_NAME` are `"key"` / `"value"`
82+
(singular). My initial Rust code used `"keys"` / `"values"` (plural), which
83+
serialized fine through the wire but turned `MapVector.getObject(i)` into a
84+
generic `JsonStringArrayList` of `Map<String, Object>` entries instead of a
85+
proper key-value map. Aligning the Rust `MapFieldNames` to singular fixed
86+
the round-trip.
87+
- **`map_extract` returns a list.** DataFusion's `map_extract(map, key)` returns
88+
`List<value>` because map semantics permit duplicate keys. PPL's
89+
`result.user.name` syntax assumes a scalar, so the ITEM adapter has to wrap
90+
the `map_extract` call in `array_element(…, 1)` to project the list down to
91+
a scalar.
92+
- **CHAR(N) literals don't unify with `any1` substrait type variables.** When
93+
the YAML signature uses `map<any1, any2>, any1` for the key-typed slot, a
94+
Calcite-emitted `'a.b'` CHAR(3) literal doesn't satisfy the type-variable
95+
binding. The adapter has to insert an explicit `CAST(key AS VARCHAR)`
96+
before the `map_extract` call.
97+
- **`MAP` had to be added to `FieldType` before `OpenSearchFilterRule` would
98+
even reach the EQUALS lookup.** Once added, `EQUALS on FieldType.MAP` still
99+
needed its own capability registration — the filter rule keys on the
100+
field's declared type, not the predicate's actual operand types, so without
101+
the explicit MAP entry the registry returned no viable backends.
102+
- **Two competing `:run` invocations stomp on the same data dir.** Repeated
103+
restart attempts during debugging accumulated multiple OpenSearch JVMs all
104+
pointing at `build/testclusters/runTask-0/data/...`, producing
105+
`Failed to initialize Parquet writer: No such file or directory` errors
106+
even though the data was being written. Killing all gradle wrappers + JVMs
107+
and starting one clean cluster cleared it. (Future task-runners should
108+
guard against this — gradle's testCluster bind to the same data dir
109+
regardless of which gradle wrapper started it.)
110+
111+
## Reproduction artefacts
112+
113+
Worktree:
114+
`/Users/ahkcs/IdeaProjects/sql/.claude/worktrees/spath-analytics-route`
115+
(branch `feature/spath-analytics-route`, based off
116+
`feature/ppl-coverage-bundle`).
117+
118+
Run output (analytics-engine route):
119+
`integ-test/build/test-results/integTestRemote/TEST-org.opensearch.sql.calcite.remote.CalcitePPLSpathCommandIT.xml`
120+
121+
Cluster log:
122+
`OpenSearch/build/testclusters/runTask-0/logs/runTask.log`
123+
124+
## Files to read once before sending the PR
125+
126+
The analytics-engine side is the bulk of the change. Open these in order to
127+
review the boundary contracts:
128+
129+
- `OpenSearch/sandbox/plugins/analytics-backend-datafusion/rust/src/udf/json_extract_all.rs`
130+
- `OpenSearch/sandbox/plugins/analytics-backend-datafusion/src/main/resources/opensearch_scalar_functions.yaml` (the `map_extract` and `json_extract_all` entries)
131+
- `OpenSearch/sandbox/plugins/analytics-backend-datafusion/src/main/java/org/opensearch/be/datafusion/ArrayElementAdapter.java` (the MAP branch and CAST)
132+
- `OpenSearch/sandbox/plugins/analytics-backend-datafusion/src/main/java/org/opensearch/be/datafusion/DataFusionAnalyticsBackendPlugin.java` (the `MAP_RETURNING_PROJECT_OPS` set and the MAP filter capability registration)
133+
- `OpenSearch/sandbox/plugins/analytics-engine/src/main/java/org/opensearch/analytics/exec/ArrowValues.java` (MapVector flattening)
134+
- `sql/core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java` (Arrow Text marshaller)
135+
- `sql/integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLSpathCommandIT.java` (parquet-backed index init)
136+
137+
## PR shape
138+
139+
For `opensearch-project/OpenSearch`, the natural commit-order is:
140+
141+
1. Add `FieldType.MAP` + `ScalarFunction.JSON_EXTRACT_ALL` enum constants
142+
(framework SPI, no behavior change on its own).
143+
2. Add the `json_extract_all` Rust UDF (drop-in, isolated module).
144+
3. Register the Rust UDF in `udf/mod.rs` and the substrait extension YAML.
145+
4. Wire `JsonExtractAllAdapter` + `MAP_RETURNING_PROJECT_OPS` +
146+
`STANDARD_FILTER_OPS×FieldType.MAP` in the backend plugin.
147+
5. Add the `map_extract`/`array_element` ITEM-on-MAP dispatch in
148+
`ArrayElementAdapter` + YAML entry.
149+
6. `ArrowValues.MapVector` flattening.
150+
7. `gradle/run.gradle``transport.stream.enabled` flag.
151+
152+
For `opensearch-project/sql`, a single PR:
153+
154+
- `ExprValueUtils` — Arrow Text marshaller branch.
155+
- `CalcitePPLSpathCommandIT` — parquet-backed test indices.
156+
157+
## Reference fixture
158+
159+
The reference PR shape for a similar map-returning-function workstream
160+
mirrors this status doc. The fillnull worked example
161+
([`opensearch-project/OpenSearch#21472`](https://github.com/opensearch-project/OpenSearch/pull/21472))
162+
is a strict Bucket-1 / S0 (one-line capability addition); spath is a
163+
mixed bucket that closes the missing infrastructure for any future
164+
map-returning PPL function at once.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLSpathCommandIT.java

Lines changed: 68 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,29 @@
1515
import org.json.JSONObject;
1616
import org.junit.jupiter.api.Test;
1717
import org.opensearch.client.Request;
18+
import org.opensearch.sql.legacy.TestUtils;
1819
import org.opensearch.sql.ppl.PPLIntegTestCase;
1920

2021
public class CalcitePPLSpathCommandIT extends PPLIntegTestCase {
22+
// Pin the doc fields as keyword so the analytics-engine compatibility run
23+
// (tests.analytics.parquet_indices=true) provisions each test index as a
24+
// parquet-backed composite store with parquet-friendly types. Without an
25+
// explicit createIndex, the dynamic-mapping route auto-creates Lucene-only
26+
// indices that DataFusion cannot acquireReader on, so the analytics-engine
27+
// fragment driver fails with `UnsupportedOperationException: acquireReader
28+
// is not supported in EngineBackedIndexer` before any spath function fires.
29+
private static final String SIMPLE_DOC_MAPPING =
30+
"{\"mappings\":{\"properties\":{\"doc\":{\"type\":\"keyword\"}}}}";
31+
32+
private static final String AUTO_DOC_MAPPING =
33+
"{\"mappings\":{\"properties\":{"
34+
+ "\"nested_doc\":{\"type\":\"keyword\"},"
35+
+ "\"array_doc\":{\"type\":\"keyword\"},"
36+
+ "\"merge_doc\":{\"type\":\"keyword\"},"
37+
+ "\"stringify_doc\":{\"type\":\"keyword\"},"
38+
+ "\"empty_doc\":{\"type\":\"keyword\"},"
39+
+ "\"malformed_doc\":{\"type\":\"keyword\"}}}}";
40+
2141
@Override
2242
public void init() throws Exception {
2343
super.init();
@@ -26,48 +46,64 @@ public void init() throws Exception {
2646
loadIndex(Index.BANK);
2747

2848
// Simple JSON docs for path-based extraction
29-
Request request1 = new Request("PUT", "/test_spath/_doc/1?refresh=true");
30-
request1.setJsonEntity("{\"doc\": \"{\\\"n\\\": 1}\"}");
31-
client().performRequest(request1);
49+
if (!TestUtils.isIndexExist(client(), "test_spath")) {
50+
TestUtils.createIndexByRestClient(client(), "test_spath", SIMPLE_DOC_MAPPING);
51+
52+
Request request1 = new Request("PUT", "/test_spath/_doc/1?refresh=true");
53+
request1.setJsonEntity("{\"doc\": \"{\\\"n\\\": 1}\"}");
54+
client().performRequest(request1);
3255

33-
Request request2 = new Request("PUT", "/test_spath/_doc/2?refresh=true");
34-
request2.setJsonEntity("{\"doc\": \"{\\\"n\\\": 2}\"}");
35-
client().performRequest(request2);
56+
Request request2 = new Request("PUT", "/test_spath/_doc/2?refresh=true");
57+
request2.setJsonEntity("{\"doc\": \"{\\\"n\\\": 2}\"}");
58+
client().performRequest(request2);
3659

37-
Request request3 = new Request("PUT", "/test_spath/_doc/3?refresh=true");
38-
request3.setJsonEntity("{\"doc\": \"{\\\"n\\\": 3}\"}");
39-
client().performRequest(request3);
60+
Request request3 = new Request("PUT", "/test_spath/_doc/3?refresh=true");
61+
request3.setJsonEntity("{\"doc\": \"{\\\"n\\\": 3}\"}");
62+
client().performRequest(request3);
63+
}
4064

4165
// Auto-extract mode: flatten rules and edge cases (empty, malformed)
42-
Request autoExtractDoc = new Request("PUT", "/test_spath_auto/_doc/1?refresh=true");
43-
autoExtractDoc.setJsonEntity(
44-
"{\"nested_doc\": \"{\\\"user\\\":{\\\"name\\\":\\\"John\\\"}}\","
45-
+ " \"array_doc\": \"{\\\"tags\\\":[\\\"java\\\",\\\"sql\\\"]}\","
46-
+ " \"merge_doc\": \"{\\\"a\\\":{\\\"b\\\":1},\\\"a.b\\\":2}\","
47-
+ " \"stringify_doc\": \"{\\\"n\\\":30,\\\"b\\\":true,\\\"x\\\":null}\","
48-
+ " \"empty_doc\": \"{}\","
49-
+ " \"malformed_doc\": \"{\\\"user\\\":{\\\"name\\\":\"}");
50-
client().performRequest(autoExtractDoc);
66+
if (!TestUtils.isIndexExist(client(), "test_spath_auto")) {
67+
TestUtils.createIndexByRestClient(client(), "test_spath_auto", AUTO_DOC_MAPPING);
68+
69+
Request autoExtractDoc = new Request("PUT", "/test_spath_auto/_doc/1?refresh=true");
70+
autoExtractDoc.setJsonEntity(
71+
"{\"nested_doc\": \"{\\\"user\\\":{\\\"name\\\":\\\"John\\\"}}\","
72+
+ " \"array_doc\": \"{\\\"tags\\\":[\\\"java\\\",\\\"sql\\\"]}\","
73+
+ " \"merge_doc\": \"{\\\"a\\\":{\\\"b\\\":1},\\\"a.b\\\":2}\","
74+
+ " \"stringify_doc\": \"{\\\"n\\\":30,\\\"b\\\":true,\\\"x\\\":null}\","
75+
+ " \"empty_doc\": \"{}\","
76+
+ " \"malformed_doc\": \"{\\\"user\\\":{\\\"name\\\":\"}");
77+
client().performRequest(autoExtractDoc);
78+
}
5179

5280
// Auto-extract mode: 2-doc index for spath + command (eval/where/stats/sort) tests
53-
Request cmdDoc1 = new Request("PUT", "/test_spath_cmd/_doc/1?refresh=true");
54-
cmdDoc1.setJsonEntity(
55-
"{\"doc\": \"{\\\"user\\\":{\\\"name\\\":\\\"John\\\",\\\"age\\\":30}}\"}");
56-
client().performRequest(cmdDoc1);
81+
if (!TestUtils.isIndexExist(client(), "test_spath_cmd")) {
82+
TestUtils.createIndexByRestClient(client(), "test_spath_cmd", SIMPLE_DOC_MAPPING);
5783

58-
Request cmdDoc2 = new Request("PUT", "/test_spath_cmd/_doc/2?refresh=true");
59-
cmdDoc2.setJsonEntity(
60-
"{\"doc\": \"{\\\"user\\\":{\\\"name\\\":\\\"Alice\\\",\\\"age\\\":25}}\"}");
61-
client().performRequest(cmdDoc2);
84+
Request cmdDoc1 = new Request("PUT", "/test_spath_cmd/_doc/1?refresh=true");
85+
cmdDoc1.setJsonEntity(
86+
"{\"doc\": \"{\\\"user\\\":{\\\"name\\\":\\\"John\\\",\\\"age\\\":30}}\"}");
87+
client().performRequest(cmdDoc1);
88+
89+
Request cmdDoc2 = new Request("PUT", "/test_spath_cmd/_doc/2?refresh=true");
90+
cmdDoc2.setJsonEntity(
91+
"{\"doc\": \"{\\\"user\\\":{\\\"name\\\":\\\"Alice\\\",\\\"age\\\":25}}\"}");
92+
client().performRequest(cmdDoc2);
93+
}
6294

6395
// Auto-extract mode: null input handling (doc 1 establishes mapping, doc 2 has null)
64-
Request nullDoc1 = new Request("PUT", "/test_spath_null/_doc/1?refresh=true");
65-
nullDoc1.setJsonEntity("{\"doc\": \"{\\\"n\\\": 1}\"}");
66-
client().performRequest(nullDoc1);
96+
if (!TestUtils.isIndexExist(client(), "test_spath_null")) {
97+
TestUtils.createIndexByRestClient(client(), "test_spath_null", SIMPLE_DOC_MAPPING);
98+
99+
Request nullDoc1 = new Request("PUT", "/test_spath_null/_doc/1?refresh=true");
100+
nullDoc1.setJsonEntity("{\"doc\": \"{\\\"n\\\": 1}\"}");
101+
client().performRequest(nullDoc1);
67102

68-
Request nullDoc2 = new Request("PUT", "/test_spath_null/_doc/2?refresh=true");
69-
nullDoc2.setJsonEntity("{\"doc\": null}");
70-
client().performRequest(nullDoc2);
103+
Request nullDoc2 = new Request("PUT", "/test_spath_null/_doc/2?refresh=true");
104+
nullDoc2.setJsonEntity("{\"doc\": null}");
105+
client().performRequest(nullDoc2);
106+
}
71107
}
72108

73109
@Test

0 commit comments

Comments
 (0)