Skip to content

Commit cdac62e

Browse files
committed
Convert byte[] cells to base64 ExprStringValue for IP / BINARY fields
`AnalyticsExecutionEngine.convertRows` reaches `ExprValueUtils.fromObjectValue` for every cell in every row. OpenSearch `ip` and `binary` fields are stored as raw bytes, so once the analytics-engine reader (paired with OpenSearch PR #21681 commit 4) materializes a parquet column, each cell arrives as a Java `byte[]`. The fall-through catch-all previously threw: ExpressionEvaluationException: unsupported object class [B at ExprValueUtils.fromObjectValue(ExprValueUtils.java:168) at AnalyticsExecutionEngine.convertRows(AnalyticsExecutionEngine.java:128) Every PPL query that even projects an ip column fails on the analytics route with this error, regardless of operator or predicate. `OpenSearchSchemaBuilder.mapFieldType` collapses both `ip` and `binary` into Calcite `VARBINARY`, so the cell-level type system can't distinguish the two. An earlier draft of this commit gated IP rendering on byte length (4 or 16 bytes → `ExprIpValue` via `InetAddress.getByAddress`), but that misinterprets any 4-byte ASCII payload (e.g. `"INFO"`, `"WARN"`, `"POST"`) as a public IPv4 address — the classic "Norway problem" applied to bytes. With no schema context here we can't safely guess, so default to a base64-encoded `ExprStringValue`. Typed call sites that DO know the column is `ip` should route through `ipValue(String)` explicitly. Once `OpenSearchSchemaBuilder` splits `ip` and `binary` into separate UDTs (TODO already filed in that file), this branch can render IP cells in dotted-decimal form again without the length-guessing. Signed-off-by: Kai Huang <ahkcs@amazon.com>
1 parent 4070c28 commit cdac62e

2 files changed

Lines changed: 22 additions & 0 deletions

File tree

core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,16 @@ public static ExprValue fromObjectValue(Object o) {
164164
return timestampValue(((LocalDateTime) o).toInstant(ZoneOffset.UTC));
165165
} else if (o instanceof TemporalAmount) {
166166
return intervalValue((TemporalAmount) o);
167+
} else if (o instanceof byte[] bytes) {
168+
// No type context here: OpenSearch `ip` and `binary` fields both collapse to
169+
// VARBINARY in `OpenSearchSchemaBuilder`, so a 4- or 16-byte payload could be
170+
// a genuine address or arbitrary binary that happens to be that length (4-byte
171+
// ASCII strings like "INFO" or "WARN" are indistinguishable from a public IPv4
172+
// address byte-for-byte). Default to a safe, unambiguous encoding. Typed call
173+
// sites should route IP cells through {@link #ipValue(String)} explicitly.
174+
// TODO: split `ip` and `binary` into separate UDTs upstream so this branch can
175+
// render IP cells in dotted form again. See `OpenSearchSchemaBuilder.mapFieldType`.
176+
return stringValue(java.util.Base64.getEncoder().encodeToString(bytes));
167177
} else {
168178
throw new ExpressionEvaluationException("unsupported object " + o.getClass());
169179
}

core/src/test/java/org/opensearch/sql/data/model/ExprValueUtilsTest.java

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,18 @@ public void constructDateAndTimeValue() {
244244
ExprValueUtils.fromObjectValue("2012-07-07 01:01:01", TIMESTAMP));
245245
}
246246

247+
@Test
248+
public void fromObjectValue_byte_array_returns_base64_string() {
249+
// Without type context the factory can't distinguish `ip` from `binary`
250+
// (both collapse to VARBINARY in OpenSearchSchemaBuilder), so every byte[]
251+
// is base64-encoded. Typed routing happens at the call site.
252+
assertEquals(
253+
new ExprStringValue("AQIDBA=="), ExprValueUtils.fromObjectValue(new byte[] {1, 2, 3, 4}));
254+
assertEquals(
255+
new ExprStringValue("AQIDBAU="),
256+
ExprValueUtils.fromObjectValue(new byte[] {1, 2, 3, 4, 5}));
257+
}
258+
247259
@Test
248260
public void fromObjectValue_double_infinity_returns_null() {
249261
assertTrue(ExprValueUtils.fromObjectValue(Double.POSITIVE_INFINITY).isNull());

0 commit comments

Comments
 (0)