Skip to content

Commit dc23a73

Browse files
committed
Convert byte[] cells to ExprValue in fromObjectValue for IP / BINARY fields
`AnalyticsExecutionEngine.convertRows` reaches `ExprValueUtils.fromObjectValue` for every cell in every row. OpenSearch `ip` and `binary` fields are stored as raw bytes, so once the analytics-engine reader (post-#21681 commit 4) materializes a parquet column, each cell arrives as a Java `byte[]`. The fall-through catch-all then throws: ExpressionEvaluationException: unsupported object class [B at ExprValueUtils.fromObjectValue(ExprValueUtils.java:168) at AnalyticsExecutionEngine.convertRows(AnalyticsExecutionEngine.java:128) Every PPL query that even projects an ip column fails on the analytics route with this error, regardless of operator or predicate. `OpenSearchSchemaBuilder.mapFieldType` collapses both `ip` and `binary` into Calcite `VARBINARY`, so the cell-level type system can't distinguish the two. Length is the only signal available — IPv4 is exactly 4 bytes, IPv6 is exactly 16, and `InetAddress.getByAddress` rejects every other length, so the heuristic is unambiguous in practice: - 4 or 16 bytes → ExprIpValue (canonical IP string) - any other length → base64-encoded ExprStringValue (binary payload) `InetAddress.getByAddress` may throw `UnknownHostException` only when the length is not 4 or 16; we already gate on that, but the catch is kept as a defensive fall-through to base64 for forward compatibility (e.g. future RFC 6052-style 4or16-byte alternates). Future work: differentiate `ip` and `binary` at the schema layer (separate UDTs in `OpenSearchSchemaBuilder` + matching `schema_coerce.rs` handling) so the type system can route directly without the length heuristic. Tracked by the TODO already in `OpenSearchSchemaBuilder.mapFieldType`. Adds three unit tests in `ExprValueUtilsTest` covering IPv4, IPv6, and non-IP byte arrays. Signed-off-by: Kai Huang <ahkcs@amazon.com>
1 parent a224f19 commit dc23a73

2 files changed

Lines changed: 44 additions & 0 deletions

File tree

core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,23 @@ public static ExprValue fromObjectValue(Object o) {
164164
return timestampValue(((LocalDateTime) o).toInstant(ZoneOffset.UTC));
165165
} else if (o instanceof TemporalAmount) {
166166
return intervalValue((TemporalAmount) o);
167+
} else if (o instanceof byte[] bytes) {
168+
// OpenSearch `ip` and `binary` fields both map to Calcite VARBINARY in
169+
// `OpenSearchSchemaBuilder`, so by the time the cell reaches this factory the
170+
// type system can't distinguish the two. Length is the only signal — IPv4 is
171+
// 4 bytes, IPv6 is 16, and `InetAddress.getByAddress` rejects every other
172+
// length, so we can safely try it first and fall through to base64 for
173+
// genuine `binary` payloads.
174+
// TODO: differentiate `ip` and `binary` upstream (separate UDTs in
175+
// `OpenSearchSchemaBuilder` + `schema_coerce.rs`) and route by type here.
176+
if (bytes.length == 4 || bytes.length == 16) {
177+
try {
178+
return new ExprIpValue(java.net.InetAddress.getByAddress(bytes).getHostAddress());
179+
} catch (java.net.UnknownHostException ignored) {
180+
// Fall through to base64.
181+
}
182+
}
183+
return stringValue(java.util.Base64.getEncoder().encodeToString(bytes));
167184
} else {
168185
throw new ExpressionEvaluationException("unsupported object " + o.getClass());
169186
}

core/src/test/java/org/opensearch/sql/data/model/ExprValueUtilsTest.java

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,33 @@ public void constructDateAndTimeValue() {
244244
ExprValueUtils.fromObjectValue("2012-07-07 01:01:01", TIMESTAMP));
245245
}
246246

247+
@Test
248+
public void fromObjectValue_ipv4_byte_array_returns_ip_value() {
249+
// 1.2.3.4
250+
byte[] ipv4 = new byte[] {1, 2, 3, 4};
251+
ExprValue v = ExprValueUtils.fromObjectValue(ipv4);
252+
assertTrue(v instanceof ExprIpValue);
253+
assertEquals("1.2.3.4", v.value());
254+
}
255+
256+
@Test
257+
public void fromObjectValue_ipv6_byte_array_returns_ip_value() {
258+
// ::1 (loopback)
259+
byte[] ipv6 = new byte[16];
260+
ipv6[15] = 1;
261+
ExprValue v = ExprValueUtils.fromObjectValue(ipv6);
262+
assertTrue(v instanceof ExprIpValue);
263+
assertEquals("::1", v.value());
264+
}
265+
266+
@Test
267+
public void fromObjectValue_non_ip_byte_array_returns_base64_string() {
268+
// 5-byte payload — not a valid IP length, falls through to base64.
269+
byte[] binary = new byte[] {1, 2, 3, 4, 5};
270+
ExprValue v = ExprValueUtils.fromObjectValue(binary);
271+
assertEquals(new ExprStringValue("AQIDBAU="), v);
272+
}
273+
247274
@Test
248275
public void fromObjectValue_double_infinity_returns_null() {
249276
assertTrue(ExprValueUtils.fromObjectValue(Double.POSITIVE_INFINITY).isNull());

0 commit comments

Comments
 (0)