Skip to content

Commit 0bb712b

Browse files
authored
feat(ffi): add literal expression support (#7675)
## Summary Add support for literal expressions in `vortex-ffi` Motivation example: Сallers need constants in expression trees to create scan predicates that can be pushed down, for example: ```sql age >= 67 price >= decimal(1512.0) ``` Introduce `vx_scalar` as an opaque ffi handle that exposes a typed Vortex scalar (`DType` + `optional ScalarValue`) and `vx_expression_literal` that allows to create literal expression nodes from scalar handle ## API Changes ### Scalar: `vx_scalar_new_*` creates a Rust `Scalar` and returns it as an owned opaque boxed handle (`vx_scalar *`). APIs that take `const vx_scalar *` borrow the handle; `vx_expression_literal` clones the underlying scalar into the expression, so the caller can free the original handle after construction ```c vx_scalar *vx_scalar_new_u8(uint8_t value, bool is_nullable); vx_scalar *vx_scalar_new_utf8(const char *ptr, size_t len, bool is_nullable, vx_error **err); vx_scalar *vx_scalar_new_null(const vx_dtype *dtype, vx_error **err); const vx_dtype *vx_scalar_dtype(const vx_scalar *scalar); bool vx_scalar_is_null(const vx_scalar *scalar); vx_scalar *vx_scalar_clone(const vx_scalar *scalar); void vx_scalar_free(vx_scalar *scalar); ``` ### Literal expression: `vx_expression_literal` clones the scalar into the expression, so the original scalar can be freed immediately after expression creation: ```c vx_expression *root = vx_expression_root(); vx_expression *age = vx_expression_get_item("age", root); vx_scalar *threshold_scalar = vx_scalar_new_u8(67, false); vx_expression *threshold = vx_expression_literal(threshold_scalar, &error); vx_scalar_free(threshold_scalar); vx_expression *filter = vx_expression_binary(VX_OPERATOR_GTE, age, threshold); vx_scan_options options = {}; options.filter = filter; ``` ## Testing Verifying new behavior and functionality works correctly AI disclosure: Claude was used to add tests <!-- Please describe how this change was tested. Here are some common categories for testing in Vortex: 1. Verifying existing behavior is maintained. 2. Verifying new behavior and functionality works correctly. 4. Serialization compatibility (backwards and forwards) should be maintained or explicitly broken. --> --------- Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
1 parent 6fa3cae commit 0bb712b

6 files changed

Lines changed: 1328 additions & 6 deletions

File tree

vortex-ffi/cinclude/vortex.h

Lines changed: 275 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -489,6 +489,16 @@ typedef struct vx_file vx_file;
489489
*/
490490
typedef struct vx_partition vx_partition;
491491

492+
/**
493+
* A typed scalar value.
494+
*
495+
* A `vx_scalar` represents a single value with an associated `DType`.
496+
* Its value is either null or a `ScalarValue`. Null values are allowed only
497+
* when the associated `DType` allows nulls. Non-null values are represented
498+
* by `ScalarValue` and interpreted using the `DType`.
499+
*/
500+
typedef struct vx_scalar vx_scalar;
501+
492502
/**
493503
* A scan is a single traversal of a data source with projections and
494504
* filters. A scan can be consumed only once.
@@ -1105,6 +1115,40 @@ void vx_expression_free(vx_expression *ptr);
11051115
*/
11061116
vx_expression *vx_expression_root(void);
11071117

1118+
/**
1119+
* Create a literal expression from a scalar.
1120+
*
1121+
* Literal expressions are useful for constants in expression trees, especially scan
1122+
* predicates. For example, a caller can compare a column expression to a scalar
1123+
* threshold and pass the resulting predicate to `vx_data_source_scan`.
1124+
*
1125+
* Example:
1126+
*
1127+
* vx_error* error = NULL;
1128+
* const vx_data_source* data_source = ...;
1129+
*
1130+
* vx_expression* root = vx_expression_root();
1131+
* vx_expression* age = vx_expression_get_item("age", root);
1132+
*
1133+
* vx_scalar* threshold_scalar = vx_scalar_new_u8(50, false);
1134+
* vx_expression* threshold = vx_expression_literal(threshold_scalar, &error);
1135+
* vx_scalar_free(threshold_scalar);
1136+
*
1137+
* vx_expression* predicate = vx_expression_binary(VX_OPERATOR_GTE, age, threshold);
1138+
* vx_scan_options options = {};
1139+
* options.filter = predicate;
1140+
*
1141+
* vx_scan* scan = vx_data_source_scan(data_source, &options, NULL, &error);
1142+
*
1143+
* vx_scan_free(scan);
1144+
* vx_expression_free(predicate);
1145+
* vx_expression_free(threshold);
1146+
* vx_expression_free(age);
1147+
* vx_expression_free(root);
1148+
*
1149+
*/
1150+
vx_expression *vx_expression_literal(const vx_scalar *scalar, vx_error **err);
1151+
11081152
/**
11091153
* Create an expression that selects (includes) specific fields from a child
11101154
* expression. Child expression must have a DTYPE_STRUCT dtype. Errors in
@@ -1252,6 +1296,237 @@ vx_array_iterator *vx_file_scan(const vx_session *session,
12521296
*/
12531297
void vx_set_log_level(vx_log_level level);
12541298

1299+
/**
1300+
* Free an owned [`vx_scalar`] object.
1301+
*/
1302+
void vx_scalar_free(vx_scalar *ptr);
1303+
1304+
/**
1305+
* Clone a borrowed scalar handle.
1306+
*
1307+
* The input scalar handle is not consumed. The returned scalar handle must be
1308+
* released with vx_scalar_free. Returns NULL when given a NULL scalar handle.
1309+
*/
1310+
vx_scalar *vx_scalar_clone(const vx_scalar *scalar);
1311+
1312+
/**
1313+
* Return the data type of a scalar.
1314+
*
1315+
* The returned data type handle borrows storage from the scalar handle, so its
1316+
* lifetime is bound to the scalar handle. It MUST NOT be freed separately.
1317+
* Returns NULL when given a NULL scalar handle.
1318+
*/
1319+
const vx_dtype *vx_scalar_dtype(const vx_scalar *scalar);
1320+
1321+
/**
1322+
* Return whether the scalar is a typed null value.
1323+
*
1324+
* Returns false when given a NULL scalar handle.
1325+
*/
1326+
bool vx_scalar_is_null(const vx_scalar *scalar);
1327+
1328+
/**
1329+
* Create a boolean scalar.
1330+
*/
1331+
vx_scalar *vx_scalar_new_bool(bool value, bool is_nullable);
1332+
1333+
/**
1334+
* Create an unsigned 8-bit integer scalar.
1335+
*/
1336+
vx_scalar *vx_scalar_new_u8(uint8_t value, bool is_nullable);
1337+
1338+
/**
1339+
* Create an unsigned 16-bit integer scalar.
1340+
*/
1341+
vx_scalar *vx_scalar_new_u16(uint16_t value, bool is_nullable);
1342+
1343+
/**
1344+
* Create an unsigned 32-bit integer scalar.
1345+
*/
1346+
vx_scalar *vx_scalar_new_u32(uint32_t value, bool is_nullable);
1347+
1348+
/**
1349+
* Create an unsigned 64-bit integer scalar.
1350+
*/
1351+
vx_scalar *vx_scalar_new_u64(uint64_t value, bool is_nullable);
1352+
1353+
/**
1354+
* Create a signed 8-bit integer scalar.
1355+
*/
1356+
vx_scalar *vx_scalar_new_i8(int8_t value, bool is_nullable);
1357+
1358+
/**
1359+
* Create a signed 16-bit integer scalar.
1360+
*/
1361+
vx_scalar *vx_scalar_new_i16(int16_t value, bool is_nullable);
1362+
1363+
/**
1364+
* Create a signed 32-bit integer scalar.
1365+
*/
1366+
vx_scalar *vx_scalar_new_i32(int32_t value, bool is_nullable);
1367+
1368+
/**
1369+
* Create a signed 64-bit integer scalar.
1370+
*/
1371+
vx_scalar *vx_scalar_new_i64(int64_t value, bool is_nullable);
1372+
1373+
/**
1374+
* Create a 32-bit floating point scalar.
1375+
*/
1376+
vx_scalar *vx_scalar_new_f32(float value, bool is_nullable);
1377+
1378+
/**
1379+
* Create a 64-bit floating point scalar.
1380+
*/
1381+
vx_scalar *vx_scalar_new_f64(double value, bool is_nullable);
1382+
1383+
/**
1384+
* Create a 16-bit floating point scalar.
1385+
*
1386+
* The value is read from raw half-precision bits because C has no portable
1387+
* half-precision floating point ABI.
1388+
*/
1389+
vx_scalar *vx_scalar_new_f16_bits(uint16_t bits, bool is_nullable);
1390+
1391+
/**
1392+
* Create a UTF-8 scalar.
1393+
*
1394+
* The byte range is copied into the scalar. A NULL data pointer is allowed only
1395+
* for an empty byte range. Invalid UTF-8 returns NULL and writes the error
1396+
* output.
1397+
*/
1398+
vx_scalar *vx_scalar_new_utf8(const char *ptr, size_t len, bool is_nullable, vx_error **err);
1399+
1400+
/**
1401+
* Create a binary scalar.
1402+
*
1403+
* The byte range is copied into the scalar. A NULL data pointer is allowed only
1404+
* for an empty byte range. Passing a NULL data pointer for a non-empty byte
1405+
* range returns NULL and writes the error output.
1406+
*/
1407+
vx_scalar *vx_scalar_new_binary(const uint8_t *ptr, size_t len, bool is_nullable, vx_error **err);
1408+
1409+
/**
1410+
* Create a typed null scalar.
1411+
*
1412+
* The data type handle is borrowed, not consumed. The returned scalar uses a
1413+
* nullable copy of that logical type, regardless of the input type's top-level
1414+
* nullability. A NULL data type handle returns NULL and writes the error output.
1415+
*/
1416+
vx_scalar *vx_scalar_new_null(const vx_dtype *dtype, vx_error **err);
1417+
1418+
/**
1419+
* Create a decimal scalar.
1420+
*
1421+
* The unscaled value is provided as a signed 8-bit integer. Decimal precision
1422+
* and scale define the logical decimal type. Invalid decimal metadata or value
1423+
* overflow returns NULL and writes the error output.
1424+
*/
1425+
vx_scalar *
1426+
vx_scalar_new_decimal_i8(int8_t value, uint8_t precision, int8_t scale, bool is_nullable, vx_error **err);
1427+
1428+
/**
1429+
* Create a decimal scalar.
1430+
*
1431+
* The unscaled value is provided as a signed 16-bit integer. Decimal precision
1432+
* and scale define the logical decimal type. Invalid decimal metadata or value
1433+
* overflow returns NULL and writes the error output.
1434+
*/
1435+
vx_scalar *
1436+
vx_scalar_new_decimal_i16(int16_t value, uint8_t precision, int8_t scale, bool is_nullable, vx_error **err);
1437+
1438+
/**
1439+
* Create a decimal scalar.
1440+
*
1441+
* The unscaled value is provided as a signed 32-bit integer. Decimal precision
1442+
* and scale define the logical decimal type. Invalid decimal metadata or value
1443+
* overflow returns NULL and writes the error output.
1444+
*/
1445+
vx_scalar *
1446+
vx_scalar_new_decimal_i32(int32_t value, uint8_t precision, int8_t scale, bool is_nullable, vx_error **err);
1447+
1448+
/**
1449+
* Create a decimal scalar.
1450+
*
1451+
* The unscaled value is provided as a signed 64-bit integer. Decimal precision
1452+
* and scale define the logical decimal type. Invalid decimal metadata or value
1453+
* overflow returns NULL and writes the error output.
1454+
*/
1455+
vx_scalar *
1456+
vx_scalar_new_decimal_i64(int64_t value, uint8_t precision, int8_t scale, bool is_nullable, vx_error **err);
1457+
1458+
/**
1459+
* Create a decimal scalar.
1460+
*
1461+
* The unscaled value is read from a 16-byte little-endian signed integer
1462+
* buffer. Decimal precision and scale define the logical decimal type.
1463+
* Invalid decimal metadata or value overflow returns NULL and writes the error
1464+
* output.
1465+
*/
1466+
vx_scalar *vx_scalar_new_decimal_i128_le(const uint8_t *bytes16,
1467+
uint8_t precision,
1468+
int8_t scale,
1469+
bool is_nullable,
1470+
vx_error **err);
1471+
1472+
/**
1473+
* Create a decimal scalar.
1474+
*
1475+
* The unscaled value is read from a 32-byte little-endian signed integer
1476+
* buffer. Decimal precision and scale define the logical decimal type.
1477+
* Invalid decimal metadata or value overflow returns NULL and writes the error
1478+
* output.
1479+
*/
1480+
vx_scalar *vx_scalar_new_decimal_i256_le(const uint8_t *bytes32,
1481+
uint8_t precision,
1482+
int8_t scale,
1483+
bool is_nullable,
1484+
vx_error **err);
1485+
1486+
/**
1487+
* Create a list scalar.
1488+
*
1489+
* The element data type handle is borrowed, not consumed. Child scalar handles
1490+
* are cloned into the list value, so the caller keeps ownership of the handle
1491+
* array and each scalar in it. A NULL child handle array is allowed only for an
1492+
* empty list. Child values are validated against the element logical type.
1493+
*/
1494+
vx_scalar *vx_scalar_new_list(const vx_dtype *element_dtype,
1495+
const vx_scalar *const *elements,
1496+
size_t len,
1497+
bool is_nullable,
1498+
vx_error **err);
1499+
1500+
/**
1501+
* Create a fixed-size list scalar.
1502+
*
1503+
* The element data type handle is borrowed, not consumed. The number of child
1504+
* scalars becomes the fixed-size list width and must fit in a 32-bit unsigned
1505+
* integer. Child scalar handles are cloned into the list value, so the caller
1506+
* keeps ownership of the handle array and each scalar in it. A NULL child
1507+
* handle array is allowed only for an empty list. Child values are validated
1508+
* against the element logical type.
1509+
*/
1510+
vx_scalar *vx_scalar_new_fixed_size_list(const vx_dtype *element_dtype,
1511+
const vx_scalar *const *elements,
1512+
size_t len,
1513+
bool is_nullable,
1514+
vx_error **err);
1515+
1516+
/**
1517+
* Create a struct scalar.
1518+
*
1519+
* The struct data type handle is borrowed, not consumed. Field scalar handles
1520+
* are cloned into the struct value, so the caller keeps ownership of the handle
1521+
* array and each scalar in it. Field count and field logical types are validated
1522+
* against the struct logical type. A NULL field handle array is allowed only for
1523+
* an empty struct value.
1524+
*/
1525+
vx_scalar *vx_scalar_new_struct(const vx_dtype *struct_dtype,
1526+
const vx_scalar *const *fields,
1527+
size_t len,
1528+
vx_error **err);
1529+
12551530
/**
12561531
* Free an owned [`vx_scan`] object.
12571532
*/

0 commit comments

Comments
 (0)