Skip to content

Commit 93366d8

Browse files
committed
Support Query Expressions
Closes gh-129
1 parent eddd3bf commit 93366d8

19 files changed

Lines changed: 1416 additions & 81 deletions

.github/workflows/pr-build-workflow.yml

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,17 @@ jobs:
1414
with:
1515
java-version: 1.11
1616
- name: Install Reindexer
17+
# TODO: Temporarily build reindexer from sources until https://github.com/Restream/reindexer/issues/92 is released.
18+
# run: |
19+
# sudo curl https://repo.reindexer.io/RX-KEY.GPG -o /etc/apt/trusted.gpg.d/reindexer.asc
20+
# echo 'deb https://repo.reindexer.io/ubuntu-noble /' | sudo tee -a /etc/apt/sources.list
21+
# sudo apt-get update
22+
# sudo apt-get install -y reindexer-dev libopenblas-pthread-dev
1723
run: |
18-
sudo curl https://repo.reindexer.io/RX-KEY.GPG -o /etc/apt/trusted.gpg.d/reindexer.asc
19-
echo 'deb https://repo.reindexer.io/ubuntu-noble /' | sudo tee -a /etc/apt/sources.list
20-
sudo apt-get update
21-
sudo apt-get install -y reindexer-dev libopenblas-pthread-dev
24+
curl -L https://github.com/Restream/reindexer/raw/master/dependencies.sh | sudo bash -s
25+
git clone https://github.com/Restream/reindexer
26+
cmake reindexer -Breindexer/build
27+
cmake --build reindexer/build -- -j 4
28+
sudo cmake --install reindexer/build
2229
- name: Build with Maven
2330
run: mvn --batch-mode --update-snapshots verify

README.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,72 @@ InnerJoins can be used as a condition in Where clause:
305305
```
306306
Note that usually Or operator implements short-circuiting for Where conditions: if the previous condition is true the next one is not evaluated. But in case of InnerJoin it works differently: in query1 (from the example above) both InnerJoin conditions are evaluated despite the result of WhereInt. Limit(0) as part of InnerJoin (query3 from the example above) does not join any data - it works like a filter only to verify conditions.
307307

308+
### Query Expressions
309+
310+
#### Functions
311+
Reindexer provides built-in functions that can be used within `WHERE` clauses to enable advanced filtering capabilities beyond simple field comparisons.
312+
313+
##### flat_array_len(field_name)
314+
The `flat_array_len` function returns the length or cardinality of a specified field, making it particularly useful for filtering based on array sizes or field presence.
315+
The `flat_array_len` function can be used in both `SELECT` and `UPDATE` queries.
316+
317+
Behavior by Field Type:
318+
- Array Fields: returns the number of elements in the array
319+
- Scalar Fields (integers, strings, etc.): always returns 1
320+
- Object Fields: always returns 1
321+
- Nested Array Elements: returns the count of occurrences when the field is nested within arrays
322+
323+
Examples:
324+
325+
```java
326+
// Find social media posts with between 10 and 50 comments
327+
// and at least 3 attached media files.
328+
List<Post> posts = db.query("posts", Post.class)
329+
.where(Expression.flatArrayLength("comments"), RANGE, Expression.literal(10, 50))
330+
.where(Expression.flatArrayLength("media"), GE, Expression.literal(3))
331+
.toList();
332+
333+
// Update field 'size' with flat_array_len function.
334+
db.query("posts", Post.class)
335+
.where("id", EQ, 1)
336+
.set("size", Expression.string("flat_array_len(comments)"))
337+
.update();
338+
```
339+
340+
Notes:
341+
- `flat_array_len` function operates efficiently on indexed fields
342+
- Returns 0 if the specified field does not exist in a document
343+
- Supports the following comparison operators: (`=`, `>`, `>=`, `<`, `<=`, `Range`, `Set`)
344+
- Can be used in both `SELECT` and `UPDATE` queries
345+
346+
##### now(unit)
347+
The `now()` function returns the current system timestamp, making it particularly useful for time-based filtering and data synchronization.
348+
This function can be used in both `SELECT` and `UPDATE` queries.
349+
350+
Arguments:
351+
- `sec` - returns timestamp in seconds (default if no argument is provided)
352+
- `msec` - returns timestamp in milliseconds
353+
- `usec` - returns timestamp in microseconds
354+
- `nsec` - returns timestamp in nanoseconds
355+
356+
```java
357+
// Find events that occurred in the past.
358+
List<Event> events = db.query("events", Event.class)
359+
.where(Expression.field("timestamp"), LE, Expression.now(TimeUnit.SECONDS))
360+
.toList();
361+
362+
db.query("items", Item.class)
363+
.where("id", EQ, 42)
364+
.set("updated_at", Expression.string("now(usec)"))
365+
.update();
366+
```
367+
368+
Notes:
369+
- The returned timestamp represents seconds (or subunits) since the Unix epoch (January 1, 1970)
370+
- Time resolution depends on the specified unit - use `nsec` for maximum precision
371+
- All instances of `now()` within a single query share the same value, which is computed at the start of the query execution
372+
- Useful for implementing TTL (Time-To-Live) functionality and audit logging
373+
308374
### Transactions and batch update
309375

310376
Reindexer supports transactions. Transaction are performs atomic namespace update. There are synchronous and

src/main/java/ru/rt/restream/reindexer/Query.java

Lines changed: 112 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,12 @@
2020
import ru.rt.restream.reindexer.annotations.Hnsw;
2121
import ru.rt.restream.reindexer.annotations.Ivf;
2222
import ru.rt.restream.reindexer.annotations.VecBf;
23-
import ru.rt.restream.reindexer.binding.Consts;
2423
import ru.rt.restream.reindexer.binding.QueryResult;
2524
import ru.rt.restream.reindexer.binding.RequestContext;
2625
import ru.rt.restream.reindexer.binding.TransactionContext;
2726
import ru.rt.restream.reindexer.binding.cproto.ByteBuffer;
2827
import ru.rt.restream.reindexer.binding.cproto.cjson.PayloadType;
28+
import ru.rt.restream.reindexer.expression.Expression;
2929
import ru.rt.restream.reindexer.util.JsonSerializer;
3030
import ru.rt.restream.reindexer.util.Pair;
3131
import ru.rt.restream.reindexer.vector.params.KnnSearchParam;
@@ -36,10 +36,10 @@
3636
import java.util.Collections;
3737
import java.util.Deque;
3838
import java.util.List;
39+
import java.util.Objects;
3940
import java.util.Optional;
4041
import java.util.Spliterator;
4142
import java.util.Spliterators;
42-
import java.util.UUID;
4343
import java.util.stream.Stream;
4444
import java.util.stream.StreamSupport;
4545

@@ -56,8 +56,6 @@
5656
import static ru.rt.restream.reindexer.binding.Consts.MERGE;
5757
import static ru.rt.restream.reindexer.binding.Consts.MODE_ACCURATE_TOTAL;
5858
import static ru.rt.restream.reindexer.binding.Consts.OR_INNER_JOIN;
59-
import static ru.rt.restream.reindexer.binding.Consts.VALUE_BOOL;
60-
import static ru.rt.restream.reindexer.binding.Consts.VALUE_NULL;
6159
import static ru.rt.restream.reindexer.binding.Consts.VALUE_STRING;
6260

6361
/**
@@ -112,6 +110,7 @@ public class Query<T> {
112110
private static final int QUERY_FIELD_SUB_QUERY_CONDITION = 30;
113111
private static final int QUERY_LOCAL = 31;
114112
private static final int QUERY_KNN_CONDITION = 32;
113+
private static final int QUERY_EXPRESSION_CONDITION = 36;
115114

116115
/**
117116
* Condition types.
@@ -403,7 +402,7 @@ public Query<T> where(String indexName, Condition condition, Object... values) {
403402

404403
buffer.putVarUInt32(values.length);
405404
for (Object key : values) {
406-
putValue(key);
405+
buffer.putValue(key);
407406
}
408407

409408
return this;
@@ -444,7 +443,7 @@ public Query<T> where(Query<?> subquery, Condition condition, Object... values)
444443

445444
buffer.putVarUInt32(values.length);
446445
for (Object key : values) {
447-
putValue(key);
446+
buffer.putValue(key);
448447
}
449448

450449
return this;
@@ -472,6 +471,60 @@ public Query<T> where(String indexName, Condition condition, Query<?> subquery)
472471
return this;
473472
}
474473

474+
/**
475+
* Where predicate between {@code left} and {@code right} expressions using {@code condition}.
476+
* Supported combinations:
477+
* <ul>
478+
* <li>{@link Expression#field(String) field} condition {@link Expression#literal(Object...) values}</li>
479+
* <li>{@link Expression#field(String) field} condition {@link Expression#now(TimeUnit) now(unit)}</li>
480+
* <li>{@link Expression#field(String) field} condition {@link Expression#flatArrayLength(String) flat_array_len(field)}</li>
481+
* <li>{@link Expression#field(String) field} condition {@link Expression#subQuery(Query) subquery}</li>
482+
* <li>{@link Expression#field(String) field} condition {@link Expression#field(String) field}</li>
483+
* <li>{@link Expression#flatArrayLength(String) flat_array_len(field)} condition {@link Expression#literal(Object...) values}</li>
484+
* <li>{@link Expression#flatArrayLength(String) flat_array_len(field)} condition {@link Expression#subQuery(Query) subquery}</li>
485+
* <li>{@link Expression#subQuery(Query) subquery} condition {@link Expression#literal(Object...) values}</li>
486+
* <li>{@link Expression#subQuery(Query) subquery} condition {@link Expression#now(TimeUnit) now(unit)}</li>
487+
* <li>{@link Expression#subQuery(Query) subquery} condition {@link Expression#flatArrayLength(String) flat_array_len(field)}</li>
488+
* </ul>
489+
* Example: {@code where(Expression.field("created_at"), Condition.LE, Expression.now())}.
490+
* <p>
491+
* Note: the {@link Expression#string(String) arithmetical string expressions} are not currently supported.
492+
*
493+
* @param left the left operand {@link Expression} to use
494+
* @param condition the {@link Condition} to use
495+
* @param right the right operand {@link Expression} to use
496+
* @return the {@link Query} for further customizations
497+
* @see Expression for factory methods to build different types of expressions
498+
*/
499+
public Query<T> where(Expression left, Condition condition, Expression right) {
500+
Objects.requireNonNull(left, "left expression cannot be null");
501+
Objects.requireNonNull(right, "right expression cannot be null");
502+
if (!left.supports(Expression.Type.CONDITIONAL)) {
503+
throw new IllegalArgumentException(
504+
String.format("Left expression: '%s' cannot be used for select", left));
505+
}
506+
if (!right.supports(Expression.Type.CONDITIONAL)) {
507+
throw new IllegalArgumentException(
508+
String.format("Right expression: '%s' cannot be used for select", right));
509+
}
510+
511+
logBuilder.where(nextOperation, left, condition.code, right);
512+
513+
buffer.putVarUInt32(QUERY_EXPRESSION_CONDITION);
514+
515+
left.serialize(buffer);
516+
517+
buffer.putVarUInt32(nextOperation);
518+
buffer.putVarUInt32(condition.code);
519+
520+
right.serialize(buffer);
521+
522+
nextOperation = OP_AND;
523+
queryCount++;
524+
525+
return this;
526+
}
527+
475528
/**
476529
* The condition are possible only on the vector indexed fields,
477530
* marked with {@link Hnsw}, {@link Ivf}, {@link VecBf} annotations.
@@ -785,7 +838,7 @@ public Query<T> sort(String index, boolean desc, Object... values) {
785838

786839
buffer.putVarUInt32(values.length);
787840
for (Object value : values) {
788-
putValue(value);
841+
buffer.putValue(value);
789842
}
790843

791844
return this;
@@ -803,55 +856,6 @@ public Query<T> fetchCount(int fetchCount) {
803856
return this;
804857
}
805858

806-
private void putValue(Object value) {
807-
if (value == null) {
808-
buffer.putVarUInt32(VALUE_NULL);
809-
} else if (value instanceof Boolean) {
810-
buffer.putVarUInt32(VALUE_BOOL);
811-
if ((Boolean) value) {
812-
buffer.putVarUInt32(1);
813-
} else {
814-
buffer.putVarUInt32(0);
815-
}
816-
} else if (value instanceof Integer) {
817-
buffer.putVarUInt32(Consts.VALUE_INT)
818-
.putVarInt64((Integer) value);
819-
} else if (value instanceof String) {
820-
buffer.putVarUInt32(Consts.VALUE_STRING)
821-
.putVString((String) value);
822-
} else if (value instanceof Long) {
823-
buffer.putVarUInt32(Consts.VALUE_INT_64)
824-
.putVarInt64((Long) value);
825-
} else if (value instanceof Byte) {
826-
buffer.putVarUInt32(Consts.VALUE_INT)
827-
.putVarInt64((Byte) value);
828-
} else if (value instanceof Short) {
829-
buffer.putVarUInt32(Consts.VALUE_INT)
830-
.putVarInt64((Short) value);
831-
} else if (value instanceof Double) {
832-
buffer.putVarUInt32(Consts.VALUE_DOUBLE)
833-
.putDouble((Double) value);
834-
} else if (value instanceof Float) {
835-
Float floatValue = (Float) value;
836-
buffer.putVarUInt32(Consts.VALUE_DOUBLE)
837-
.putDouble(floatValue.doubleValue());
838-
} else if (value instanceof Character) {
839-
Character character = (Character) value;
840-
buffer.putVarUInt32(Consts.VALUE_STRING)
841-
.putVString(character.toString());
842-
} else if (value instanceof UUID) {
843-
buffer.putVarUInt32(Consts.VALUE_UUID)
844-
.putUuid((UUID) value);
845-
} else if (value instanceof Object[]) {
846-
buffer.putVarUInt32(Consts.VALUE_TUPLE);
847-
Object[] objects = (Object[]) value;
848-
buffer.putVarUInt32(objects.length);
849-
for (Object object : objects) {
850-
putValue(object);
851-
}
852-
}
853-
}
854-
855859
/**
856860
* Will execute query, and return stream of items.
857861
* The returned stream must be closed using the {@link Stream#close()} method or
@@ -1122,7 +1126,7 @@ public Query<T> set(String fieldName, Object value) {
11221126
buffer.putVarUInt32(values.size());
11231127
for (Object v : values) {
11241128
buffer.putVarUInt32(0);
1125-
putValue(v);
1129+
buffer.putValue(v);
11261130
}
11271131
} else if (value != null && value.getClass().isArray()) {
11281132
Object[] values = (Object[]) value;
@@ -1137,19 +1141,49 @@ public Query<T> set(String fieldName, Object value) {
11371141
buffer.putVarUInt32(values.length);
11381142
for (Object v : values) {
11391143
buffer.putVarUInt32(0);
1140-
putValue(v);
1144+
buffer.putValue(v);
11411145
}
11421146
} else {
11431147
buffer.putVarUInt32(cmd);
11441148
buffer.putVString(fieldName);
11451149
buffer.putVarUInt32(1);
11461150
buffer.putVarUInt32(0);
1147-
putValue(value);
1151+
buffer.putValue(value);
11481152
}
11491153

11501154
return this;
11511155
}
11521156

1157+
/**
1158+
* Updates indexed field by {@link Expression}.
1159+
* <p>
1160+
* Example: {@code set("created_at", Expression.string("now() - 1 * 24 * 60 * 60"))}.
1161+
* <p>
1162+
* Note: currently, only {@link Expression#string(String)} is supported.
1163+
*
1164+
* @param fieldName the field name to use
1165+
* @param expression the expression to use
1166+
* @return the {@link Query} for further customizations
1167+
* @see Expression#string(String)
1168+
*/
1169+
public Query<T> set(String fieldName, Expression expression) {
1170+
if (expression == null) {
1171+
return set(fieldName, (Object) null);
1172+
}
1173+
if (!expression.supports(Expression.Type.MODIFYING)) {
1174+
throw new IllegalArgumentException(String.format("Expression: '%s' cannot be used for update", expression));
1175+
}
1176+
1177+
logBuilder.set(fieldName, expression);
1178+
1179+
buffer.putVarUInt32(QUERY_UPDATE_FIELD);
1180+
buffer.putVString(fieldName);
1181+
1182+
expression.serialize(buffer);
1183+
1184+
return this;
1185+
}
1186+
11531187
private void setObject(String fieldName, Object value) {
11541188
boolean isArray = false;
11551189
int count = 1;
@@ -1286,4 +1320,23 @@ String getSql() {
12861320
return logBuilder.getSql();
12871321
}
12881322

1323+
/**
1324+
* Returns all used bytes from the {@link ByteBuffer}.
1325+
*
1326+
* @return all used bytes from the {@code ByteBuffer}
1327+
*/
1328+
public byte[] bytes() {
1329+
return buffer.bytes();
1330+
}
1331+
1332+
/**
1333+
* Returns the string representation of the query.
1334+
*
1335+
* @return the string representation of the query
1336+
*/
1337+
@Override
1338+
public String toString() {
1339+
return getSql();
1340+
}
1341+
12891342
}

0 commit comments

Comments
 (0)