Skip to content

Commit 50ac897

Browse files
feat(query): add KQL (Kibana Query Language) support
Adds optional KQL parsing as a thin translation layer at the REST entry point. KQL inputs are parsed and lowered to QueryAst using only existing variants (BoolQuery, FullTextQuery, RangeQuery, FieldPresenceQuery, WildcardQuery, MatchAll, UserInputQuery) — the core enum, visitor traits, tag pruning, and root-search remain unchanged. Wire surface: * Native REST: `?kql=<expr>` on /api/v1/{index}/search, mutually exclusive with the existing `query=` parameter. * Elastic-compat JSON: `{"query": {"kql": {"query": "...", ...}}}` on /api/v1/_elastic/{index}/_search. Documented as a Quickwit-only extension since real Elasticsearch returns parsing_exception. Supported grammar (matches the public Kibana KQL reference): field-value, quoted phrases, bare default-field terms, `*` (match-all fast path), `?`/`*` wildcards, `field:*` exists, `field:>=N` / `>` / `<=` / `<` ranges (numeric literals coerced to JsonLiteral::Number, non-numeric falls back to String), boolean and/or/not (case-insensitive), juxtaposition as implicit AND, parens, `field:(a or b)` value groups with proper field distribution, escape semantics (`\and`, `\:`, `\+`). Safety rails (all return HTTP 400 with specific error messages): * Max KQL input length: 16 KiB (REST layer) * Max parser nesting depth: 64 * Max bare-token length: 1 KiB * Max quoted-phrase length: 4 KiB * `{...}` nested-field syntax rejected (Quickwit has no nested type) * Nested field qualifier inside value group rejected * `query` and `kql` mutually exclusive Observability: * quickwit_kql_parse_total counter * quickwit_kql_parse_failures_total counter * quickwit_kql_parse_duration_seconds histogram * Structured `kql=<bool>`, `tantivy_grammar=<bool>` fields on search log lines so SRE can split KQL vs Tantivy-grammar traffic without parsing raw query strings. OpenAPI: * `query` is now `#[serde(default)]` (semantically optional at the wire layer); utoipa override exposes both `query` and `kql` as `Option<String>` so generated SDK clients no longer encode the obsolete `required: ["query"]` contract. Tests: * 246 unit tests in quickwit-query covering lexer, parser (recursive-descent with depth guard), lowering (with Tantivy-grammar escape handling for default-field deferral), metrics wiring, JSON DSL deserialization, and proptest fuzz (~6k cases) confirming the parser never panics on arbitrary input. * Kibana conformance corpus pinning expected ASTs for each documented KQL idiom + explicit notes on intentional divergences. * REST handler unit tests for `kql`/`query` mutual exclusion, whitespace handling, size caps, and search_fields propagation. * Integration scenarios under rest-api-tests/scenarii/kql_search/ asserting exact hit counts against a known dataset. * Concurrent load harness (load_test.py) mixing happy-path and adversarial shapes; multi-node docker-compose template for distributed root→leaf testing. Line coverage on KQL production code: 95-99% per file; the remaining gaps are defensive code, test panic-guards in let-else patterns, and lazy_counter!/lazy_histogram! macro internals the coverage tool cannot introspect. Files modified outside the new kql/ module: 5 (Cargo.lock, quickwit-cli/src/tool.rs, quickwit-query/Cargo.toml, quickwit-query/src/elastic_query_dsl/mod.rs, quickwit-query/src/lib.rs, quickwit-serve/src/search_api/rest_handler.rs). The core QueryAst enum, QueryAstVisitor, QueryAstTransformer, tag_pruning, and root-search are untouched.
1 parent 9c5a2f2 commit 50ac897

23 files changed

Lines changed: 3539 additions & 7 deletions

quickwit/Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

quickwit/quickwit-cli/src/tool.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -542,6 +542,7 @@ pub async fn local_search_cli(args: LocalSearchArgs) -> anyhow::Result<()> {
542542
let sort_by: SortBy = args.sort_by_field.map(SortBy::from).unwrap_or_default();
543543
let search_request_query_string = SearchRequestQueryString {
544544
query: args.query,
545+
kql: None,
545546
start_offset: args.start_offset as u64,
546547
max_hits: args.max_hits as u64,
547548
search_fields: args.search_fields,

quickwit/quickwit-query/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ rustc-hash = { workspace = true }
2929

3030
quickwit-common = { workspace = true }
3131
quickwit-datetime = { workspace = true }
32+
quickwit-metrics = { workspace = true }
3233
quickwit-proto = { workspace = true }
3334

3435
[dev-dependencies]
Lines changed: 258 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,258 @@
1+
# KQL — Kibana Query Language support
2+
3+
> ⚠️ **Disambiguation.** "KQL" is overloaded in the industry — it refers
4+
> to two unrelated query languages:
5+
>
6+
> - **Kibana Query Language** (this module): a single-expression
7+
> predicate grammar — `level:error and status:>=500` — used by the
8+
> Kibana UI for log search. Public grammar reference:
9+
> <https://www.elastic.co/docs/explore-analyze/query-filter/languages/kql>.
10+
> - **Kusto Query Language** (Microsoft): a pipeline language —
11+
> `Table | where x > 5 | summarize count() by foo | top 10 by ts`
12+
> used by Azure Data Explorer, Log Analytics, Sentinel, Defender.
13+
> **Not implemented here.** If you want Kusto support, propose it
14+
> under a different name (e.g. `kusto`, `kustoql`) to avoid collision
15+
> with this module.
16+
>
17+
> Throughout this codebase, `KQL` / `kql` / `?kql=` always means the
18+
> Kibana variant.
19+
20+
End-user query language for Quickwit, drawn from the public Kibana KQL
21+
grammar referenced above.
22+
23+
This module owns the parser, the AST, the lowering pass to Quickwit's
24+
internal `QueryAst`, and the Prometheus metrics emitted from the parse
25+
path.
26+
27+
## Wire surface
28+
29+
Two ways to send KQL to a running Quickwit cluster.
30+
31+
### 1. Native REST parameter
32+
33+
```bash
34+
curl 'http://<host>:7280/api/v1/<index>/search?kql=level:error+and+status:>=500'
35+
```
36+
37+
The `kql` query parameter is mutually exclusive with the existing `query`
38+
parameter (Tantivy/Lucene-ish grammar). Exactly one must be supplied; both
39+
or neither returns HTTP 400.
40+
41+
POST variant:
42+
43+
```bash
44+
curl -X POST 'http://<host>:7280/api/v1/<index>/search' \
45+
-H 'Content-Type: application/json' \
46+
-d '{"kql": "level:error and service:api", "max_hits": 20}'
47+
```
48+
49+
**Note.** KQL is intentionally **not** exposed via the
50+
`/api/v1/_elastic/<index>/_search` endpoint. That namespace mirrors the
51+
Elasticsearch query DSL, which has no `kql` variant — a real
52+
Elasticsearch cluster rejects `{"query": {"kql": ...}}` with
53+
`parsing_exception`. Keeping the `_elastic/` surface honest means KQL
54+
lives only on the two native paths above.
55+
56+
## Supported grammar
57+
58+
Every form documented in the Kibana KQL reference, modulo the divergences
59+
called out below. The conformance corpus
60+
[`kibana_conformance.rs`](kibana_conformance.rs) pins the expected AST for
61+
each idiom and fails CI on drift.
62+
63+
| Form | Example |
64+
|---|---|
65+
| Field-value match | `level:error` |
66+
| Phrase match | `message:"connection refused"` |
67+
| Bare term against default fields | `refused` |
68+
| Bare phrase against default fields | `"connection refused"` |
69+
| Wildcard value | `service:work*` |
70+
| Field-exists check | `level:*` |
71+
| Match-all | `*` (lowers to `QueryAst::MatchAll`, no automaton work) |
72+
| Boolean AND (explicit) | `level:error and service:api` |
73+
| Boolean AND (juxtaposition) | `level:error service:api` |
74+
| Boolean OR | `level:error or level:warn` |
75+
| Boolean NOT | `not level:error` |
76+
| Parens | `(level:error or level:warn) and service:api` |
77+
| Value group OR | `level:(error or warn)` |
78+
| Value group AND | `tags:(prod and critical)` |
79+
| Range `>=` / `>` / `<=` / `<` | `status:>=500`, `latency_ms:<0.1` |
80+
| Compound range | `status:>=200 and status:<500` |
81+
| Quoted ISO timestamp in range | `@timestamp:<"2025-01-01T00:00:00Z"` |
82+
| Escaped colon in field name | `metric\:count:value` |
83+
| Escaped keyword as field name | `\and:value` |
84+
85+
Precedence: `not` binds tightest, then `and`, then `or` (loosest). Parens
86+
override.
87+
88+
## Intentional divergences from Kibana
89+
90+
| Kibana behavior | Quickwit behavior | Reason |
91+
|---|---|---|
92+
| Unquoted ISO timestamps in range values (`@timestamp:>=2025-01-01T00:00:00Z`) | Requires quotes (`@timestamp:>="2025-01-01T00:00:00Z"`) | Our lexer tokenizes on `:`. Documented in error messages. |
93+
| Nested-field object syntax (`nested:{ name:foo }`) | Rejected with a clear error pointing to flat dotted paths | Quickwit has no nested-field type. |
94+
| `field:(other:value)` — nested field qualifier in value group | Rejected | Silent rebinding would be a wrong-data footgun. Kibana also errors. |
95+
96+
## Safety rails
97+
98+
All limits are hard caps — exceeding them returns HTTP 400 with a specific
99+
error message.
100+
101+
| Limit | Value | Where |
102+
|---|---|---|
103+
| Max KQL string length (REST) | 16,384 bytes | [`rest_handler.rs:MAX_KQL_INPUT_LEN`](../../../quickwit-serve/src/search_api/rest_handler.rs) |
104+
| Max parser nesting depth | 64 | [`parser.rs:MAX_KQL_DEPTH`](parser.rs) |
105+
| Max bare-token length | 1,024 bytes | [`lexer.rs:MAX_BARE_TOKEN_LEN`](lexer.rs) |
106+
| Max quoted-phrase length | 4,096 bytes | [`lexer.rs:MAX_PHRASE_LEN`](lexer.rs) |
107+
108+
Together these close the obvious DoS angles: oversized inputs, pathological
109+
nesting, single-token memory bombs.
110+
111+
## Observability
112+
113+
Prometheus metrics emitted from the parse path under the `quickwit_kql_*`
114+
namespace:
115+
116+
| Metric | Type | Meaning |
117+
|---|---|---|
118+
| `quickwit_kql_parse_total` | counter | Every parse attempt that reaches `KqlQuery::parse_user_query` |
119+
| `quickwit_kql_parse_failures_total` | counter | Subset that returned an error |
120+
| `quickwit_kql_parse_duration_seconds` | histogram | Wall-clock from parse-start to AST-or-error |
121+
122+
Structured tracing fields on every search log line: `kql=true/false`,
123+
`tantivy_grammar=true/false` — lets you split KQL vs. Lucene traffic in
124+
Splunk/Elastic without parsing raw query strings.
125+
126+
## Architecture
127+
128+
KQL is translated eagerly at the REST entry point. There is **no new
129+
variant on `QueryAst`** — the output is built from existing variants
130+
(`BoolQuery`, `FullTextQuery`, `RangeQuery`, `FieldPresenceQuery`,
131+
`WildcardQuery`, `MatchAll`, `UserInputQuery`). Bare default-field values
132+
are wrapped in `UserInputQuery` so the existing search root resolves
133+
them against each index's `default_search_fields` — same deferred-
134+
resolution mechanism the Tantivy-grammar `?query=` path already uses.
135+
136+
```
137+
┌──────────────────────────────────────────────────────────┐
138+
│ REST handler │
139+
│ quickwit-serve/src/search_api/rest_handler.rs │
140+
│ • SearchRequestQueryString { query, kql, ... } │
141+
│ • build_query_ast(kql_text) → kql_to_query_ast(...) │
142+
└─────────────────────────┬────────────────────────────────┘
143+
144+
145+
┌──────────────────────────────────────────────────────────┐
146+
│ kql/ ◀──── you are here │
147+
│ lexer.rs → Token stream, size caps │
148+
│ parser.rs → KqlAst, depth cap │
149+
│ lower.rs → KqlAst → existing QueryAst variants │
150+
│ (Bool / FullText / Range / FieldPresence │
151+
│ / Wildcard / MatchAll / UserInputQuery) │
152+
│ metrics.rs → counters + histogram │
153+
└─────────────────────────┬────────────────────────────────┘
154+
│ QueryAst (no new variant)
155+
156+
┌──────────────────────────────────────────────────────────┐
157+
│ Existing search pipeline — UNCHANGED │
158+
│ quickwit-search/src/root.rs │
159+
│ • UserInputQuery vessels resolve via the existing │
160+
│ deferred-default-field path │
161+
│ • All other variants flow through as-is │
162+
└──────────────────────────────────────────────────────────┘
163+
```
164+
165+
## Testing layers
166+
167+
| Layer | Where | What it proves |
168+
|---|---|---|
169+
| Unit | this crate's `#[cfg(test)]` blocks | Per-function correctness for lexer / parser / lowering / metrics wire-up |
170+
| Conformance | [`kibana_conformance.rs`](kibana_conformance.rs) | Documented Kibana grammar idioms produce the expected `KqlAst` |
171+
| Proptest fuzz | inside `parser.rs::tests::proptest_*` | Parser never panics for arbitrary ASCII or Unicode input (≈6k cases per run) |
172+
| Integration | [`../../../../rest-api-tests/scenarii/kql_search/`](../../../../rest-api-tests/scenarii/kql_search/) | End-to-end through the HTTP stack against a real index — exact `num_hits` per query |
173+
| Load | [`../../../../rest-api-tests/scenarii/kql_search/load_test.py`](../../../../rest-api-tests/scenarii/kql_search/load_test.py) | Throughput + p50/p95/p99 + safety-rail behavior under concurrency |
174+
| Multi-node | [`../../../../rest-api-tests/scenarii/kql_search/docker-compose.cluster.yml`](../../../../rest-api-tests/scenarii/kql_search/docker-compose.cluster.yml) | Distributed root→leaf, PostgreSQL metastore, LocalStack S3 |
175+
176+
Run the integration scenarios:
177+
178+
```bash
179+
cd quickwit/rest-api-tests
180+
python3 run_tests.py --engine quickwit \
181+
--binary <path>/target/debug/quickwit \
182+
--test scenarii/kql_search
183+
```
184+
185+
## Isolation audit — what this feature touches in the rest of the codebase
186+
187+
KQL is implemented as a **thin translation layer at the REST entry
188+
point**, not as a new query AST node. The `QueryAst` enum, the visitor
189+
traits, tag pruning, and root-search are all unchanged.
190+
191+
### New files (pure isolation)
192+
193+
| Path | Purpose |
194+
|---|---|
195+
| `quickwit-query/src/kql/` (this directory) | Lexer, parser, AST, lowering, metrics, conformance corpus |
196+
| `rest-api-tests/scenarii/kql_search/` | YAML scenarios, load test, multi-node compose |
197+
198+
### Existing files modified
199+
200+
| File | Change | Risk to non-KQL traffic |
201+
|---|---|---|
202+
| `quickwit-query/src/lib.rs` | `mod kql;` + `pub use kql::kql_to_query_ast` | None — adds a module and one public function |
203+
| `quickwit-query/Cargo.toml` | Added `quickwit-metrics` dep | None — already a workspace member |
204+
| `quickwit-serve/src/search_api/rest_handler.rs` | Added `kql` field to `SearchRequestQueryString`; new `build_query_ast` helper that calls `kql_to_query_ast`; structured log fields | **One wire-contract change**: `query` was required, now `#[serde(default)]`. Requests with `{}` previously failed at JSON deserialization; now fail at validation with HTTP 400 "either `query` or `kql` must be supplied". OpenAPI schema correctly reports both as optional/nullable. |
205+
| `quickwit-cli/src/tool.rs` | Added `kql: None` to one struct literal that didn't use `..Default::default()` | None |
206+
207+
### Files I did NOT touch
208+
209+
- `quickwit-query/src/query_ast/mod.rs`**the core `QueryAst` enum is unchanged.** No new variant, no new match arms.
210+
- `quickwit-query/src/query_ast/visitor.rs`**`QueryAstVisitor` and `QueryAstTransformer` traits are unchanged.** External visitors keep working without recompilation.
211+
- `quickwit-query/src/elastic_query_dsl/mod.rs`**the ES query DSL enum is unchanged.** KQL is deliberately not exposed under `_elastic/` because real Elasticsearch has no `kql` variant.
212+
- `quickwit-doc-mapper/src/tag_pruning.rs` — unchanged.
213+
- `quickwit-search/src/root.rs` — unchanged.
214+
- All ES DSL variants (`term`, `match`, `range`, `bool`, ...) — unchanged.
215+
- Indexing pipeline, metastore, storage, control plane, cluster, actors — unchanged.
216+
- The Tantivy-grammar `UserInputQuery` lowering path — unchanged (KQL reuses it as a deferred-resolution vessel; no code change to that path).
217+
- On-disk data formats — zero impact.
218+
219+
### What "no effect on main code" actually means
220+
221+
- **End users of the existing `query=` parameter or other ES DSL variants**: no behavior change.
222+
- **Operators / SREs**: a handful of new metrics under `quickwit_kql_*`, no removed metrics, no changes to existing dashboards.
223+
- **Data at rest**: zero impact. KQL translates to the same `QueryAst` types Quickwit already executes.
224+
- **Rust callers of `QueryAst`, `QueryAstVisitor`, `QueryAstTransformer`**: zero source-level breakage. These types are exactly as they were before this feature.
225+
- **Rust callers of `SearchRequestQueryString`**: one new field (`kql: Option<String>`); callers using `..Default::default()` keep working; the one in-workspace site that listed every field (`quickwit-cli/src/tool.rs`) was updated.
226+
227+
This is the wrapper architecture — KQL is added without growing the core
228+
query system's surface. The full-integration variant (a new
229+
`QueryAst::Kql` variant with deferred parsing at root) is also viable and
230+
would have been more ergonomic for cross-index queries with differing
231+
defaults, but it required ~377 lines across 9 files including visitor
232+
trait extensions. The wrapper variant trades a tiny bit of fidelity (the
233+
multi-index error message stays as the existing generic Tantivy-grammar
234+
one) for a substantially smaller, more reviewable change.
235+
236+
## Performance reference
237+
238+
Numbers from a 30-second load test against a debug build, single node, on
239+
a MacBook (the floor — release-mode + a real load generator will go
240+
substantially higher):
241+
242+
- Sustained throughput: **~1,560 req/s** across 14 happy-path shapes + 5
243+
adversarial shapes
244+
- p99 latency under load: **< 30 ms** for every happy-path shape
245+
- p99 latency for adversarial rejects: **< 18 ms** (rejection happens
246+
before any search work)
247+
- Parse-stage cost: **93% of parses < 100 µs**, 100% < 1 ms
248+
- Errors during 47k-request sweep: **0 unexpected statuses**
249+
250+
## Known limitations (not yet implemented)
251+
252+
- Real Kibana frontend has not been pointed at this server. Grammar
253+
matches the public Kibana docs; the conformance corpus pins each
254+
idiom. A standing Kibana → Quickwit smoke test is the next layer.
255+
- The KQL parser is hand-rolled; the Tantivy-grammar path uses
256+
`tantivy::query_grammar`. Two parsers means two maintenance surfaces.
257+
Consolidating either upstream or behind a single grammar is deferred.
258+
- Authenticated / multi-tenant exercising not covered by these tests.
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
// Copyright 2021-Present Datadog, Inc.
2+
//
3+
// Licensed under the Apache License, Version 2.0 (the "License");
4+
// you may not use this file except in compliance with the License.
5+
// You may obtain a copy of the License at
6+
//
7+
// http://www.apache.org/licenses/LICENSE-2.0
8+
//
9+
// Unless required by applicable law or agreed to in writing, software
10+
// distributed under the License is distributed on an "AS IS" BASIS,
11+
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
// See the License for the specific language governing permissions and
13+
// limitations under the License.
14+
15+
/// Parsed KQL expression. This is an internal representation that lowers to
16+
/// `QueryAst` via `lower_kql_ast`.
17+
#[derive(Debug, Clone, PartialEq, Eq)]
18+
pub(crate) enum KqlAst {
19+
/// Conjunction of subqueries. Empty vector is not produced by the parser.
20+
And(Vec<KqlAst>),
21+
/// Disjunction of subqueries.
22+
Or(Vec<KqlAst>),
23+
/// Negation of a subquery.
24+
Not(Box<KqlAst>),
25+
/// `field:value` clause.
26+
FieldValue { field: String, value: KqlValue },
27+
/// `field:<op><value>` numeric/datetime range bound.
28+
FieldRange {
29+
field: String,
30+
op: RangeOp,
31+
value: String,
32+
},
33+
/// `field:*` — checks whether the field is present on the document.
34+
FieldExists { field: String },
35+
/// Bare value with no field qualifier — matches against default fields.
36+
DefaultValue(KqlValue),
37+
}
38+
39+
#[derive(Debug, Clone, PartialEq, Eq)]
40+
pub(crate) enum KqlValue {
41+
/// Unquoted token. May contain `*` and `?` wildcards.
42+
Literal(String),
43+
/// Double-quoted phrase. Wildcards inside are treated literally.
44+
Phrase(String),
45+
}
46+
47+
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
48+
pub(crate) enum RangeOp {
49+
Gt,
50+
Gte,
51+
Lt,
52+
Lte,
53+
}

0 commit comments

Comments
 (0)