feat(vrl): add drop_on_abort option to VRL transform (#5664)#6472
feat(vrl): add drop_on_abort option to VRL transform (#5664)#6472siva-abstract-security wants to merge 3 commits into
drop_on_abort option to VRL transform (#5664)#6472Conversation
…5664) Adds a `drop_on_abort` boolean to `TransformConfig` (default `true`, matching Vector). When enabled, documents terminated by a VRL `abort` expression are: - silently dropped (no `warn!` log) - routed to a distinct `DocProcessorError::TransformAbort` variant - counted in a new `transform_aborts` counter / `transform_abort` label on `quickwit_doc_processor_docs_total` - excluded from `num_invalid_docs()` (the doc is *intentionally* dropped, not invalid); surfaced via new `num_dropped_docs()` When `drop_on_abort: false`, an `abort` is treated identically to an unexpected VRL runtime error (warn + transform_errors). Unexpected VRL runtime errors (`Terminate::Error`) always warn regardless of the flag. Tests covering edge cases, error conditions, error handling, validations: - `TransformConfig` serde (8 inner cases): default applied on missing field, explicit true/false roundtrip, JSON+YAML parity, unknown field rejected, **wrong-type rejected** (string `"true"` fails to parse) - `VrlProgram` (8 tests): classification across {Abort, Error, Ok} × {drop_on_abort=true,false}, **`abort "with message"` parity with bare abort**, **Display impl surfaces abort reason**, num_bytes preservation, conditional-abort filter use case - `DocProcessor` (6 integration tests): single-abort + drop_on_abort=true, single-abort + drop_on_abort=false, runtime error always counts as transform_error, conditional batch, **mixed-category batch (valid + abort + json-error + docmapper-error in one batch)**, direct counter routing for all variants
There was a problem hiding this comment.
💡 Codex Review
https://github.com/quickwit-oss/quickwit/blob/00b26e3eb38db2742262ce8ad949ee256a8ad586/quickwit-indexing/src/actors/vrl_processing.rs#L71-L73
Stop logging drop_on_abort aborts as errors
When drop_on_abort is enabled, this branch still returns Err(DocProcessorError::TransformAbort), and DocProcessor::process_raw_doc logs every Err with rate_limited_warn! before recording counters. As a result, filter-style VRL scripts using abort still emit warning logs for intentionally dropped documents, despite the new option being documented as a silent drop/no-warn path; special-case TransformAbort before the generic error logging or avoid routing it through the error logging path.
https://github.com/quickwit-oss/quickwit/blob/00b26e3eb38db2742262ce8ad949ee256a8ad586/quickwit-indexing/src/actors/doc_processor.rs#L381
Keep dropped aborts out of indexed-doc totals
Including transform_aborts in num_processed_docs() makes intentionally dropped documents indistinguishable from documents that were actually indexed in aggregate statistics: IndexingStatistics::add_actor_counters adds this value to num_docs, while CLI reporting computes indexed docs as num_docs - num_invalid_docs, and aborts are explicitly excluded from num_invalid_docs(). For a source where VRL filters out documents via abort, the CLI/API can therefore report them as indexed even though no ProcessedDoc was forwarded downstream; expose/subtract dropped docs separately instead of folding them into the indexed total.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Two fixes for the `drop_on_abort` patch surfaced by Codex review: 1. **Silent aborts were not actually silent.** Suppressing the `warn!` in `VrlProgram::transform_doc` wasn't enough — `DocProcessor::process_raw_doc` logs every `Err` via `rate_limited_warn!` *after* the transform returns. Adds `DocProcessorError::should_warn()` returning `false` for `TransformAbort` and gates the rate-limited log on it. 2. **Aborted docs were inflating the indexed-doc total.** The previous patch included `transform_aborts` in `num_processed_docs()`. Downstream code (`IndexingStatistics::add_actor_counters`) aggregates that into `num_docs`, and the CLI/API computes `indexed = num_docs - num_invalid_docs`. Since aborts are intentionally excluded from `num_invalid_docs`, they would be reported as indexed. Removes `transform_aborts` from `num_processed_docs` and adds a new `num_dropped_docs` field on `IndexingStatistics` so aborts remain visible in API observations without polluting the indexed math. New / updated tests: - `test_doc_processor_error_should_warn_excludes_transform_abort` — direct unit test for the new `should_warn()` method across all variants. - `test_doc_processor_mixed_category_batch` — extended to also pipe the counters through `IndexingStatistics::add_actor_counters` and assert `num_docs == 3`, `num_invalid_docs == 2`, `num_dropped_docs == 1`, and `num_docs - num_invalid_docs == 1` (the one doc actually indexed). - Existing tests updated for the new `num_processed_docs` semantics (excludes aborts).
|
Thanks @chatgpt-codex-connector — both findings were real and are addressed in c59db5d. P2 #1 — P2 #2 — aborts were inflating the indexed-doc total. Confirmed via |
|
To use Codex here, create a Codex account and connect to github. |
Summary
Closes #5664.
Adds a
drop_on_abortboolean toTransformConfig(defaulttrue, matching Vector's VRL semantics). When enabled, documents terminated by a VRLabortexpression are dropped silently, with no warn log, and surfaced via a distinct counter so operators can distinguish intentional filtering from genuine VRL failures.Behavior
drop_on_abort: true(default)drop_on_abort: falseabortexpression (with or without message)transform_aborts→DocProcessorError::TransformAborttransform_errors→DocProcessorError::TransformTerminate::Error(e.g.parse_json!failure)transform_errorstransform_errorsOk(doc)num_invalid_docs()excludes aborted documents (intentional drops, not invalid); a newnum_dropped_docs()accessor surfaces them. New Prometheus label valuedocs_processed_status="transform_abort"on the existingquickwit_doc_processor_docs_totalmetric — no dashboard breakage.Usage
Design notes
true(Vector parity). Anyone callingabortalready accepts the doc being dropped; they want the noise gone. Strict-mode users can opt back into the old behavior.TransformAbortvariant. Aborts are intentional drops, not errors. Carrying that distinction through the error type letsrecord_errorroute cleanly and keepsnum_invalid_docs()semantically honest.TransformConfig::new()signature unchanged. Defaults the new field; ~5 callers in cli, control-plane, metastore tests stay untouched.Tests — 29 assertions across edge cases, error conditions, error handling, validations
TransformConfigserde (8 inner cases):true/falseroundtrip"true") rejecteddeny_unknown_fieldsTransformConfig::new()defaults to trueVrlProgram(8 tests):drop_on_abort)abort× {drop_on_abort=true → TransformAbort, false → Transform}abort "with reason"parity with bare abortTerminate::Error) always route to Transform regardless of flagDisplayimpl surfaces the abort reason for log readabilityDocProcessorintegration (6 tests through full actor pipeline):DocProcessorCounters::record_errorrouting for all variantsTest plan
cargo test -p quickwit-config --features vrl— 18 passing (incl. 5 new test fns)cargo test -p quickwit-indexing --features vrl— 25 passing (incl. 14 new test fns)cargo clippy -p quickwit-config -p quickwit-indexing --features vrl --tests— cleancargo +nightly fmt --check— cleancargo checkagainst quickwit-cli, control-plane, metastore — no breakagequickwit tool local-ingest --transform-script, REST query returned exactly 3 hits (KEEP-1,KEEP-2,KEEP-3) with thedropfield stripped and body upcased — confirming the silent-drop behavior on the production path.