Skip to content

enhancement(transforms): Add global option to loosen ordering guarantees in stateless transforms + introduce associated metric#25070

Open
ArunPiduguDD wants to merge 10 commits intomasterfrom
add-unordered-stateless-transform-option
Open

enhancement(transforms): Add global option to loosen ordering guarantees in stateless transforms + introduce associated metric#25070
ArunPiduguDD wants to merge 10 commits intomasterfrom
add-unordered-stateless-transform-option

Conversation

@ArunPiduguDD
Copy link
Copy Markdown
Contributor

@ArunPiduguDD ArunPiduguDD commented Mar 30, 2026

Summary

Introduces a new global configuration option called preserve_ordering_stateless_transforms that disables ordering guarantees in concurrently running Function Transforms. Also adds a corresponding metric estimated_concurrent_transform_scheduling_pressure to inform users when they should consider enabling this option.

Detailed Context

In Vector's existing concurrency model stateless function transforms are run concurrently (e.g. a function transform can have multiple threads working on batches of events in parallel). However, the existing implementation still guarantees event ordering (e.g. if 1000 events arrive at a transform and are processed across 10 batches/Tasks, they will still leave the transform in the same order they arrived, even if later batches complete before earlier batches).

In cases where processing latency of the events within the transform is both high & variable, then this can lead to inefficiencies - as mentioned above events that are processed in later batches can be blocked by batches/Tasks scheduled earlier (if the earlier batch is still processing when the later batch finishes)

The effect can be illustrated by this wall-time profile (measured during a benchmark test with 8 CPUs / parallel threads)

579586858-6711d6b9-8c5f-4562-bddc-41cc07744a12
In this test the vector instance was constantly flooded with events so there are always events waiting to be processed. Multiple threads finish processing their batch, however new batches / Tasks are unable to be scheduled for these idle threads due to the fact that the head Task in the `FuturesOrdered` [queue](https://github.com/vectordotdev/vector/blob/master/src/topology/builder.rs#L1215) is still processing its batch, leading to a CPU utilization inefficiency and overall lower throughput.

When switching this to an ordered queue, the the transform is not held up by long-running tasks and the overall ingress throughput increases (graph below shows bytes / second throughput of ordered vs unordered queue - test was done using remap processor with many regex rules)

Screenshot 2026-04-17 at 1 41 42 PM

In order to determine if enabling this option is needed, this PR also adds a new metric estimated_concurrent_transform_scheduling_pressure which keeps track of how many Tasks have been completed and are blocked by the head task from being scheduled (metric is a distribution which ranges from 0-1). B/c this introduces a shared counter to the transform "hot path", ran regression benchmark tests to confirm there are no issues: https://github.com/vectordotdev/vector/actions/runs/24585515449 (note: a few tests failed to run but seems to be unrelated issues - tests also failing to run on the latest merged commit in master)

Vector configuration

Added a new regression test with the following Vector config

How did you test this PR?

Ran Vector pipelines with the preserve_ordering_stateless_transforms option set. Also ran regression tests

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

@github-actions github-actions Bot added domain: topology Anything related to Vector's topology code domain: core Anything related to core crates i.e. vector-core, core-common, etc labels Mar 30, 2026
@ArunPiduguDD ArunPiduguDD force-pushed the add-unordered-stateless-transform-option branch from 1a4d60d to 72718d3 Compare April 13, 2026 15:45
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 13, 2026
@ArunPiduguDD ArunPiduguDD deleted the add-unordered-stateless-transform-option branch April 13, 2026 15:50
@ArunPiduguDD ArunPiduguDD restored the add-unordered-stateless-transform-option branch April 13, 2026 15:54
@ArunPiduguDD ArunPiduguDD reopened this Apr 13, 2026
@ArunPiduguDD ArunPiduguDD force-pushed the add-unordered-stateless-transform-option branch from 72718d3 to 6957e46 Compare April 13, 2026 15:54
@ArunPiduguDD ArunPiduguDD changed the base branch from master to graphite-base/25070 April 13, 2026 15:57
@ArunPiduguDD ArunPiduguDD force-pushed the add-unordered-stateless-transform-option branch from 6957e46 to 9774502 Compare April 13, 2026 15:58
@ArunPiduguDD ArunPiduguDD changed the base branch from graphite-base/25070 to add-transform-concurrency-blockage-metric April 13, 2026 15:58
@ArunPiduguDD ArunPiduguDD changed the base branch from add-transform-concurrency-blockage-metric to graphite-base/25070 April 13, 2026 19:57
@ArunPiduguDD ArunPiduguDD force-pushed the add-unordered-stateless-transform-option branch from 9774502 to 3829755 Compare April 13, 2026 19:57
@ArunPiduguDD ArunPiduguDD changed the base branch from graphite-base/25070 to master April 13, 2026 19:57
@ArunPiduguDD ArunPiduguDD force-pushed the add-unordered-stateless-transform-option branch from 3829755 to aa3e104 Compare April 14, 2026 23:30
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@ArunPiduguDD ArunPiduguDD changed the title add global option to loosen ordering guarantees in stateless transforms in exchange for potential performance benefits Add global option to loosen ordering guarantees in stateless transforms + associated metric Apr 14, 2026
@ArunPiduguDD ArunPiduguDD changed the title Add global option to loosen ordering guarantees in stateless transforms + associated metric enhancement(transforms): Add global option to loosen ordering guarantees in stateless transforms + associated metric Apr 14, 2026
@ArunPiduguDD ArunPiduguDD changed the title enhancement(transforms): Add global option to loosen ordering guarantees in stateless transforms + associated metric enhancement(transforms): Add global option to loosen ordering guarantees in stateless transforms + introduce associated metric Apr 14, 2026
…ing pressure metric

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@ArunPiduguDD ArunPiduguDD force-pushed the add-unordered-stateless-transform-option branch from dea4a7c to 0a73a90 Compare April 14, 2026 23:48
ArunPiduguDD and others added 3 commits April 15, 2026 15:15
…ig until feature lands on master

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added work in progress domain: external docs Anything related to Vector's external, public documentation labels Apr 20, 2026
@ArunPiduguDD ArunPiduguDD marked this pull request as ready for review April 20, 2026 21:50
@ArunPiduguDD ArunPiduguDD requested review from a team as code owners April 20, 2026 21:50
@ArunPiduguDD ArunPiduguDD requested a review from pront April 20, 2026 21:50
Copy link
Copy Markdown
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thank you for this contribution! This PR can be split into two dedicated PRs, I would start with introducing estimated_concurrent_transform_scheduling_pressure first.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: external docs Anything related to Vector's external, public documentation domain: topology Anything related to Vector's topology code work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants