fix(observability): do not hold span guards across await points#25482
Open
gwenaskell wants to merge 4 commits into
Open
fix(observability): do not hold span guards across await points#25482gwenaskell wants to merge 4 commits into
gwenaskell wants to merge 4 commits into
Conversation
…it. The test still works because tokio:: test runs single-threaded by default, but better fixing it than silencing the clippy error
bruceg
approved these changes
May 22, 2026
Member
bruceg
left a comment
There was a problem hiding this comment.
Makes sense to me, and has a side benefit of reducing the complexity of those long topology builder functions.
pront
approved these changes
May 22, 2026
|
|
||
| error!(message = "Before source started.", %test_id); | ||
|
|
||
| drop(_enter); // don't hold the span guard across an await point |
Member
There was a problem hiding this comment.
Nit:
Suggested change
| drop(_enter); // don't hold the span guard across an await point | |
| drop(enter); // don't hold the span guard across an await point |
|
|
||
| let rx = start_source().await; | ||
|
|
||
| let _enter = span.enter(); |
Member
There was a problem hiding this comment.
Nit:
Suggested change
| let _enter = span.enter(); | |
| let enter = span.enter(); |
pront
reviewed
May 22, 2026
| let server = match source.inner.build(context).await { | ||
| Err(error) => { | ||
| self.errors.push(format!("Source \"{key}\": {error}")); | ||
| return Err(()); |
Member
There was a problem hiding this comment.
Is this behavior change intentional? Previously we collected all errors.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
In the topology builder, the components builder functions enter a tracing span, which carries the tags that will automatically be injected into the metrics registered at build time below. This entered span guard is held while calling the inner source build function, which is asynchronous. Holding an entered span across an await point is strongly discouraged in tracing docs: if the awaited future is non-trivial and actually yields to the tokio runtime, the entered span guard is leaked on the current thread's stack and lost if the task is later resumed on a different thread.
What this means concretely: any source, transform or sink which build method actually performs async work that pauses it temporarily will very likely loose all component tags on the metrics registered at build time within that source / sink after the await point.
This includes (the list may not be exhaustive):
Vector configuration
Our E2E tests flagged missing metrics for gcp_pubsub, microsoft_sentinel, and sinks using disk buffers.
utilizationwas frequently not reported.How did you test this PR?
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.