fix(sdk): count batch items before enqueue to fix flush/shutdown race by Wassbdr · Pull Request #3558 · open-telemetry/opentelemetry-rust

Wassbdr · 2026-06-18T17:13:04Z

Problem

In the thread-based BatchSpanProcessor and BatchLogProcessor, the producer (on_end() / emit()) does try_send(item) into the data channel first and only then current_batch_size.fetch_add(1). The worker uses a snapshot of current_batch_size (load(Acquire)) as the drain target for force_flush() / shutdown().

If a flush/shutdown drain snapshots the counter in the window between a successful try_send and the fetch_add, it under-counts: the worker drains nothing (or too little) and returns success. As a result:

force_flush() can return before an already-enqueued item is exported.
shutdown() can miss an already-enqueued item and then exit, dropping it.

Fix

Increment current_batch_size before enqueueing, and revert it (fetch_sub) when the send fails (Full or Disconnected). Producer-side only; the drain logic is untouched.

This flips the counter's invariant from "may under-count the channel" (lossy) to "counter is always ≥ channel depth" (an upper bound). A flush/shutdown drain can therefore no longer under-count items already in the queue.

Why the drain logic doesn't need to change

The drain already tolerates a transient over-count (an item counted but not yet enqueued), thanks to the #3441 fix:

batch_limit = max_export_batch_size.min(target - total_exported)
break out of the loop on an empty chunk
fetch_sub only the count actually exported

So when the worker snapshots a target that includes a not-yet-enqueued item, it exports what's in the channel, breaks on the empty chunk, and leaves that item's +1 in the counter to be drained on a later cycle. No underflow, no spin. #3441's drain logic and its regression test are left unchanged.

Race interleavings considered

Original bug (sent but not counted) — eliminated by construction (count precedes send).
Symmetric race (counted but not yet sent) during a flush — drain exports what's present, breaks on empty chunk, the phantom +1 survives and is drained next cycle. The item wasn't "enqueued before the flush" in any happens-before sense, so missing it is within contract.
Shutdown — if try_send Ok happens-before the shutdown() call, the fetch_add (sequenced-before the send) is visible to the worker via the mutex-backed control channel → item is both counted and in the channel → exported.
Queue full / disconnected — fetch_add then send fails then fetch_sub: transient over-count of 1, handled like case 2; dropped-count accounting unchanged.

Tests

batchspanprocessor_drain_handles_counted_but_not_yet_enqueued_spans — drives the drain helper with counter=2 / channel depth=1, asserts 1 exported, counter stays at 1 (no underflow), and the late item is exported on the next cycle.
batchspanprocessor_on_end_reverts_count_when_queue_full / test_batch_log_processor_emit_reverts_count_when_queue_full — overflow the queue while the exporter is blocked, assert dropped items don't remain counted as pending and the counter settles to 0 after flush. (Verified these fail without the queue-full revert.)
batchspanprocessor_all_spans_accounted_for — spans analog of the existing logs stress test (4 threads × 100k), asserting received + dropped == emitted.

cargo test -p opentelemetry_sdk --all-features --lib, cargo clippy --all-targets --all-features -- -Dwarnings, and cargo fmt --all -- --check all pass.

codecov · 2026-06-18T17:21:57Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.0%. Comparing base (1b3846c) to head (6508c5c).

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #3558     +/-   ##
=======================================
+ Coverage   82.9%   83.0%   +0.1%     
=======================================
  Files        130     130             
  Lines      27768   27936    +168     
=======================================
+ Hits       23040   23210    +170     
+ Misses      4728    4726      -2

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lalitb

LGTM with nit comment. Thanks.

Wassbdr · 2026-06-23T02:20:53Z

Thanks @lalitb! Reduced batchspanprocessor_all_spans_accounted_for to 10k spans per thread (40k total) in ee81737 — still exercises the received+dropped==emitted invariant without the 400k-span cost. The targeted race tests cover the race directly.

In the thread-based BatchSpanProcessor and BatchLogProcessor, on_end()/emit() sent the item into the channel first and only then incremented current_batch_size. The worker uses a snapshot of current_batch_size as the drain target for force_flush()/shutdown(). If the snapshot happened in the window between a successful try_send and the fetch_add, the worker drained nothing and returned success, so force_flush() could return before an already-enqueued item was exported and shutdown() could drop it. Increment current_batch_size before enqueueing, and revert the increment when the send fails (queue full or disconnected). The counter is now always an upper bound on the channel depth, so a flush/shutdown drain can no longer under-count items already in the queue. The post-open-telemetry#3441 drain logic (break on empty chunk, fetch_sub of the actually-exported count) already tolerates the transient over-count from an item counted but not yet enqueued, and is left unchanged. Fixes open-telemetry#3453

Addresses review feedback: 40k total spans still exercises the received+dropped==emitted invariant without adding a 400k-span stress test to the normal unit suite. The targeted race tests cover the race directly.

cijothomas · 2026-06-27T13:59:12Z

@Wassbdr can you fix the changelog conflict and this is good to merge.

Wassbdr requested a review from a team as a code owner June 18, 2026 17:13

lalitb reviewed Jun 22, 2026

View reviewed changes

Comment thread opentelemetry-sdk/src/trace/span_processor.rs Outdated

lalitb approved these changes Jun 22, 2026

View reviewed changes

Wassbdr added 2 commits June 24, 2026 12:46

test(sdk): reduce span accounting stress test to 10k spans per thread

6508c5c

Addresses review feedback: 40k total spans still exercises the received+dropped==emitted invariant without adding a 400k-span stress test to the normal unit suite. The targeted race tests cover the race directly.

Wassbdr force-pushed the fix/3453-batch-flush-race branch from ee81737 to 6508c5c Compare June 24, 2026 10:50

cijothomas enabled auto-merge June 27, 2026 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(sdk): count batch items before enqueue to fix flush/shutdown race#3558

fix(sdk): count batch items before enqueue to fix flush/shutdown race#3558
Wassbdr wants to merge 2 commits into
open-telemetry:mainfrom
Wassbdr:fix/3453-batch-flush-race

Wassbdr commented Jun 18, 2026

Uh oh!

codecov Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

lalitb left a comment

Uh oh!

Wassbdr commented Jun 23, 2026

Uh oh!

cijothomas commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Wassbdr commented Jun 18, 2026

Problem

Fix

Why the drain logic doesn't need to change

Race interleavings considered

Tests

Uh oh!

codecov Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

lalitb left a comment

Choose a reason for hiding this comment

Uh oh!

Wassbdr commented Jun 23, 2026

Uh oh!

cijothomas commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Jun 18, 2026 •

edited

Loading