Skip to content

fix(sdk): count batch items before enqueue to fix flush/shutdown race#3558

Open
Wassbdr wants to merge 2 commits into
open-telemetry:mainfrom
Wassbdr:fix/3453-batch-flush-race
Open

fix(sdk): count batch items before enqueue to fix flush/shutdown race#3558
Wassbdr wants to merge 2 commits into
open-telemetry:mainfrom
Wassbdr:fix/3453-batch-flush-race

Conversation

@Wassbdr

@Wassbdr Wassbdr commented Jun 18, 2026

Copy link
Copy Markdown

Fixes #3453

Problem

In the thread-based BatchSpanProcessor and BatchLogProcessor, the producer (on_end() / emit()) does try_send(item) into the data channel first and only then current_batch_size.fetch_add(1). The worker uses a snapshot of current_batch_size (load(Acquire)) as the drain target for force_flush() / shutdown().

If a flush/shutdown drain snapshots the counter in the window between a successful try_send and the fetch_add, it under-counts: the worker drains nothing (or too little) and returns success. As a result:

  • force_flush() can return before an already-enqueued item is exported.
  • shutdown() can miss an already-enqueued item and then exit, dropping it.

Fix

Increment current_batch_size before enqueueing, and revert it (fetch_sub) when the send fails (Full or Disconnected). Producer-side only; the drain logic is untouched.

This flips the counter's invariant from "may under-count the channel" (lossy) to "counter is always ≥ channel depth" (an upper bound). A flush/shutdown drain can therefore no longer under-count items already in the queue.

Why the drain logic doesn't need to change

The drain already tolerates a transient over-count (an item counted but not yet enqueued), thanks to the #3441 fix:

  • batch_limit = max_export_batch_size.min(target - total_exported)
  • break out of the loop on an empty chunk
  • fetch_sub only the count actually exported

So when the worker snapshots a target that includes a not-yet-enqueued item, it exports what's in the channel, breaks on the empty chunk, and leaves that item's +1 in the counter to be drained on a later cycle. No underflow, no spin. #3441's drain logic and its regression test are left unchanged.

Race interleavings considered

  1. Original bug (sent but not counted) — eliminated by construction (count precedes send).
  2. Symmetric race (counted but not yet sent) during a flush — drain exports what's present, breaks on empty chunk, the phantom +1 survives and is drained next cycle. The item wasn't "enqueued before the flush" in any happens-before sense, so missing it is within contract.
  3. Shutdown — if try_send Ok happens-before the shutdown() call, the fetch_add (sequenced-before the send) is visible to the worker via the mutex-backed control channel → item is both counted and in the channel → exported.
  4. Queue full / disconnectedfetch_add then send fails then fetch_sub: transient over-count of 1, handled like case 2; dropped-count accounting unchanged.

Tests

  • batchspanprocessor_drain_handles_counted_but_not_yet_enqueued_spans — drives the drain helper with counter=2 / channel depth=1, asserts 1 exported, counter stays at 1 (no underflow), and the late item is exported on the next cycle.
  • batchspanprocessor_on_end_reverts_count_when_queue_full / test_batch_log_processor_emit_reverts_count_when_queue_full — overflow the queue while the exporter is blocked, assert dropped items don't remain counted as pending and the counter settles to 0 after flush. (Verified these fail without the queue-full revert.)
  • batchspanprocessor_all_spans_accounted_for — spans analog of the existing logs stress test (4 threads × 100k), asserting received + dropped == emitted.

cargo test -p opentelemetry_sdk --all-features --lib, cargo clippy --all-targets --all-features -- -Dwarnings, and cargo fmt --all -- --check all pass.

@Wassbdr Wassbdr requested a review from a team as a code owner June 18, 2026 17:13
@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.0%. Comparing base (1b3846c) to head (6508c5c).

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #3558     +/-   ##
=======================================
+ Coverage   82.9%   83.0%   +0.1%     
=======================================
  Files        130     130             
  Lines      27768   27936    +168     
=======================================
+ Hits       23040   23210    +170     
+ Misses      4728    4726      -2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread opentelemetry-sdk/src/trace/span_processor.rs Outdated

@lalitb lalitb left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nit comment. Thanks.

@Wassbdr

Wassbdr commented Jun 23, 2026

Copy link
Copy Markdown
Author

Thanks @lalitb! Reduced batchspanprocessor_all_spans_accounted_for to 10k spans per thread (40k total) in ee81737 — still exercises the received+dropped==emitted invariant without the 400k-span cost. The targeted race tests cover the race directly.

Wassbdr added 2 commits June 24, 2026 12:46
In the thread-based BatchSpanProcessor and BatchLogProcessor, on_end()/emit()
sent the item into the channel first and only then incremented
current_batch_size. The worker uses a snapshot of current_batch_size as the
drain target for force_flush()/shutdown(). If the snapshot happened in the
window between a successful try_send and the fetch_add, the worker drained
nothing and returned success, so force_flush() could return before an
already-enqueued item was exported and shutdown() could drop it.

Increment current_batch_size before enqueueing, and revert the increment when
the send fails (queue full or disconnected). The counter is now always an
upper bound on the channel depth, so a flush/shutdown drain can no longer
under-count items already in the queue. The post-open-telemetry#3441 drain logic (break on
empty chunk, fetch_sub of the actually-exported count) already tolerates the
transient over-count from an item counted but not yet enqueued, and is left
unchanged.

Fixes open-telemetry#3453
Addresses review feedback: 40k total spans still exercises the
received+dropped==emitted invariant without adding a 400k-span stress test
to the normal unit suite. The targeted race tests cover the race directly.
@Wassbdr Wassbdr force-pushed the fix/3453-batch-flush-race branch from ee81737 to 6508c5c Compare June 24, 2026 10:50
@cijothomas

Copy link
Copy Markdown
Member

@Wassbdr can you fix the changelog conflict and this is good to merge.

@cijothomas cijothomas enabled auto-merge June 27, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: BatchSpanProcessor` and BatchLogProcessor can miss already-enqueued items during force_flush() / shutdown()

3 participants