Skip to content

Commit be4dac4

Browse files
Robert Karpclaude
andcommitted
fix: suppress APM error events for receive-loop cancellations during shutdown (5.7.5)
Setting Outcome=Success (5.7.4) was insufficient: Elastic APM captures error events at the DiagnosticSource level before ReceiverWrapper runs, so the error document was already queued regardless of the outcome override. Registers a one-time Agent.AddFilter(IError) that drops error events whose TransactionId matches a cancelled-receive transaction, preventing them from reaching the APM server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent f0e645f commit be4dac4

2 files changed

Lines changed: 34 additions & 2 deletions

File tree

docs/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,10 @@ All notable changes to this project will be documented in this file.
44
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
55
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
66

7+
## 5.7.5
8+
- Fixed
9+
- `ApmTransactionManager.OnReceiveCancelled()` now also suppresses the APM error *event* (not just the transaction outcome) for receive-loop cancellations during pod shutdown. Setting `Outcome = Success` (5.7.4) was insufficient because Elastic APM captures error events at the `DiagnosticSource` level — before `ReceiverWrapper` runs — so the error document was already queued. 5.7.5 registers a one-time `Agent.AddFilter(IError)` filter that drops error events whose `TransactionId` belongs to a cancelled receive transaction, preventing them from reaching the APM server.
10+
711
## 5.7.4
812
- Fixed
913
- Prevented `OperationCanceledException` during pod graceful shutdown from being recorded as APM errors. Added `ICancellationAwareTransactionManager` — an optional interface that `ITransactionManager` implementations can implement to react to receive-loop cancellations. `ApmTransactionManager` implements it by setting the current Elastic APM transaction outcome to `Success`, overriding the error state set by the Azure SDK's auto-instrumentation. `ReceiverWrapper` calls `OnReceiveCancelled()` via a runtime cast before logging the shutdown warning.

src/Ev.ServiceBus.Apm/ApmTransactionManager.cs

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
using System;
2+
using System.Collections.Concurrent;
23
using System.Collections.Generic;
34
using System.Diagnostics;
5+
using System.Threading;
46
using System.Threading.Tasks;
57
using Elastic.Apm;
68
using Elastic.Apm.Api;
@@ -14,6 +16,15 @@ namespace Ev.ServiceBus.Apm;
1416
/// </summary>
1517
public class ApmTransactionManager : ITransactionManager, ICancellationAwareTransactionManager
1618
{
19+
// Tracks transaction IDs for which the ASB ProcessErrorAsync callback fired an OperationCanceledException
20+
// (the standard signal that the receive loop is being stopped, most commonly during pod graceful shutdown).
21+
// The error filter below suppresses APM error events for these transactions so that
22+
// shutdown-induced TaskCanceledException entries do not appear in APM.
23+
// Capped at 1000 entries as a safety net: normal pod shutdown produces ~50 entries; in any
24+
// edge case where the processor is stopped and restarted mid-lifecycle the cap prevents unbounded growth.
25+
private static readonly ConcurrentDictionary<string, byte> _cancelledTransactionIds = new();
26+
private const int CancelledTransactionIdCap = 1000;
27+
private static int _filterRegistered; // 0 = not registered, 1 = registered
1728
public async Task RunWithInTransaction(MessageExecutionContext executionContext, Func<Task> transaction)
1829
{
1930
if (IsTraceEnabled())
@@ -73,8 +84,25 @@ private static List<SpanLink> GetSpanLinks(string? diagnosticId)
7384

7485
public void OnReceiveCancelled()
7586
{
76-
if (IsTraceEnabled())
77-
Agent.Tracer.CurrentTransaction.Outcome = Outcome.Success;
87+
if (!IsTraceEnabled())
88+
return;
89+
90+
var tx = Agent.Tracer.CurrentTransaction;
91+
tx.Outcome = Outcome.Success;
92+
if (_cancelledTransactionIds.Count < CancelledTransactionIdCap)
93+
_cancelledTransactionIds.TryAdd(tx.Id, 0);
94+
95+
// Register once: suppress error events for cancelled-receive transactions before they are sent to APM.
96+
// Elastic APM captures error events at the DiagnosticSource level (before ReceiverWrapper runs),
97+
// so setting Outcome = Success alone does not prevent error documents from appearing in APM.
98+
// Returning null from the filter drops the error event entirely.
99+
if (Interlocked.CompareExchange(ref _filterRegistered, 1, 0) == 0)
100+
{
101+
Agent.AddFilter((IError error) =>
102+
error.TransactionId is not null && _cancelledTransactionIds.ContainsKey(error.TransactionId)
103+
? null
104+
: error);
105+
}
78106
}
79107

80108
private static bool IsTraceEnabled()

0 commit comments

Comments
 (0)