Skip to content

Commit 42e2dc9

Browse files
authored
[Test Optimization] Add test.final_status tag (#8091)
## Summary of changes Add a new test.final_status tag to the final execution span of tests across NUnit, XUnit, and MsTest frameworks. This tag represents the adjusted final outcome of a test for CI pipeline result determination, with values pass, fail, or skip. Jira: SDTEST-2985 ### Core changes: - Add TestFinalStatus constant to TestTags.cs and FinalStatus property to TestSpanTags.cs - Add shared CalculateFinalStatus() helper in Common.cs implementing the priority logic - Implement final status tracking in NUnit (TestOptimizationTestCommand.cs), XUnit (XUnitIntegration.cs), and MsTest (TestMethodAttributeExecuteIntegration.cs) - Add ATR budget pre-check methods (GetRemainingBudget/GetRemainingAtrBudget) for early termination detection - Add per-row caching for MsTest parameterized tests to track execution results independently ### Fix: - Corrected **test.test_management.attempt_to_fix_passed** tag to only be set on Attempt-to-Fix (ATF) tests; previously it was incorrectly set on all retried tests (EFD, ATR, ATF) due to a hardcoded behavior type. ## Reason for change When retry mechanisms are enabled (ATR, EFD, Attempt to Fix), a single test can run multiple times with different outcomes. Some intermediate outcomes are suppressed to avoid failing CI pipelines. Currently, there is no way to query tests by their final adjusted status to build monitors and alerts for hard failures on default branches. The test.final_status tag enables customers to: - Query tests by their effective CI outcome in Datadog - Build monitors for tests that are truly failing (not just flaky) - Distinguish between tests that eventually passed vs. those that consistently failed - Track quarantined/disabled tests separately from actual failures #### Priority logic: 1. Quarantined/disabled tests → always skip (CI never sees actual result) 2. **ATF tests with any execution failed → fail (test is still flaky, fix didn't work)** 3. Any execution passed → pass (matches CI behavior: one pass = pipeline pass) 4. Last retry is skip/inconclusive AND no pass → skip 5. All executions failed → fail #### ATF (Attempt to Fix) semantics: For ATF tests, the goal is to determine if a fix actually resolved a flaky test. Therefore: - If **any** execution fails (initial or retry), the test is still flaky → `final_status = fail` - Only if **all** executions pass, the fix worked → `final_status = pass` - Skip/inconclusive does **not** count as failure (only actual failures count) - `attempt_to_fix_passed` tag is derived from the same logic for consistency ## Implementation details #### Shared logic (Common.cs): - CalculateFinalStatus(anyExecutionPassed, anyExecutionFailed, isSkippedOrInconclusive, testTags) implements the 5-priority determination - Added `anyExecutionFailed` parameter to support ATF-specific behavior #### NUnit: - Extended RetryState struct with InitialExecutionPassed, InitialExecutionFailed, and AnyRetryPassed fields - Added GetRemainingBudget() to FlakyRetryBehavior for ATR budget exhaustion pre-check - Set final_status in ExecuteTest() before span close, handling ATR early exit and budget exhaustion - AllAttemptsPassed only clears on actual failure (not skip) for ATF semantics - AttemptToFixPassed derived from anyExecutionFailed for consistency #### XUnit: - Extended TestCaseMetadata with InitialExecutionPassed, InitialExecutionFailed, and AnyRetryPassed fields - Added unified GetRemainingAtrBudget() handling both v2 and v3 via Math.Max() strategy - Updated WriteFinalTagsFromMetadata() with final execution detection (handles stale TotalExecutions on initial EFD) - Track InitialExecutionFailed on exception path for first execution - AttemptToFixPassed derived from anyExecutionFailed for consistency #### MsTest: - Added per-row caches (InitialExecutionPassedCache, InitialExecutionFailedCache, AnyRetryPassedCache, AllAttemptsPassedCache) using ConditionalWeakTable<object, ConcurrentDictionary<string, bool>> to track parameterized test results independently - Added SetFinalStatusIfApplicable() helper with proper final execution detection - Added GetRemainingAtrBudget() for budget exhaustion pre-check - Updated SkipTestMethodExecutor and UnitTestRunnerRunSingleTest* integrations for pre-execution skip paths - Track InitialExecutionFailed in both exception and non-exception paths - AttemptToFixPassed derived from anyExecutionFailed for consistency ## Test coverage #### Unit tests (TestFinalStatusTests.cs): 136 tests covering: - Priority order verification (quarantined/disabled → ATF-fail → pass → skip → fail) - Single execution scenarios (pass/fail/skip) - EFD scenarios (new tests, duration-based retries, slow abort) - ATR scenarios (initial pass, retry pass, all fail, budget exhaustion) - ATF scenarios (any failure → fail, all pass → pass, skip semantics) - ATF skip semantics: skip does NOT count as failure (5 dedicated tests) - MsTest-specific: parameterized per-row tracking, class/assembly init errors, inconclusive/not-runnable - XUnit-specific: SkipException handling, null metadata - NUnit-specific: ITR skipped, attribute skipped, inconclusive - Mixed feature interactions (EFD+ATR, EFD+ATF, ATR+ATF) - Edge cases: empty strings, case sensitivity, null tags #### Integration tests: - Added TestFinalStatus to span ordering in TestingFrameworkEvpTest.cs for deterministic snapshot verification ## Other details <!-- Fixes #{issue} --> <!-- ⚠️ Note: Where possible, please obtain 2 approvals prior to merging. Unless CODEOWNERS specifies otherwise, for external teams it is typically best to have one review from a team member, and one review from apm-dotnet. Trivial changes do not require 2 reviews. MergeQueue is NOT enabled in this repository. If you have write access to the repo, the PR has 1-2 approvals (see above), and all of the required checks have passed, you can use the Squash and Merge button to merge the PR. If you don't have write access, or you need help, reach out in the #apm-dotnet channel in Slack. -->
1 parent 9281c0d commit 42e2dc9

66 files changed

Lines changed: 2721 additions & 118 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

tracer/src/Datadog.Trace/Ci/Tagging/TestSpanTags.cs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,9 @@ public TestSpanTags(TestSuiteSpanTags suiteTags, string testName)
127127
[Tag(TestTags.TestAttemptToFixPassed)]
128128
public string? AttemptToFixPassed { get; set; }
129129

130+
[Tag(TestTags.TestFinalStatus)]
131+
public string? FinalStatus { get; set; }
132+
130133
[Tag(CapabilitiesTags.LibraryCapabilitiesTestImpactAnalysis)]
131134
public string? CapabilitiesTestImpactAnalysis { get; set; }
132135

tracer/src/Datadog.Trace/Ci/Tags/TestTags.cs

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,21 @@ internal static class TestTags
156156
/// </summary>
157157
public const string TestRetryReason = "test.retry_reason";
158158

159+
/// <summary>
160+
/// Retry reason value for Early Flake Detection
161+
/// </summary>
162+
public const string TestRetryReasonEfd = "efd";
163+
164+
/// <summary>
165+
/// Retry reason value for Auto Test Retries
166+
/// </summary>
167+
public const string TestRetryReasonAtr = "atr";
168+
169+
/// <summary>
170+
/// Retry reason value for Attempt to Fix (Test Management)
171+
/// </summary>
172+
public const string TestRetryReasonAttemptToFix = "attempt_to_fix";
173+
159174
/// <summary>
160175
/// Test is quarantined flag
161176
/// </summary>
@@ -185,4 +200,9 @@ internal static class TestTags
185200
/// Test management enabled flag
186201
/// </summary>
187202
public const string TestManagementEnabled = "test.test_management.enabled";
203+
204+
/// <summary>
205+
/// Test final status - the adjusted test outcome for CI pipelines
206+
/// </summary>
207+
public const string TestFinalStatus = "test.final_status";
188208
}

tracer/src/Datadog.Trace/Ci/Test.cs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -555,8 +555,8 @@ public void Close(TestStatus status, TimeSpan? duration, string? skipReason)
555555
{
556556
var retryReasonTag = tags.TestRetryReason switch
557557
{
558-
"efd" => MetricTags.CIVisibilityTestingEventTypeRetryReason.EarlyFlakeDetection,
559-
"atr" => MetricTags.CIVisibilityTestingEventTypeRetryReason.AutomaticTestRetry,
558+
TestTags.TestRetryReasonEfd => MetricTags.CIVisibilityTestingEventTypeRetryReason.EarlyFlakeDetection,
559+
TestTags.TestRetryReasonAtr => MetricTags.CIVisibilityTestingEventTypeRetryReason.AutomaticTestRetry,
560560
_ => MetricTags.CIVisibilityTestingEventTypeRetryReason.None
561561
};
562562

tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/Testing/Common.cs

Lines changed: 49 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
using System.Threading;
1111
using Datadog.Trace.Ci;
1212
using Datadog.Trace.Ci.Net;
13+
using Datadog.Trace.Ci.Tagging;
1314
using Datadog.Trace.Ci.Tags;
1415
using Datadog.Trace.Logging;
1516
using Datadog.Trace.Util;
@@ -188,7 +189,7 @@ internal static bool SetEarlyFlakeDetectionTestTagsAndAbortReason(Test test, boo
188189
if (isRetry)
189190
{
190191
testTags.TestIsRetry = "true";
191-
testTags.TestRetryReason = "efd";
192+
testTags.TestRetryReason = TestTags.TestRetryReasonEfd;
192193
}
193194
else
194195
{
@@ -211,7 +212,7 @@ internal static bool SetFlakyRetryTags(Test test, bool isRetry)
211212
{
212213
var testTags = test.GetTags();
213214
testTags.TestIsRetry = "true";
214-
testTags.TestRetryReason = "atr";
215+
testTags.TestRetryReason = TestTags.TestRetryReasonAtr;
215216
}
216217

217218
return flakyRetryFeature;
@@ -244,7 +245,7 @@ internal static TestOptimizationClient.TestManagementResponseTestPropertiesAttri
244245
if (isRetry)
245246
{
246247
testTags.TestIsRetry = "true";
247-
testTags.TestRetryReason = "attempt_to_fix";
248+
testTags.TestRetryReason = TestTags.TestRetryReasonAttemptToFix;
248249
}
249250
}
250251

@@ -273,4 +274,49 @@ internal static void CheckFaultyThreshold(Test test, long nTestCases, long tTest
273274
}
274275
}
275276
}
277+
278+
/// <summary>
279+
/// Calculates the final status for a test based on execution results and test management tags.
280+
/// Priority order (first match wins):
281+
/// 1. Quarantined/disabled -> skip (always mask to skip)
282+
/// 2. For ATF tests: any execution failed -> fail (flaky test = fix didn't work)
283+
/// 3. Any execution passed -> pass
284+
/// 4. Skip/inconclusive AND no pass -> skip
285+
/// 5. All executions failed -> fail
286+
/// </summary>
287+
/// <param name="anyExecutionPassed">True if any execution (initial or retry) passed.</param>
288+
/// <param name="anyExecutionFailed">True if any execution (initial or retry) failed.</param>
289+
/// <param name="isSkippedOrInconclusive">True if the current/last execution was skip or inconclusive.</param>
290+
/// <param name="testTags">The test tags to check for quarantine/disabled/ATF status.</param>
291+
/// <returns>The final status string: "pass", "fail", or "skip".</returns>
292+
internal static string CalculateFinalStatus(bool anyExecutionPassed, bool anyExecutionFailed, bool isSkippedOrInconclusive, TestSpanTags? testTags)
293+
{
294+
// Priority 1: Quarantined/disabled tests always mask to skip
295+
if (testTags?.IsQuarantined == "true" || testTags?.IsDisabled == "true")
296+
{
297+
return TestTags.StatusSkip;
298+
}
299+
300+
// Priority 2: For ATF tests, any failure means fix didn't work (test is still flaky)
301+
// This must be checked BEFORE anyPassed for ATF tests
302+
if (testTags?.IsAttemptToFix == "true" && anyExecutionFailed)
303+
{
304+
return TestTags.StatusFail;
305+
}
306+
307+
// Priority 3: Any execution passed -> pass (pass takes precedence over skip)
308+
if (anyExecutionPassed)
309+
{
310+
return TestTags.StatusPass;
311+
}
312+
313+
// Priority 4: Skip/inconclusive AND no pass -> skip
314+
if (isSkippedOrInconclusive)
315+
{
316+
return TestTags.StatusSkip;
317+
}
318+
319+
// Priority 5: All executions failed -> fail
320+
return TestTags.StatusFail;
321+
}
276322
}

tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/Testing/MsTestV2/SkipTestMethodExecutor.cs

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
using System.Reflection;
99
using System.Threading.Tasks;
1010
using Datadog.Trace.Ci;
11+
using Datadog.Trace.Ci.Tags;
1112
using Datadog.Trace.DuckTyping;
1213

1314
namespace Datadog.Trace.ClrProfiler.AutoInstrumentation.Testing.MsTestV2;
@@ -37,8 +38,13 @@ protected void ProcessTestMethod(object testMethod)
3738
if (testMethod.TryDuckCast<ITestMethod>(out var testMethodInfo))
3839
{
3940
// Create the skip span
40-
MsTestIntegration.OnMethodBegin(testMethodInfo, testMethod.GetType(), isRetry: false)?
41-
.Close(TestStatus.Skip, TimeSpan.Zero, _skipReason);
41+
var test = MsTestIntegration.OnMethodBegin(testMethodInfo, testMethod.GetType(), isRetry: false);
42+
if (test is not null)
43+
{
44+
// Set final_status = skip for pre-execution skipped tests (ITR/attribute-based skips)
45+
test.GetTags().FinalStatus = TestTags.StatusSkip;
46+
test.Close(TestStatus.Skip, TimeSpan.Zero, _skipReason);
47+
}
4248
}
4349
}
4450

0 commit comments

Comments
 (0)