Add --retry-failed-tests-delay option to MTP retry package#7840
Add --retry-failed-tests-delay option to MTP retry package#7840Copilot wants to merge 8 commits into
--retry-failed-tests-delay option to MTP retry package#7840Conversation
Agent-Logs-Url: https://github.com/microsoft/testfx/sessions/b65299ab-c298-41e1-b995-33ae9f177b0b Co-authored-by: Evangelink <11340282+Evangelink@users.noreply.github.com>
--retry-failed-tests-delay option to MTP retry package
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a configurable delay between retry attempts for the retry-failed-tests extension to reduce flakiness for external-system-dependent tests.
Changes:
- Introduces
--retry-failed-tests-delayCLI option with unit-aware timespan parsing and validation. - Applies an inter-attempt
Task.DelayinRetryOrchestratorand strips the option from forwarded child-process args. - Updates help text, package docs, resources, and adds unit tests for option validation.
Show a summary per file
| File | Description |
|---|---|
| test/UnitTests/Microsoft.Testing.Extensions.UnitTests/RetryTests.cs | Adds unit tests for delay option parsing/validation and dependency on --retry-failed-tests. |
| test/UnitTests/MSTestAdapter.PlatformServices.UnitTests/Services/DesktopTestSourceHostTests.cs | Adjusts nullability usage (null!) in mocks. |
| test/IntegrationTests/Microsoft.Testing.Platform.Acceptance.IntegrationTests/HelpInfoAllExtensionsTests.cs | Updates expected help output to include the new delay option. |
| src/Platform/Microsoft.Testing.Extensions.Retry/RetryOrchestrator.cs | Removes delay arg from forwarded args; parses delay option and delays between attempts. |
| src/Platform/Microsoft.Testing.Extensions.Retry/RetryCommandLineOptionsProvider.cs | Registers the new option and validates dependency + timespan format. |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.zh-Hant.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.zh-Hans.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.tr.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.ru.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.pt-BR.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.pl.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.ko.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.ja.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.it.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.fr.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.es.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.de.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/xlf/ExtensionResources.cs.xlf | Adds new resource strings for delay option description/error (untranslated “new” entries). |
| src/Platform/Microsoft.Testing.Extensions.Retry/Resources/ExtensionResources.resx | Adds delay option description and invalid-argument error string. |
| src/Platform/Microsoft.Testing.Extensions.Retry/PACKAGE.md | Documents the new delay feature and CLI option usage. |
Copilot's findings
- Files reviewed: 20/20 changed files
- Comments generated: 4
| argIndex = GetOptionArgumentIndex(RetryCommandLineOptionsProvider.RetryFailedTestsDelayOptionName, executableArguments); | ||
| if (argIndex > -1) | ||
| { | ||
| indexToCleanup.Add(argIndex); | ||
| indexToCleanup.Add(argIndex + 1); | ||
| } |
There was a problem hiding this comment.
Added a bounds check in the latest commit: argIndex + 1 is now only added to indexToCleanup when argIndex + 1 < executableArguments.Length.
| if (commandOption.Name == RetryFailedTestsDelayOptionName && !TimeSpanParser.TryParse(arguments[0], out TimeSpan _)) | ||
| { | ||
| return ValidationResult.InvalidTask(ExtensionResources.RetryFailedTestsDelayOptionInvalidArgument); | ||
| } |
There was a problem hiding this comment.
Extended validation in the latest commit: the condition now also rejects delays where delay < TimeSpan.Zero or delay.TotalMilliseconds > int.MaxValue. Added [DataRow("25d")] (~30 days, > int.MaxValue ms) to the invalid test cases. Note that the TimeSpanParser regex (^\d+...) already rejects negative inputs, so the < TimeSpan.Zero guard is an extra defensive layer.
| TimeSpan? retryDelay = _commandLineOptions.TryGetOptionArgumentList(RetryCommandLineOptionsProvider.RetryFailedTestsDelayOptionName, out string[]? retryDelayArgs) | ||
| ? TimeSpanParser.Parse(retryDelayArgs[0]) | ||
| : null; |
There was a problem hiding this comment.
Already addressed in the previous commit — TimeSpanParser.TryParse with a retryDelayArgs is { Length: > 0 } guard is used instead of Parse.
| --retry-failed-tests | ||
| Retry failed tests the given number of times | ||
| --retry-failed-tests-delay | ||
| Add a delay between retries. The delay is expressed as a time value, e.g. 1s, 2.5m, 1h. Default unit is milliseconds. |
There was a problem hiding this comment.
Updated in the latest commit. The description now reads: ...e.g. 200, 1s, 2.5m, 1h... — the bare millisecond example 200 has been added to both the resource string and the test expected values.
Evangelink
left a comment
There was a problem hiding this comment.
Summary
Workflow: Test Expert Reviewer 🧪
Date: 2026-05-11
Repository: microsoft/testfx
Key Findings
-
[Critical — will cause test failure]
HelpInfoAllExtensionsTests.cs: The--helpoutput test was updated for--retry-failed-tests-delay, but theInfo_WithAllExtensionsRegistered_OutputFullInfoContenttest was not. BecauseAssertOutputMatchesLinesperforms sequential line-by-line matching, the test will fail in CI — the actual--infooutput now includes the delay option underRetryCommandLineOptionsProvider, but the expected pattern jumps straight toTerminalTestReporterCommandLineOptionsProvider. Four lines need to be inserted in the pattern. -
[Minor — coverage gap]
RetryTests.cs(IsInvalid_If_InvalidTimeSpan_Is_Provided_For_DelayOption): The whitespace-only input (" ") takes a distinct early-return path inTimeSpanParser.TryParsebut is not included in the[DataRow]set. -
[Moderate — missing behavioral coverage] Neither the new unit tests nor the existing
RetryFailedTestsTests.csintegration tests verify the actual delay behaviour inRetryOrchestrator: thatTask.Delayis applied between retry attempts (not before the first), and that--retry-failed-tests-delayis stripped from the child-process argument list.
Recommendations
- Add the
--retry-failed-tests-delayblock to the--infoexpected-output string inHelpInfoAllExtensionsTests.cs(around line 342, after the--retry-failed-tests-max-testsentry). - Add
[DataRow(" ")]toIsInvalid_If_InvalidTimeSpan_Is_Provided_For_DelayOption. - Add an acceptance or unit test that exercises the orchestrator's delay and arg-stripping logic end-to-end.
Generated by Test Expert Reviewer
🧪 Test quality reviewed by Test Expert Reviewer 🧪
…DataRow Co-authored-by: Evangelink <11340282+Evangelink@users.noreply.github.com>
Evangelink
left a comment
There was a problem hiding this comment.
Summary
Workflow: Expert Code Reviewer
Date: 2026-05-11
Repository: microsoft/testfx
Key Findings
[Defensive] TimeSpanParser.Parse in orchestrator path (RetryOrchestrator.cs:158)
The orchestrator calls TimeSpanParser.Parse (which throws FormatException) rather than the TryParse pattern used elsewhere in the same method. No bounds check on retryDelayArgs before [0] access either, inconsistent with how cmdRetries is guarded at line 98. Low practical risk since validation runs first, but adds a latent uncaught-exception path.
[Resources] Pipe server created before delay fires (RetryOrchestrator.cs:181 / 282)
RetryFailedTestsPipeServer (OS named-pipe listener) is created ~100 lines before the Task.Delay fires. For large delays (minutes/hours), the pipe holds OS resources for the entire wait period before the child process is launched. Moving the delay to the top of the loop body (immediately after attemptCount++) would scope the pipe server to only the window where it's needed.
Positive Observations
- Correctness of the retry-delay logic is sound: delay correctly skips attempt 1 (
attemptCount > 1), fires before each subsequent attempt, and appliesConfigureAwait(false)throughout. - Cancellation token is correctly threaded into
Task.Delay— cancellation during the wait will propagate asOperationCanceledException, which is the expected behavior for this framework. - Argument stripping (
indexToCleanup) correctly removes both the option flag and its value, consistent with all other stripped options. - Validation in
RetryCommandLineOptionsProvideris complete: dependency on--retry-failed-tests, format validation viaTryParse, and test coverage for valid/invalid inputs. TimeSpanParser.TryParsecorrectly rejects empty strings and invalid formats per the new unit tests.
Overall Assessment
The implementation is logically correct and well-structured. Both findings are minor defensive/resource concerns rather than correctness bugs. No threading issues, security vulnerabilities, or public API breaking changes were identified.
🧠 Reviewed by Expert Code Reviewer
🧠 Reviewed by Expert Code Reviewer 🧠
| }; | ||
|
|
||
| // Apply delay between retries if configured (skip first attempt) | ||
| if (attemptCount > 1 && retryDelay is { } delay) |
There was a problem hiding this comment.
[Resources] The delay fires ~100 lines after retryFailedTestsPipeServer is created and has started listening (line 181). For long delays (e.g. --retry-failed-tests-delay 5m), the OS named-pipe handle is held open for the entire wait period before the child process is even spawned.
Moving the delay to the very top of the loop body — immediately after attemptCount++, before any per-attempt resources are allocated — would scope the pipe server's lifetime to only the time it is actually needed:
attemptCount++;
if (attemptCount > 1 && retryDelay is { } delay)
{
await logger.LogDebugAsync($"Waiting {delay:c} before retry attempt {attemptCount}").ConfigureAwait(false);
await Task.Delay(delay, cancellationToken).ConfigureAwait(false);
}
// Cleanup the arguments, create pipe server, etc.Impact: Minor resource waste — pipe handle open during delay period. Not a correctness bug, but avoidable with a small restructure.
There was a problem hiding this comment.
Fixed in the latest commit. The delay block is now at the very top of the loop body (immediately after attemptCount++), before the pipe server and any other per-attempt resources are allocated.
|
|
||
| // Parse the delay once before the loop since command-line options don't change. | ||
| TimeSpan? retryDelay = _commandLineOptions.TryGetOptionArgumentList(RetryCommandLineOptionsProvider.RetryFailedTestsDelayOptionName, out string[]? retryDelayArgs) | ||
| ? TimeSpanParser.Parse(retryDelayArgs[0]) |
There was a problem hiding this comment.
[Defensive] TimeSpanParser.Parse is used here instead of TryParse. Parse throws FormatException on an invalid value; that exception is not caught anywhere in OrchestrateTestHostExecutionAsync, so it would surface as an unhandled crash rather than a graceful failure.
In practice this is safe because ValidateOptionArgumentsAsync already rejected invalid values before this code runs — but the guard at line 98 shows the established pattern for this method:
ApplicationStateGuard.Ensure(cmdRetries is not null);
int userMaxRetryCount = int.Parse(cmdRetries[0], ...);The analogous defensive form here would be:
TimeSpan? retryDelay = null;
if (_commandLineOptions.TryGetOptionArgumentList(RetryCommandLineOptionsProvider.RetryFailedTestsDelayOptionName, out string[]? retryDelayArgs)
&& retryDelayArgs is { Length: > 0 }
&& TimeSpanParser.TryParse(retryDelayArgs[0], out TimeSpan parsedDelay))
{
retryDelay = parsedDelay;
}Impact: If validation is ever bypassed (e.g. internal/tooling callers), an unhandled FormatException escapes the orchestrator.
Suggestion: Use TryParse and add a Length > 0 guard, consistent with how cmdRetries is validated on line 98.
There was a problem hiding this comment.
Fixed in the latest commit. The delay is now parsed using TryParse with a retryDelayArgs is { Length: > 0 } guard, consistent with the cmdRetries pattern on line 98.
Co-authored-by: Evangelink <11340282+Evangelink@users.noreply.github.com>
Addressed all three points:
|
…tion Co-authored-by: Evangelink <11340282+Evangelink@users.noreply.github.com>
Co-authored-by: Evangelink <11340282+Evangelink@users.noreply.github.com>
The retry extension had no way to introduce a wait between retry attempts, making it less useful for tests interacting with external systems (message queues, databases, etc.) where a cooldown period reduces flakiness.
Changes
--retry-failed-tests-delay: accepts a timespan value using the same unit-aware format as--hangdump-timeout(e.g.200,1s,2.5m,1h; default unit is milliseconds)RetryCommandLineOptionsProvider: registers the option, validates it requires--retry-failed-tests, validates the argument viaTimeSpanParser.TryParse, and enforces a valid range (non-negative, no greater thanint.MaxValuemilliseconds — theTask.Delaymaximum)RetryOrchestrator: strips the delay arg from arguments forwarded to child processes (with a bounds check before addingargIndex + 1to the cleanup list); appliesTask.Delayat the top of the retry loop (afterattemptCount++, before any per-attempt resources are allocated), skipping the first attempt; usesTryParsewith aLength > 0guard instead ofParsefor defensive handling200) alongside unit-suffixed examplesRetryFailedTests_WithDelay_StripsDelayFromChildArgs) that verifies the option is correctly stripped from child-process arguments end-to-endUsage