Fix stress pipeline: rename variables to avoid env var injection, simplify runner#4329
Conversation
… warnOnTestFailure - Move STRESS_CONFIG_FILE from dotnet run -e arg to task env: property to fix DotNetCoreCLI@2 leaking the env var value as an app argument, which hit the default switch case and set RunMode.Help. - Rename failOnTestFailure to warnOnTestFailure (default false) with inverted semantics: false = failures fail the run, true = warnings only. All test steps always execute regardless of this setting.
There was a problem hiding this comment.
Pull request overview
This PR fixes Azure DevOps stress test runs accidentally executing in “Help” mode by moving STRESS_CONFIG_FILE injection from dotnet run -e ... into the supported env: block for DotNetCoreCLI@2. It also renames the failure-handling parameter to warnOnTestFailure and aligns continueOnError semantics with that name.
Changes:
- Set
STRESS_CONFIG_FILEviaenv:on theDotNetCoreCLI@2runtasks (instead ofdotnet run -e ...) to prevent argument leakage into the runner. - Replace
failOnTestFailurewithwarnOnTestFailure(defaultfalse) and updatecontinueOnErroraccordingly. - Add
--no-buildto sharedrunArgumentssodotnet runconsistently uses the already-built outputs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| eng/pipelines/stress/stress-tests-stage.yml | Renames and threads the new warnOnTestFailure parameter into the stage’s OS-specific jobs. |
| eng/pipelines/stress/stress-tests-pipeline.yml | Exposes warnOnTestFailure as a pipeline parameter and passes it to the stress test stage. |
| eng/pipelines/stress/stress-tests-job.yml | Applies warnOnTestFailure to continueOnError and sets STRESS_CONFIG_FILE via task env: for each runtime run. |
- Remove RunMode enum, RunVerify, ExitWithError, and Help mode - Init() returns exit code directly; Main() short-circuits on non-zero - Rename RunStress() to Run() (only remaining path) - Fix GetMdsVersion() to use correct namespace (Microsoft.Data.SqlClient.ThisAssembly) - PrintHelp() dumps received args with quoting for diagnostics
The task's 'command: run' mode puts all 'arguments' content after '--' as application args, ignoring the user-specified '--' separator. This caused dotnet-run flags (--verbosity, --configuration, --no-build, -f) to be passed to the stress runner app instead of to dotnet-run itself. Switch to 'command: custom' + 'custom: run' which passes arguments literally, giving us full control over placement.
1b955b5 to
09a1d25
Compare
…gression dotnet run in SDK 10.0.300 incorrectly forwards its own options (--verbosity, --configuration, --no-build) as application arguments on the CI agent (Ubuntu 22.04). This cannot be reproduced locally with the same SDK version but manifests consistently on hosted agents. Since we build separately anyway (--no-build), switch to dotnet exec which directly invokes the built DLL without any argument parsing ambiguity.
- Remove stale -all and -verify options from PrintHelp (no run modes remain) - Clarify netTestRuntimes comment to note net10.0 is tested but not shipped - Use CmdLine@2 for .NET Framework test loop (dotnet exec cannot run net462 exe)
The dotnet CLI's System.CommandLine reads environment variables named
{COMMAND}ARGUMENTS (e.g. RUNARGUMENTS, BUILDARGUMENTS, TESTARGUMENTS)
and silently injects their content into parsed arguments. Since ADO
exposes all pipeline variables as uppercased env vars, variables named
'runArguments', 'buildArguments', or 'testArguments' caused the stress
runner to receive SDK options in its args[], triggering Help mode.
Changes:
- Rename buildArguments -> dotnetBuildOpts
- Rename testArguments -> stressTestOpts
- Add dotnetRunOpts variable for shared dotnet run options
- Revert dotnet exec back to dotnet run (the real fix is the rename)
- Use ${{ variables.* }} for compile-time expansion
- Document the gotcha in ado-pipelines.instructions.md
- Upgraded the Runner to use System.CommandLine instead of trying to parse arguments by hand.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 53 out of 53 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/Microsoft.Data.SqlClient/tests/StressTests/SqlClient.Stress.Runner/Monitor/MonitorLoadUtils.cs:22
mainAssembly.GetType(classname)can return null (and in the current layout it will, sincemainAssemblyis the Monitoring assembly that containsIMonitorLoader, but it does not defineMicrosoft.Data.SqlClient.Test.Stress.MonitorLoader). The subsequentt.GetInterfaces()will throw aNullReferenceException, and enabling monitoring will fail with an unhelpful crash. Add an explicit null check with a clear error so failures are actionable (and ensure the expectedIMonitorLoaderimplementation type exists in the target assembly).
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4329 +/- ##
==========================================
- Coverage 66.55% 64.64% -1.91%
==========================================
Files 285 280 -5
Lines 43311 66160 +22849
==========================================
+ Hits 28824 42767 +13943
- Misses 14487 23393 +8906
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Reviewers (@mdaigle @cheenamalhotra @benrr101) - Please don't wait for the stress test checks to pass before reviewing - they won't, because the tests don't pass reliably and I'm not going to re-run them hoping to get a green checkmark! This PR isn't concerned with fixing whatever is wrong with the stress tests. It is only concerned with getting the pipeline running reliably. |
I agree the test failures need work and investigation into why are they failing at the first place. Let's also capture that in our backlog. |
benrr101
left a comment
There was a problem hiding this comment.
Approving for pipeline instructions file
Summary
Fixes stress tests running in Help mode on CI instead of executing tests. The root cause was the dotnet CLI's System.CommandLine silently injecting pipeline variable content into application arguments.
Root Cause
The dotnet CLI reads environment variables named
{COMMAND}ARGUMENTS(e.g.RUNARGUMENTS,BUILDARGUMENTS,TESTARGUMENTS) and prepends their content to the parsed arguments for that subcommand. Azure DevOps automatically exposes all pipeline variables as uppercased environment variables, so a pipeline variable namedbuildArgumentsbecameBUILDARGUMENTS, andtestArgumentsbecameTESTARGUMENTS— causing their content to leak into the application'sargs[]and triggering Help mode.This affects all .NET SDK versions (8.0+) and is invisible in
[command]log lines.You can see the pipeline executing the tests now:
They aren't passing, but that's a problem for another PR.
Changes
Pipeline Default Behaviour
Pipeline variable rename (
stress-tests-job.yml)buildArguments→dotnetBuildOptstestArguments→stressTestOptsdotnetRunOptsvariable for shareddotnet runoptions${{ variables.* }}compile-time expansion for all static variables{COMMAND}ARGUMENTSenv var footgunPool name auto-selection (
stress-tests-stage.yml)ADO.Net→ADO-1ES-Poolpublic→ADO-CI-1ES-Pool${{ if eq(variables['System.TeamProject'], 'ADO.Net') }}at compile timeStress runner simplification (
Program.cs)RunMode.RunOne,RunMode.Profile, XML config)PlatformAttribute(unused)RunAllandHelpmodesPipeline structure (
stress-tests-pipeline.yml,stress-tests-stage.yml)netFrameworkTestRuntimes: []for Linux/macOS jobsInstructions update (
.github/instructions/ado-pipelines.instructions.md){COMMAND}ARGUMENTSenv var injection gotchaVerification
env:property correctly setsSTRESS_CONFIG_FILEfor the spawned process