Commit 53edb3f
authored
Move running of smoke tests to Nuke (#8271)
## Summary of changes
Instead of most of the logic for running smoke tests being embedded in
the yaml and bash inside yaml, move the building and running of smoke
tests into Nuke.
## Reason for change
The previous design had some downsides:
- Very tied to Azure Devops. If we want to migrate to gitlab at some
point, this _should_ make it easier, because devops is "doing less"
- Hard to run smoke tests locally. If you wanted to investigate a
scenario, you'd have to decode all the docker, docker compose, and bash
scripts that you needed to run to get something _resembling_ the test
setup.
- There was a lot of duplication, because it's hard to remove that in a
clean way from some of the yaml without creating loads of fine-grained
steps (which have their own difficulties). Moving to C# makes it easy to
(for example) have try-catch blocks, custom retries etc
- Bash in YAML is kind of ewww
## Implementation details
I initially tried to implement this over a year ago, using
TestContainers, but tl;dr; I ran into a bunch of limitations that I
couldn't get past (APIs that we needed, which just didn't exist,
differences between windows/linux etc), so I abandoned it. Until 🤖 made
exploring these things easier again!
The latest approach uses the https://github.com/dotnet/Docker.DotNet/
project, which provides a strongly-typed way to call the docker HTTP API
(which is what TestContainers actually uses under the hood - it even
uses this project). This made it _much_ easier to convert the explicit
steps that we are doing currently in bash/yaml/docker-compose to being
simple C# methods.
At a high level, the implementation roughly follows what we have today,
but it's tied much _less_ to the azure devops infrastructure, as we just
run our Nuke tasks in the same way we do today (i.e. directly on the box
for Windows, in a docker container for Linux).
A high level overview:
- The `GenerateVariables` stage still generates the matrix of variables,
but it only needs to generate a _category_ (e.g.
`LinuxX64Installer`/`WindowsFleetInstallerIis`), and an associated
_scenario_ (the specific test, e.g. `ubuntu, .NET 10, .deb`).
- Renamed the stages (and associated matricies) to make them more
consistent e.g..
`smoke_<x64|arm64|win|macos>_<installer|nuget|fleet>_tests`. We can
easily tweak this if we prefer
- To run a test (e.g. locally) `build.ps1 RunArtifactSmokeTests
-SmokeTestCategory "LinuxX64Installer" -SmokeTestScenario
"someScenario"`
- All of the work for building the images, building/pulling the test
agent/running the smoke tests/running crash tests/Doing snapshot
verification is handled by Nuke. We have automatic retries around all
the parts that could fail (i.e. anything docker or HTTP related)
That also means we can delete various things
- All the old stages in the pipelin
- The old run-snapshot-test.yml
- The entries in the docker-compose (the test-agent is actually still
used in a few places, so those stay)
Also includes a few tiny tweaks and cleanup (commented in the files as
appropriate)
## Test coverage
The same hopefully!? I've run the full sweet of tests several times, and
spot checked various of the tests to make sure everything looks ok, and
as far as I can tell, it does! Also temporarily [modified the snapshots
](https://dev.azure.com/datadoghq/dd-trace-dotnet/_build/results?buildId=196876&view=results)
to confirm that causes everything to fail too
## Other details
The _big_ one which I didn't/couldn't easily convert is the macos smoke
tests. These are written _completely_ differently today, because they
don't run in containers (which means we have to handle a whole bunch of
different issues) and rather just duplicate a whole bunch of logic. It's
_probably_ not worth the effort to port them into Nuke at the moment,
but I'm open to doing it in a follow up if people feel one way or the
other.
The other thing is that I _didn't_ move the "downloading of artifacts"
into the nuke job, though technically we could, and it would make
running locally even easier. My reason for _not_ doing that was that it
ties the nuke side to the azure devops side completely then, and if we
rename an artifact in the yaml (for some reason) it's far more likely
we'll forget it on the c# side.
https://datadoghq.atlassian.net/browse/LANGPLAT-8231 parent 10c5701 commit 53edb3f
23 files changed
Lines changed: 3173 additions & 3441 deletions
File tree
- .azure-pipelines
- steps
- tracer/build/_build
- SmokeTests
- docker
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
This file was deleted.
0 commit comments