Skip to content

Commit 53edb3f

Browse files
authored
Move running of smoke tests to Nuke (#8271)
## Summary of changes Instead of most of the logic for running smoke tests being embedded in the yaml and bash inside yaml, move the building and running of smoke tests into Nuke. ## Reason for change The previous design had some downsides: - Very tied to Azure Devops. If we want to migrate to gitlab at some point, this _should_ make it easier, because devops is "doing less" - Hard to run smoke tests locally. If you wanted to investigate a scenario, you'd have to decode all the docker, docker compose, and bash scripts that you needed to run to get something _resembling_ the test setup. - There was a lot of duplication, because it's hard to remove that in a clean way from some of the yaml without creating loads of fine-grained steps (which have their own difficulties). Moving to C# makes it easy to (for example) have try-catch blocks, custom retries etc - Bash in YAML is kind of ewww ## Implementation details I initially tried to implement this over a year ago, using TestContainers, but tl;dr; I ran into a bunch of limitations that I couldn't get past (APIs that we needed, which just didn't exist, differences between windows/linux etc), so I abandoned it. Until 🤖 made exploring these things easier again! The latest approach uses the https://github.com/dotnet/Docker.DotNet/ project, which provides a strongly-typed way to call the docker HTTP API (which is what TestContainers actually uses under the hood - it even uses this project). This made it _much_ easier to convert the explicit steps that we are doing currently in bash/yaml/docker-compose to being simple C# methods. At a high level, the implementation roughly follows what we have today, but it's tied much _less_ to the azure devops infrastructure, as we just run our Nuke tasks in the same way we do today (i.e. directly on the box for Windows, in a docker container for Linux). A high level overview: - The `GenerateVariables` stage still generates the matrix of variables, but it only needs to generate a _category_ (e.g. `LinuxX64Installer`/`WindowsFleetInstallerIis`), and an associated _scenario_ (the specific test, e.g. `ubuntu, .NET 10, .deb`). - Renamed the stages (and associated matricies) to make them more consistent e.g.. `smoke_<x64|arm64|win|macos>_<installer|nuget|fleet>_tests`. We can easily tweak this if we prefer - To run a test (e.g. locally) `build.ps1 RunArtifactSmokeTests -SmokeTestCategory "LinuxX64Installer" -SmokeTestScenario "someScenario"` - All of the work for building the images, building/pulling the test agent/running the smoke tests/running crash tests/Doing snapshot verification is handled by Nuke. We have automatic retries around all the parts that could fail (i.e. anything docker or HTTP related) That also means we can delete various things - All the old stages in the pipelin - The old run-snapshot-test.yml - The entries in the docker-compose (the test-agent is actually still used in a few places, so those stay) Also includes a few tiny tweaks and cleanup (commented in the files as appropriate) ## Test coverage The same hopefully!? I've run the full sweet of tests several times, and spot checked various of the tests to make sure everything looks ok, and as far as I can tell, it does! Also temporarily [modified the snapshots ](https://dev.azure.com/datadoghq/dd-trace-dotnet/_build/results?buildId=196876&view=results) to confirm that causes everything to fail too ## Other details The _big_ one which I didn't/couldn't easily convert is the macos smoke tests. These are written _completely_ differently today, because they don't run in containers (which means we have to handle a whole bunch of different issues) and rather just duplicate a whole bunch of logic. It's _probably_ not worth the effort to port them into Nuke at the moment, but I'm open to doing it in a follow up if people feel one way or the other. The other thing is that I _didn't_ move the "downloading of artifacts" into the nuke job, though technically we could, and it would make running locally even easier. My reason for _not_ doing that was that it ties the nuke side to the azure devops side completely then, and if we rename an artifact in the yaml (for some reason) it's far more likely we'll forget it on the c# side. https://datadoghq.atlassian.net/browse/LANGPLAT-823
1 parent 10c5701 commit 53edb3f

23 files changed

Lines changed: 3173 additions & 3441 deletions
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
parameters:
2+
- name: matrixVariable
3+
type: string
4+
5+
- name: artifacts
6+
type: object
7+
default: []
8+
9+
jobs:
10+
- template: update-github-status-jobs.yml
11+
parameters:
12+
jobs: [test]
13+
14+
- job: test
15+
timeoutInMinutes: 20
16+
variables:
17+
targetShaId: $[ stageDependencies.merge_commit_id.fetch.outputs['set_sha.sha']]
18+
targetBranch: $[ stageDependencies.merge_commit_id.fetch.outputs['set_sha.branch']]
19+
strategy:
20+
matrix: $[ stageDependencies.generate_variables.generate_variables_job.outputs['generate_variables_step.${{ parameters.matrixVariable }}'] ]
21+
22+
steps:
23+
- template: clone-repo.yml
24+
parameters:
25+
targetShaId: $(targetShaId)
26+
targetBranch: $(targetBranch)
27+
28+
# Download required artifacts. runner-standalone-* artifacts are tar.gz
29+
# archives that need to be extracted; all others download directly.
30+
- ${{ each artifact in parameters.artifacts }}:
31+
- ${{ if startsWith(artifact, 'runner-standalone-') }}:
32+
- template: download-artifact.yml
33+
parameters:
34+
artifact: ${{ artifact }}
35+
path: $(Agent.TempDirectory)
36+
patterns: "*.tar.gz"
37+
- script: |
38+
mkdir -p $(outputDir)
39+
tar -xf $(Agent.TempDirectory)/dd-trace-*.tar.gz -C $(outputDir)
40+
chmod +x $(outputDir)/dd-trace
41+
displayName: extract dd-trace tool to artifacts
42+
- ${{ else }}:
43+
- template: download-artifact.yml
44+
parameters:
45+
artifact: ${{ artifact }}
46+
path: $(outputDir)
47+
48+
- template: run-in-docker.yml
49+
parameters:
50+
build: true
51+
baseImage: alpine
52+
command: "RunArtifactSmokeTests CheckSmokeTestsForErrors ExtractMetricsFromLogs -SmokeTestCategory $(category) -SmokeTestScenario $(scenario) --Artifacts /project/artifacts"
53+
apiKey: $(DD_LOGGER_DD_API_KEY)
54+
extraArgs: "-v /var/run/docker.sock:/var/run/docker.sock"
55+
56+
- publish: artifacts/build_data
57+
artifact: _$(System.StageName)_$(Agent.JobName)_logs_$(System.JobAttempt)
58+
condition: always()
59+
continueOnError: true
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
parameters:
2+
- name: matrixVariable
3+
type: string
4+
5+
- name: artifacts
6+
type: object
7+
default: []
8+
9+
jobs:
10+
- template: update-github-status-jobs.yml
11+
parameters:
12+
jobs: [test]
13+
14+
- job: test
15+
timeoutInMinutes: 45
16+
variables:
17+
targetShaId: $[ stageDependencies.merge_commit_id.fetch.outputs['set_sha.sha']]
18+
targetBranch: $[ stageDependencies.merge_commit_id.fetch.outputs['set_sha.branch']]
19+
strategy:
20+
matrix: $[ stageDependencies.generate_variables.generate_variables_job.outputs['generate_variables_step.${{ parameters.matrixVariable }}'] ]
21+
22+
steps:
23+
- template: ensure-docker-ready.yml
24+
- template: clone-repo.yml
25+
parameters:
26+
targetShaId: $(targetShaId)
27+
targetBranch: $(targetBranch)
28+
29+
# Download required artifacts to the output directory
30+
- ${{ each artifact in parameters.artifacts }}:
31+
- template: download-artifact.yml
32+
parameters:
33+
artifact: ${{ artifact }}
34+
path: $(outputDir)
35+
36+
- template: install-latest-dotnet-sdk.yml
37+
- script: tracer\build.cmd RunArtifactSmokeTests CheckSmokeTestsForErrors ExtractMetricsFromLogs -SmokeTestCategory $(category) -SmokeTestScenario $(scenario) --Artifacts $(outputDir)
38+
displayName: Run Nuke smoke test
39+
env:
40+
DD_LOGGER_DD_API_KEY: $(ddApiKey)
41+
42+
- publish: artifacts/build_data
43+
artifact: _$(System.StageName)_$(Agent.JobName)_logs_$(System.JobAttempt)
44+
condition: always()
45+
continueOnError: true

.azure-pipelines/steps/run-snapshot-test.yml

Lines changed: 0 additions & 246 deletions
This file was deleted.

0 commit comments

Comments
 (0)