ci(test): parallelize acceptance tests using dynamic matrix#692
ci(test): parallelize acceptance tests using dynamic matrix#692tembleking merged 7 commits intomasterfrom
Conversation
Run each test file in its own job using GitHub Actions dynamic matrix. This reduces CI time from ~30 minutes to ~3-5 minutes by running ~210 jobs in parallel instead of 4 sequential jobs. Changes: - Sysdig Secure: 89 parallel jobs (was 1 job taking ~30min) - Sysdig Monitor: 54 parallel jobs (was 1 job taking ~16min) - IBM Monitor: 41 parallel jobs (was 1 job taking ~21min) - IBM Secure: 26 parallel jobs (was 1 job taking ~10min) Each job: 1. Lists test files by build tag 2. Extracts test function names from the file 3. Runs only those tests using -run flag
There was a problem hiding this comment.
Pull request overview
This PR parallelizes acceptance tests by running each test file in its own GitHub Actions job, reducing CI time from ~30 minutes to ~3-5 minutes. The workflow now dynamically discovers test files by build tag and creates separate jobs for each file.
Changes:
- Replaced sequential acceptance test jobs with dynamic matrix-based parallel execution
- Added list jobs that discover test files via grep and generate matrices
- Removed
TEST_SUITEenvironment variable in favor of inline build tags
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Handle empty grep results gracefully to avoid workflow failures - Add aggregator jobs to collect matrix results for each test suite (useful for configuring required status checks in branch protection) - Add max-parallel: 20 to all matrix strategies to prevent API rate limiting
The test was using a fixed email address which caused failures when multiple test suites ran in parallel (both monitor and secure suites include this test).
- Reduce max-parallel from 20 to 5 for IBM tests to avoid rate limiting - Fix stateful rule count test to check >= 2 instead of exactly 2, avoiding failures when tests run in parallel and create additional rules
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e310922 to
68bc69c
Compare
IBM tests have rate limiting issues that prevent parallelization. Keep Sysdig Monitor and Secure parallelized with dynamic matrix.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
## Summary Parallelize Sysdig acceptance tests by running each test file in its own GitHub Actions job using dynamic matrix strategy. **IBM tests remain sequential due to API rate limiting issues.** ## Changes | Suite | Before | After | |-------|--------|-------| | Sysdig Secure | 1 sequential job | Dynamic matrix (max 20 concurrent), fail-fast | | Sysdig Monitor | 1 sequential job | Dynamic matrix (max 20 concurrent), fail-fast | | IBM Monitor | 1 sequential job | No change (sequential) | | IBM Secure | 1 sequential job | No change (sequential) | ## How it works 1. `list-*-tests` jobs discover test files by searching for build tags (`tf_acc_sysdig_secure`, `tf_acc_sysdig_monitor`) 2. Each file's tests run in parallel using `-run` flag to filter by test name 3. `fail-fast: true` stops the matrix early if any test fails 4. Aggregator jobs (`sysdig-secure-result`, `sysdig-monitor-result`) collect results for required status checks 5. IBM tests remain sequential to avoid API rate limiting (500/504 errors with parallelization) ## Additional changes - Add `merge_group` trigger to `ci.yml` for merge queue support - Handle empty grep results gracefully in list jobs - Fix `data_source_sysdig_user_test` to use random email suffix (avoid collisions in parallel runs) - Fix `data_source_sysdig_secure_rule_stateful_count_test` to check `rule_count >= 2` instead of exact match (avoid flaky failures)
Summary
Parallelize Sysdig acceptance tests by running each test file in its own GitHub Actions job using dynamic matrix strategy. IBM tests remain sequential due to API rate limiting issues.
Changes
How it works
list-*-testsjobs discover test files by searching for build tags (tf_acc_sysdig_secure,tf_acc_sysdig_monitor)-runflag to filter by test namefail-fast: truestops the matrix early if any test failssysdig-secure-result,sysdig-monitor-result) collect results for required status checksAdditional changes
merge_grouptrigger toci.ymlfor merge queue supportdata_source_sysdig_user_testto use random email suffix (avoid collisions in parallel runs)data_source_sysdig_secure_rule_stateful_count_testto checkrule_count >= 2instead of exact match (avoid flaky failures)