|
| 1 | +# CI Remediation Plan: E2E Tests & Workflow Optimization |
| 2 | + |
| 3 | +**Objective**: Stabilize the E2E testing pipeline by addressing missing browser dependencies, optimizing shard distribution, and fixing flaky tests. |
| 4 | + |
| 5 | +## 1. CI Workflow Updates (`.github/workflows/e2e-tests-split.yml`) |
| 6 | + |
| 7 | +### 1.1 Fix Missing Browser Dependencies in Security Jobs |
| 8 | +The security enforcement jobs for Firefox and WebKit are failing because they lack the Chromium dependency required by the shared test utilities (likely in `fixtures/auth-fixtures` or `utils/` which might depend on Chromium-specific behaviors or default browser contexts during setup). |
| 9 | + |
| 10 | +**Action**: Add the Chromium installation step to `e2e-firefox-security` and `e2e-webkit-security` jobs, mirroring the non-security jobs. |
| 11 | + |
| 12 | +**Implementation Details**: |
| 13 | +```yaml |
| 14 | +# In e2e-firefox-security: |
| 15 | +- name: Install Playwright Chromium |
| 16 | + run: | |
| 17 | + echo "📦 Installing Chromium (required by security-tests dependency)..." |
| 18 | + npx playwright install --with-deps chromium |
| 19 | + EXIT_CODE=$? |
| 20 | + echo "✅ Install command completed (exit code: $EXIT_CODE)" |
| 21 | + exit $EXIT_CODE |
| 22 | +
|
| 23 | +# In e2e-webkit-security: |
| 24 | +- name: Install Playwright Chromium |
| 25 | + run: | |
| 26 | + echo "📦 Installing Chromium (required by security-tests dependency)..." |
| 27 | + npx playwright install --with-deps chromium |
| 28 | + EXIT_CODE=$? |
| 29 | + echo "✅ Install command completed (exit code: $EXIT_CODE)" |
| 30 | + exit $EXIT_CODE |
| 31 | +``` |
| 32 | +
|
| 33 | +### 1.2 Optimize Shard Distribution |
| 34 | +Shard 4 is consistently timing out (>20m) while others finish quickly (4-13m). Reducing the shard count forces a redistribution of tests which effectively rebalances the load. |
| 35 | +
|
| 36 | +**Action**: |
| 37 | +1. Change shard strategy from 4 to 3. |
| 38 | +2. Increase workflow timeout from default (or 20m) to **25 minutes** to accommodate the slightly higher per-shard load. |
| 39 | +
|
| 40 | +**Implementation Details**: |
| 41 | +```yaml |
| 42 | +# In e2e-chromium, e2e-firefox, e2e-webkit jobs: |
| 43 | +timeout-minutes: 25 # Increased for safety |
| 44 | + |
| 45 | +strategy: |
| 46 | + fail-fast: false |
| 47 | + matrix: |
| 48 | + shard: [1, 2, 3] # Reduced from [1, 2, 3, 4] |
| 49 | + total-shards: [3] # Reduced from [4] |
| 50 | +``` |
| 51 | +
|
| 52 | +## 2. Test Stability Fixes |
| 53 | +
|
| 54 | +### 2.1 Fix `certificates.spec.ts` (Core) |
| 55 | +**Issue**: Tests fail when checking for "Empty State OR Table" because `isVisible().catch()` returns false for both during the transitional loading state, even after waiting for loading to complete. |
| 56 | + |
| 57 | +**Solution**: Use Playwright's distinct `expect` assertions with locators combined via `.or()` to allow Playwright's auto-retrying mechanism to handle the state transition. |
| 58 | + |
| 59 | +**Implementation**: |
| 60 | +```typescript |
| 61 | +// Replace explicit boolean checks: |
| 62 | +// const hasEmptyMessage = await emptyCellMessage.isVisible().catch(() => false); |
| 63 | +// const hasTable = await table.isVisible().catch(() => false); |
| 64 | +// expect(hasEmptyMessage || hasTable).toBeTruthy(); |
| 65 | +
|
| 66 | +// With robust locator assertion: |
| 67 | +await expect( |
| 68 | + page.getByRole('table').or(page.getByText(/no.*certificates.*found/i)) |
| 69 | +).toBeVisible({ timeout: 10000 }); |
| 70 | +``` |
| 71 | +*Apply this pattern to lines 104 and 120.* |
| 72 | + |
| 73 | +### 2.2 Fix `proxy-hosts.spec.ts` (Core) |
| 74 | +**Issue**: `waitForModal` failures (undefined selector match). The custom helper is less reliable than direct Playwright assertions, especially when animations or DOM updates are involved. |
| 75 | + |
| 76 | +**Solution**: Replace `waitForModal(page)` with explicit expectations for the dialog visibility. |
| 77 | + |
| 78 | +**Implementation**: |
| 79 | +```typescript |
| 80 | +// Replace: |
| 81 | +// await waitForModal(page); |
| 82 | +
|
| 83 | +// With: |
| 84 | +await expect(page.getByRole('dialog')).toBeVisible(); |
| 85 | +``` |
| 86 | +*Apply to all occurrences in `Create`, `Update`, `Delete` describe blocks.* |
| 87 | + |
| 88 | +### 2.3 Fix `crowdsec-import.spec.ts` (Security) |
| 89 | +**Issue**: Flaky failure on "should handle archive with optional files". The backend likely returns a 500/4xx error intermittently (possibly due to file locking on `acquis.yaml` or state issues from previous tests). |
| 90 | + |
| 91 | +**Solution**: Implement a retry loop for the API request. This handles transient backend locking issues. |
| 92 | + |
| 93 | +**Implementation**: |
| 94 | +```typescript |
| 95 | +// Wrap the request in a retry loop |
| 96 | +await expect(async () => { |
| 97 | + const response = await request.post('/api/v1/admin/crowdsec/import', { |
| 98 | + // ... payload ... |
| 99 | + }); |
| 100 | + expect(response.ok(), `Import failed with status: ${response.status()}`).toBeTruthy(); |
| 101 | + const data = await response.json(); |
| 102 | + expect(data).toHaveProperty('status', 'imported'); |
| 103 | +}).toPass({ |
| 104 | + intervals: [1000, 2000, 5000], |
| 105 | + timeout: 15_000 |
| 106 | +}); |
| 107 | +``` |
| 108 | + |
| 109 | +## 3. Execution Plan |
| 110 | + |
| 111 | +### Phase 1: Test Stability |
| 112 | +1. Modify `tests/core/certificates.spec.ts`. |
| 113 | +2. Modify `tests/core/proxy-hosts.spec.ts`. |
| 114 | +3. Modify `tests/security/crowdsec-import.spec.ts`. |
| 115 | +4. Verification: Run these specific tests locally (using the skill) to ensure they pass consistently. |
| 116 | + |
| 117 | +### Phase 2: Workflow Updates |
| 118 | +1. Modify `.github/workflows/e2e-tests-split.yml`. |
| 119 | +2. Verification: Rely on CI execution (cannot fully simulate GitHub Actions matrix locally). |
| 120 | + |
| 121 | +### Phase 3: Final Verification |
| 122 | +1. Push changes and monitor the full E2E suite. |
0 commit comments