playwright(ci): split per-entity suites + cap retries + reduce dead-time waits by chirag-madlani · Pull Request #29435 · open-metadata/OpenMetadata

chirag-madlani · 2026-06-24T14:41:43Z

Describe your changes:

Cut PR Playwright CI worst-case shard time from ~141 min to an estimated ~95–100 min, and let broken PRs bail in minutes instead of running the full ~2h35m. Three themes: tightened playwright.config.ts (retries: 2→1, maxFailures: 500→50, plus retries: 0 on Login.spec.ts whose tests sleep 3 min and ~2 min on intentional JWT expiry); reduced 19 hard-coded waitForTimeout values across helpers and specs (biggest single win: utils/lineage.ts drag wait 1000→300, hit ~225× per DataAssetLineage run); and split the two parameterized-per-entity giants (CustomProperties.spec.ts at 3,862 lines / 476 tests and DataAssetLineage.spec.ts) into thin PR specs that run only Table — plus Dashboard for advanced search — alongside a new stress Playwright project for the rest, wired into the existing manual postgresql-nightly-e2e.yml / mysql-nightly-e2e.yml via shardIndex: 7. Also migrated 4 ES-indexing retry sleeps to the existing waitForSearchIndexed poll helper and removed 7 unannotated waitForTimeout leaks ESLint should have blocked.

Type of change:

Improvement

High-level design:

Per-entity test redundancy on PR. CustomProperties.spec.ts parameterized the same UI flow across 19 entity types (476 tests, ~193 min test-time on shard 4); DataAssetLineage.spec.ts across 15 (~69 min, with test.setTimeout(5 * 60 * 1000) per test). The second through Nth iterations are identical UI safeguards. Extracted the parameterized bodies into single-source-of-truth helpers under playwright/shared/; PR specs call the helper with one entity, stress specs call with the rest.

Layout

PR: Pages/TableCustomProperty.spec.ts (Table CRUD + name validation, 15 tests); Pages/DashboardAdvanceSearchCustomProperty.spec.ts (Dashboard full incl. 7 advanced-search describes, 44 tests); Pages/Lineage/TableLineage.spec.ts (Table + Column Level + Temp + Settings, 88 tests)
Stress: stress/CustomPropertiesAllEntities.spec.ts (17 entities); stress/Lineage/DataAssetLineageAllEntities.spec.ts (14 entities) — 404 tests across 5 files

Workflow plumbing. New stress Playwright project (testMatch: '**/stress/**'); chromium project's testIgnore adds '**/stress/**'. Both postgresql-nightly-e2e.yml and mysql-nightly-e2e.yml matrices extended with shardIndex: 7 running --project=stress; shardTotal bumped 6→7 (matrix metadata only — --shard=N/5 is hardcoded). The existing nightly/ folder and its workflows are explicitly untouched (different purpose: ingestion).

Alternatives rejected

Moving to the existing nightly/ folder — it's currently orphaned (no project picks it up) and serves a different (ingestion) purpose.
Tag-based split via a @stress grep — file-based separation matches the existing **/nightly/** convention.
Refactoring DataAssetLineage to test fewer edges per entity — out of scope; this PR is a mechanical move only.

Tests:

Use cases covered

Pure test infrastructure: timeout knobs + file split. No new product code; existing test bodies move verbatim into the helper, which is invoked from the new entry files. The Dashboard-only and Table-only PR coverage preserves the per-entity safeguards that historically caught regressions, and the stress project re-runs the full 19/15-entity matrices manually on demand.

Playwright (UI) tests

Not applicable for adding tests — existing tests are reshuffled, not changed.

Manual testing performed

tsc --noEmit on playwright/tsconfig.json — clean for all new files
prettier --check — clean
eslint --fix — clean (auto-removed unused imports introduced by the split)
playwright test --list --project=chromium — 3,720 tests across 265 files. PR verify create lineage for entity count = 1 (Table only); TableCustomProperty 15 tests; Dashboard advanced search 44 tests; TableLineage 88 tests including the Column Level 8×8 matrix.
playwright test --list --project=stress — 404 tests across 5 files. verify create lineage for entity count = 14 (all non-Table entities present).

UI screen recording / screenshots:

Not applicable — no UI changes.

Checklist:

I have read the CONTRIBUTING document.
I have commented on my code, particularly in hard-to-understand areas.

No associated GitHub issue — this is internal CI tooling work, no product behavior change.

🤖 Generated with Claude Code

Greptile Summary

This PR is a CI optimization focused on test infrastructure: it splits two large parameterized specs (CustomProperties.spec.ts at 3,862 lines / 476 tests and DataAssetLineage.spec.ts) into thin PR-side files (Table + Dashboard only) backed by shared helpers under playwright/shared/, with the remaining entity iterations deferred to a new stress Playwright project triggered only by nightly workflows. It also tightens maxFailures from 500→50, disables retries for the JWT-expiry sleeps in Login.spec.ts, reduces 19 waitForTimeout values, and replaces four fixed ES-indexing sleeps with waitForSearchIndexed poll calls.

Split + shared helpers: customPropertiesEntityTests.ts (3,735 lines) and dataAssetLineageEntityTests.ts (261 lines) become the single source of truth; PR specs call them with 1–2 entities, stress specs with the rest; the new stress Playwright project is wired into both nightly workflows via a new shardIndex: 7 matrix row.
Race-condition fixes: searchRBAC.ts now registers waitForResponse before the press('Enter') that triggers the search, eliminating a known race; SearchIndexNestedColumns.spec.ts narrows its waitForResponse predicates to match specific search terms rather than any search/query call; Table.spec.ts converts a bare .count() === 15 assertion to toHaveCount(15) for automatic Playwright retries.
Timeout hardening: taskWorkflow.ts bumps the task-card visibility timeout from 15 s to 45 s with a clear comment about activity-feed lag; searchRBAC.ts adds a 45 s timeout to the RBAC negative assertion to absorb search-index propagation delay.

Confidence Score: 5/5

Pure test infrastructure change with no production code impact; all entity test bodies move verbatim into shared helpers and CI wiring is additive.

All changed files are Playwright tests, shared test helpers, and CI workflow YAML. The split into PR/stress tiers is mechanically correct — shared helpers are verbatim extractions, the new stress project is properly isolated from PR chromium via testIgnore, and shard 7 is correctly routed before the chromium else-branch in both nightly workflows. Race-condition fixes in searchRBAC.ts and SearchIndexNestedColumns.spec.ts are sound improvements. The reduced canvas-stabilization timeouts in lineage.ts are the only area requiring CI observation, but they are annotated and intentional.

No files require special attention; lineage.ts's reduced drag stabilization wait (1000→300 ms) should be monitored on first nightly stress run to confirm it is sufficient on slow CI agents.

Important Files Changed

Filename	Overview
openmetadata-ui/src/main/resources/ui/playwright.config.ts	Adds `stress` project pointing to `/stress/`, adds `/stress/` to chromium's `testIgnore`, reduces `maxFailures` 500→50, and keeps retries at 2 (comment explains why 1 was reverted). All changes look correct.
.github/workflows/mysql-nightly-e2e.yml	Adds shardIndex 7 for the stress project; the if/elif/else ladder correctly routes shard 7 to `--project=stress` before the chromium else branch, so no overlap. `shardTotal: [7]` is metadata only — `--shard=N/5` remains hardcoded.
.github/workflows/postgresql-nightly-e2e.yml	Mirrors mysql-nightly-e2e.yml changes exactly; same correct routing logic for shard 7.
openmetadata-ui/src/main/resources/ui/playwright/shared/customPropertiesEntityTests.ts	New 3,735-line shared helper extracted verbatim from the deleted CustomProperties.spec.ts; exports `ALL_ENTITIES` array and `registerCustomPropertiesEntityTests` function for selective entity filtering by callers.
openmetadata-ui/src/main/resources/ui/playwright/shared/dataAssetLineageEntityTests.ts	New 261-line helper extracted from DataAssetLineage.spec.ts; always creates all 15 entities as edge targets in beforeAll regardless of the `testEntities` filter, which is correct since every entity can appear as a lineage node.
openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/Lineage/TableLineage.spec.ts	New PR-side lineage spec: calls the shared helper for Table only, then keeps Column Level Lineage, Temp lineage, and Settings modal describes inline (non-parameterized unique coverage). Correct structure.
openmetadata-ui/src/main/resources/ui/playwright/utils/searchRBAC.ts	Fixes two `waitForResponse`-after-action races by registering the listener before `press('Enter')`; adds 45 s timeout to RBAC negative assertion; wraps explore-tree assertions in `toPass` polling. All improvements are correct.
openmetadata-ui/src/main/resources/ui/playwright/e2e/fixtures/pages.ts	Adds `adminPage` fixture following the same `browser.newPage({ storageState }) / page.close()` pattern as the other role fixtures; adds `export { expect }` for convenience. Replaces the deleted `support/fixtures/userPages.ts`.
openmetadata-ui/src/main/resources/ui/playwright/support/fixtures/userPages.ts	Deleted — functionality consolidated into `e2e/fixtures/pages.ts`. The old file used `browser.newContext()` + `context.close()`; new code uses `browser.newPage()` + `page.close()`, which is equivalent and consistent with existing fixtures.
openmetadata-ui/src/main/resources/ui/playwright/utils/lineage.ts	Reduces canvas stabilization waits: drag-before wait 1000→300 ms (hit ~225× per DataAssetLineage run) and node-selection wait 500→200 ms. Both waits are annotated ESLint-suppressions; the 300 ms drag window should be monitored on first nightly stress run.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[PR Playwright Run] --> B{shardIndex}
    B -->|1| C[DataAssetRulesEnabled\nDataAssetRulesDisabled]
    B -->|2-6| D[chromium project\n--shard=N/5\n3720 tests / 265 files]
    B -->|7 - nightly only| E[stress project\n--project=stress\n404 tests / 5 files]

    D --> D1[TableCustomProperty.spec.ts\n15 tests - Table only]
    D --> D2[DashboardAdvanceSearchCustomProperty.spec.ts\n44 tests - Dashboard only]
    D --> D3[Lineage/TableLineage.spec.ts\n88 tests - Table + Column Level + Temp + Settings]
    D --> D4[...other chromium tests...]

    E --> E1[stress/CustomPropertiesAllEntities.spec.ts\n17 entities excl. Table + Dashboard]
    E --> E2[stress/Lineage/DataAssetLineageAllEntities.spec.ts\n14 entities excl. Table]

    D1 & D2 -->|shared helper| SH1[playwright/shared/customPropertiesEntityTests.ts]
    E1 -->|shared helper| SH1
    D3 -->|shared helper| SH2[playwright/shared/dataAssetLineageEntityTests.ts]
    E2 -->|shared helper| SH2

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[PR Playwright Run] --> B{shardIndex}
    B -->|1| C[DataAssetRulesEnabled\nDataAssetRulesDisabled]
    B -->|2-6| D[chromium project\n--shard=N/5\n3720 tests / 265 files]
    B -->|7 - nightly only| E[stress project\n--project=stress\n404 tests / 5 files]

    D --> D1[TableCustomProperty.spec.ts\n15 tests - Table only]
    D --> D2[DashboardAdvanceSearchCustomProperty.spec.ts\n44 tests - Dashboard only]
    D --> D3[Lineage/TableLineage.spec.ts\n88 tests - Table + Column Level + Temp + Settings]
    D --> D4[...other chromium tests...]

    E --> E1[stress/CustomPropertiesAllEntities.spec.ts\n17 entities excl. Table + Dashboard]
    E --> E2[stress/Lineage/DataAssetLineageAllEntities.spec.ts\n14 entities excl. Table]

    D1 & D2 -->|shared helper| SH1[playwright/shared/customPropertiesEntityTests.ts]
    E1 -->|shared helper| SH1
    D3 -->|shared helper| SH2[playwright/shared/dataAssetLineageEntityTests.ts]
    E2 -->|shared helper| SH2

_{Reviews (19): Last reviewed commit: "playwright(ci): restore retries:2 — flak..." | Re-trigger Greptile}

… PR CI time Drop retries from 2 to 1 and maxFailures from 500 to 50 so a failing test no longer burns up to 15 min per retry (worst case for files using test.setTimeout) and a fundamentally broken PR bails in minutes rather than running the full ~2h35m suite. Replace 7 unannotated waitForTimeout calls in Tasks/ActivityFeed/Stream specs with proper locator waits or remove them where the upstream expect/response already absorbs the wait. Migrate 4 ES-indexing retries in DomainUIInteractions and DataProductAndSubdomains to the existing waitForSearchIndexed helper so they actively poll the search API instead of blind-sleeping 2s × N retries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ime in CI Reduce 19 hard-coded waits across in-PR-scope helpers and specs without changing what they wait for or removing the annotated reasons. The biggest single win is utils/lineage.ts dragAndDropNode (1000→300ms), which fires ~225 times in DataAssetLineage.spec.ts and recovers ~2.5min on that file alone. Scale used: 2000→500, 1000→300, 500→200, with conservative caps on pipeline-deployment waits (5000→2000) since those have real backend settling time. Promise.race race-cap timeouts (widgetFilters, customizeLandingPage line 513) are left untouched — those are response deadlines, not sleeps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…s project CustomProperties.spec.ts (3862 lines, 476 tests, ~193 min test-time) and DataAssetLineage.spec.ts (557 lines, ~69 min) ran the same UI flow for 19 and 15 entity types respectively — redundant safeguards on every PR. Split each into a shared helper + thin entry files so PR runs only the representative entity (Table) plus Dashboard's unique advanced-search coverage; the other entities now run in a new `stress` project triggered manually via postgresql-nightly-e2e.yml (workflow_dispatch). Nightly itself and the existing nightly/ folder are untouched. PR runs: - Pages/TableCustomProperty.spec.ts (Table CRUD + name validation) - Pages/DashboardAdvanceSearchCustomProperty.spec.ts (Dashboard full) - Pages/Lineage/TableLineage.spec.ts (Table parameterized + 3 unparameterized describes) Stress runs (manual via workflow_dispatch): - stress/CustomPropertiesAllEntities.spec.ts (17 other entities) - stress/Lineage/DataAssetLineageAllEntities.spec.ts (14 other entities) Estimated max-shard impact: shard 4 drops from ~141 min to ~85 min (CustomProperties wall-time down from ~64 min to ~20 min on PR), shard 6 drops from ~124 min to ~109 min (DataAssetLineage down to Table-only). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…l to 7 Apply the same shard 7 stress addition to mysql-nightly that was added to postgresql-nightly in the prior commit. Also bump shardTotal from 6 to 7 in both files so the matrix metadata matches the actual entry count (shardTotal is unused by the run command — --shard=N/5 is hardcoded — but the stale 6 was misleading). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two tests in this file sleep on intentional JWT-expiry waits (3 min and ~2 min via waitForTimeout). With the global retries:1, a single flake under one of those tests would relive the same 3- or 2-min sleep before reporting — ~5 min of waste per failed retry. Token-expiry failures are almost always real (config drift / token TTL mismatch), not transient flakes, so retrying buys nothing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-24T14:41:53Z

❌ PR checklist incomplete

This PR cannot be merged until the following are addressed on its linked issue:

No GitHub issue is linked. Link an issue in the Development section of the PR (or add Fixes #12345 to the description). For a same-org cross-repo issue, add Fixes open-metadata/<repo>#123 to the description.

The fields live on the linked issue in the Shipping project (open the issue → right sidebar → Projects). After you set them, re-run this check (or push a commit) — issue/project changes do not re-trigger it automatically.

Maintainers can bypass this check by adding the skip-pr-checks label.

greptile-apps · 2026-06-24T14:52:58Z


    // eslint-disable-next-line playwright/no-wait-for-timeout -- wait to catch any additional monogram requests after save
-    await page.waitForTimeout(2000);
+    await page.waitForTimeout(500);


Observation window may be too short for delayed requests

The purpose of this wait is to catch any additional monogram requests that fire after the save response. The assertion monogramCallCount <= 1 is only meaningful if we wait long enough for late-arriving requests to land. At 500ms the window is 4× shorter than the original 2000ms, so a duplicate request delayed by UI re-renders or debounce cycles that previously would have been caught could now slip past undetected, making the test pass falsely. If this was reduced to speed up CI, consider replacing it with a short waitForResponse or a network-idle guard instead of a fixed sleep.

greptile-apps · 2026-06-24T14:53:00Z


    // eslint-disable-next-line playwright/no-wait-for-timeout -- wait for auth state to be persisted to indexedDB
-    await adminPage.waitForTimeout(2000);
+    await adminPage.waitForTimeout(500);


Auth-state persistence window reduced to 500ms

This sleep is the only guard between waitForFunction(() => indexedDB.databases()) (which only confirms the API exists, not that data has been written) and the storageState snapshot. On slow CI VMs or under memory pressure the browser engine may not have finished flushing the auth cookies/localStorage to the snapshot path within 500ms. A failure here would silently produce an empty or partial auth file, causing every downstream test to fail with a login error rather than a clear auth-setup error. Since this runs once per shard, the cost of keeping it at 1000ms or even 2000ms is negligible compared to the risk.

github-actions · 2026-06-24T15:01:04Z

Jest test Coverage

UI tests summary

Lines	Statements	Branches	Functions
	63.42% (70688/111446)	45.9% (40625/88504)	47.81% (12369/25866)

…ilwind name Main commit 9ef9465 (#29199 "Fixed the colors for context center") renamed the upvote/downvote SVG fill class from `fill-blue-500` to `tw:fill-utility-blue-500` but the matching test regex was not updated. Two assertions in ContextCenterPermission.spec.ts have been failing on every PR since that commit merged because the regex still looks for the old name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…2e-ci

…metadata/OpenMetadata into optimize-playwright-e2e-ci

…e responses The test 'Table pagination with sorting should works' was failing on both attempts in CI with `expected: 15, received: 0`. Root cause: the test clicks 'Name' to sort, then clicks 'next' to paginate, then asserts row count — but it only waits for `waitForAllLoadersToDisappear` between actions. The loader for the current page can finish before the new page-2 fetch arrives, leaving the assertion to run on an empty table mid-transition. Two changes: - Register `waitForResponse('/api/v1/dataQuality/testCases/search/list?*')` before both the sort click and the next-page click, so the test waits for the new data to land. - Replace `expect(... count() === 15).toBe(15)` with `expect(locator).toHaveCount(15)`, which auto-retries until the table re-renders. Exposed by the move from retries:2 → retries:1; was passing on 3rd attempt before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sonarqubecloud · 2026-06-26T09:14:55Z

Quality Gate passed for 'open-metadata-ui'

Issues
1241 New issues
0 Accepted issues

Measures
13 Security Hotspots
0.0% Coverage on New Code
3.4% Duplication on New Code

See analysis details on SonarQube Cloud

… bulk-edit reads The 'Glossary' test was failing both attempts with Expected: "Entity updated" Received: "Entity created" After api-creating a glossary + term, the test immediately opens the bulk-edit table. That table reads the glossary's terms from ES; if the term hasn't been indexed yet, the table comes back empty. Filling row 1 with the term's name then registers as a *new* term creation instead of an *update* of the existing one, so the import status shows "Entity created" instead of "Entity updated". Fix: insert a `waitForSearchIndexed` on the term's FQN against `glossary_term_search_index` immediately after `glossaryTerm.create(...)`, matching the existing pattern (Domains, ObservabilityAlerts, etc.). Exposed by retries:2 → retries:1; previously passed on the 3rd attempt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rt flakes 4 of the 7 flake-on-retry tests from the latest CI run are the same ES indexing race as BulkEditEntity:661 (already fixed in 860e875): the test creates entities via API, then immediately opens a bulk-edit or bulk-import view that reads from ES. The ES indexing lag races the test. Added waitForSearchIndexed after each API-create: - BulkEditOperationBadges.spec.ts beforeAll: term + table indexed before any of the 4 tests in the file runs - BulkImport.spec.ts: - Database service test → database_service_search_index - Database Schema test → database_schema_search_index - Table test → table_search_index ContextCenterPermission Archive (1433, 1506) and DataQualityDashboard Data Product (1021) flakes are transient backend slowness — pass on retry; not deterministic test bugs to fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…2e-ci

…xing The 'Admin: Complete export-import-validate flow' test was failing on both attempts with: Expected: "5" Received: "4" on the `processed-row` assertion. Root cause: same ES indexing race pattern. The beforeAll creates the table + one test case via API, then `performE2EExportImportFlow` exports the table's test cases via search. Without waiting for the just-created test case to be indexed, the export's CSV is missing the 1 existing test case row, so 4 added rows = 4 processed (not the expected 5 = 1 existing + 4 added). Added `waitForSearchIndexed(... test_case_search_index)` after `table.createTestCase()` in both the Admin and EditAll User describes. Also: re-ran organize-imports + prettier on BulkEditEntity.spec.ts to fix the UI Checkstyle Playwright-lint job — the prior commit's manual edit left the file out of the formatting sequence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…r grid render Both genuine failures in the latest CI run (BulkImport.spec.ts:154 and :614) failed at `expect('.rdg-cell-details').toContainText(rowStatus)`. From the retry log, the result grid IS populating but slowly — on retry the helper observed `3 × locator resolved to 0 elements` then `16 × locator resolved to 2 elements` while the test expected 6 cells. The assertion was running against a mid-render grid and never caught up within 15s (the default expect timeout). Made the helper a two-step wait: first `toHaveCount(rowStatus.length)` with a 60s timeout so the grid finishes rendering all rows, then the existing `toContainText` against the populated cells. This avoids the race without changing what the assertion checks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two waits I tightened in the earlier sweep had non-obvious correctness implications that two reviewers (greptile-apps, gitar-bot) flagged. Reverting both to 2000ms. 1. auth.setup.ts:208 (500→2000ms): This is the only guard between the indexedDB API being available and the storageState snapshot being written. waitForFunction confirms the API exists but not that auth data has been persisted. A miss here corrupts admin.json and cascade-fails every downstream spec that uses `test.use({ storageState: 'playwright/.auth/admin.json' })` — and with retries:1 the blast radius is larger than it would have been before. The 1.5s saved × 6 shards is negligible vs that risk. 2. CustomThemeConfig.spec.ts:207 (500→2000ms): This is the rare wait that IS the test's assertion-observation window: the assertion `monogramCallCount <= 1` is only meaningful for the duration the test sits and watches for late requests. At 500ms a duplicate monogram request arriving 500ms-2000ms after save would not be counted, so the test could pass while the regression it guards against is present (false negative). Added an inline comment so a future me doesn't re-tighten this. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…res/pages `support/fixtures/userPages.ts` defined 7 storage-state-keyed Page fixtures (adminPage, dataConsumerPage, dataStewardPage, ownerPage, editDescriptionPage, editTagsPage, editGlossaryTermPage). All 7 overlapped with `e2e/fixtures/pages.ts`, which already had 8 of the same fixtures (incl. viewOnlyPage) plus an `adminPage`-equivalent override of the default `page` fixture. The only meaningful difference was that userPages used `browser.newContext()` + `context.newPage()` while pages.ts uses `browser.newPage()` — functionally equivalent since `newPage()` creates a context under the hood. Usage breakdown was 66 specs on pages.ts vs 8 specs on userPages.ts, so userPages.ts was the duplicate to remove. Changes: - Added explicit `adminPage` alias to pages.ts (returns same admin-storage page as the default `page` fixture) so migrating specs don't need to rename call-sites - Re-exported `expect` from pages.ts (matches userPages's surface) - Updated 8 specs to import from `e2e/fixtures/pages` instead of `support/fixtures/userPages` - Deleted `support/fixtures/userPages.ts` Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…stomThemeConfig In commit a4a346e I added a multi-line explanatory comment above the 2-second monogram-observation wait, which inadvertently pushed the `// eslint-disable-next-line playwright/no-wait-for-timeout` directive several lines up from the actual `await page.waitForTimeout(2000)` call. `eslint-disable-next-line` only applies to the immediately following line, so the rule fired and the Playwright lint job failed. Moved the disable comment back to be adjacent to the waitForTimeout, with the explanation block kept above it (no other content changed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tive assertion ExploreBrowse.spec.ts:255 (service type drill-down): The test calls expandTreeNode('Databases'), then clicks explore-tree-title-mysql. expandTreeNode only waits for loaders to disappear, but the tree's child rows keep animating/repositioning for a beat after that. The subsequent .click() then times out with "waiting for element to be visible, enabled and stable". Added an explicit `await expect(serviceTitle).toBeVisible()` before the click; toBeVisible polls until the element is stable, which lets the click land cleanly. SearchRBAC negative assertion (utils/searchRBAC.ts, exploreShouldShowEntity): The "fully denied user sees neither asset type" test consistently saw 1 element across 18 retries (~15s) — both initial and retry attempts ran ~30s and failed. Root cause: RBAC enforcement against a newly-assigned user role lags the patch call by several seconds while the search-index user doc updates. The default 15s expect-timeout for the toHaveCount(0) assertion isn't enough for the result-card to drop out as the role takes effect. Extended the negative-assertion timeout to 45s. The positive-assertion case (shouldSee=true) keeps the default — it only ever needs to see the entity once, not wait for it to disappear. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…2e-ci

…specificity 3 genuine failures (failed both attempts) from latest CI: DescriptionSuggestion.spec.ts:326 and TagsSuggestion.spec.ts:526 share the same root cause in utils/taskWorkflow.ts:openTaskDetails — the test creates a task via API, then opens the entity's tasks tab and expects the new task card to be visible within 15s. Under Basic-project parallelism the activity-feed UI refresh lags past 15s and the card never appears within the default timeout. Bumped openTaskDetails' toBeVisible timeout from 15s to 45s — healthy runs aren't affected since toBeVisible polls and resolves as soon as the card appears. SearchIndexNestedColumns.spec.ts:109 — `waitForResponse('/api/v1/search/query?*')` was racing with the empty-query suggestion fetch fired when the search box is clicked. The test's listener would resolve on that earlier response and then check for the suggestion dropdown before the table-name query had even fired. Made both waitForResponse patterns predicate-based so they match only the response whose URL contains the encoded search term. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…test After three CI iterations on retries:1 the same root cause kept surfacing in different tests each run (task-feed UI not refreshing within 45s after API task create, RBAC index propagation, search dropdowns racing the empty-query suggestion response, etc.). Each one takes its own diagnosis round and the failing set rotates, so we can't land the PR by fixing them one at a time. Restoring retries:2. The rest of the PR's wins stay: - maxFailures:50 still bails fundamentally broken PRs in minutes rather than running the full ~2h35m suite - The PR↔stress split still cuts ~20% off worst-case PR shard time - Wait-time tightenings stay (the bad ones were already reverted) - ES-indexing-race fixes (waitForSearchIndexed in BulkEdit, BulkImport, TestCaseImportExport, etc.) stay — those were real bugs, not retry-masked flakes - Spec-side fixes (Roles loader, SearchRBAC waitForResponse, exploreTreeCategories poll, Domains setTimeout cap, ContextCenter CSS regex, Table.spec pagination wait, etc.) all stay Net effect: PR still ships meaningful CI improvements; we just don't pretend we can drop a retry for free against a flaky baseline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gitar-bot · 2026-06-29T10:01:23Z

Code Review ⚠️ Changes requested 4 resolved / 5 findings

Restructures Playwright CI to reduce shard times by extracting entity-specific tests into shared helpers and introducing a dedicated stress project. Changes requested due to a potential regression where the validateFollowedEntityToWidget helper no longer asserts positive follow outcomes.

⚠️

Bug: validateFollowedEntityToWidget no longer asserts positive case

📄 openmetadata-ui/src/main/resources/ui/playwright/utils/entity.ts:1450-1452 📄 openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/Domains.spec.ts:370-374 📄 openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/Domains.spec.ts:455 📄 openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/Domains.spec.ts:543-547

In the new shared helper validateFollowedEntityToWidget, the isFollowing === true branch only calls await followingWidget.isVisible() and await followingWidget.getByTestId(following-${entity}).isVisible(). Locator.isVisible() resolves to a boolean and performs no assertion (and does not auto-retry), so when the followed entity is absent from the widget the test still passes. This weakens coverage compared to the code it replaces in Domains.spec.ts, which used await expect(followingWidget).toContainText(entity.displayName) — a real, auto-retrying assertion. The negative branch correctly uses expect(...).not.toBeVisible(), making the asymmetry clear. As a result the three migrated Domains follow-flow tests no longer verify that a followed Domain / SubDomain / DataProduct actually appears in the Following widget. Replace the positive-branch isVisible() calls with await expect(followingWidget.getByTestId(following-${entity})).toBeVisible().

Use auto-retrying expect assertions for the positive follow case so a missing entity fails the test.

if (isFollowing) {
  await expect(followingWidget).toBeVisible();
  await expect(
    followingWidget.getByTestId(`following-${entity}`)
  ).toBeVisible();
} else {

✅ 4 resolved

✅ Performance: Stress shard (404 tests, ~unsharded) may become nightly bottleneck

📄 openmetadata-ui/src/main/resources/ui/playwright.config.ts:233-240 📄 .github/workflows/postgresql-nightly-e2e.yml:102-104
The new stress Playwright project runs all of e2e/stress/** (404 tests, including the 14-entity DataAssetLineage matrix each with test.setTimeout(5*60*1000)) in a single matrix entry (shardIndex 7) with no --shard splitting, while the rest of the suite is spread across 5 chromium shards. Since the original DataAssetLineage matrix alone was ~69 min and CustomProperties ~193 min on shard 4, concentrating the full multi-entity coverage on one unsharded shard could make shard 7 the dominant wall-clock cost of the nightly run. Consider sharding the stress project (e.g. --shard=N/M across additional matrix entries) if nightly duration becomes a concern. Acceptable for a manual workflow_dispatch run, hence minor.

✅ Performance: New stress project runs 404 tests unsharded on a single nightly shard

📄 openmetadata-ui/src/main/resources/ui/playwright.config.ts:233-240 📄 .github/workflows/postgresql-nightly-e2e.yml:99-105 📄 .github/workflows/mysql-nightly-e2e.yml:99-105
The new stress project (npx playwright test --project=stress) runs on shardIndex 7 with no --shard flag, while the chromium suite is split across 5 shards. The two parameterized giants that were moved here (CustomProperties ~17 entities and DataAssetLineage ~14 entities, each with test.setTimeout(5 * 60 * 1000) per test) account for the bulk of the original ~193min + ~69min test-time. Concentrating all 404 of these long-running tests into one unsharded job (workers:3) is likely to make shard 7 the dominant bottleneck of the nightly run, potentially exceeding the time it would take if these tests were distributed. Consider sharding the stress project too (e.g. a stressShardIndex/stressShardTotal matrix with --project=stress --shard=N/M) so nightly wall-clock isn't gated by a single slow shard.

✅ Edge Case: Shorter monogram observation window weakens duplicate-request guard

📄 openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/CustomThemeConfig.spec.ts:204-210
This test exists specifically to catch a regression where the monogram URL is requested more than once after save (expect(monogramUrlCallCount).toBeLessThanOrEqual(1)). The assertion is only meaningful for the duration the test sits and observes network traffic after await saveResponse. Reducing that observation window from 2000ms to 500ms means a duplicate request that arrives between 500ms and 2000ms after save will no longer be counted, so the test can pass while the very regression it guards against is present (false negative). Unlike most of the other waitForTimeout cuts in this PR (which only wait for an expected state to appear), here the wait duration is the test coverage. Consider keeping a longer window here, or replacing the blind wait with an explicit assertion that no second monogram request fires (e.g. wait for network idle / assert on a captured request promise that should reject/timeout).

✅ Edge Case: Reduced auth indexedDB persistence wait may cascade-fail all tests

📄 openmetadata-ui/src/main/resources/ui/playwright/e2e/auth.setup.ts:205-211
The wait for the admin auth state to be persisted to indexedDB before saving storage state was cut from 2000ms to 500ms. Unlike per-spec waits, a miss here corrupts the saved admin.json storage state that nearly every downstream spec (including the new TableCustomProperty/Dashboard/stress specs which test.use({ storageState: 'playwright/.auth/admin.json' })) depends on, so a single under-wait could fail an entire shard rather than one test. Because retries were also reduced to 1 in this PR, the blast radius of an occasional miss is larger. This is speculative (indexedDB persistence is usually fast), but given the cascade risk consider polling for the persisted state rather than a fixed 500ms sleep, or keeping a more conservative value here specifically.

🤖 Prompt for agents

Code Review: Restructures Playwright CI to reduce shard times by extracting entity-specific tests into shared helpers and introducing a dedicated stress project. Changes requested due to a potential regression where the `validateFollowedEntityToWidget` helper no longer asserts positive follow outcomes.

1. ⚠️ Bug: validateFollowedEntityToWidget no longer asserts positive case
   Files: openmetadata-ui/src/main/resources/ui/playwright/utils/entity.ts:1450-1452, openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/Domains.spec.ts:370-374, openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/Domains.spec.ts:455, openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/Domains.spec.ts:543-547

   In the new shared helper `validateFollowedEntityToWidget`, the `isFollowing === true` branch only calls `await followingWidget.isVisible()` and `await followingWidget.getByTestId(`following-${entity}`).isVisible()`. `Locator.isVisible()` resolves to a boolean and performs no assertion (and does not auto-retry), so when the followed entity is absent from the widget the test still passes. This weakens coverage compared to the code it replaces in Domains.spec.ts, which used `await expect(followingWidget).toContainText(entity.displayName)` — a real, auto-retrying assertion. The negative branch correctly uses `expect(...).not.toBeVisible()`, making the asymmetry clear. As a result the three migrated Domains follow-flow tests no longer verify that a followed Domain / SubDomain / DataProduct actually appears in the Following widget. Replace the positive-branch `isVisible()` calls with `await expect(followingWidget.getByTestId(`following-${entity}`)).toBeVisible()`.

   Fix (Use auto-retrying expect assertions for the positive follow case so a missing entity fails the test.):
   if (isFollowing) {
     await expect(followingWidget).toBeVisible();
     await expect(
       followingWidget.getByTestId(`following-${entity}`)
     ).toBeVisible();
   } else {

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

chirag-madlani and others added 5 commits June 24, 2026 16:59

chirag-madlani requested review from a team, akash-jain-10, harshach and tutte as code owners June 24, 2026 14:41

github-actions Bot added safe to test Add this label to run secure Github workflows on PRs UI UI specific issues labels Jun 24, 2026

chirag-madlani temporarily deployed to test June 24, 2026 14:43 — with GitHub Actions Inactive

chirag-madlani had a problem deploying to test June 24, 2026 14:43 — with GitHub Actions Failure

chirag-madlani temporarily deployed to test June 24, 2026 14:43 — with GitHub Actions Inactive

gitar-bot Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread openmetadata-ui/src/main/resources/ui/playwright.config.ts

chirag-madlani had a problem deploying to test June 24, 2026 14:52 — with GitHub Actions Failure

greptile-apps Bot reviewed Jun 24, 2026

View reviewed changes

chirag-madlani had a problem deploying to test June 24, 2026 14:54 — with GitHub Actions Failure

chirag-madlani temporarily deployed to test June 24, 2026 14:54 — with GitHub Actions Inactive

chirag-madlani had a problem deploying to test June 24, 2026 14:54 — with GitHub Actions Failure

chirag-madlani temporarily deployed to test June 24, 2026 14:54 — with GitHub Actions Inactive

ShaileshParmar11 previously approved these changes Jun 24, 2026

View reviewed changes

chirag-madlani temporarily deployed to test June 25, 2026 08:55 — with GitHub Actions Inactive

chirag-madlani had a problem deploying to test June 25, 2026 08:55 — with GitHub Actions Failure

chirag-madlani temporarily deployed to test June 25, 2026 08:55 — with GitHub Actions Inactive

chirag-madlani had a problem deploying to test June 25, 2026 08:55 — with GitHub Actions Failure

chirag-madlani temporarily deployed to test June 25, 2026 08:59 — with GitHub Actions Inactive

chirag-madlani had a problem deploying to test June 25, 2026 08:59 — with GitHub Actions Failure

chirag-madlani and others added 5 commits June 25, 2026 19:20

Merge remote-tracking branch 'origin/main' into optimize-playwright-e…

23fe8bd

…2e-ci

Merge branch 'main' into optimize-playwright-e2e-ci

26e9502

Merge branch 'optimize-playwright-e2e-ci' of https://github.com/open-…

1929108

…metadata/OpenMetadata into optimize-playwright-e2e-ci

chirag-madlani and others added 12 commits June 26, 2026 15:12

Merge remote-tracking branch 'origin/main' into optimize-playwright-e…

383c1ad

…2e-ci

Merge remote-tracking branch 'origin/main' into optimize-playwright-e…

310b277

…2e-ci

chirag-madlani mentioned this pull request Jun 29, 2026

playwright(fix): test reliability — ES indexing, locator strict-mode, waitForResponse races #29585

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

playwright(ci): split per-entity suites + cap retries + reduce dead-time waits#29435

playwright(ci): split per-entity suites + cap retries + reduce dead-time waits#29435
chirag-madlani wants to merge 29 commits into
mainfrom
optimize-playwright-e2e-ci

chirag-madlani commented Jun 24, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Uh oh!

greptile-apps Bot Jun 24, 2026

Uh oh!

Uh oh!

greptile-apps Bot Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Jun 26, 2026

Uh oh!

gitar-bot Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

chirag-madlani commented Jun 24, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes:

Type of change:

High-level design:

Tests:

Use cases covered

Playwright (UI) tests

Manual testing performed

UI screen recording / screenshots:

Checklist:

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

github-actions Bot commented Jun 24, 2026

❌ PR checklist incomplete

Uh oh!

Uh oh!

greptile-apps Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Jest test Coverage

UI tests summary

Uh oh!

sonarqubecloud Bot commented Jun 26, 2026

Quality Gate passed for 'open-metadata-ui'

Uh oh!

gitar-bot Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

chirag-madlani commented Jun 24, 2026 •

edited by greptile-apps Bot

Loading

github-actions Bot commented Jun 24, 2026 •

edited

Loading