Skip to content

ci(playwright): split E2E tests into AI and non-AI groups#522

Open
gkorland wants to merge 2 commits intostagingfrom
ci/split-e2e-ai-tests
Open

ci(playwright): split E2E tests into AI and non-AI groups#522
gkorland wants to merge 2 commits intostagingfrom
ci/split-e2e-ai-tests

Conversation

@gkorland
Copy link
Copy Markdown
Contributor

@gkorland gkorland commented Apr 5, 2026

Summary

Split Playwright E2E tests into two groups based on whether they require LLM secrets (Azure OpenAI API). Non-AI tests always run; AI tests only run when secrets are available. This replaces the blanket skip for Dependabot PRs with granular coverage.

Changes

  • e2e/tests/chat.spec.ts — Tagged 8 of 9 tests with @requires-ai (all except chat controls disabled without database connection)
  • e2e/tests/database.spec.ts — Tagged 8 of 9 tests with @requires-ai (all except invalid connection string -> shows error)
  • e2e/tests/sidebar.spec.ts — No changes (all tests are non-AI)
  • e2e/tests/userProfile.spec.ts — No changes (all tests are non-AI)
  • .github/workflows/playwright.yml:
    • Removed job-level if skip condition
    • Added Check for AI secrets step that detects secret availability
    • Split Playwright run into two steps: Run Playwright tests (no AI) (always) and Run Playwright tests (AI) (conditional)
    • Docker Compose test databases only start when AI tests will run
  • AGENTS.md — Updated Testing and CI/CD sections

Test Split (chromium in CI)

Group Tests Runs When
Non-AI 16 tests (sidebar, userProfile, 1 chat, 1 database) Always
AI (@requires-ai) 16 tests (8 chat, 8 database) Secrets available

Testing

  • Verified YAML is valid
  • Verified --grep @requires-ai and --grep-invert @requires-ai list correct tests via npx playwright test --list
  • Tag counts: 8 in chat.spec.ts, 8 in database.spec.ts, 0 in sidebar/userProfile

Memory / Performance Impact

N/A — CI workflow change only. Dependabot PRs will be faster (no Docker Compose, no AI test wait).

Related Issues

Replaces the blanket skip from #518 with granular test splitting.

Summary by CodeRabbit

  • Tests

    • Marked several E2E tests as AI-dependent so non-AI tests run on every CI run while AI-dependent tests run only when LLM credentials are available.
  • Chores

    • CI workflow split into non-AI and AI test runs; AI tests are conditionally executed when LLM secrets are present to speed up and stabilize regular runs.
  • Documentation

    • Clarified CI E2E behavior and the requirements for AI-dependent tests.

@overcut-ai
Copy link
Copy Markdown

overcut-ai bot commented Apr 5, 2026

Completed Working on "Code Review"

✅ Review submitted successfully: COMMENT. Total comments: 2 across 2 files.

✅ Workflow completed successfully.


👉 View complete log

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 4ad8f7c.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: caa7d395-073d-44ff-af5e-c36056cc90ef

📥 Commits

Reviewing files that changed from the base of the PR and between 806a53f and 4ad8f7c.

📒 Files selected for processing (4)
  • .github/workflows/playwright.yml
  • AGENTS.md
  • e2e/tests/chat.spec.ts
  • e2e/tests/database.spec.ts
✅ Files skipped from review due to trivial changes (2)
  • AGENTS.md
  • e2e/tests/database.spec.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/playwright.yml

📝 Walkthrough

Walkthrough

CI workflow now computes whether Azure LLM secrets (AZURE_API_KEY, AZURE_API_BASE, AZURE_API_VERSION) are present, gates AI-dependent setup (Docker Compose and AI tests) on that flag, splits Playwright into a non-AI run (always) and an AI run (only when secrets exist, using !cancelled()), test specs were tagged @requires-ai, and docs updated.

Changes

Cohort / File(s) Summary
Workflow Configuration
​.github/workflows/playwright.yml
Removed job-level Dependabot skip; added a Check for AI secrets step that sets has-ai-secrets when secrets.AZURE_API_KEY, secrets.AZURE_API_BASE, and secrets.AZURE_API_VERSION exist. Docker Compose and AI-dependent setup/exec are gated on has-ai-secrets. Playwright split into a non-AI run (always, --grep-invert \@requires-ai`) and an AI run (only when has-ai-secretsis true,--grep `@requires-ai`and!cancelled()`).
Test Metadata (AI-Dependent)
e2e/tests/chat.spec.ts, e2e/tests/database.spec.ts
Added { tag: '@requires-ai' } metadata to multiple existing tests that depend on LLM-driven DB schema or AI behavior; one chat test removed an unnecessary DB-connect step indicating it is pure UI-only. Test bodies and assertions otherwise unchanged.
Documentation
AGENTS.md
Updated to state that Playwright E2E tests tagged @requires-ai depend on Azure OpenAI secrets to load DB schema; non-@requires-ai E2E tests run unconditionally. Notes that secrets are unavailable for Dependabot and fork PRs.

Sequence Diagram(s)

sequenceDiagram
  participant CI as CI Workflow
  participant S as Secrets Store
  participant DB as Docker Compose DB
  participant Pno as Playwright (non-AI)
  participant Pai as Playwright (AI)

  CI->>S: check AZURE_API_KEY, AZURE_API_BASE, AZURE_API_VERSION
  alt has-ai-secrets == true
    CI->>CI: set has-ai-secrets = true
    CI->>DB: start test databases
    DB-->>CI: DB ready
    CI->>Pno: run Playwright (--grep-invert `@requires-ai`)
    Pno-->>CI: non-AI results
    CI->>Pai: run Playwright (--grep `@requires-ai`, !cancelled())
    Pai-->>CI: AI-test results
  else has-ai-secrets == false
    CI->>CI: set has-ai-secrets = false
    CI->>Pno: run Playwright (--grep-invert `@requires-ai`)
    Pno-->>CI: non-AI results
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

With twitching nose I sniff the keys,
Secrets present? I hop with ease,
Some tests leap where AI lights,
Others bound on simpler flights,
A rabbit claps—CI hums, we please! 🐇✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: splitting E2E tests into AI and non-AI groups, which is the primary objective of the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/split-e2e-ai-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@railway-app
Copy link
Copy Markdown

railway-app bot commented Apr 5, 2026

🚅 Deployed to the QueryWeaver-pr-522 environment in queryweaver

Service Status Web Updated (UTC)
QueryWeaver ✅ Success (View Logs) Web Apr 5, 2026 at 11:38 am

@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-522 April 5, 2026 06:53 Destroyed
Copy link
Copy Markdown

@overcut-ai overcut-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Found 2 comments total: 1 MAJOR, 1 MINOR.

Key themes

  • Test classification accuracy: One UI-only test appears incorrectly tagged as AI-dependent, reducing baseline coverage when secrets are unavailable.
  • CI behavior documentation clarity: Contributor-facing docs should better explain when AI-tagged tests are skipped (including fork PR scenarios).

Next steps

  1. Reclassify the empty query submission is prevented test so non-AI baseline validation always runs.
  2. Update AGENTS.md to explicitly state AI-tagged tests depend on secret availability and are usually skipped for fork/Dependabot PRs.

@gkorland gkorland force-pushed the ci/split-e2e-ai-tests branch from d9524e5 to 806a53f Compare April 5, 2026 06:57
@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-522 April 5, 2026 06:57 Destroyed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/playwright.yml:
- Around line 29-40: The gate step with id "check-secrets" only checks
AZURE_API_KEY but should verify the full Azure config (AZURE_API_KEY,
AZURE_API_BASE, AZURE_API_VERSION) before setting has-ai-secrets; update the
check-secrets run logic to ensure all three environment variables are non-empty
and only then emit "has-ai-secrets=true" (otherwise emit false) so the AI shard
is skipped when any of AZURE_API_KEY, AZURE_API_BASE, or AZURE_API_VERSION is
missing.
- Around line 187-196: Replace the two sequential Playwright steps ("Run
Playwright tests (no AI)" and "Run Playwright tests (AI)") with a single job
using a strategy.matrix that includes two entries (e.g., {name: "no-ai", grep:
"--grep-invert `@requires-ai`"} and {name: "ai", grep: "--grep `@requires-ai`"}),
set strategy.fail-fast: false, and run npx playwright test using matrix.grep;
guard the AI shard by adding an if on the run step like: if: matrix.name != 'ai'
|| steps.check-secrets.outputs.has-ai-secrets == 'true' so both shards execute
independently and the job still fails if any matrix run fails.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6d1acb5f-a7d7-46b0-9269-f28eba088335

📥 Commits

Reviewing files that changed from the base of the PR and between e2f82c4 and d9524e5.

📒 Files selected for processing (4)
  • .github/workflows/playwright.yml
  • AGENTS.md
  • e2e/tests/chat.spec.ts
  • e2e/tests/database.spec.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
e2e/tests/chat.spec.ts (1)

144-153: ⚠️ Potential issue | 🟠 Major

Keep empty-input validation in the non-AI test lane.

At Line 144, this test checks client-side disabled-state behavior for empty input. Tagging it @requires-ai drops baseline UI validation coverage whenever secrets are unavailable. Please remove the AI tag (and avoid AI-dependent setup in this test).

Suggested fix
-  test('empty query submission is prevented', { tag: '@requires-ai' }, async () => {
+  test('empty query submission is prevented', async () => {
     const homePage = await browser.createNewPage(HomePage, getBaseUrl(), 'e2e/.auth/user.json');
     await browser.setPageToFullScreen();
-
-    // Ensure database is connected (will skip if already connected)
-    await homePage.ensureDatabaseConnected(apiCall);

     // Verify send button is disabled with empty input
     const isSendButtonDisabled = await homePage.isSendQueryButtonDisabled();
     expect(isSendButtonDisabled).toBeTruthy();
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests/chat.spec.ts` around lines 144 - 153, The test "empty query
submission is prevented" should not be in the AI-dependent lane: remove the
`@requires-ai` tag from the test declaration and eliminate AI-dependent setup
usage — specifically, change the ensureDatabaseConnected(apiCall) call so it
does not depend on the AI fixture (e.g., call ensureDatabaseConnected() or a
non-AI variant) and avoid passing or referencing apiCall; keep the rest of the
test (HomePage creation, setPageToFullScreen, isSendQueryButtonDisabled
assertion) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@e2e/tests/chat.spec.ts`:
- Around line 144-153: The test "empty query submission is prevented" should not
be in the AI-dependent lane: remove the `@requires-ai` tag from the test
declaration and eliminate AI-dependent setup usage — specifically, change the
ensureDatabaseConnected(apiCall) call so it does not depend on the AI fixture
(e.g., call ensureDatabaseConnected() or a non-AI variant) and avoid passing or
referencing apiCall; keep the rest of the test (HomePage creation,
setPageToFullScreen, isSendQueryButtonDisabled assertion) unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 796ae3ae-b88d-44f2-96c1-bd8b68653f44

📥 Commits

Reviewing files that changed from the base of the PR and between d9524e5 and 806a53f.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (4)
  • .github/workflows/playwright.yml
  • AGENTS.md
  • e2e/tests/chat.spec.ts
  • e2e/tests/database.spec.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • .github/workflows/playwright.yml
  • AGENTS.md
  • e2e/tests/database.spec.ts

gkorland and others added 2 commits April 5, 2026 14:30
Tag tests requiring LLM secrets (Azure OpenAI) with @requires-ai.
The workflow now runs in two phases:
- Non-AI tests (sidebar, userProfile, error paths) always run
- AI tests (database connection, chat queries) only run when
  secrets are available

This replaces the previous blanket skip for Dependabot PRs with a
granular approach that retains smoke test coverage even when secrets
are unavailable. Docker Compose test databases are also conditionally
started only when AI tests will run.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove @requires-ai tag from 'empty query submission is prevented' test
  since it only validates UI state (send button disabled) without needing
  a database connection or AI embeddings
- Check all three Azure secrets (API_KEY, API_BASE, API_VERSION) in the
  secrets detection step, not just the API key
  fail, preserving test signal for both shards
- Update AGENTS.md to mention fork PRs alongside Dependabot as contexts
  where secrets are unavailable

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@gkorland
Copy link
Copy Markdown
Contributor Author

gkorland commented Apr 5, 2026

Agent Review Summary

Comments Fixed (agreed & resolved)

  1. overcut-ai — empty query test tag → Fixed in 4ad8f7c — removed @requires-ai tag and ensureDatabaseConnected() call since the test only validates UI state (send button disabled for empty input)
  2. overcut-ai — AGENTS.md fork PR mention → Fixed in 4ad8f7c — updated docs to mention fork PRs alongside Dependabot
  3. coderabbit — check all Azure secrets → Fixed in 4ad8f7c — now checks all three (AZURE_API_KEY, AZURE_API_BASE, AZURE_API_VERSION)

CI Status

All 11 checks passing ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant