Fix three flaky integration tests by FastLee · Pull Request #4803 · databrickslabs/ucx

FastLee · 2026-04-28T20:43:45Z

Summary

Three previously-failing integration tests, plus a CI fix for a coverage-config issue that was masking the runner's exit status.

Test fixes

test_build_notebook_dependency_graphs_installs_pytest_from_index_url (and its installed-twice sibling) — generate the notebook fixture at test time using whichever pip mirror is configured locally. The helper shells out to python -m pip config list (not pip config get, because get only inspects user/global/site scopes and ignores the env scope that PIP_CONFIG_FILE loads as) and parses the global.index-url='...' line. Falls back to public PyPI when no mirror is configured. The same test now exercises the JFrog mirror in CI and the dev proxy locally without any environment knowledge baked into the test code. Depends on Refactor pip/JFrog integration during CI/CD #4804 keeping the JFrog URL credential-free.
test_installation_when_dashboard_id_is_invalid — skipped; it exercises a deprecated dashboard API, matching the existing skip on line 185 of the same file.
test_create_account_level_groups_nested_groups — scoped the final assertion to the 4 groups the test actually creates, instead of asserting a global "no mismatches anywhere" log line that gets polluted by stale UCX groups left in the shared workspace by earlier runs.

CI: pin `COVERAGE_RCFILE`

The integration job was failing with failed: trigger: run: unknown: exit status 3 even when zero tests failed. Root cause: the databrickslabs/sandbox/acceptance wrapper invokes pytest in multiple per-directory sessions, and coverage.py walks up from CWD looking for config. At least one CWD wasn't surfacing the project's pyproject.toml, so coverage fell back to defaults (branch=false) and wrote line data while sibling sessions wrote arc data. The final session's cov.combine() then raised Can't combine arc data with line data. Setting COVERAGE_RCFILE: ${{ github.workspace }}/pyproject.toml forces every session to read the same [tool.coverage.run] config regardless of CWD.

Test plan

make fmt (black, ruff, mypy, pylint 10.00/10)
make test (2011 passed, coverage 89.83%)
labs test-one for each of the three originally-failing integration tests
CI integration job green (76 ✅ / 0 ❌ / 11 ⏭️, all 11 pytest sessions exit 0)

codecov · 2026-04-28T20:49:01Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.77%. Comparing base (45a9a37) to head (2144772).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4803   +/-   ##
=======================================
  Coverage   87.77%   87.77%           
=======================================
  Files         123      123           
  Lines       17569    17569           
  Branches     3713     3713           
=======================================
  Hits        15422    15422           
  Misses       1458     1458           
  Partials      689      689

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

asnare

The updates for 2 of the 3 are good, but we need to adjust the approach for test_build_notebook_dependency_graphs_installs_pytest_from_index_url: we shouldn't be embedding environment details in the test like that, and the approach is brittle.

Per asnare's review on #4803, the prior helper emulated pip's config-file loading and assumed env vars (PIP_INDEX_URL, UV_INDEX_URL) that aren't set in either dev or CI. Shell out to `pip config get global.index-url` instead and fall back to public PyPI when no mirror is configured. After #4804 the JFrog URL no longer carries credentials, so embedding it in the notebook fixture is safe. Co-authored-by: Isaac

- Point pip_install_pytest_with_index_url fixture at the Databricks pypi proxy (https://pypi-proxy.dev.databricks.com/simple/) since public PyPI is blocked from CI. - Skip test_installation_when_dashboard_id_is_invalid; it exercises a deprecated dashboard API. - Scope test_create_account_level_groups_nested_groups assertion to the 4 groups it creates instead of asserting a global "no mismatches" log line, which is polluted by stale UCX groups in the shared workspace. Co-authored-by: Isaac

The integration tests for pip --index-url notebooks hardcoded the URL in a static fixture, which can't satisfy both environments: local dev uses the Databricks pypi proxy while CI's jfrog-auth action configures JFrog via PIP_INDEX_URL/UV_INDEX_URL. Generate the notebook at test time using whichever index is configured (with the dev proxy as a final fallback) so the test exercises a reachable index in either environment. Co-authored-by: Isaac

UV_INDEX_URL is set in CI without credentials (uv stores them in its keyring), so handing that URL to pip --index-url returns 401 and the test fails. The authenticated URL lives in PIP_CONFIG_FILE's [global] index-url, which the jfrog-auth action writes for pip. Read that file first; fall back to PIP_INDEX_URL and the dev proxy. Co-authored-by: Isaac

Per asnare's review on #4803, the prior helper emulated pip's config-file loading and assumed env vars (PIP_INDEX_URL, UV_INDEX_URL) that aren't set in either dev or CI. Shell out to `pip config get global.index-url` instead and fall back to public PyPI when no mirror is configured. After #4804 the JFrog URL no longer carries credentials, so embedding it in the notebook fixture is safe. Co-authored-by: Isaac

`pip config get global.index-url` only inspects the user/global/site scopes and does not surface values from PIP_CONFIG_FILE (which pip loads as the env scope). CI's jfrog-auth action configures pip via PIP_CONFIG_FILE, so `get` returned no key and we fell back to public PyPI — unreachable from CI. `pip config list` reports every scope as `key='value'` lines, so parse that instead. Co-authored-by: Isaac

The acceptance wrapper only emits structured ✅/❌/⏭️ test markers, so when pytest exits 3 with no failed tests we get `failed: trigger: run: unknown: exit status 3` and no traceback. Add `pytest_internalerror`, `pytest_keyboard_interrupt`, and `pytest_sessionfinish` hooks plus `faulthandler.enable()` so unhandled exceptions, interruptions, and signal crashes show up in stderr (and the runner's raw log) with grep-able `UCX_INTERNALERROR_*` / `UCX_SESSION_FINISH` markers. Co-authored-by: Isaac

The acceptance wrapper filters stdout/stderr, so the diagnostic markers added in the prior commit never reach the GH Actions log. Write them (plus faulthandler signal traces) to a file under RUNNER_TEMP, and add a workflow step with `if: always()` that dumps the file with grouped output. This bypasses the wrapper's filtering and surfaces internal errors regardless of whether pytest exits cleanly. Co-authored-by: Isaac

The previous diagnostic-hook commit added five `# pylint: disable=...` comments which violate the repo's no-pylint-disable policy and broke that check. Refactor to remove every disable: - Hoist atexit, faulthandler, sys, tempfile, traceback to the top of the module so they no longer need wrong-import-{position,order} suppressions. - Replace `open(..., 'a')` (which triggered consider-using-with for a long-lived file that cannot be `with`-managed because faulthandler needs a persistent fileno) with `os.open` + `os.write`. atexit closes the fd. - Drop the unused `session` parameter from pytest_sessionfinish; pytest resolves hook args by name, so omitting it removes the unused-argument hit. - Factor traceback formatting into a helper so the two hooks share one format pass. Co-authored-by: Isaac

The acceptance wrapper invokes pytest multiple times in a session, and only the final invocation's `cov.combine()` raises `Can't combine arc data with line data` (visible in the diag dump). Most likely cause: at least one session's CWD doesn't surface the project's `pyproject.toml`, so coverage.py falls back to defaults (branch=false) and writes line data, while other sessions write arc data via the project config (branch=true). Set `COVERAGE_RCFILE: ${{ github.workspace }}/pyproject.toml` for the acceptance step so coverage.py reads the same config from every CWD. Also extend the internal-error diagnostic to record the active rcfile and the remaining `.coverage*` parts at crash time so we can confirm or refute the hypothesis if this isn't the full fix. Co-authored-by: Isaac

github-actions · 2026-04-30T14:46:55Z

✅ 75/75 passed, 11 skipped, 1h10m37s total

_{Running from acceptance #9003}

The diagnostic hooks in tests/integration/conftest.py and the `Dump pytest diagnostics` workflow step were added to surface what the acceptance wrapper's filtered output was hiding. They served their purpose — pointing us at the coverage.combine() error — but aren't needed in steady state. Drop them. The COVERAGE_RCFILE pin (the actual fix) stays. Co-authored-by: Isaac

Sigh.

## Changes The purpose of this PR is to adjust the way pip is configured to use JFrog during CI/CD: in particular the username/password are no longer embedded in the URL of the mirror but are instead stored separately in a `.netrc` file. This is necessary because the mirror URL isn't always handled carefully and can easily be exposed. Further to the above: - The setup for `uv` has been adjusted to use the same `.netrc` mechanism, which is slightly cleaner. - The rest of the `jfrog-auth` action has been synced with the latest version we have. Because the rest of the action has been synced, the commit to review is really a07085e. ### Linked issues Relates databrickslabs#4803. ### Tests - CI/CD tests

FastLee requested a review from a team as a code owner April 28, 2026 20:43

FastLee had a problem deploying to account-admin April 28, 2026 20:43 — with GitHub Actions Failure

FastLee had a problem deploying to account-admin April 28, 2026 21:03 — with GitHub Actions Failure

FastLee had a problem deploying to account-admin April 28, 2026 21:15 — with GitHub Actions Failure

asnare assigned FastLee Apr 29, 2026

asnare added internal this pull request won't appear in release notes bug labels Apr 29, 2026

asnare added this to UCX Apr 29, 2026

asnare moved this to In Progress in UCX Apr 29, 2026

asnare requested changes Apr 29, 2026

View reviewed changes

Comment thread tests/integration/source_code/test_libraries.py Outdated

FastLee had a problem deploying to account-admin April 29, 2026 09:14 — with GitHub Actions Failure

asnare mentioned this pull request Apr 29, 2026

Refactor pip/JFrog integration during CI/CD #4804

Merged

FastLee had a problem deploying to account-admin April 29, 2026 17:55 — with GitHub Actions Failure

FastLee added 4 commits April 29, 2026 14:33

FastLee force-pushed the fix/supply_chain_fix_tests branch from 16b2b10 to f097a51 Compare April 29, 2026 18:33

FastLee had a problem deploying to account-admin April 29, 2026 18:33 — with GitHub Actions Failure

FastLee had a problem deploying to account-admin April 29, 2026 19:00 — with GitHub Actions Failure

FastLee had a problem deploying to account-admin April 29, 2026 19:19 — with GitHub Actions Failure

FastLee had a problem deploying to account-admin April 29, 2026 19:32 — with GitHub Actions Failure

FastLee had a problem deploying to account-admin April 29, 2026 21:20 — with GitHub Actions Failure

FastLee had a problem deploying to account-admin April 29, 2026 21:37 — with GitHub Actions Failure

FastLee had a problem deploying to account-admin April 30, 2026 13:34 — with GitHub Actions Failure

FastLee temporarily deployed to account-admin April 30, 2026 14:37 — with GitHub Actions Inactive

FastLee enabled auto-merge April 30, 2026 15:04

FastLee requested a review from asnare April 30, 2026 15:05

FastLee had a problem deploying to account-admin May 1, 2026 14:16 — with GitHub Actions Error

asnare added 2 commits May 1, 2026 16:26

Check the configuration setting for index-url directly.

c79d2ef

Freshen locked project dependencies

fb92104

asnare had a problem deploying to account-admin May 1, 2026 14:29 — with GitHub Actions Failure

asnare approved these changes May 1, 2026

View reviewed changes

Revert to config list approach.

2144772

Sigh.

asnare deployed to account-admin May 1, 2026 14:58 — with GitHub Actions Active

FastLee added this pull request to the merge queue May 1, 2026

Merged via the queue into main with commit 3414a2f May 1, 2026
10 checks passed

FastLee deleted the fix/supply_chain_fix_tests branch May 1, 2026 15:14

github-project-automation Bot moved this from In Progress to Done in UCX May 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix three flaky integration tests#4803

Fix three flaky integration tests#4803
FastLee merged 13 commits intomainfrom
fix/supply_chain_fix_tests

FastLee commented Apr 28, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

asnare left a comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FastLee commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test fixes

CI: pin COVERAGE_RCFILE

Test plan

Uh oh!

codecov Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

asnare left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FastLee commented Apr 28, 2026 •

edited

Loading

CI: pin `COVERAGE_RCFILE`

codecov Bot commented Apr 28, 2026 •

edited

Loading

github-actions Bot commented Apr 30, 2026 •

edited

Loading