fix(test): stabilize bgprocess polling in backup/restore/IE/maintenance tests#9958
Conversation
…ce tests The shared polling helpers in: - web/pgadmin/tools/backup/tests/test_backup_utils.py - web/pgadmin/tools/import_export/tests/test_import_export_utils.py - web/pgadmin/tools/maintenance/tests/test_create_maintenance_job.py - web/pgadmin/tools/restore/tests/test_create_restore_job.py all share the same race that surfaced on macos-latest / pg16 in PR #9955's CI run: - Wait budget was 2.5s (5 iterations x 0.5s; maintenance used 5s). - The break condition was `execution_time' in the_process`, but `execution_time` is the elapsed time of a *running* bgprocess -- it is set before the wrapped pg_dump / pg_restore / psql / COPY actually finishes. The completion signal is `exit_code` becoming non-None. - So the helper could return control while the wrapped command was still running, and the next assertion -- e.g. `assert_equal(the_process['exit_code'] in [0, 1], True)` -- would fire on `None in [0, 1]`, i.e. `False != True`. Some scenarios masked the bug by listing `None` in their `expected_exit_code` set (a tell that someone noticed the polling was unreliable and worked around it by widening accepted exit codes). Scenarios that didn't include `None` were the ones that flaked. Fix all four helpers identically: - Poll for up to 60 iterations x 0.5s = 30s, generous enough for the slowest CI runner. - Break only when `the_process.get('exit_code') is not None`, the actual completion signal. - Narrow `except Exception` to `except StopIteration`, which is the only thing `next(...)` here can raise. No call-site changes needed; the helper contract (returns once the job is done; raises if the bgprocess never finished) is unchanged in spirit and strictly more reliable in practice. Verified: - pycodestyle on the four files: 0 violations. This fixes the failure observed in the macos-latest / pg16 leg of PR #9955's CI run (run 26154521710, job 76930277702), which was unrelated to that PR's lockfile-only changes.
WalkthroughThis PR standardizes polling logic across four background job test helpers (backup, import/export, maintenance, and restore). Each test's job completion detection is refactored from relying on ChangesBackground Job Polling Completion Detection
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
web/pgadmin/tools/import_export/tests/test_import_export_utils.py (1)
46-60:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPolling timeout can still return before import/export completion.
The loop break condition is correct, but there is no mandatory post-loop check that completion actually happened. Add explicit asserts that
the_processexists andexit_codeis non-Nonebefore continuing.Proposed fix
for _ in range(60): ... if the_process and the_process.get('exit_code') is not None: break time.sleep(0.5) + assert_equal(the_process is not None, True) + assert_equal(the_process.get('exit_code') is not None, True) + assert_equal('execution_time' in the_process, True) assert_equal('stime' in the_process, True) assert_equal('exit_code' in the_process, True)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/pgadmin/tools/import_export/tests/test_import_export_utils.py` around lines 46 - 60, The test's polling loop may exit due to timeout without actually finding the job, so after the for-loop that queries via tester.get and sets the_process (variable job_id, the_process, and its 'exit_code'), add explicit assertions to guarantee completion: assert that the_process is not None and that the_process.get('exit_code') is not None (or assertTrue/ assert_is_not_none depending on the test helpers) so the test fails if the import/export never completed before continuing.web/pgadmin/tools/restore/tests/test_create_restore_job.py (1)
173-187:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRestore polling still allows timeout without proving completion.
Please assert completion right after the loop. Otherwise, timed-out runs can continue with
exit_code is None, which undermines the new completion contract.Proposed fix
for _ in range(60): ... if the_process and the_process.get('exit_code') is not None: break time.sleep(0.5) + self.assertIsNotNone(the_process) + self.assertIsNotNone(the_process.get('exit_code')) + self.assertTrue('execution_time' in the_process) self.assertTrue('stime' in the_process) self.assertTrue('exit_code' in the_process)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/pgadmin/tools/restore/tests/test_create_restore_job.py` around lines 173 - 187, After the polling loop that checks the background process list (the loop that queries '/misc/bgprocess/' and searches for the_process by job_id), add assertions immediately after the loop to guarantee the job completed: assert that the_process is not None and that the_process.get('exit_code') is not None (e.g., using self.assertIsNotNone on both). This ensures timed-out runs fail the test instead of continuing with exit_code == None.web/pgadmin/tools/maintenance/tests/test_create_maintenance_job.py (1)
81-95:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMaintenance helper still needs an explicit “completed” assertion after polling.
for _ in range(60)can finish without hitting thebreak, and the test then proceeds without guaranteeingexit_codeis set. Add explicit post-loop checks.Proposed fix
for _ in range(60): ... if the_process and the_process.get('exit_code') is not None: break time.sleep(0.5) + self.assertIsNotNone(the_process) + self.assertIsNotNone(the_process.get('exit_code')) + self.assertTrue('execution_time' in the_process) self.assertTrue('stime' in the_process) self.assertTrue('exit_code' in the_process)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/pgadmin/tools/maintenance/tests/test_create_maintenance_job.py` around lines 81 - 95, The polling loop may exit without finding a completed job, so after the for loop add explicit assertions to ensure the maintenance job was found and finished: assert that the_process is not None, that the_process.get('exit_code') is not None, and optionally that the exit_code indicates success (e.g., equals 0). Update the test in test_create_maintenance_job (the block that uses job_id and the_process) to perform these post-loop checks so the test fails if the job never completed.web/pgadmin/tools/backup/tests/test_backup_utils.py (1)
33-47:⚠️ Potential issue | 🟠 Major | ⚡ Quick winTimeout path still permits an unfinished backup job.
If the 60-iteration loop exhausts while
the_processexists butexit_codeis stillNone, the helper continues and may still pass becauseexpected_exit_codecan includeNone. Add an explicit completion assertion immediately after the loop.Proposed fix
for _ in range(60): ... if the_process and the_process.get('exit_code') is not None: break time.sleep(0.5) + assert_equal(the_process is not None, True) + assert_equal(the_process.get('exit_code') is not None, True) + assert_equal('execution_time' in the_process, True) assert_equal('stime' in the_process, True) assert_equal('exit_code' in the_process, True)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/pgadmin/tools/backup/tests/test_backup_utils.py` around lines 33 - 47, The loop that polls bgprocess may exit by timeout while the job is still running; after the for-loop that uses tester.get('/misc/bgprocess/?_={0}'), ensure you explicitly assert that the job completed by checking that the_process (found by job_id) is not None and that the_process.get('exit_code') is not None, then assert that the_process['exit_code'] is among expected_exit_code; reference the variables the_process, job_id, exit_code, expected_exit_code and the polling call to enforce completion before continuing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@web/pgadmin/tools/backup/tests/test_backup_utils.py`:
- Around line 33-47: The loop that polls bgprocess may exit by timeout while the
job is still running; after the for-loop that uses
tester.get('/misc/bgprocess/?_={0}'), ensure you explicitly assert that the job
completed by checking that the_process (found by job_id) is not None and that
the_process.get('exit_code') is not None, then assert that
the_process['exit_code'] is among expected_exit_code; reference the variables
the_process, job_id, exit_code, expected_exit_code and the polling call to
enforce completion before continuing.
In `@web/pgadmin/tools/import_export/tests/test_import_export_utils.py`:
- Around line 46-60: The test's polling loop may exit due to timeout without
actually finding the job, so after the for-loop that queries via tester.get and
sets the_process (variable job_id, the_process, and its 'exit_code'), add
explicit assertions to guarantee completion: assert that the_process is not None
and that the_process.get('exit_code') is not None (or assertTrue/
assert_is_not_none depending on the test helpers) so the test fails if the
import/export never completed before continuing.
In `@web/pgadmin/tools/maintenance/tests/test_create_maintenance_job.py`:
- Around line 81-95: The polling loop may exit without finding a completed job,
so after the for loop add explicit assertions to ensure the maintenance job was
found and finished: assert that the_process is not None, that
the_process.get('exit_code') is not None, and optionally that the exit_code
indicates success (e.g., equals 0). Update the test in
test_create_maintenance_job (the block that uses job_id and the_process) to
perform these post-loop checks so the test fails if the job never completed.
In `@web/pgadmin/tools/restore/tests/test_create_restore_job.py`:
- Around line 173-187: After the polling loop that checks the background process
list (the loop that queries '/misc/bgprocess/' and searches for the_process by
job_id), add assertions immediately after the loop to guarantee the job
completed: assert that the_process is not None and that
the_process.get('exit_code') is not None (e.g., using self.assertIsNotNone on
both). This ensures timed-out runs fail the test instead of continuing with
exit_code == None.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0cf58215-6041-4990-b5ec-18fc36912ec9
📒 Files selected for processing (4)
web/pgadmin/tools/backup/tests/test_backup_utils.pyweb/pgadmin/tools/import_export/tests/test_import_export_utils.pyweb/pgadmin/tools/maintenance/tests/test_create_maintenance_job.pyweb/pgadmin/tools/restore/tests/test_create_restore_job.py
Summary
The shared bgprocess polling helpers in four test modules share the same race that just surfaced on the macos-latest / pg16 leg of #9955's CI run (run 26148720940, job 76930277702):
'execution_time' in the_process— butexecution_timeis the elapsed time of a running bgprocess. It's set before the wrappedpg_dump/pg_restore/psql/COPYactually finishes. The completion signal isexit_codebecoming non-None.assert_equal(the_process['exit_code'] in [0, 1], True)— fired onNone in [0, 1], i.e.False != True.Some scenarios masked the bug by listing
Nonein theirexpected_exit_codeset (a tell that someone noticed the polling was unreliable and widened the accepted set instead of fixing the polling). Scenarios that didn't includeNonewere the ones that flaked.Files changed
All four helpers fixed identically:
web/pgadmin/tools/backup/tests/test_backup_utils.pyweb/pgadmin/tools/import_export/tests/test_import_export_utils.pyweb/pgadmin/tools/maintenance/tests/test_create_maintenance_job.pyweb/pgadmin/tools/restore/tests/test_create_restore_job.pyFix
the_process.get('exit_code') is not None— the actual completion signal.except Exception:toexcept StopIteration:— the only thingnext(...)here can raise.No call-site changes needed. The helper contract (returns once the job is done; raises if the bgprocess never finished) is unchanged in spirit and strictly more reliable in practice.
Test plan
pycodestyleon all four files → 0 violationsRelated
Summary by CodeRabbit