vmupdate: wait for full VM shutdown before next queued update#208
vmupdate: wait for full VM shutdown before next queued update#208Jayant-kernel wants to merge 3 commits intoQubesOS:mainfrom
Conversation
|
Hi @piotrbartman @marmarek , quick update from my side: I pushed the fix for #10617 and updated the PR. |
|
It only needs to wait shutdown if it is a qube with PCI devices assigned to it. Making it wait for every qube to shutdown, slows the process. Also, no need to tag maintainers, they get notified if they are tracking the repository. |
|
Yes, I updated the PR to wait only for VMs with assigned PCI devices, not for every VM shutdown. Please take another look when you can. |
5901acd to
868b1f8
Compare
|
I uploaded my signing key to the keyserver and force-pushed the signed commit. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #208 +/- ##
==========================================
+ Coverage 69.33% 71.80% +2.47%
==========================================
Files 10 12 +2
Lines 1275 1330 +55
==========================================
+ Hits 884 955 +71
+ Misses 391 375 -16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Please remove this commit: c7eb4b2 |
| # This program is free software; you can redistribute it and/or | ||
| # modify it under the terms of the GNU General Public License | ||
| # as published by the Free Software Foundation; either version 2 | ||
| # of the License, or (at your option) any later version. |
There was a problem hiding this comment.
This verbatim license seems to be too short compared to other files. I have not seen such such option being used.
https://github.com/QubesOS/qubes-core-admin-linux/blob/main/vmupdate/tests/test_vmupdate.py
|
|
||
| @patch("vmupdate.qube_connection.wait_for_domain_shutdown") | ||
| @patch("vmupdate.qube_connection.asyncio.run") | ||
| def test_wait_for_shutdown_when_vm_started_by_update(_arun, wait_shutdown): |
There was a problem hiding this comment.
I don't think the _ should be used, it normally is for options that you won't use it later, but would like to know on definition what they are. You use this option later:
_arun.assert_called_once()
|
|
||
| @patch("vmupdate.qube_connection.wait_for_domain_shutdown") | ||
| @patch("vmupdate.qube_connection.asyncio.run") | ||
| def test_do_not_wait_for_shutdown_without_assigned_pci(_arun, wait_shutdown): |
There was a problem hiding this comment.
|
|
||
| @patch("vmupdate.qube_connection.wait_for_domain_shutdown") | ||
| @patch("vmupdate.qube_connection.asyncio.run") | ||
| def test_do_not_shutdown_if_vm_was_already_running(_arun, wait_shutdown): |
There was a problem hiding this comment.
| @@ -94,9 +96,19 @@ def __exit__(self, exc_type, exc_val, exc_tb): | |||
| if self.qube.is_running() and not self._initially_running: | |||
| self.logger.info('Shutdown %s', self.qube.name) | |||
There was a problem hiding this comment.
It would be nice if the log message could be improved, add some text related to waiting shutdown, to differentiate from the default case that doesn't wait.
| self.logger.info('Shutdown %s', self.qube.name) | ||
| self.qube.shutdown() | ||
| if self._has_assigned_pci_devices(self.qube): | ||
| asyncio.run(wait_for_domain_shutdown([self.qube])) |
There was a problem hiding this comment.
Use vmupdate.shutdown_domains() instead.
| """Return True when VM has assigned PCI devices.""" | ||
| try: | ||
| return any(vm.devices['pci'].get_assigned_devices()) | ||
| except Exception: # best effort check only |
There was a problem hiding this comment.
This exception is too broad. Use QubesDaemonAccessErrror. Other exception should be raised.
|
Hey @ben-grande Fixed the license header to match other test files Is this okay? |
| self.qube.shutdown() | ||
| if self._has_assigned_pci_devices(self.qube): | ||
| self.logger.info( | ||
| 'Shutdown %s and wait for full shutdown ' |
There was a problem hiding this comment.
Instead of saying "shutdown" twice, write: Waiting for full shutdown %s (PCI devices assigned)
There was a problem hiding this comment.
changed to 'Waiting for full shutdown %s (PCI devices assigned)
| self.logger.info( | ||
| 'Shutdown %s and wait for full shutdown ' | ||
| '(PCI devices assigned)', self.qube.name) | ||
| from vmupdate.vmupdate import shutdown_domains |
There was a problem hiding this comment.
Please don't import it here, import at the top of the module.
There was a problem hiding this comment.
moved shutdown_domains import to the top of the module.
54b3271 to
49ab0e1
Compare
|
@ben-grande |
|
CI is failing: Please wait for CI to complete before asking for review. |
7626b55 to
4670c56
Compare
8fbf31a to
1961d75
Compare
Move shutdown_domains, get_feature, get_boolean_feature, and is_stale out of vmupdate.py into a new vmupdate/utils.py module. These functions have no dependency on the CLI entry point and are useful from other modules (e.g. qube_connection.py) without causing circular imports. Update vmupdate.py and qube_connection.py to import from utils.
VMs with assigned PCI devices must be fully shut down before the next update starts, because the hypervisor holds resources until shutdown completes. Previously, qube_connection.py called vm.shutdown() and moved on immediately, leaving PCI resources held. Use shutdown_domains() from vmupdate.utils to issue a force shutdown and wait for the domain to finish. Regular VMs (no PCI devices) are unaffected and still use the fast non-blocking path. Add unit tests covering: - shutdown_domains is called and vm.shutdown is not for PCI VMs - vm.shutdown is called and shutdown_domains is not for non-PCI VMs - neither is called when the VM was already running before the update Fixes: QubesOS/qubes-issues#10617
1961d75 to
4e03608
Compare
|
@ben-grande @marmarek |
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2026042316-4.3&flavor=pull-requests Test run included the following:
New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2026032404-devel&flavor=update
Failed tests43 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/170766#dependencies 32 fixed
Unstable testsDetails
Performance TestsPerformance degradation:11 performance degradations
Remaining performance tests:100 tests
|
|
Did not find errors on OpenQA related to this PR. |
d189954 to
060186e
Compare
Summary
Fix update sequencing for VMs started by
qubes-vm-update: wait for full shutdown inQubeConnection.__exit__()before returning.This prevents the next queued VM from being started while the previous one is still in the shutdown phase and still holding shared PCI devices.
Closes QubesOS/qubes-issues#10617.
What changed
vmupdate/qube_connection.pyself.qube.shutdown()for VMs started by updater, callwait_for_domain_shutdown([self.qube]).Why
With
--max-concurrency=1, the updater previously issued shutdown and immediately returned, allowing the next VM to start too early. For HVMs sharing PCI devices, this causes startup conflicts.Tests
vmupdate/tests/test_qube_connection.pytest_wait_for_shutdown_when_vm_started_by_updatetest_do_not_shutdown_if_vm_was_already_running