chore(vm): add system migration policy#2152
Closed
LopatinDmitr wants to merge 90 commits into
Closed
Conversation
docs: add release notes for v1.6.0 --------- Signed-off-by: Isteb4k <dmitry.rakitin@flant.com> Signed-off-by: Vladislav Panfilov <vladislav.panfilov@flant.com> Co-authored-by: Vladislav Panfilov <vladislav.panfilov@flant.com>
Update due 1.6.0. --------- Signed-off-by: Vladislav Panfilov <vladislav.panfilov@flant.com> Signed-off-by: Pavel Tishkov <pavel.tishkov@flant.com> Co-authored-by: Pavel Tishkov <pavel.tishkov@flant.com>
Improved test/dvp-static-cluster/scripts/gen-kubeconfig.sh kubeconfig generation flow and error handling. Refactored retry logic to avoid redundant checks of the same kubeconfig and made retries explicit at generation level. Added robust failure handling and clearer exit behavior: strict bash mode (set -Eeuo pipefail) centralized error-exit helper (exit_with_error) signal/error traps with meaningful exit codes and diagnostics --------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
Re-generate changelog v1.6.0 Signed-off-by: deckhouse-BOaTswain <89150800+deckhouse-boatswain@users.noreply.github.com> Co-authored-by: Isteb4k <Isteb4k@users.noreply.github.com>
Description Reduced ssh command timeout wait to 5 seconds. UntilSSHReady now wait 60 seconds in test PowerState. Why do we need it, and what problem does it solve? There was also only one attempt to connect to the server. SSH actually only knocked once, because the timeout for SSH is 30 seconds, and for Eventually it's also 30 seconds. --------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
Description Add describe for nodes when test fails. Also increase timeout for UntilVMAgentReady in VirtualMachineConfiguration The error output for the error in function UntilVMAgentReady has become clearer --------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com> Co-authored-by: Roman Sysoev <36233932+hardcoretime@users.noreply.github.com>
Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
…2053) Description Сhange image from Alpine to Ubuntu due to problems with lsblk utility output ---------------- Signed-off-by: Dmitry Lopatin <dmitry.lopatin@flant.com>
Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
#2046) Signed-off-by: Valeriy Khorunzhin <valeriy.khorunzhin@flant.com>
- Ensure upload layer errors are not missed. - Wrap "DVCR is out of space" error. Signed-off-by: Roman Sysoev <roman.sysoev@flant.com>
- Use kube-api-rewriter machinery from the external repo - Only KubeVirt and CDI rules are needed here. Signed-off-by: Ivan Mikheykin <ivan.mikheykin@flant.com>
* chore(module): add SecurityPolicyException resources - Add exceptions for all Pods that require more permissions than provided by the PSS Restricted: - ds/virt-handler - ds/virtualization-dra - ds/vm-route-forge - Add a dev note about SecurityPolicyExceptions. Signed-off-by: Ivan Mikheykin <ivan.mikheykin@flant.com> --------- Signed-off-by: Ivan Mikheykin <ivan.mikheykin@flant.com>
Description Add e2e test for USBDevices and NodeUSBDevice. That test attach USBDevice to VM and write data on it, migrate virtual machine and check written data. --------- Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com> Co-authored-by: Roman Sysoev <36233932+hardcoretime@users.noreply.github.com>
…34 version of k8s (#2059) Add k8s cluster version configuration for e2e tests on nested clusters, ceph cluster use 1.34 version of k8s --------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
Signed-off-by: Ivan Mikheykin <ivan.mikheykin@flant.com>
#2048) Signed-off-by: Valeriy Khorunzhin <valeriy.khorunzhin@flant.com>
Description This PR updates E2E test network and image configuration to make VM connectivity and migration scenarios more deterministic. Key changes: - Configure test ClusterNetwork to existing VLAN 4006 (cn-4006-for-e2e-test) instead of VLAN 1003 (cn-1003-for-e2e-test). - Normalize image usage in E2E tests (switch most cases from perf image to stable Alpine UEFI image where needed). - Fix object builders for Ubuntu resources (VI/CVI/VD) to use Ubuntu image URL instead of Alpine BIOS URL. - Add dedicated constructors for Alpine BIOS/UEFI images in object helpers. Update additional network interfaces test: - move additional IPs to per-test-case params, - pass explicit IPs into connectivity checks, - adjust Alpine cloud-init service startup commands. --------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
fix dvcr gc rbac rules Signed-off-by: Yaroslav Borbat <yaroslav.borbat@flant.com>
Signed-off-by: Valeriy Khorunzhin <valeriy.khorunzhin@flant.com>
* docs: add metadata.name length limit Signed-off-by: Vladislav Panfilov <vladislav.panfilov@flant.com> * docs: update metadata.name length limits Signed-off-by: Vladislav Panfilov <vladislav.panfilov@flant.com> --------- Signed-off-by: Vladislav Panfilov <vladislav.panfilov@flant.com>
Signed-off-by: Maksim Fedotov <maksim.fedotov@flant.com>
Description The virtualization-setup-dummy-hcd node group configuration used in end-to-end clusters now only supports Debian OS. A bug in the node group configuration has been fixed; the missing linux-modules-extra-$KERNEL_VERSION package has been added. Removed exit 1 from script in node group configuration to continue cluster bootstrap. --------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
Add release notes v1.5.2. --------- Signed-off-by: Isteb4k <dmitry.rakitin@flant.com> Signed-off-by: Vladislav Panfilov <97229646+prismagod@users.noreply.github.com> Co-authored-by: Vladislav Panfilov <97229646+prismagod@users.noreply.github.com>
Re-generate changelog v1.5.2 Signed-off-by: deckhouse-BOaTswain <89150800+deckhouse-boatswain@users.noreply.github.com> Co-authored-by: nevermarine <nevermarine@users.noreply.github.com>
Fix a lowercase RFC 1123 error for vd ImporterNetworkPolicy --------- Signed-off-by: Isteb4k <dmitry.rakitin@flant.com>
Signed-off-by: Dmitry Lopatin <dmitry.lopatin@flant.com>
…1950) Add webhook validation that rejects migration operations for VMs with local storage in CE edition --------- Signed-off-by: Daniil Loktev <lokt.daniil@gmail.com>
Fix VM status (Running condition) to properly display CSI driver and volume attachment errors
Fix misleading errors related to SDN ("waiting for SDN module") (NetworkReady condition)
When a VM fails to start due to CSI driver issues (e.g., missing CSI driver, volume attachment failures), the Console UI shows incorrect error messages: "Cannot determine the status of additional interfaces, waiting for a response from the SDN module", which come from NetworkReady condition. The actual CSI error was hidden in pod events and not surfaced in the VM status.
---------
Signed-off-by: Daniil Loktev <lokt.daniil@gmail.com>
Signed-off-by: Daniil Loktev <70405899+loktev-d@users.noreply.github.com>
…dation issue (#2063) Temporarily revert VM/VMS dashboard location due to validation issue Signed-off-by: Pavel Tishkov <pavel.tishkov@flant.com>
chore(module): update module requirements Signed-off-by: Yaroslav Borbat <yaroslav.borbat@flant.com>
Description Fixes semver parsing for module version requirements with two-component versions. Added NormalizeSemVerRange() function in tools/moduleversions/internal/version/normalize.go that automatically converts two-component versions to three-component format: ~1.74 → ~1.74.0 ^1.74 → ^1.74.0 >=1.74 <2.0 → >=1.74.0 <2.0.0 The normalization is applied before passing the range string to semver.ParseRange() in the requirements checker. Also fix requirements for deckhouse back to ">= 1.74.2" --------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
…iver. (#2087) fix(dra): enable NRI (Node Resource Interface) hook in the DRA USB driver Signed-off-by: Yaroslav Borbat <yaroslav.borbat@flant.com>
Add release notes for v1.6.2 --------- Signed-off-by: Isteb4k <dmitry.rakitin@flant.com> Signed-off-by: Vladislav Panfilov <97229646+prismagod@users.noreply.github.com> Co-authored-by: Vladislav Panfilov <97229646+prismagod@users.noreply.github.com>
…oval (#2124) Signed-off-by: Dmitry Lopatin <dmitry.lopatin@flant.com>
Description Removal of e2e tests with Ceph storage from the nightly pipeline. Removed: e2e-ceph job from e2e-matrix.yml e2e-reusable-pipeline.yml file (was used for Ceph) All Ceph manifests and scripts in test/dvp-static-cluster/storage/ceph/ --------------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
test(e2e): skip vd snapshot wait when CSI snapshots lag behind Signed-off-by: Dmitry Lopatin <dmitry.lopatin@flant.com>
…pvc (#2115) fix(api): improve storage class validation messages for VD and VI on PVC - Enhance error messages during storage class for VD and VI on PVC - Add unit tests for VirtualDisk storage class validation, separating CE and EE test sets. Signed-off-by: Dmitry Lopatin <dmitry.lopatin@flant.com>
Signed-off-by: Dmitry Lopatin <dmitry.lopatin@flant.com>
…2116) - Prevent changing VirtualDisk storage class to an arbitrary value while migration is in progress by allowing only rollback to the source PVC storage class. Add validator tests for forbidden A->B->C changes and allowed rollback to A. Signed-off-by: Dmitry Lopatin <dmitry.lopatin@flant.com>
… in Go files. (#2140) Description stylecheck is deprecated in golangci-lint v2. The correct linter name for nolint directives is staticcheck. Using deprecated linter names causes warnings or errors. Change //nolint:stylecheck,nolintlint → //nolint:staticcheck,nolintlint ----------------- Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
Signed-off-by: Yaroslav Borbat <yaroslav.borbat@flant.com>
…#2144) * fix(vmop): prevent Maintenance mode from getting stuck during restore Return reconcile.Result instead of nil to properly complete the reconciliation loop when snapshot steps exit early (exit maintenance step, waiting disk ready step). Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com> * fix(vmop): set maintenance condition to false instead of early return Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com> --------- Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
d899db1 to
641625d
Compare
Signed-off-by: Dmitry Lopatin <dmitry.lopatin@flant.com>
641625d to
b2e9698
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add a system-level live migration policy override sourced from
ModuleConfig/virtualizationannotationvirtualization.deckhouse.io/system-migration-policy.The controller now reads this annotation at startup and, when valid, applies it globally in live migration policy calculation.
What is the expected result?
ModuleConfig/virtualization:virtualization.deckhouse.io/system-migration-policy: <valid policy>.Checklist
Changelog entries