Skip to content

engine-orchestration,vmware: hypervisor migration during start vm migration#7444

Merged
DaanHoogland merged 5 commits intoapache:4.18from
shapeblue:fix-startvm-interclustermig-vmotion
Jun 20, 2023
Merged

engine-orchestration,vmware: hypervisor migration during start vm migration#7444
DaanHoogland merged 5 commits intoapache:4.18from
shapeblue:fix-startvm-interclustermig-vmotion

Conversation

@shwstppr
Copy link
Copy Markdown
Contributor

Description

This PR enables the hypervisor strategy for VM migration when a VM is to be started in a different cluster for VMware.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 18, 2023

Codecov Report

Merging #7444 (d02c769) into 4.18 (dabefca) will decrease coverage by 0.01%.
The diff coverage is 0.00%.

❗ Current head d02c769 differs from pull request most recent head d7cfefb. Consider uploading reports for the commit d7cfefb to get more accurate results

@@             Coverage Diff              @@
##               4.18    #7444      +/-   ##
============================================
- Coverage     12.70%   12.70%   -0.01%     
  Complexity     8673     8673              
============================================
  Files          2717     2717              
  Lines        256186   256207      +21     
  Branches      39929    39932       +3     
============================================
  Hits          32541    32541              
- Misses       219507   219528      +21     
  Partials       4138     4138              
Impacted Files Coverage Δ
.../main/java/com/cloud/vm/VirtualMachineManager.java 92.30% <ø> (ø)
...n/java/com/cloud/vm/VirtualMachineManagerImpl.java 6.19% <0.00%> (-0.04%) ⬇️
...ain/java/com/cloud/hypervisor/guru/VMwareGuru.java 1.06% <0.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@sonarqubecloud
Copy link
Copy Markdown

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

32.1% 32.1% Coverage
0.0% 0.0% Duplication

@shwstppr shwstppr marked this pull request as ready for review May 8, 2023 08:33
@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented May 8, 2023

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6041

@weizhouapache weizhouapache added this to the 4.18.1.0 milestone May 8, 2023
@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented May 8, 2023

@blueorangutan test matrix

@blueorangutan
Copy link
Copy Markdown

@shwstppr a Trillian-Jenkins matrix job (centos7 mgmt + xenserver71, rocky8 mgmt + vmware67u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

Copy link
Copy Markdown
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm, can you advice on test approach @shwstppr ? I mean in addition to your added unit tests?

@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented May 8, 2023

@DaanHoogland A multi-cluster env would be needed. Then the following could be tested:

  • Deploy a test vm in cluster-1 and stop it
  • Fill cluster-1 to its capacity. Make sure the capacity reservation for the test vm is expired.
  • When no more new VMs can be deployed in the cluster-1 then try to start the test vm. This would trigger inter-cluster migration for the VM
  • Observe on the hypervisor side and in the logs that the migration is carried by the hypervisor (vMotion in case VMware) and not using the SSVM and export snapshot technique.

Copy link
Copy Markdown
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-6512)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 40175 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6512-xenserver-71.zip
Smoke tests completed. 108 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-6514)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 42310 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6514-kvm-centos7.zip
Smoke tests completed. 108 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-6513)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 57520 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6513-vmware-67u3.zip
Smoke tests completed. 106 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestSnapshotRootDisk>:setup Error 0.00 test_snapshots.py
test_01_deploy_vm_on_specific_host Error 21.81 test_vm_deployment_planner.py
test_02_deploy_vm_on_specific_cluster Error 3606.59 test_vm_deployment_planner.py
test_03_deploy_vm_on_specific_pod Error 15.72 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 9.57 test_vm_deployment_planner.py
test_05_deploy_vm_on_cluster_override_pod Error 31.12 test_vm_deployment_planner.py

@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented May 9, 2023

Need to investigate the VMware failures. Though they seem unrelated as we are seeing them on other PRs as well, eg: #7404

@DaanHoogland
Copy link
Copy Markdown
Contributor

Need to investigate the VMware failures. Though they seem unrelated as we are seeing them on other PRs as well, eg: #7404

indeed, but the test_snapshot failure is extra! I think the test_vm_deployment_planner should get there own ticket (if they persist)

@shwstppr
Copy link
Copy Markdown
Contributor Author

Tested with latest nightly and it seems error with test_vm_deployment_planner.py were intermittent or environment related,

[root@ref-trl-4867-v-M7-abhishek-kumar-marvin marvin]# nosetests --with-xunit --xunit-file=results.xml --with-marvin --marvin-config=./ref-trl-4867-v-M7-abhishek-kumar-advanced-cfg -s -a tags=advanced --hypervisor=Vmware tests/smoke/test_vm_deployment_planner.py 
/usr/local/lib/python3.6/site-packages/paramiko/transport.py:32: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography. The next release of cryptography will remove support for Python 3.6.
  from cryptography.hazmat.backends import default_backend

==== Marvin Init Started ====

=== Marvin Parse Config Successful ===

=== Marvin Setting TestData Successful===

==== Log Folder Path: /marvin/MarvinLogs/May_10_2023_11_53_53_U8PK9N All logs will be available here ====

=== Marvin Init Logging Successful===

==== Marvin Init Successful ====
=== TestName: test_01_deploy_vm_on_specific_host | Status : SUCCESS ===

=== TestName: test_02_deploy_vm_on_specific_cluster | Status : SUCCESS ===

=== TestName: test_03_deploy_vm_on_specific_pod | Status : SUCCESS ===

=== TestName: test_04_deploy_vm_on_host_override_pod_and_cluster | Status : SUCCESS ===

=== TestName: test_05_deploy_vm_on_cluster_override_pod | Status : SUCCESS ===

=== Final results are now copied to: /marvin//MarvinLogs/test_vm_deployment_planner_0VX1NC ===

test_snapshots.py failed while getting template in the ^^ VMware run,

CRITICAL: EXCEPTION: None: ['Traceback (most recent call last):\n', ' File "/usr/local/lib/python3.6/site-packages/nose/suite.py", line 210, in run\n self.setUp()\n', ' File "/usr/local/lib/python3.6/site-packages/nose/suite.py", line 293, in setUp\n self.setupContext(ancestor)\n', ' File "/usr/local/lib/python3.6/site-packages/nose/suite.py", line 316, in setupContext\n try_run(context, names)\n', ' File "/usr/local/lib/python3.6/site-packages/nose/util.py", line 471, in try_run\n return func()\n', ' File "/marvin/tests/smoke/test_snapshots.py", line 67, in setUpClass\n assert False, "get_test_template() failed to return template"\n', 'AssertionError: get_test_template() failed to return template\n']
``

Will run VMware tests again

@blueorangutan package

@shwstppr
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6054

@shwstppr
Copy link
Copy Markdown
Contributor Author

@blueorangutan test rocky8 vmware-67u3

@shwstppr
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a [LL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [LL]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6036

@yadvr
Copy link
Copy Markdown
Member

yadvr commented May 22, 2023

@blueorangutan test rocky8 vmware-67u3

@blueorangutan
Copy link
Copy Markdown

@rohityadavcloud a [LL] Trillian-Jenkins test job (rocky8 mgmt + vmware-67u3) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6096

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-6538)
Environment: vmware-70u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 58577 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6538-vmware-70u3.zip
Smoke tests completed. 107 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_deploy_vm_on_specific_host Error 3604.18 test_vm_deployment_planner.py
test_02_deploy_vm_on_specific_cluster Error 1.27 test_vm_deployment_planner.py
test_03_deploy_vm_on_specific_pod Error 15.68 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 3604.61 test_vm_deployment_planner.py
test_05_deploy_vm_on_cluster_override_pod Error 15.64 test_vm_deployment_planner.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-6531)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 98749 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6531-vmware-67u3.zip
Smoke tests completed. 106 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_redundant_vpc_site2site_vpn Failure 3609.70 test_vpc_vpn.py
test_01_redundant_vpc_site2site_vpn Error 3609.78 test_vpc_vpn.py
test_01_vpc_site2site_vpn_multiple_options Error 14778.33 test_vpc_vpn.py
test_01_vpc_site2site_vpn_multiple_options Error 14778.68 test_vpc_vpn.py
test_01_vpc_remote_access_vpn Failure 3610.74 test_vpc_vpn.py
test_01_vpc_site2site_vpn Failure 3609.52 test_vpc_vpn.py
test_01_vpc_site2site_vpn Error 3609.60 test_vpc_vpn.py
test_01_cancel_host_maintenace_with_no_migration_jobs Error 1806.97 test_host_maintenance.py
test_02_cancel_host_maintenace_with_migration_jobs Error 2003.78 test_host_maintenance.py
test_03_cancel_host_maintenace_with_migration_jobs_failure Error 1906.18 test_host_maintenance.py

@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented Jun 1, 2023

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a [LL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [LL]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6069

@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented Jun 5, 2023

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a [LL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [LL]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6076

@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented Jun 6, 2023

@blueorangutan test matrix

@blueorangutan
Copy link
Copy Markdown

@shwstppr a [SF] Trillian-Jenkins matrix job (centos7 mgmt + xenserver71, rocky8 mgmt + vmware67u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian Build Failed (tid-6685)

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-6686)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 38961 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6686-kvm-centos7.zip
Smoke tests completed. 108 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-6684)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 40782 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6684-xenserver-71.zip
Smoke tests completed. 108 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented Jun 8, 2023

@blueorangutan test rocky8 vmware-67u3

@blueorangutan
Copy link
Copy Markdown

@shwstppr a [SF] Trillian-Jenkins test job (rocky8 mgmt + vmware-67u3) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-6706)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 56501 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6706-vmware-67u3.zip
Smoke tests completed. 106 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_deploy_vm_on_specific_host Error 15.61 test_vm_deployment_planner.py
test_02_deploy_vm_on_specific_cluster Error 3603.37 test_vm_deployment_planner.py
test_03_deploy_vm_on_specific_pod Error 4.37 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 4.37 test_vm_deployment_planner.py
test_05_deploy_vm_on_cluster_override_pod Error 4.36 test_vm_deployment_planner.py
test_09_expunge_vm Failure 427.75 test_vm_life_cycle.py

@DaanHoogland
Copy link
Copy Markdown
Contributor

@blueorangutan test rocky8 vmware-67u3

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a [SF] Trillian-Jenkins test job (rocky8 mgmt + vmware-67u3) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-6712)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 53221 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7444-t6712-vmware-67u3.zip
Smoke tests completed. 106 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_04_autoscale_kubernetes_cluster Failure 156.64 test_kubernetes_clusters.py
test_CreateTemplateWithDuplicateName Error 1801.16 test_templates.py

Copy link
Copy Markdown
Contributor

@vladimirpetrov vladimirpetrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on manual testing (tested with VMWare 7.0.3).

@DaanHoogland DaanHoogland merged commit 3748f32 into apache:4.18 Jun 20, 2023
@yadvr yadvr deleted the fix-startvm-interclustermig-vmotion branch March 5, 2024 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants