Skip to content

🐛 fix(diskpromo): retry PromoteDisks on transient vSphere errors#1612

Open
zhangheliu wants to merge 1 commit into
vmware-tanzu:mainfrom
zhangheliu:topic/zhangheliu/vmsvc-3646-fix-diskpromo-transient-error
Open

🐛 fix(diskpromo): retry PromoteDisks on transient vSphere errors#1612
zhangheliu wants to merge 1 commit into
vmware-tanzu:mainfrom
zhangheliu:topic/zhangheliu/vmsvc-3646-fix-diskpromo-transient-error

Conversation

@zhangheliu
Copy link
Copy Markdown
Contributor

@zhangheliu zhangheliu commented May 18, 2026

What does this PR do, and why is it needed?

When PromoteDisks_Task fails with a transient vSphere fault (e.g. ConcurrentAccess), the competing operation is self-resolving — no operator action is required. Previously the reconciler returned nil without requeueing, leaving the VM stuck until the failed task expired from vSphere's RecentTask list (~10 minutes).

This PR detects transient errors via fault.IsTransientError and continues the RecentTask loop instead of returning, allowing the next reconcile cycle to issue a fresh PromoteDisks_Task immediately. The existing loop invariants ensure no duplicate tasks are created (see inline code comments for details).

A dedicated condition reason DiskPromotionTaskTransientError is introduced to surface the transient-retry state to users.

Bumps govmomi to v0.55.0-alpha.0.0.20260518191903-48ab34adb211 to include vmware/govmomi#4016, which classifies ConcurrentAccess as a transient error in fault.IsTransientError.

Which issue(s) is/are addressed by this PR? (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Fixes #

Are there any special notes for your reviewer:

The govmomi bump uses a pseudo-version referencing the commit that merged PR #4016 to govmomi main. Once an official govmomi tag is cut that includes this fix, this can be updated to the tagged release.

Please add a release note if necessary:


@zhangheliu zhangheliu requested review from a team and faisalabujabal as code owners May 18, 2026 23:49
@github-actions github-actions Bot added the size/L Denotes a PR that changes 100-499 lines. label May 18, 2026
Transient faults such as ConcurrentAccess indicate a competing
vSphere operation was in flight; they are self-resolving and do
not require a permanent failure response.

Previously, a failed PromoteDisks_Task caused the reconciler to
return nil without requeueing, leaving the VM stuck until the
task expired from vSphere's RecentTask list (~10 minutes).

This change detects transient errors via fault.IsTransientError
and continues the RecentTask loop instead of returning, allowing
the next reconcile cycle to issue a fresh PromoteDisks_Task
immediately.

A dedicated condition reason (DiskPromotionTaskTransientError)
is introduced to surface the transient-retry state to users.

Bumps govmomi to v0.55.0-alpha.0.0.20260518191903-48ab34adb211
to include vmware/govmomi#4016, which classifies ConcurrentAccess
as a transient error in fault.IsTransientError.
@zhangheliu zhangheliu force-pushed the topic/zhangheliu/vmsvc-3646-fix-diskpromo-transient-error branch from 282afe0 to fe8e5a9 Compare May 18, 2026 23:52
@github-actions
Copy link
Copy Markdown

Code Coverage

Package Line Rate Health
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/clustercontentlibraryitem 67%
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/contentlibraryitem 67%
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/utils 85%
github.com/vmware-tanzu/vm-operator/controllers/infra/capability/configmap 92%
github.com/vmware-tanzu/vm-operator/controllers/infra/capability/crd 100%
github.com/vmware-tanzu/vm-operator/controllers/infra/configmap 75%
github.com/vmware-tanzu/vm-operator/controllers/infra/node 77%
github.com/vmware-tanzu/vm-operator/controllers/infra/secret 76%
github.com/vmware-tanzu/vm-operator/controllers/infra/validatingwebhookconfiguration 87%
github.com/vmware-tanzu/vm-operator/controllers/infra/zone 73%
github.com/vmware-tanzu/vm-operator/controllers/storage/storageclass 93%
github.com/vmware-tanzu/vm-operator/controllers/storage/storagepolicy 96%
github.com/vmware-tanzu/vm-operator/controllers/storage/storagepolicyquota 91%
github.com/vmware-tanzu/vm-operator/controllers/storage/volumeattributesclass 93%
github.com/vmware-tanzu/vm-operator/controllers/util/encoding 73%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/storagepolicyusage 96%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/virtualmachine 65%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/volume 85%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/volumebatch 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineclass 73%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinegroup 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinegrouppublishrequest 88%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineimagecache 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinepublishrequest 84%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinereplicaset 67%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineservice 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineservice/providers 92%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinesetresourcepolicy 81%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinesnapshot 92%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest 72%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1 72%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1/conditions 88%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1/patch 78%
github.com/vmware-tanzu/vm-operator/controllers/vspherepolicy/policyevaluation 85%
github.com/vmware-tanzu/vm-operator/pkg/bitmask 100%
github.com/vmware-tanzu/vm-operator/pkg/builder 89%
github.com/vmware-tanzu/vm-operator/pkg/conditions 90%
github.com/vmware-tanzu/vm-operator/pkg/config 100%
github.com/vmware-tanzu/vm-operator/pkg/config/capabilities 97%
github.com/vmware-tanzu/vm-operator/pkg/config/env 100%
github.com/vmware-tanzu/vm-operator/pkg/context 37%
github.com/vmware-tanzu/vm-operator/pkg/context/generic 100%
github.com/vmware-tanzu/vm-operator/pkg/context/operation 100%
github.com/vmware-tanzu/vm-operator/pkg/crd 76%
github.com/vmware-tanzu/vm-operator/pkg/errors 76%
github.com/vmware-tanzu/vm-operator/pkg/exit 100%
github.com/vmware-tanzu/vm-operator/pkg/log 100%
github.com/vmware-tanzu/vm-operator/pkg/mem 100%
github.com/vmware-tanzu/vm-operator/pkg/patch 78%
github.com/vmware-tanzu/vm-operator/pkg/prober 89%
github.com/vmware-tanzu/vm-operator/pkg/prober/probe 90%
github.com/vmware-tanzu/vm-operator/pkg/prober/worker 77%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere 75%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/clustermodules 73%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/config 88%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/contentlibrary 75%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/credentials 100%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/network 83%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/placement 70%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/session 52%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/storage 44%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/upgrade/virtualmachine 95%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/upgrade/virtualmachine/backfill 92%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/vcenter 85%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/virtualmachine 84%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/vmlifecycle 74%
github.com/vmware-tanzu/vm-operator/pkg/record 84%
github.com/vmware-tanzu/vm-operator/pkg/topology 91%
github.com/vmware-tanzu/vm-operator/pkg/util 78%
github.com/vmware-tanzu/vm-operator/pkg/util/cloudinit 89%
github.com/vmware-tanzu/vm-operator/pkg/util/cloudinit/validate 91%
github.com/vmware-tanzu/vm-operator/pkg/util/image 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube 92%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/cource 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/internal 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/proxyaddr 73%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/spq 99%
github.com/vmware-tanzu/vm-operator/pkg/util/linuxprep 97%
github.com/vmware-tanzu/vm-operator/pkg/util/netplan 100%
github.com/vmware-tanzu/vm-operator/pkg/util/nil 100%
github.com/vmware-tanzu/vm-operator/pkg/util/ovfcache 75%
github.com/vmware-tanzu/vm-operator/pkg/util/ovfcache/internal 100%
github.com/vmware-tanzu/vm-operator/pkg/util/paused 100%
github.com/vmware-tanzu/vm-operator/pkg/util/ptr 100%
github.com/vmware-tanzu/vm-operator/pkg/util/resize 98%
github.com/vmware-tanzu/vm-operator/pkg/util/sysprep 98%
github.com/vmware-tanzu/vm-operator/pkg/util/vmopv1 88%
github.com/vmware-tanzu/vm-operator/pkg/util/volumes 100%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/client 66%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/datastore 100%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/fault 100%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/library 95%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/storage 82%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/task 100%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/vm 78%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/watcher 85%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig 95%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/anno2extraconfig 100%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/bootoptions 88%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/cdrom 88%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/crypto 92%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/diskpromo 100%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/policy 97%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/virtualcontroller 93%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/volumes/unmanaged/backfill 98%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/volumes/unmanaged/register 93%
github.com/vmware-tanzu/vm-operator/pkg/webconsolevalidation 100%
github.com/vmware-tanzu/vm-operator/services/vm-watcher 85%
github.com/vmware-tanzu/vm-operator/webhooks/common 98%
github.com/vmware-tanzu/vm-operator/webhooks/persistentvolumeclaim/validation 95%
github.com/vmware-tanzu/vm-operator/webhooks/unifiedstoragequota/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachine/mutation 87%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachine/validation 96%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineclass/mutation 62%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineclass/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegroup/mutation 87%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegroup/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegrouppublishrequest/mutation 86%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegrouppublishrequest/validation 88%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinepublishrequest/validation 90%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinereplicaset/validation 90%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineservice/mutation 67%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineservice/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesetresourcepolicy/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesnapshot/mutation 83%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesnapshot/validation 91%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinewebconsolerequest/v1alpha1/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinewebconsolerequest/validation 92%
Summary 84% (19483 / 23258)

Minimum allowed line rate is 79%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant