Skip to content

Reconcile VMLocation when it's moved to different RP and set appropriate conditions#1558

Open
anwithaun-1 wants to merge 21 commits into
vmware-tanzu:mainfrom
anwithaun-1:topic/anwitha/VMSvc_3539_3541
Open

Reconcile VMLocation when it's moved to different RP and set appropriate conditions#1558
anwithaun-1 wants to merge 21 commits into
vmware-tanzu:mainfrom
anwithaun-1:topic/anwitha/VMSvc_3539_3541

Conversation

@anwithaun-1
Copy link
Copy Markdown
Contributor

@anwithaun-1 anwithaun-1 commented Apr 16, 2026

What does this PR do, and why is it needed?

Problem:
Currently, the VM Operator does not set appropriate conditions or halt reconciliation when a VM is moved out of its designated resource pool. This leads to two major issues:
Permission Creep: DevOps admin privileges are inadvertently extended to reconfigure VMs that have drifted into different inventory locations.
Lack of Visibility: There is no "stop-gap" or clear status indicator telling administrators that the VM is in an orphaned state, leading to unpredictable reconciliation behavior.

Solution:

  • The controller now searches the broader inventory to see if the VM exists. A VM is flagged as "Orphaned" if it is located outside its Namespace RP.
  • Child ResourcePool is considered as valid locations.
  • Once an orphan is detected, the controller finalizes the status update and immediately exits the reconciliation loop with NoRequeErr.
  • This prevents the system from using elevated privileges to reconfigure a VM that is no longer in a managed/expected location.
  • Applies a clear condition to the VM object notifying the User/Administrator that the VM is in an unexpected location.
  • Property watcher added to watch VM's resourcePool or Parent property change. This triggers the reconileLocation to validate the VM's location.
  • Once the VM is moved back to the expected location, mismatch condition set on VM will be removed.

State Conflict: Two instances of the same workload existing simultaneously in different locations.

Which issue(s) is/are addressed by this PR? (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Fixes #
VMSVC-3539
VMSVC-3541

Are there any special notes for your reviewer:

Output:
Last Transition Time:  2026-05-11T05:43:40Z
    Message:               VM is in an unauthorized Resource Pool. Move the VM back to the Resource Pool hierarchy for namespace 'test-ns' to resume reconciliation.
    Reason:                LocationMismatch
    Status:                False
    Type:                  VirtualMachineConditionInAuthorizedLocation

Please add a release note if necessary:


@github-actions github-actions Bot added the size/L Denotes a PR that changes 100-499 lines. label Apr 16, 2026
@github-actions github-actions Bot added size/M Denotes a PR that changes 30-99 lines. and removed size/L Denotes a PR that changes 100-499 lines. labels Apr 21, 2026
@anwithaun-1 anwithaun-1 changed the title Fix for VMSVC-3539-3541 Reconcile VMLocation to different RP or folder and set appropriate conditions Apr 21, 2026
@anwithaun-1 anwithaun-1 changed the title Reconcile VMLocation to different RP or folder and set appropriate conditions Reconcile VMLocation when it's moved to different RP or folder and set appropriate conditions Apr 21, 2026
@anwithaun-1 anwithaun-1 marked this pull request as ready for review April 21, 2026 09:46
@anwithaun-1 anwithaun-1 requested review from a team and faisalabujabal as code owners April 21, 2026 09:46
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread controllers/virtualmachine/virtualmachine/virtualmachine_controller.go Outdated
@aakashchan aakashchan requested a review from hpannem April 22, 2026 06:20
Comment thread controllers/virtualmachine/virtualmachine/virtualmachine_controller.go Outdated
Comment thread pkg/providers/vm_provider_interface.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Copy link
Copy Markdown
Contributor Author

@anwithaun-1 anwithaun-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at the addressed comments.

Copy link
Copy Markdown
Contributor Author

@anwithaun-1 anwithaun-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved all the comments. pls review again.

Comment thread services/vm-watcher/vm_watcher_service.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
@github-actions github-actions Bot added size/L Denotes a PR that changes 100-499 lines. and removed size/M Denotes a PR that changes 30-99 lines. labels May 12, 2026
@anwithaun-1
Copy link
Copy Markdown
Contributor Author

@aruneshpa @bryanv @hpannem , pls review

Comment thread api/v1alpha6/virtualmachine_types.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
}

// 3. Validate Ancestry
// Check if the VM is in the Root RP or a Child RP (common for VKS)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if the vm is in correct namespace however wrong child RP? Is this not under the PR's scope?

Copy link
Copy Markdown
Contributor Author

@anwithaun-1 anwithaun-1 May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is considered as wrong Child RP?
If the VM is in correct namespace and moved to any of the child RP of that namespace, then it's considered as a valid location as of now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. However we need to confirm this with Bryan or Andrew.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I am with Hemanth on this. The only valid use-case for child RP is VKS. But if the VM is not in the VKS child RP, then we need to report it as an error / invalid location.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bryanv any thoughts on this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aruneshpa How to ensure that if the child RP is VKS or not?

@anwithaun-1 anwithaun-1 requested a review from aruneshpa May 18, 2026 04:57
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
return err
}

pkgcnd.Delete(vmCtx.VM, vmopv1.VirtualMachineInAuthorizedLocation)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, absence of this condition has two scenarios.

  1. the reconcile has not yet happened.
  2. The reconcile happened and its in correct location.

Would it be better to distinguish these two scenarios?
True: In correct location.
False: not in correct location.
Absent: Not yet run or unable to determine.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. The way the condition is named right now, it must be set to True during every reconcile if the VM is in a valid location.

If it is only supposed to be set when the VM is in invalid location, then we can call it "VMInInvalidLocation" or something but that violates the principal that the condition types must have positive polarity.

Copy link
Copy Markdown
Contributor Author

@anwithaun-1 anwithaun-1 May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hpannem @aruneshpa
ReconcileLocation is called when VM is created or when resourcePool is changed. So in my opinion (Absent: Not yet run or unable to determine ) is not required. We can distinguish between the VM in correct location and in incorrect location. Also "unable to determine" can happen for many reasons. if we are not able to reconcile VM's location and just remove the condition in this case, then it leads to the impression that the VM is normal.

Are these states Ok ?
True: In correct location (when vm is created or or after creation when it's moved to valid location )
False: not in correct location (when vm is moved to invalid location)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think of user perspective, who is monitoring the CR.

  1. VM reconcile is not yet run. There could be many reasons for this. Operator down... or CR is queued and reconciler didn't get a chance to this. Condition absence == reconciler not run. User waits until condition is inserted and checks the status.
  2. Once reconciled, condition is put. Usually, I believe conditions should not get deleted once they are inserted. Just change true or false or unknown. There may be exceptions in some cases like upgrade etc. However, here, it would be good to keep it.

One correction: "Unable to determine" will go as an unknown on the condition. Condition will be set as "unknown" with a reason and message. Standard k8s pattern. aka. metav1.ConditionUnknown.

True: In correct location.
False: not in correct location.
Unknown: unable to determine.
Absent: Not yet run

Comment thread pkg/providers/vsphere/vmprovider_vm.go
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
)
if err != nil {
logger.Error(err, "failed to get expected namespace resource pool")
return nil
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Should this error be ignored? If we are can't be certain that we are in the RP that we should be in, it is okay to error and requeue IMO.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasons why I am not returning the Err here:

  1. In case if we fail to get namespace RP, then it'll block all the next VM reconcilations. That is the reason , I went ahead with ignoring this error. Still I have handled err in case if I am not able to validate the location or if the VM is in the RP that it shouldn't be in.
  2. Unit tests failure due to the empty zonename property which is expected by the GetNamespaceFolderAndRPMoID api.

@aruneshpa could you pls provide suggestions on this.

Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
return err
}

pkgcnd.Delete(vmCtx.VM, vmopv1.VirtualMachineInAuthorizedLocation)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. The way the condition is named right now, it must be set to True during every reconcile if the VM is in a valid location.

If it is only supposed to be set when the VM is in invalid location, then we can call it "VMInInvalidLocation" or something but that violates the principal that the condition types must have positive polarity.

"summary.runtime.connectionState",
"summary.runtime.host",
"summary.runtime.powerState",
"resourcePool",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also watch the folder ("parent") here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aruneshpa Initially I had added "parent" and removed it as @bryanv mentioned that "the VM managed object doesn't have a parent field" in of the review thread above. I am not sure about this. Could you please provide more clarity.

Comment thread pkg/providers/vsphere/vmprovider_vm.go
Comment thread pkg/providers/vsphere/vmprovider_vm.go Outdated
@anwithaun-1 anwithaun-1 changed the title Reconcile VMLocation when it's moved to different RP or folder and set appropriate conditions Reconcile VMLocation when it's moved to different RP and set appropriate conditions May 19, 2026
@github-actions
Copy link
Copy Markdown

Code Coverage

Package Line Rate Health
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/clustercontentlibraryitem 67%
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/contentlibraryitem 67%
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/utils 85%
github.com/vmware-tanzu/vm-operator/controllers/infra/capability/configmap 92%
github.com/vmware-tanzu/vm-operator/controllers/infra/capability/crd 100%
github.com/vmware-tanzu/vm-operator/controllers/infra/configmap 75%
github.com/vmware-tanzu/vm-operator/controllers/infra/node 77%
github.com/vmware-tanzu/vm-operator/controllers/infra/secret 76%
github.com/vmware-tanzu/vm-operator/controllers/infra/validatingwebhookconfiguration 87%
github.com/vmware-tanzu/vm-operator/controllers/infra/zone 73%
github.com/vmware-tanzu/vm-operator/controllers/storage/storageclass 93%
github.com/vmware-tanzu/vm-operator/controllers/storage/storagepolicy 96%
github.com/vmware-tanzu/vm-operator/controllers/storage/storagepolicyquota 91%
github.com/vmware-tanzu/vm-operator/controllers/storage/volumeattributesclass 93%
github.com/vmware-tanzu/vm-operator/controllers/util/encoding 73%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/storagepolicyusage 96%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/virtualmachine 65%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/volume 85%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/volumebatch 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineclass 73%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinegroup 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinegrouppublishrequest 88%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineimagecache 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinepublishrequest 84%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinereplicaset 68%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineservice 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineservice/providers 92%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinesetresourcepolicy 81%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinesnapshot 92%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest 72%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1 72%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1/conditions 88%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1/patch 78%
github.com/vmware-tanzu/vm-operator/controllers/vspherepolicy/policyevaluation 85%
github.com/vmware-tanzu/vm-operator/pkg/bitmask 100%
github.com/vmware-tanzu/vm-operator/pkg/builder 89%
github.com/vmware-tanzu/vm-operator/pkg/conditions 90%
github.com/vmware-tanzu/vm-operator/pkg/config 100%
github.com/vmware-tanzu/vm-operator/pkg/config/capabilities 97%
github.com/vmware-tanzu/vm-operator/pkg/config/env 100%
github.com/vmware-tanzu/vm-operator/pkg/context 37%
github.com/vmware-tanzu/vm-operator/pkg/context/generic 100%
github.com/vmware-tanzu/vm-operator/pkg/context/operation 100%
github.com/vmware-tanzu/vm-operator/pkg/crd 76%
github.com/vmware-tanzu/vm-operator/pkg/errors 76%
github.com/vmware-tanzu/vm-operator/pkg/exit 100%
github.com/vmware-tanzu/vm-operator/pkg/log 100%
github.com/vmware-tanzu/vm-operator/pkg/mem 100%
github.com/vmware-tanzu/vm-operator/pkg/patch 78%
github.com/vmware-tanzu/vm-operator/pkg/prober 89%
github.com/vmware-tanzu/vm-operator/pkg/prober/probe 90%
github.com/vmware-tanzu/vm-operator/pkg/prober/worker 77%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere 75%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/clustermodules 73%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/config 88%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/contentlibrary 75%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/credentials 100%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/network 83%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/placement 70%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/session 52%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/storage 44%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/upgrade/virtualmachine 95%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/upgrade/virtualmachine/backfill 92%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/vcenter 80%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/virtualmachine 84%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/vmlifecycle 74%
github.com/vmware-tanzu/vm-operator/pkg/record 84%
github.com/vmware-tanzu/vm-operator/pkg/topology 91%
github.com/vmware-tanzu/vm-operator/pkg/util 78%
github.com/vmware-tanzu/vm-operator/pkg/util/cloudinit 89%
github.com/vmware-tanzu/vm-operator/pkg/util/cloudinit/validate 91%
github.com/vmware-tanzu/vm-operator/pkg/util/image 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube 92%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/cource 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/internal 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/proxyaddr 73%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/spq 99%
github.com/vmware-tanzu/vm-operator/pkg/util/linuxprep 97%
github.com/vmware-tanzu/vm-operator/pkg/util/netplan 100%
github.com/vmware-tanzu/vm-operator/pkg/util/nil 100%
github.com/vmware-tanzu/vm-operator/pkg/util/ovfcache 75%
github.com/vmware-tanzu/vm-operator/pkg/util/ovfcache/internal 100%
github.com/vmware-tanzu/vm-operator/pkg/util/paused 100%
github.com/vmware-tanzu/vm-operator/pkg/util/ptr 100%
github.com/vmware-tanzu/vm-operator/pkg/util/resize 98%
github.com/vmware-tanzu/vm-operator/pkg/util/sysprep 98%
github.com/vmware-tanzu/vm-operator/pkg/util/vmopv1 88%
github.com/vmware-tanzu/vm-operator/pkg/util/volumes 100%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/client 66%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/datastore 100%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/fault 100%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/library 95%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/storage 82%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/task 100%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/vm 78%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/watcher 85%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig 95%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/anno2extraconfig 100%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/bootoptions 88%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/cdrom 88%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/crypto 92%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/diskpromo 100%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/policy 97%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/virtualcontroller 93%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/volumes/unmanaged/backfill 98%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/volumes/unmanaged/register 93%
github.com/vmware-tanzu/vm-operator/pkg/webconsolevalidation 100%
github.com/vmware-tanzu/vm-operator/services/vm-watcher 85%
github.com/vmware-tanzu/vm-operator/webhooks/common 98%
github.com/vmware-tanzu/vm-operator/webhooks/persistentvolumeclaim/validation 95%
github.com/vmware-tanzu/vm-operator/webhooks/unifiedstoragequota/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachine/mutation 87%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachine/validation 96%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineclass/mutation 62%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineclass/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegroup/mutation 87%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegroup/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegrouppublishrequest/mutation 86%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegrouppublishrequest/validation 88%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinepublishrequest/validation 90%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinereplicaset/validation 90%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineservice/mutation 67%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineservice/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesetresourcepolicy/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesnapshot/mutation 83%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesnapshot/validation 91%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinewebconsolerequest/v1alpha1/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinewebconsolerequest/validation 92%
Summary 84% (19493 / 23285)

Minimum allowed line rate is 79%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants