Problem
The current getServerByName() implementation in pkg/openstack/instances.go discovers OpenStack instances by matching the Kubernetes node name against Nova server names using an exact regex match:
opts := servers.ListOpts{
Name: fmt.Sprintf("^%s$", regexp.QuoteMeta(name)),
}
This silently fails when the Kubernetes node hostname does not exactly match the OpenStack instance name — which is common in:
- RKE2/K3s: hostnames set via
--node-name or OS hostname may differ from Terraform resource names
- Autoscaling: instance names include random suffixes that don't match kubelet hostnames
- Enterprise naming: organizations may have different naming conventions for infra vs K8s
When discovery fails, OCCM sets no providerID, no zone/region labels, and logs no error — making this extremely hard to diagnose.
Proposed Solution
Kubernetes nodes already expose the SMBIOS system UUID via node.status.nodeInfo.systemUUID. OpenStack sets the Nova instance UUID as the guest's SMBIOS product UUID, making this a reliable 1:1 mapping.
The getInstance() function should try multiple discovery strategies in order:
func (i *InstancesV2) getInstance(ctx context.Context, node *v1.Node) (*servers.Server, error) {
// 1. If providerID already set, use it (existing behavior)
if node.Spec.ProviderID != "" {
return getServerByProviderID(...)
}
// 2. Try system UUID — Nova servers.Get(uuid) — most reliable
if uuid := node.Status.NodeInfo.SystemUUID; uuid != "" {
srv, err := servers.Get(ctx, i.compute, strings.ToLower(uuid)).Extract()
if err == nil {
return srv, nil
}
klog.V(4).Infof("Failed to find instance by system UUID %s: %v, falling back to name match", uuid, err)
}
// 3. Fallback to name match (existing behavior)
return getServerByName(ctx, i.compute, node.Name)
}
This is:
- Zero-config: No new cloud.conf options needed
- Backward-compatible: Falls back to name match if UUID lookup fails
- Reliable: SMBIOS UUID = Nova instance UUID is guaranteed by OpenStack
Evidence
On an RKE2 cluster where node hostnames don't match Nova server names, systemUUID correctly maps to the OpenStack instance:
$ kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.nodeInfo.systemUUID}{"\n"}{end}'
cpe-central-master-1 0e6d1a84-b37c-40d2-bb69-76f9b22fd5bb
cpe-central-master-2 989ec61a-37e2-4f7a-9927-4833f55be53e
cpe-central-master-3 24038e8a-fa44-45dc-9bcf-8e7ba04593eb
cpe-central-worker-1 a3634fe4-e3c5-486d-918e-97df55ad476c
cpe-central-worker-2 c9410041-4b6e-4e17-8b8f-b7ffbf9525bc
cpe-central-worker-3 4c0fbb6a-ff80-4c5e-b783-46abccb04677
All UUIDs match their corresponding Nova instance IDs.
Current Workarounds
- Ensure K8s node hostnames exactly match OpenStack instance names (fragile, not always possible)
- Set
--provider-id=openstack:///UUID on kubelet at boot time (requires infra-level changes per node)
Both workarounds require manual coordination between infra provisioning and K8s node registration, which the UUID-based discovery would eliminate entirely.
Additional Context
The silent failure when name matching fails (no error logged, providerID simply never set) makes debugging this issue very difficult. Even with --v=2, there is no indication that instance discovery failed. At minimum, a warning log when getServerByName returns no results would help operators diagnose mismatches.
Problem
The current
getServerByName()implementation inpkg/openstack/instances.godiscovers OpenStack instances by matching the Kubernetes node name against Nova server names using an exact regex match:This silently fails when the Kubernetes node hostname does not exactly match the OpenStack instance name — which is common in:
--node-nameor OS hostname may differ from Terraform resource namesWhen discovery fails, OCCM sets no
providerID, no zone/region labels, and logs no error — making this extremely hard to diagnose.Proposed Solution
Kubernetes nodes already expose the SMBIOS system UUID via
node.status.nodeInfo.systemUUID. OpenStack sets the Nova instance UUID as the guest's SMBIOS product UUID, making this a reliable 1:1 mapping.The
getInstance()function should try multiple discovery strategies in order:This is:
Evidence
On an RKE2 cluster where node hostnames don't match Nova server names,
systemUUIDcorrectly maps to the OpenStack instance:All UUIDs match their corresponding Nova instance IDs.
Current Workarounds
--provider-id=openstack:///UUIDon kubelet at boot time (requires infra-level changes per node)Both workarounds require manual coordination between infra provisioning and K8s node registration, which the UUID-based discovery would eliminate entirely.
Additional Context
The silent failure when name matching fails (no error logged, providerID simply never set) makes debugging this issue very difficult. Even with
--v=2, there is no indication that instance discovery failed. At minimum, a warning log whengetServerByNamereturns no results would help operators diagnose mismatches.