Currently (v1.1.x), when ServerCreateFailedIrrecoverableErrorReason is set, the hcloud remediation will just stop reconciling:
// Skip remediation for machines that failed to create with irrecoverable errors (e.g. invalid_input, resource_unavailable).
// These errors cannot be fixed by rebooting or replacing the machine.
// We return without error so the MHC does not keep retrying remediation.
if conditions.IsFalse(hcloudMachine, infrav1.ServerCreateSucceededCondition) &&
conditions.GetReason(hcloudMachine, infrav1.ServerCreateSucceededCondition) == infrav1.ServerCreateFailedIrrecoverableErrorReason {
log.Info("Skipping remediation for machine with irrecoverable creation failure",
"reason", conditions.GetMessage(hcloudMachine, infrav1.ServerCreateSucceededCondition),
)
// signal remediation done.
return reconcile.Result{}, nil
}
This means, the Remediation resource does not even have a status (Conditions):
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: HCloudRemediation
metadata:
annotations:
cluster.x-k8s.io/cloned-from-groupkind: HCloudRemediationTemplate.infrastructure.cluster.x-k8s.io
cluster.x-k8s.io/cloned-from-name: hetzner-apalla-1-35-v0-sha.ow82ztk-remediation-request
creationTimestamp: "2026-04-15T15:17:37Z"
generation: 1
labels:
cluster.x-k8s.io/cluster-name: tcs-guettli-tm9-1-35-v0-sha-ow82ztk
name: tcs-guettli-tm9-1-35-v0-sha-ow82ztk-md-arm-r6f99-zccjj-6vn8l
namespace: org-testing
ownerReferences:
- apiVersion: cluster.x-k8s.io/v1beta1
kind: Machine
name: tcs-guettli-tm9-1-35-v0-sha-ow82ztk-md-arm-r6f99-zccjj-6vn8l
uid: a5f9ea94-1ca6-4eb3-ab48-2a84df0d217f
resourceVersion: "2803968"
uid: 0671de0c-2cc3-44e7-ab25-e2ca8857595e
spec:
strategy:
retryLimit: 1
timeout: 3m0s
type: Reboot
This is intentional, because we don't want an endless loop if hcloud machine uses invalid server-type location tuples.
In the current case cax31 was not available for some time, but now they should be available again.
We need to communicate that better.
Desired solution:
Create a Condition on the hcloudmachine with an appropriate error message, and create a Condition on the hcloudremediation.
Currently (v1.1.x), when ServerCreateFailedIrrecoverableErrorReason is set, the hcloud remediation will just stop reconciling:
This means, the Remediation resource does not even have a status (Conditions):
This is intentional, because we don't want an endless loop if hcloud machine uses invalid server-type location tuples.
In the current case cax31 was not available for some time, but now they should be available again.
We need to communicate that better.
Desired solution:
Create a Condition on the hcloudmachine with an appropriate error message, and create a Condition on the hcloudremediation.