Describe the bug
When upgrading the gha-runner-scale-set-controller, two controller-managed objects are not updated to reflect the new version:
AutoscalingListener CRs — retain the old controller image in spec.image
EphemeralRunnerSet objects — retain old version labels (app.kubernetes.io/version, helm.sh/chart)
The controller gates reconciliation on a spec hash. A controller-only upgrade does not change any AutoscalingRunnerSet spec, so the hash is identical and the controller skips reconciliation of both objects entirely. The updateStrategy flag has no effect here — it only governs spec-change rollout behaviour, not controller version upgrades.
Object hierarchy affected
AutoscalingRunnerSet (Helm-managed)
├── AutoscalingListener ← stale image after controller upgrade
└── EphemeralRunnerSet ← stale labels/spec after controller upgrade
└── EphemeralRunner
Why this matters
For minor bumps (e.g. 0.14.1 → 0.14.2) the EphemeralRunnerSet staleness may appear cosmetic. For major upgrades where the EphemeralRunnerSet or EphemeralRunner spec has breaking changes (new required fields, removed fields, changed defaults), stale objects under a new controller is a real correctness risk.
Steps to reproduce
- Deploy ARC controller + scale sets at version N
- Upgrade the controller chart to version N+1 (scale set chart also upgraded)
- Check listeners:
kubectl get autoscalinglisteners -A -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.image}{"\n"}{end}'
- Check runner sets:
kubectl get ephemeralrunnersets -A -o custom-columns='NAME:.metadata.name,VERSION:.metadata.labels.app\.kubernetes\.io/version'
- Both still show version N
Expected behaviour
After a controller upgrade, AutoscalingListener CRs and EphemeralRunnerSet objects should be reconciled to reflect the new version without manual intervention.
Actual behaviour
Both objects retain the old version. The controller sees the spec hash as unchanged and takes no action.
Workaround
For AutoscalingListener: delete all CRs — the controller recreates them immediately with the new image. Runner pods are unaffected.
kubectl delete autoscalinglisteners -A --all
For EphemeralRunnerSet: a dummy change to minRunners is not sufficient — it only affects the AutoscalingListener hash, not the EphemeralRunnerSet hash. A change to the runner pod template (spec.template.spec) is required, e.g. adding a dummy annotation:
spec:
template:
metadata:
annotations:
upgrade-trigger: "0.14.2"
This triggers a graceful transition — the old EphemeralRunnerSet drains (in-progress jobs complete) while the new one starts accepting jobs immediately. The annotation can be removed in a follow-up commit.
Full upgrade procedure (per scale set)
- Upgrade controller chart
- Delete all
AutoscalingListener CRs
- Add a dummy annotation to
spec.template.spec in each AutoscalingRunnerSet HelmRelease, then remove it
None of these steps should be manual.
Version: controller 0.14.2, scale-set 0.14.2, Kubernetes: RKE2
Describe the bug
When upgrading the
gha-runner-scale-set-controller, two controller-managed objects are not updated to reflect the new version:AutoscalingListenerCRs — retain the old controller image inspec.imageEphemeralRunnerSetobjects — retain old version labels (app.kubernetes.io/version,helm.sh/chart)The controller gates reconciliation on a spec hash. A controller-only upgrade does not change any
AutoscalingRunnerSetspec, so the hash is identical and the controller skips reconciliation of both objects entirely. TheupdateStrategyflag has no effect here — it only governs spec-change rollout behaviour, not controller version upgrades.Object hierarchy affected
Why this matters
For minor bumps (e.g. 0.14.1 → 0.14.2) the
EphemeralRunnerSetstaleness may appear cosmetic. For major upgrades where theEphemeralRunnerSetorEphemeralRunnerspec has breaking changes (new required fields, removed fields, changed defaults), stale objects under a new controller is a real correctness risk.Steps to reproduce
kubectl get autoscalinglisteners -A -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.image}{"\n"}{end}'kubectl get ephemeralrunnersets -A -o custom-columns='NAME:.metadata.name,VERSION:.metadata.labels.app\.kubernetes\.io/version'Expected behaviour
After a controller upgrade,
AutoscalingListenerCRs andEphemeralRunnerSetobjects should be reconciled to reflect the new version without manual intervention.Actual behaviour
Both objects retain the old version. The controller sees the spec hash as unchanged and takes no action.
Workaround
For
AutoscalingListener: delete all CRs — the controller recreates them immediately with the new image. Runner pods are unaffected.For
EphemeralRunnerSet: a dummy change tominRunnersis not sufficient — it only affects theAutoscalingListenerhash, not theEphemeralRunnerSethash. A change to the runner pod template (spec.template.spec) is required, e.g. adding a dummy annotation:This triggers a graceful transition — the old
EphemeralRunnerSetdrains (in-progress jobs complete) while the new one starts accepting jobs immediately. The annotation can be removed in a follow-up commit.Full upgrade procedure (per scale set)
AutoscalingListenerCRsspec.template.specin eachAutoscalingRunnerSetHelmRelease, then remove itNone of these steps should be manual.
Version: controller
0.14.2, scale-set0.14.2, Kubernetes: RKE2