Skip to content

Commit 9be7e26

Browse files
docs: recreate without leader election (cherry-pick argoproj#16034 for 4.0) (argoproj#16046)
Signed-off-by: Alan Clucas <alan@clucas.org> Co-authored-by: Alan Clucas <alan@clucas.org>
1 parent 0ab1452 commit 9be7e26

3 files changed

Lines changed: 22 additions & 1 deletion

File tree

.spelling

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -231,6 +231,8 @@ rc2
231231
repo
232232
retryStrategy
233233
roadmap
234+
rollout
235+
rollouts
234236
runtime
235237
runtimes
236238
s3

docs/environment-variables.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ This document outlines environment variables that can be used to customize behav
3636
| `HEALTHZ_AGE` | `time.Duration` | `5m` | How old a un-reconciled workflow is to report unhealthy. |
3737
| `INDEX_WORKFLOW_SEMAPHORE_KEYS` | `bool` | `true` | Whether or not to index semaphores. |
3838
| `LEADER_ELECTION_IDENTITY` | `string` | Controller's `metadata.name` | The ID used for workflow controllers to elect a leader. |
39-
| `LEADER_ELECTION_DISABLE` | `bool` | `false` | Whether leader election should be disabled. |
39+
| `LEADER_ELECTION_DISABLE` | `bool` | `false` | Whether leader election should be disabled. When set to `true`, also set the Deployment's rollout strategy to `Recreate` to prevent two controllers running concurrently during rollouts — see [High Availability](high-availability.md#deployment-rollout-strategy). |
4040
| `LEADER_ELECTION_LEASE_DURATION` | `time.Duration` | `15s` | The duration that non-leader candidates will wait to force acquire leadership. |
4141
| `LEADER_ELECTION_RENEW_DEADLINE` | `time.Duration` | `10s` | The duration that the acting master will retry refreshing leadership before giving up. |
4242
| `LEADER_ELECTION_RETRY_PERIOD` | `time.Duration` | `5s` | The duration that the leader election clients should wait between tries of actions. |

docs/high-availability.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,23 @@ By disabling the leader election process, you can avoid unnecessary communicatio
1919

2020
By using the `PriorityClass`, you can ensure that the Workflow Controller Pod is scheduled before other Pods in the cluster.
2121

22+
### Deployment rollout strategy
23+
24+
When leader election is disabled, the Deployment's rollout strategy must not surge a second Pod.
25+
The default `RollingUpdate` strategy with `maxSurge: 25%` rounds up to `maxSurge: 1` for a single-replica Deployment, so on every rollout (image bump, ConfigMap change, resource edit) the new Pod becomes Ready before the old Pod is terminated.
26+
Without a leader lease, both Pods reconcile the same Workflows during that window, which can duplicate Pod creations, clobber Workflow status updates, and cause their informer caches to diverge.
27+
28+
Set the Deployment's `spec.strategy` to `Recreate` so the old Pod is terminated before the new Pod starts:
29+
30+
```yaml
31+
spec:
32+
strategy:
33+
type: Recreate
34+
```
35+
36+
This produces a few seconds of controller downtime during each rollout.
37+
Running Workflows keep executing; reconciliation resumes when the new Pod is Ready.
38+
2239
### Multiple Workflow Controller Replicas
2340
2441
It is possible to run multiple replicas of the Workflow Controller to provide high-availability.
@@ -28,6 +45,8 @@ Only one replica of the Workflow Controller will actively manage Workflows at an
2845
The other replicas will be on standby, ready to take over if the active replica fails.
2946
This means that you are guaranteeing resource allocations for replicas that are not actively contributing to the running of Workflows.
3047

48+
With leader election enabled, the default `RollingUpdate` Deployment strategy is safe: only the replica holding the lease reconciles Workflows, so a surging replica simply waits to acquire the lease when the previous leader steps down.
49+
3150
The leader election process requires frequent communication with the Kubernetes API.
3251
When running Workflows at scale, the Kubernetes API may become unresponsive, causing the leader election to take longer than 10 seconds (`LEADER_ELECTION_RENEW_DEADLINE`) to respond, which will disrupt the controller.
3352

0 commit comments

Comments
 (0)