Skip to content

Commit 3aee62a

Browse files
authored
Run OAP init job in the main phase to fix helm --wait deadlock (#190)
The OAP init job was a `post-install,post-upgrade,post-rollback` hook. Under `helm upgrade --install --wait`, Helm waits for all release resources to become Ready before running post-* hooks, but the OAP Deployment runs in `-Dmode=no-init` and never becomes Ready until the init job creates the storage schema. The hook therefore never runs and the install deadlocks until it times out (hits new users on a fresh install/storage). Hooks cannot fix this with embedded storage subcharts: a pre-* hook init job cannot reach main-phase storage, and a post-* hook deadlocks under `--wait`. So the init job now runs as a normal main-phase resource alongside storage and the OAP Deployment, which blocks in no-init mode until the schema appears. To avoid `spec.template is immutable` failures on upgrade (a Job's pod template cannot be patched), the Job name carries an 8-char hash of the chart values, so a changed spec yields a new Job and Helm prunes the previous one. A new optional `oapInit.ttlSecondsAfterFinished` can auto-clean finished Jobs (off by default; left off for GitOps tools that would otherwise recreate the Job). The OAP Deployment startupProbe default failureThreshold is raised 9 -> 30 (90s -> 300s) so the pod waits for the init job during a cold start instead of being restarted. Docs (values.yaml, chart README, root README) updated accordingly.
1 parent 32f1811 commit 3aee62a

5 files changed

Lines changed: 47 additions & 21 deletions

File tree

README.md

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -316,22 +316,25 @@ You can set those environment variables by `--set oap.env.<ENV_NAME>=<ENV_VALUE>
316316

317317
> The environment variables take priority over the overrode configuration files.
318318
319-
## Rerun OAP init job
319+
## OAP init job
320320

321-
Kubernetes Job cannot be rerun by default, if you want to rerun the OAP init
322-
job, you need to delete the Job and recreate it.
321+
The OAP storage schema (Elasticsearch indices / SQL tables / BanyanDB groups) is created by a
322+
one-shot `*-oap-init-*` Job that runs OAP in `-Dmode=init`. The main OAP Deployment runs in
323+
`-Dmode=no-init` and blocks (its `12800` port stays closed, so it is not Ready) until that schema
324+
exists. The init Job is a **normal release resource** that runs in the main install/upgrade phase,
325+
so `helm upgrade --install --wait` works: the Job creates the schema while OAP waits for it. To get
326+
Helm to surface init-Job failures directly (instead of only seeing OAP fail to become Ready), add
327+
`--wait-for-jobs` alongside `--wait`.
328+
329+
The Job name carries a hash of the chart values, so any `helm upgrade` that changes a value
330+
re-creates the Job and re-runs init automatically (Helm prunes the previous one).
331+
332+
To **force a rerun** without changing any value — delete the Job and re-run `helm upgrade`; Helm
333+
recreates the (now missing) Job and init runs again:
323334

324335
```shell
325-
# Make sure to export the Job manifest to a file before deleting it.
326-
kubectl get job -n "${SKYWALKING_RELEASE_NAMESPACE}" -l release=$SKYWALKING_RELEASE_NAME -o yaml > oap-init.job.yaml
327-
# Trim the Job manifest to keep only the Job part, you can either download yq from https://github.com/mikefarah/yq or
328-
# manually remove the fields that are not needed.
329-
yq 'del(.items[0].metadata.creationTimestamp,.items[0].metadata.resourceVersion,.items[0].metadata.uid,.items[0].status,.items[0].spec.template.metadata.labels."batch.kubernetes.io/controller-uid",.items[0].spec.template.metadata.labels."controller-uid",.items[0].spec.selector.matchLabels."batch.kubernetes.io/controller-uid")' oap-init.job.yaml > oap-init.job.trimmed.yaml
330-
# Check the file oap-init.job.trimmed.yaml to make sure it has correct content
331-
# Delete the Job
332336
kubectl delete job -n "${SKYWALKING_RELEASE_NAMESPACE}" -l release=$SKYWALKING_RELEASE_NAME
333-
# Create the Job
334-
kubectl -n "${SKYWALKING_RELEASE_NAMESPACE}" apply -f oap-init.job.trimmed.yaml
337+
helm upgrade "$SKYWALKING_RELEASE_NAME" <chart> -n "${SKYWALKING_RELEASE_NAMESPACE}" --reuse-values
335338
```
336339

337340
# Contact Us

chart/skywalking/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ The following table lists the configurable parameters of the Skywalking chart an
6868
| `oap.nodeSelector` | OAP labels for master pod assignment | `{}` |
6969
| `oap.tolerations` | OAP tolerations | `[]` |
7070
| `oap.resources` | OAP node resources requests & limits | `{} - cpu limit must be an integer` |
71-
| `oap.startupProbe` | Configuration fields for the [startupProbe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) | `tcpSocket.port: 12800` <br> `failureThreshold: 9` <br> `periodSeconds: 10`
71+
| `oap.startupProbe` | Configuration fields for the [startupProbe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/). The default budget (`failureThreshold` * `periodSeconds` = 300s) is large enough for OAP to wait in no-init mode while the OAP init Job creates the storage schema. | `tcpSocket.port: 12800` <br> `failureThreshold: 30` <br> `periodSeconds: 10`
7272
| `oap.livenessProbe` | Configuration fields for the [livenessProbe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) | `tcpSocket.port: 12800` <br> `initialDelaySeconds: 5` <br> `periodSeconds: 10`
7373
| `oap.readinessProbe` | Configuration fields for the [readinessProbe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) | `tcpSocket.port: 12800` <br> `initialDelaySeconds: 5` <br> `periodSeconds: 10`
7474
| `oap.env` | OAP environment variables | `[]` |
@@ -109,6 +109,7 @@ The following table lists the configurable parameters of the Skywalking chart an
109109
| `oapInit.nodeSelector` | OAP init job labels for master pod assignment | `{}` |
110110
| `oapInit.tolerations` | OAP init job tolerations | `[]` |
111111
| `oapInit.extraPodLabels` | OAP init job metadata labels | `[]` |
112+
| `oapInit.ttlSecondsAfterFinished` | Seconds after which the finished OAP init Job (and its Pod) is auto-deleted by the Kubernetes TTL-after-finished controller. Empty keeps the Job. Leave empty with GitOps tools (Argo CD/Flux), which would recreate it after deletion. | `""` |
112113
| `satellite.name` | Satellite deployment name | `satellite` |
113114
| `satellite.replicas` | Satellite k8s deployment replicas | `1` |
114115
| `satellite.enabled` | Is enable Satellite | `false` |

chart/skywalking/templates/oap-deployment.yaml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,10 @@ spec:
105105
{{ else }}
106106
tcpSocket:
107107
port: 12800
108-
failureThreshold: 9
108+
# In no-init mode OAP blocks (port 12800 stays closed) until the init Job has created
109+
# the storage schema. Give it a generous budget (30 * 10s = 300s) so the pod waits for
110+
# the init Job instead of being restarted during a cold start.
111+
failureThreshold: 30
109112
periodSeconds: 10
110113
{{- end }}
111114
readinessProbe:

chart/skywalking/templates/oap-init.job.yaml

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,17 +18,28 @@
1818
apiVersion: batch/v1
1919
kind: Job
2020
metadata:
21-
name: "{{ template "skywalking.oap.fullname" . }}-init"
21+
# NOTE: This Job is intentionally a normal release resource, NOT a Helm hook.
22+
# Running it as a post-install/post-upgrade hook deadlocks `helm upgrade --install --wait`:
23+
# Helm waits for every release resource to become Ready before it runs post-* hooks, but the
24+
# OAP Deployment runs in `-Dmode=no-init` and never becomes Ready until this Job has created
25+
# the storage schema -- so the hook (and therefore the schema) would never run. As a main-phase
26+
# resource the Job runs alongside the OAP Deployment, which blocks in no-init mode until the
27+
# schema appears, so `--wait` resolves instead of deadlocking.
28+
#
29+
# The name carries a hash of the chart values: a Job's `spec.template` is immutable, so a stable
30+
# name would make `helm upgrade` fail with "field is immutable" whenever the pod template changes.
31+
# Hashing yields a fresh Job whenever a relevant value changes; Helm prunes the previous one.
32+
name: "{{ printf "%s-init-%s" (include "skywalking.oap.fullname" . | trunc 40 | trimSuffix "-") (.Values | toYaml | sha256sum | trunc 8) }}"
2233
labels:
2334
app: {{ template "skywalking.name" . }}
2435
chart: {{ .Chart.Name }}-{{ .Chart.Version }}
2536
component: "{{ template "skywalking.fullname" . }}-job"
2637
heritage: {{ .Release.Service }}
2738
release: {{ .Release.Name }}
28-
annotations:
29-
"helm.sh/hook": post-install,post-upgrade,post-rollback
30-
"helm.sh/hook-weight": "1"
3139
spec:
40+
{{- if .Values.oapInit.ttlSecondsAfterFinished }}
41+
ttlSecondsAfterFinished: {{ .Values.oapInit.ttlSecondsAfterFinished }}
42+
{{- end }}
3243
template:
3344
metadata:
3445
name: "{{ .Release.Name }}-oap-init"

chart/skywalking/values.yaml

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -75,11 +75,13 @@ oap:
7575
# initialDelaySeconds: 5
7676
# periodSeconds: 20
7777
startupProbe: {}
78-
# Time to boot the application is set to:
79-
# 9 (failureThreshold) * 10 (periodSeconds) = 90 seconds in this case.
78+
# Boot budget defaults to 30 (failureThreshold) * 10 (periodSeconds) = 300 seconds.
79+
# In no-init mode OAP keeps port 12800 closed until the OAP init Job has created the storage
80+
# schema, so the budget must be large enough to cover storage startup + schema creation;
81+
# otherwise the pod is restarted while it is legitimately waiting for the init Job.
8082
# tcpSocket:
8183
# port: 12800
82-
# failureThreshold: 9
84+
# failureThreshold: 30
8385
# periodSeconds: 10
8486
readinessProbe: {}
8587
# tcpSocket:
@@ -301,6 +303,12 @@ oapInit:
301303
tolerations: []
302304
extraPodLabels: {}
303305
# sidecar.istio.io/inject: false
306+
# Auto-delete the completed init Job (and its Pod) this many seconds after it finishes, via the
307+
# Kubernetes TTL-after-finished controller. Leave empty to keep the completed Job around.
308+
# NOTE: leave this empty when using GitOps tools (e.g. Argo CD, Flux) -- they would recreate the
309+
# Job after the TTL controller deletes it, re-running init on every reconcile. The Job name is
310+
# value-hashed, so upgrades already work without TTL; this is only for tidying finished Jobs.
311+
ttlSecondsAfterFinished: ""
304312

305313
# Elasticsearch managed by ECK (eck-elasticsearch chart)
306314
# When enabled, the ECK operator is also installed as a dependency.

0 commit comments

Comments
 (0)