Skip to content

Commit 62066fe

Browse files
docs: lineage-controller-webhook configuration guide (#513)
## Summary - Adds `content/en/docs/next/operations/configuration/lineage-controller-webhook.md` (weight 40, sibling of the existing Components reference page). - Documents the five user-facing knobs (`deployment.enabled`, `deployment.replicas`, `nodeAffinity`, `tolerations`, `localK8sAPIEndpoint.enabled`) and three worked Package CR overrides covering the common non-default topologies: Deployment on a large control plane, managed Kubernetes / Cozy-in-Cozy tenants, and dedicated webhook nodes. - Explains the chart's validation guard (`localK8sAPIEndpoint.enabled: true` + `nodeAffinity: []` is rejected at render time) and adds a `warning`-style callout for the case the guard cannot catch — non-apiserver-bearing custom `nodeAffinity` with the local endpoint left enabled, which silently produces crash-looping pods and tenant CREATE/UPDATE outage with `failurePolicy: Fail`. ## Companion PR Companion to cozystack/cozystack#2481, which exposes the underlying knobs in the chart. The website page is targeted at `next/` since v1.3 docs aren't registered yet and these knobs ship in the next release. ## Test plan - [ ] `make serve` — confirm the page renders, the cross-link to `Components` resolves, and the alert callout displays correctly. - [ ] Visually inspect the four code blocks; confirm Package CR overrides are valid YAML. - [ ] Verify the page appears in the `Configuration` sidebar between `Components` (weight 30) and `White Labeling` (weight 50). 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Added a new Lineage Controller Webhook guide describing its behavior and impact on resource labeling. * Documents admission requirement (failurePolicy: Fail) and operational implications. * Provides deployment topology guidance (replicas, affinity, anti-affinity, PDB, service distribution) and a Helm override example. * Warns about a deprecated flag that can cause crashes/admission outages. * Includes verification commands and a dry-run validation flow. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
2 parents f466d34 + 29b381c commit 62066fe

1 file changed

Lines changed: 110 additions & 0 deletions

File tree

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
---
2+
title: "The Lineage Controller Webhook"
3+
linkTitle: "Lineage Controller Webhook"
4+
description: "What the lineage-controller-webhook does, how it's deployed, and the one knob worth knowing about."
5+
weight: 40
6+
---
7+
8+
The **lineage controller webhook** is a mutating admission webhook shipped as
9+
part of the `cozystack.cozystack-engine` Package. On every CREATE and UPDATE
10+
of a tenant `Pod`, `Secret`, `Service`, `PersistentVolumeClaim`,
11+
`Ingress`, or `WorkloadMonitor` it walks up the ownership graph and stamps the
12+
owning Cozystack `Application`'s identity onto the resource as labels
13+
(`apps.cozystack.io/application.{group,kind,name}`). The Cozystack dashboard,
14+
the aggregated API server, and the SchedulingClass mechanism all rely on those
15+
labels.
16+
17+
The webhook is registered with `failurePolicy: Fail`, so the kube-apiserver
18+
must be able to reach a healthy webhook pod for tenant CREATE/UPDATE traffic
19+
to succeed.
20+
21+
## Default deployment shape
22+
23+
The chart deploys a single `Deployment` modelled on the `cozystack-api` shape:
24+
25+
- **2 replicas** (override via `replicas`).
26+
- **Soft `nodeAffinity`** preferring `node-role.kubernetes.io/control-plane`
27+
(`Exists` matches both Talos's empty value and k3s/kubeadm's `"true"`). The
28+
preference is *soft* — pods land on a control-plane node when one is
29+
reachable, and on any worker otherwise. No override is needed for managed
30+
Kubernetes (EKS / AKS / GKE), Cozy-in-Cozy tenant clusters, or any other
31+
cluster where control-plane nodes aren't visible: the webhook simply
32+
schedules elsewhere.
33+
- **Permissive `tolerations`** (`{operator: Exists}`) so a control-plane node
34+
with `NoSchedule` taints accepts the pod when the soft affinity is
35+
satisfiable.
36+
- **Soft `podAntiAffinity`** on `kubernetes.io/hostname` so replicas
37+
best-effort spread across nodes.
38+
- **`PodDisruptionBudget`** with `maxUnavailable: 1`. At `replicas: 2+` it
39+
caps disruption to one pod; at `replicas: 1` it's a useful no-op.
40+
- **Service `spec.trafficDistribution: PreferClose`** so the apiserver
41+
prefers a webhook endpoint on its own node when one exists, and
42+
transparently falls over to a remote endpoint otherwise. Requires
43+
Kubernetes ≥ 1.31; older clusters silently fall back to default
44+
cluster-wide distribution (still safe, just no locality preference).
45+
46+
This shape works as-is on every Kubernetes distribution Cozystack supports.
47+
You shouldn't need to override anything in normal operation.
48+
49+
## Increasing replicas
50+
51+
If you want more than two replicas — for instance, to keep one webhook pod
52+
co-located with each apiserver on a five-node control plane — override the
53+
`replicas` value via the `cozystack.cozystack-engine` Package, the same way
54+
you'd override any other component (see
55+
[Components]({{% ref "/docs/next/operations/configuration/components" %}})):
56+
57+
```yaml
58+
apiVersion: cozystack.io/v1alpha1
59+
kind: Package
60+
metadata:
61+
name: cozystack.cozystack-engine
62+
namespace: cozy-system
63+
spec:
64+
variant: default
65+
components:
66+
lineage-controller-webhook:
67+
values:
68+
lineageControllerWebhook:
69+
replicas: 5
70+
```
71+
72+
## `localK8sAPIEndpoint.enabled` is deprecated
73+
74+
The chart still exposes `localK8sAPIEndpoint.enabled`, which when set to
75+
`true` injects `KUBERNETES_SERVICE_HOST=status.hostIP` and
76+
`KUBERNETES_SERVICE_PORT=6443` so the webhook talks to the apiserver on its
77+
own node. It was originally added to avoid latency on the
78+
webhook-to-apiserver path. It's now defaulted to `false` and slated for
79+
removal once the latency motivation is addressed in the webhook itself.
80+
81+
{{% alert title="Important" color="warning" %}}
82+
Do not enable `localK8sAPIEndpoint.enabled` with the default chart values.
83+
The injected `status.hostIP` is only valid when the pod runs on a node that
84+
hosts a kube-apiserver, and the chart's soft control-plane affinity does not
85+
guarantee that. With the flag enabled and the pod scheduled off a
86+
control-plane node, the controller crash-loops dialing a non-apiserver IP —
87+
and combined with `failurePolicy: Fail` that means a tenant CREATE/UPDATE
88+
outage.
89+
{{% /alert %}}
90+
91+
## Verifying the deployment
92+
93+
```bash
94+
kubectl -n cozy-system get deploy lineage-controller-webhook
95+
kubectl -n cozy-system get pods -l app=lineage-controller-webhook
96+
kubectl -n cozy-system get svc lineage-controller-webhook -o yaml | grep trafficDistribution
97+
```
98+
99+
A quick end-to-end check, exercising the webhook through the apiserver:
100+
101+
```bash
102+
kubectl create ns lineage-webhook-test
103+
kubectl -n lineage-webhook-test create service clusterip probe \
104+
--clusterip=None --dry-run=server
105+
kubectl delete ns lineage-webhook-test
106+
```
107+
108+
The dry-run CREATE goes through the mutating admission webhook; if the
109+
webhook isn't reachable, it fails with `failed calling webhook
110+
"lineage.cozystack.io"`.

0 commit comments

Comments
 (0)