Skip to content

Commit 08734dc

Browse files
authored
docs: add ztvp-certificates scenario documentation (#130)
* docs: add ztvp-certificates scenario documentation Covers architecture, extraction phases, platform-specific handling (BareMetal/VSphere proxy CA, custom enterprise CAs, image pull trust), ACM Policy distribution, and automatic rollout strategies. Signed-off-by: Min Zhang <minzhang@redhat.com> * docs: combine BareMetal and vSphere certificate scenarios Merge Scenario 2 (BareMetal) and Scenario 3 (vSphere) into a single scenario since both platforms have identical self-signed ingress behavior and redundant proxyCA overrides. Renumber remaining scenarios accordingly. Signed-off-by: Min Zhang <minzhang@redhat.com> * docs: address review feedback on ztvp-certificates doc - Add link to the chart directory - Fix "ArgoCD" to "Argo CD" - Remove hardcoded sync-wave numbers to avoid staleness - Renumber phases 8.5/8.6 to 8.1/8.2 - Clarify service CA is read from within the Job Pod - Add "ConfigMap" qualifier to ztvp-trusted-ca references - Link to ACM fromConfigMap documentation - Replace wave numbers with relative ordering in sync table Signed-off-by: Min Zhang <minzhang@redhat.com> * docs: clarify Image Pull Trust patches additionalTrustedCA attribute Signed-off-by: Min Zhang <minzhang@redhat.com> --------- Signed-off-by: Min Zhang <minzhang@redhat.com>
1 parent a5599b9 commit 08734dc

1 file changed

Lines changed: 316 additions & 0 deletions

File tree

docs/ztvp-certificates.md

Lines changed: 316 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
# ZTVP Certificates
2+
3+
The [`ztvp-certificates`](../charts/ztvp-certificates/) chart manages CA certificate extraction, validation,
4+
bundling, and distribution across the Zero Trust Validated Pattern. It runs as
5+
an application managed by Argo CD in the `openshift-config` namespace, ensuring
6+
certificates are available before any workload that needs TLS verification.
7+
8+
## Architecture
9+
10+
```text
11+
IngressControllers Service CA Cluster trusted-ca-bundle
12+
(openshift-ingress) (openshift-config) (openshift-config-managed)
13+
| | |
14+
+-----------+-----------+--------------------------+
15+
|
16+
extract-certificates.sh <-- runs as Job (initial) + CronJob (daily)
17+
|
18+
validates & combines
19+
|
20+
v
21+
ConfigMap: ztvp-trusted-ca
22+
(openshift-config)
23+
|
24+
+-----------+-----------+-----------------------+
25+
| | |
26+
ACM Policy distributes proxyCA patches imagePullTrust
27+
to target namespaces proxy/cluster patches image.config
28+
(e.g. qtodo) (all platforms) (when enabled)
29+
```
30+
31+
### Kubernetes Resources
32+
33+
| Resource | Purpose |
34+
|---|---|
35+
| **ServiceAccount / RBAC** | Grants the extraction Job read access to secrets, configmaps, ingresscontrollers, and proxy across namespaces |
36+
| **ConfigMap (script)** | Holds the templated `extract-certificates.sh` script |
37+
| **Job (initial)** | Runs once at first sync to populate the CA bundle |
38+
| **CronJob** | Runs on schedule (default daily at 02:00) for automatic rotation |
39+
| **ACM Policy + Placement** | Distributes the `ztvp-trusted-ca` ConfigMap into target namespaces via ACM governance |
40+
| **ManagedClusterSetBinding** | Binds the `default` ManagedClusterSet in `openshift-config` so the Placement can target `local-cluster` |
41+
42+
## Extraction Phases
43+
44+
The extraction script runs through a deterministic sequence of phases. Each
45+
phase is independently gated by values, so the script adapts to the active
46+
configuration.
47+
48+
| Phase | Gate | What It Does |
49+
|---|---|---|
50+
| 1 -- Custom CA | `customCA.secretRef.enabled` | Reads a user-supplied secret and writes `custom-ca.crt` |
51+
| 2 -- Ingress CA | `autoDetect` | Loops over every `IngressController`, extracts `tls.crt` from the referenced or default router secret |
52+
| 3 -- Service CA | `autoDetect` | Reads `openshift-service-ca.crt` ConfigMap |
53+
| 4 -- Cluster CA Bundle | `autoDetect` | Reads `trusted-ca-bundle` from `openshift-config-managed` (present when a corporate proxy injects CAs) |
54+
| 5 -- Additional Certs | `customCA.additionalCertificates[]` | Reads each additional secret and writes a `.crt` file |
55+
| 6 -- Validation | `validation.enabled` | Checks minimum size and `openssl x509` parse for every `.crt` |
56+
| 7 -- Combine | always | Concatenates all `.crt` files into `tls-ca-bundle.pem`; fails if bundle < 100 bytes |
57+
| 8 -- ConfigMap | always | `oc apply` the `ztvp-trusted-ca` ConfigMap with annotations recording extraction metadata |
58+
| 8.1 -- Proxy CA | `proxyCA.enabled` | Creates a separate ConfigMap with ingress + service CAs only |
59+
| 8.2 -- Proxy Patch | `proxyCA.enabled` | Patches `proxy/cluster` to set `trustedCA` (only if not already set to another value) |
60+
| 9 -- Image Pull Trust | `imagePullTrust.enabled` | Creates a ConfigMap keyed by registry hostname and patches `image.config.openshift.io/cluster` to set the `additionalTrustedCA` attribute. |
61+
| 10 -- Rollout | `rollout.enabled` | Restarts Deployments/StatefulSets that consume the certificate bundle |
62+
63+
## Scenario Handling
64+
65+
### Scenario 1: Cloud Cluster with Public Certificates (Default)
66+
67+
Applies to AWS, Azure, GCP, and any cluster whose ingress uses certificates
68+
signed by a public CA.
69+
70+
**Active settings:**
71+
72+
* `autoDetect: true`
73+
* `proxyCA.enabled: true` (default -- ensures ACS Central and other workloads
74+
that verify TLS on routes can trust the ingress CA without per-pod volume mounts)
75+
* `imagePullTrust.enabled: false`
76+
77+
**What happens:**
78+
79+
1. The Job auto-detects the ingress CA from each `IngressController`'s router
80+
secret in `openshift-ingress`.
81+
2. The service CA is read from `openshift-service-ca.crt` from within the Job Pod.
82+
3. If a cluster-wide proxy bundle exists, it is included.
83+
4. All certificates are combined into `ztvp-trusted-ca` ConfigMap and distributed via
84+
ACM Policy to target namespaces.
85+
5. A proxy CA ConfigMap (`ztvp-proxy-ca`) is created with ingress + service
86+
CAs and the `proxy/cluster` is patched so the Cluster Network Operator injects
87+
these CAs into all workloads automatically.
88+
89+
No platform override file is needed. The chart's default `values.yaml` handles
90+
this scenario out of the box.
91+
92+
### Scenario 2: Bare Metal / vSphere with Self-Signed Ingress
93+
94+
Bare metal and vSphere clusters typically use self-signed certificates for the
95+
default ingress. Since `proxyCA` is enabled by default (see Scenario 1), the
96+
ingress CA is automatically injected cluster-wide. Workloads that verify TLS
97+
on routes (e.g., ACS Central connecting to Keycloak) work without extra
98+
configuration.
99+
100+
**Platform overrides:**
101+
102+
* `overrides/values-ztvp-certificates-BareMetal.yaml`
103+
* `overrides/values-ztvp-certificates-VSphere.yaml`
104+
105+
Both contain:
106+
107+
```yaml
108+
proxyCA:
109+
enabled: true
110+
```
111+
112+
> **Note:** These overrides are now redundant because the chart default is
113+
> `proxyCA.enabled: true`. They are retained for clarity and backward
114+
> compatibility with older chart versions.
115+
116+
**Behavior is identical to Scenario 1** -- Phases 8.1 and 8.2 run by default:
117+
118+
1. Phase 8.1 builds a proxy-specific bundle containing only the ingress and
119+
service CAs (the Cluster Network Operator merges these with system CAs).
120+
2. Phase 8.2 patches `proxy/cluster` to set `spec.trustedCA.name` to
121+
`ztvp-proxy-ca`.
122+
3. The CNO propagates the merged bundle to every node, making the ingress CA
123+
trusted system-wide for all pods without explicit volume mounts.
124+
125+
### Scenario 3: Enterprise Custom CA
126+
127+
When the organization uses a private PKI (e.g., a corporate root CA that
128+
signed the cluster's ingress certificate), the administrator creates a
129+
Kubernetes secret with the CA and enables `customCA.secretRef`.
130+
131+
**Setup:**
132+
133+
```bash
134+
oc create secret generic custom-ca-bundle \
135+
--from-file=ca.crt=/path/to/corporate-root-ca.crt \
136+
-n openshift-config
137+
```
138+
139+
**values-hub.yaml overrides:**
140+
141+
```yaml
142+
- name: customCA.secretRef.enabled
143+
value: "true"
144+
- name: customCA.secretRef.name
145+
value: custom-ca-bundle
146+
- name: customCA.secretRef.namespace
147+
value: openshift-config
148+
```
149+
150+
**What happens:**
151+
152+
1. Phase 1 extracts the custom CA from the referenced secret.
153+
2. Auto-detect (phases 2-4) still runs, so ingress and service CAs are
154+
included alongside the custom CA.
155+
3. The combined bundle contains both the custom CA and the auto-detected
156+
certificates.
157+
158+
### Scenario 4: Multiple Additional CAs
159+
160+
When several external CAs are needed (e.g., corporate root CA, a partner CA,
161+
and an intermediate CA), use `additionalCertificates` via the
162+
`extraValueFiles` mechanism.
163+
164+
**Configuration** (`overrides/values-ztvp-certificates.yaml`):
165+
166+
```yaml
167+
customCA:
168+
additionalCertificates:
169+
- name: corporate-root-ca
170+
secretRef:
171+
name: corporate-root-ca
172+
namespace: openshift-config
173+
key: ca.crt
174+
- name: partner-ca
175+
secretRef:
176+
name: partner-ca
177+
namespace: openshift-config
178+
key: ca.crt
179+
```
180+
181+
**What happens:**
182+
183+
1. Phase 5 iterates over each entry and extracts the certificate from its
184+
secret. Missing secrets produce a warning but do not fail the job.
185+
2. All additional certificates are combined with auto-detected and custom CAs
186+
in Phase 7.
187+
188+
### Scenario 5: Image Pull Trust for Built-In Registry
189+
190+
When an image registry (e.g., Quay or the embedded OpenShift registry) is
191+
exposed behind the cluster ingress with a self-signed or internal CA, kubelet
192+
image pulls fail with `x509: certificate signed by unknown authority`. The
193+
`imagePullTrust` feature solves this at the node level.
194+
195+
**values-hub.yaml overrides:**
196+
197+
```yaml
198+
- name: imagePullTrust.enabled
199+
value: "true"
200+
- name: imagePullTrust.registries[0]
201+
value: quay-registry-quay-quay-enterprise.apps.example.com
202+
```
203+
204+
**What happens:**
205+
206+
1. Phase 9 combines all extracted ingress CAs into a single PEM.
207+
2. A ConfigMap (`ztvp-registry-cas`) is created in `openshift-config` with
208+
each registry hostname as a key and the ingress CA PEM as the value.
209+
3. `image.config.openshift.io/cluster` is patched to set
210+
`additionalTrustedCA.name` to that ConfigMap.
211+
4. The Machine Config Operator rolls the trust configuration out to all nodes.
212+
213+
### Scenario 6: Custom Source Locations
214+
215+
In non-standard environments where the ingress CA or service CA are stored in
216+
different locations, `customSource` overrides the default auto-detection
217+
targets.
218+
219+
```yaml
220+
customSource:
221+
ingressCA:
222+
secretName: my-ingress-ca
223+
secretNamespace: my-namespace
224+
secretKey: tls.crt
225+
serviceCA:
226+
configMapName: my-service-ca
227+
configMapNamespace: my-namespace
228+
configMapKey: service-ca.crt
229+
```
230+
231+
Auto-detection will read from the specified locations instead of the standard
232+
OpenShift defaults.
233+
234+
## Distribution
235+
236+
Certificate distribution uses **ACM Governance Policies** to replicate the
237+
`ztvp-trusted-ca` ConfigMap from `openshift-config` into each target
238+
namespace.
239+
240+
```text
241+
openshift-config/ztvp-trusted-ca ---ACM Policy---> qtodo/ztvp-trusted-ca
242+
rhtpa/ztvp-trusted-ca
243+
...
244+
```
245+
246+
The policy uses [`fromConfigMap`](https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12/html-single/governance/index#fromConfigMap-function) hub templates so that the ConfigMap data is
247+
always sourced from the hub cluster's copy. Target namespaces are configured
248+
via `distribution.targetNamespaces`.
249+
250+
**Requirements:**
251+
252+
* ACM (Advanced Cluster Management) must be installed
253+
* A `ManagedClusterSetBinding` for the `default` cluster set is created
254+
automatically by the chart
255+
* The `Placement` targets clusters with `local-cluster: "true"`
256+
257+
## Automatic Rollout
258+
259+
When certificates are updated, consuming workloads need to pick up the new
260+
bundle. The chart supports three rollout strategies:
261+
262+
| Strategy | Behavior |
263+
|---|---|
264+
| `labeled` (default) | Restarts Deployments/StatefulSets matching `ztvp.io/uses-certificates: "true"` in distribution target namespaces |
265+
| `all` | Restarts all Deployments/StatefulSets in target namespaces |
266+
| `specific` | Restarts only the named resources listed in `rollout.targets` |
267+
268+
To opt a workload into automatic restart, add the label:
269+
270+
```yaml
271+
metadata:
272+
labels:
273+
ztvp.io/uses-certificates: "true"
274+
```
275+
276+
## Sync Wave Ordering
277+
278+
The chart's resources are ordered within the Argo CD sync:
279+
280+
| Order | Resources |
281+
|---|---|
282+
| 1st | ServiceAccount, RBAC (Role, RoleBinding, ClusterRole, ClusterRoleBinding) |
283+
| 2nd | Initial Job, CronJob, ConfigMap (script) |
284+
| 3rd | ManagedClusterSetBinding |
285+
| 4th | ACM Policy, PlacementBinding, Placement |
286+
287+
The application itself is deployed early in the overall sync order (via
288+
`values-hub.yaml`), ensuring it runs before operators and workloads that depend
289+
on the CA bundle.
290+
291+
## Configuration Reference
292+
293+
### Top-Level Values
294+
295+
| Value | Default | Description |
296+
|---|---|---|
297+
| `enabled` | `true` | Master toggle for all chart resources |
298+
| `autoDetect` | `true` | Auto-detect ingress, service, and cluster CAs from OpenShift |
299+
| `configMapName` | `ztvp-trusted-ca` | Name of the output ConfigMap |
300+
| `proxyCA.enabled` | `true` | Create a proxy CA ConfigMap and patch `proxy/cluster` |
301+
| `imagePullTrust.enabled` | `false` | Configure node-level registry trust via `image.config` |
302+
| `rollout.enabled` | `true` | Restart consuming workloads after certificate updates |
303+
| `rollout.strategy` | `labeled` | One of: `labeled`, `all`, `specific` |
304+
| `distribution.enabled` | `true` | Distribute CA bundle via ACM Policy |
305+
| `distribution.method` | `acm-policy` | Distribution mechanism |
306+
| `cronJob.schedule` | `0 2 * * *` | Cron schedule for automatic re-extraction |
307+
| `validation.enabled` | `true` | Validate certificate size and format |
308+
| `debug.verbose` | `false` | Enable `set -x` in the extraction script |
309+
310+
### Platform Override Files
311+
312+
| File | When Applied | Effect |
313+
|---|---|---|
314+
| `overrides/values-ztvp-certificates.yaml` | Always | Additional CAs, rollout config |
315+
| `overrides/values-ztvp-certificates-BareMetal.yaml` | `clusterPlatform == BareMetal` | Confirms `proxyCA` (redundant; default is already `true`) |
316+
| `overrides/values-ztvp-certificates-VSphere.yaml` | `clusterPlatform == VSphere` | Confirms `proxyCA` (redundant; default is already `true`) |

0 commit comments

Comments
 (0)