Skip to content

Commit 2901f08

Browse files
authored
docs: operations section (#3282)
Signed-off-by: Attila Mészáros <a_meszaros@apple.com>
1 parent d38b50e commit 2901f08

File tree

13 files changed

+258
-68
lines changed

13 files changed

+258
-68
lines changed

docs/content/en/docs/documentation/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ This section contains detailed documentation for all Java Operator SDK features
1919

2020
- **[Eventing](eventing/)** - Understanding the event-driven model
2121
- **[Accessing Resources in Caches](working-with-es-caches/)** - How to access resources in caches
22-
- **[Observability](observability/)** - Monitoring and debugging your operators
22+
- **[Operations](operations/)** - Helm chart, metrics, logging, configurations, leader election
2323
- **[Other Features](features/)** - Additional capabilities and integrations
2424

2525
Each guide includes practical examples and best practices to help you build robust, production-ready operators.

docs/content/en/docs/documentation/architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Architecture and Internals
3-
weight: 85
3+
weight: 90
44
---
55

66
This document provides an overview of the Java Operator SDK's internal structure and components to help developers understand and contribute to the project. While not a comprehensive reference, it introduces core concepts that should make other components easier to understand.

docs/content/en/docs/documentation/dependent-resource-and-workflows/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Dependent resources and workflows
3-
weight: 70
3+
weight: 80
44
---
55

66
Dependent resources and workflows are features sometimes referenced as higher

docs/content/en/docs/documentation/features.md

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,6 @@ public class DeploymentReconciler
2323
}
2424
```
2525

26-
## Leader Election
27-
28-
Operators are typically deployed with a single active instance. However, you can deploy multiple instances where only one (the "leader") processes events. This is achieved through "leader election."
29-
30-
While all instances run and start their event sources to populate caches, only the leader processes events. If the leader crashes, other instances are already warmed up and ready to take over when a new leader is elected.
31-
32-
See sample configuration in the [E2E test](https://github.com/java-operator-sdk/java-operator-sdk/blob/8865302ac0346ee31f2d7b348997ec2913d5922b/sample-operators/leader-election/src/main/java/io/javaoperatorsdk/operator/sample/LeaderElectionTestOperator.java#L21-L23).
33-
3426
## Automatic CRD Generation
3527

3628
**Note:** This feature is provided by the [Fabric8 Kubernetes Client](https://github.com/fabric8io/kubernetes-client), not JOSDK itself.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
title: Operations
3+
weight: 70
4+
---
5+
6+
This section covers operations-related features for running and managing operators in production.

docs/content/en/docs/documentation/configuration.md renamed to docs/content/en/docs/documentation/operations/configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Configurations
3-
weight: 55
3+
weight: 71
44
---
55

66
The Java Operator SDK (JOSDK) provides abstractions that work great out of the box. However, we recognize that default behavior isn't always suitable for every use case. Numerous configuration options help you tailor the framework to your specific needs.
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
title: Generic Helm Chart
3+
weight: 76
4+
---
5+
6+
A generic, reusable Helm chart for deploying Java operators built with JOSDK is available at
7+
[`helm/generic-helm-chart`](https://github.com/java-operator-sdk/java-operator-sdk/tree/main/helm/generic-helm-chart).
8+
9+
It is intended as a **template for operator developers** — a starting point that covers common deployment
10+
patterns so you don't have to write a chart from scratch. The chart is maintained on a **best-effort basis**.
11+
Contributions are more than welcome.
12+
13+
The chart is used in the
14+
[`metrics-processing` sample operator E2E test](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/sample-operators/metrics-processing/src/test/java/io/javaoperatorsdk/operator/sample/metrics/MetricsHandlingE2E.java)
15+
to deploy the operator to a cluster via Helm.
16+
17+
## What the Chart Provides
18+
19+
- **Deployment** with security defaults (non-root user, read-only filesystem, no privilege escalation)
20+
- **Dynamic RBAC** (ClusterRole, ClusterRoleBinding, ServiceAccount) — permissions are generated automatically
21+
from the primary and secondary resources you declare in `values.yaml`
22+
- **ConfigMap** for operator configuration (`config.yaml`) and logging (`log4j2.xml`), mounted at `/config`
23+
- **Leader election** support (opt-in)
24+
- **Extensibility** via extra containers, init containers, volumes, and environment variables
25+
26+
## Key Configuration
27+
28+
The most important values to set when adapting the chart for your operator:
29+
30+
```yaml
31+
image:
32+
repository: my-operator-image # required
33+
tag: "latest"
34+
35+
# Custom resources your operator reconciles
36+
primaryResources:
37+
- apiGroup: "sample.javaoperatorsdk"
38+
resources:
39+
- myresources
40+
41+
# Kubernetes resources your operator manages
42+
secondaryResources:
43+
- apiGroup: ""
44+
resources:
45+
- configmaps
46+
- services
47+
```
48+
49+
Primary resources get read/watch/patch permissions and status sub-resource access.
50+
Secondary resources get full CRUD permissions. Default verbs can be overridden per resource entry.
51+
52+
### Operator Environment
53+
54+
The chart injects `OPERATOR_NAMESPACE` automatically. You can optionally set `WATCH_NAMESPACE` to
55+
restrict the operator to a single namespace, and add arbitrary environment variables:
56+
57+
```yaml
58+
operator:
59+
watchNamespace: "" # empty = all namespaces
60+
env:
61+
- name: MY_CUSTOM_VAR
62+
value: "some-value"
63+
```
64+
65+
### Resource Defaults
66+
67+
```yaml
68+
resources:
69+
limits:
70+
cpu: 500m
71+
memory: 512Mi
72+
requests:
73+
cpu: 100m
74+
memory: 128Mi
75+
```
76+
77+
See the full
78+
[`values.yaml`](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/helm/generic-helm-chart/values.yaml)
79+
for all available options.
80+
81+
## Usage Example
82+
83+
A working example of how to use the chart can be found in the metrics-processing sample operator's
84+
[`helm-values.yaml`](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/sample-operators/metrics-processing/src/test/resources/helm-values.yaml):
85+
86+
```yaml
87+
image:
88+
repository: metrics-processing-operator
89+
pullPolicy: Never
90+
tag: "latest"
91+
92+
nameOverride: "metrics-processing-operator"
93+
94+
resources: {}
95+
96+
primaryResources:
97+
- apiGroup: "sample.javaoperatorsdk"
98+
resources:
99+
- metricshandlingcustomresource1s
100+
- metricshandlingcustomresource2s
101+
```
102+
103+
Install with:
104+
105+
```shell
106+
helm install my-operator ./helm/generic-helm-chart -f my-values.yaml --namespace my-ns
107+
```
108+
109+
## Testing the Chart
110+
111+
The chart includes unit tests using the [helm-unittest](https://github.com/helm-unittest/helm-unittest) plugin.
112+
Run them with:
113+
114+
```shell
115+
./helm/run-tests.sh
116+
```
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: Leader Election
3+
weight: 74
4+
---
5+
6+
When running multiple replicas of an operator for high availability, leader election ensures that
7+
only one instance actively reconciles resources at a time. JOSDK uses Kubernetes
8+
[Lease](https://kubernetes.io/docs/concepts/architecture/leases/) objects for leader election.
9+
10+
## Enabling Leader Election
11+
12+
### Programmatic Configuration
13+
14+
```java
15+
var operator = new Operator(o -> o.withLeaderElectionConfiguration(
16+
new LeaderElectionConfiguration("my-operator-lease", "operator-namespace")));
17+
```
18+
19+
Or using the builder for full control:
20+
21+
```java
22+
import static io.javaoperatorsdk.operator.api.config.LeaderElectionConfigurationBuilder.aLeaderElectionConfiguration;
23+
24+
var config = aLeaderElectionConfiguration("my-operator-lease")
25+
.withLeaseNamespace("operator-namespace")
26+
.withIdentity(System.getenv("POD_NAME"))
27+
.withLeaseDuration(Duration.ofSeconds(15))
28+
.withRenewDeadline(Duration.ofSeconds(10))
29+
.withRetryPeriod(Duration.ofSeconds(2))
30+
.build();
31+
32+
var operator = new Operator(o -> o.withLeaderElectionConfiguration(config));
33+
```
34+
35+
### External Configuration
36+
37+
Leader election can also be configured via properties (e.g. environment variables or a config file).
38+
39+
See details under [configurations](configuration.md) page.
40+
41+
## How It Works
42+
43+
1. When leader election is enabled, the operator starts but **does not process events** until it acquires
44+
the lease.
45+
2. Once leadership is acquired, event processing begins normally.
46+
3. If leadership is lost (e.g. the leader pod becomes unresponsive), another instance acquires the lease
47+
and takes over reconciliation. The instance that lost the lead is terminated (`System.exit()`)
48+
49+
### Identity and Namespace Inference
50+
51+
If not explicitly set:
52+
- **Identity** is resolved from the `HOSTNAME` environment variable, then the pod name, falling back to a
53+
random UUID.
54+
- **Lease namespace** defaults to the namespace the operator pod is running in.
55+
56+
## RBAC Requirements
57+
58+
The operator's service account needs permissions to manage Lease objects:
59+
60+
```yaml
61+
- apiGroups: ["coordination.k8s.io"]
62+
resources: ["leases"]
63+
verbs: ["create", "update", "get"]
64+
```
65+
66+
JOSDK checks for these permissions at startup and throws a clear error if they are missing.
67+
68+
## Sample E2E Test
69+
70+
A complete working example is available in the
71+
[`leader-election` sample operator](https://github.com/java-operator-sdk/java-operator-sdk/tree/main/sample-operators/leader-election),
72+
including multi-replica deployment manifests and an E2E test that verifies failover behavior.
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
---
2+
title: Logging
3+
weight: 72
4+
---
5+
6+
## Contextual Info for Logging with MDC
7+
8+
Logging is enhanced with additional contextual information using
9+
[MDC](http://www.slf4j.org/manual.html#mdc). The following attributes are available in most
10+
parts of reconciliation logic and during the execution of the controller:
11+
12+
| MDC Key | Value added from primary resource |
13+
|:---------------------------|:----------------------------------|
14+
| `resource.apiVersion` | `.apiVersion` |
15+
| `resource.kind` | `.kind` |
16+
| `resource.name` | `.metadata.name` |
17+
| `resource.namespace` | `.metadata.namespace` |
18+
| `resource.resourceVersion` | `.metadata.resourceVersion` |
19+
| `resource.generation` | `.metadata.generation` |
20+
| `resource.uid` | `.metadata.uid` |
21+
22+
For more information about MDC see this [link](https://www.baeldung.com/mdc-in-log4j-2-logback).
23+
24+
### MDC entries during event handling
25+
26+
Although, usually users might not require it in their day-to-day workflow, it is worth mentioning that
27+
there are additional MDC entries managed for event handling. Typically, you might be interested in it
28+
in your `SecondaryToPrimaryMapper` related logs.
29+
For `InformerEventSource` and `ControllerEventSource` the following information is present:
30+
31+
| MDC Key | Value from Resource from the Event |
32+
|:-----------------------------------------------|:-------------------------------------------------|
33+
| `eventsource.event.resource.name` | `.metadata.name` |
34+
| `eventsource.event.resource.uid` | `.metadata.uid` |
35+
| `eventsource.event.resource.namespace` | `.metadata.namespace` |
36+
| `eventsource.event.resource.kind` | resource kind |
37+
| `eventsource.event.resource.resourceVersion` | `.metadata.resourceVersion` |
38+
| `eventsource.event.action` | action name (e.g. `ADDED`, `UPDATED`, `DELETED`) |
39+
| `eventsource.name` | name of the event source |
40+
41+
### Note on null values
42+
43+
If a resource doesn't provide values for one of the specified keys, the key will be omitted and not added to the MDC
44+
context. There is, however, one notable exception: the resource's namespace, where, instead of omitting the key, we emit
45+
the `MDCUtils.NO_NAMESPACE` value instead. This allows searching for resources without namespace (notably, clustered
46+
resources) in the logs more easily.
47+
48+
### Disabling MDC support
49+
50+
MDC support is enabled by default. If you want to disable it, you can set the `JAVA_OPERATOR_SDK_USE_MDC` environment
51+
variable to `false` when you start your operator.

docs/content/en/docs/documentation/observability.md renamed to docs/content/en/docs/documentation/operations/metrics.md

Lines changed: 6 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: Observability
3-
weight: 55
2+
title: Metrics
3+
weight: 73
44
---
55

66
## Runtime Info
@@ -15,53 +15,6 @@ setting, where this flag usually needs to be set to false, in order to control t
1515
See also an example implementation in the
1616
[WebPage sample](https://github.com/java-operator-sdk/java-operator-sdk/blob/3e2e7c4c834ef1c409d636156b988125744ca911/sample-operators/webpage/src/main/java/io/javaoperatorsdk/operator/sample/WebPageOperator.java#L38-L43)
1717

18-
## Contextual Info for Logging with MDC
19-
20-
Logging is enhanced with additional contextual information using
21-
[MDC](http://www.slf4j.org/manual.html#mdc). The following attributes are available in most
22-
parts of reconciliation logic and during the execution of the controller:
23-
24-
| MDC Key | Value added from primary resource |
25-
|:---------------------------|:----------------------------------|
26-
| `resource.apiVersion` | `.apiVersion` |
27-
| `resource.kind` | `.kind` |
28-
| `resource.name` | `.metadata.name` |
29-
| `resource.namespace` | `.metadata.namespace` |
30-
| `resource.resourceVersion` | `.metadata.resourceVersion` |
31-
| `resource.generation` | `.metadata.generation` |
32-
| `resource.uid` | `.metadata.uid` |
33-
34-
For more information about MDC see this [link](https://www.baeldung.com/mdc-in-log4j-2-logback).
35-
36-
### MDC entries during event handling
37-
38-
Although, usually users might not require it in their day-to-day workflow, it is worth mentioning that
39-
there are additional MDC entries managed for event handling. Typically, you might be interested in it
40-
in your `SecondaryToPrimaryMapper` related logs.
41-
For `InformerEventSource` and `ControllerEventSource` the following information is present:
42-
43-
| MDC Key | Value from Resource from the Event |
44-
|:-----------------------------------------------|:-------------------------------------------------|
45-
| `eventsource.event.resource.name` | `.metadata.name` |
46-
| `eventsource.event.resource.uid` | `.metadata.uid` |
47-
| `eventsource.event.resource.namespace` | `.metadata.namespace` |
48-
| `eventsource.event.resource.kind` | resource kind |
49-
| `eventsource.event.resource.resourceVersion` | `.metadata.resourceVersion` |
50-
| `eventsource.event.action` | action name (e.g. `ADDED`, `UPDATED`, `DELETED`) |
51-
| `eventsource.name` | name of the event source |
52-
53-
### Note on null values
54-
55-
If a resource doesn't provide values for one of the specified keys, the key will be omitted and not added to the MDC
56-
context. There is, however, one notable exception: the resource's namespace, where, instead of omitting the key, we emit
57-
the `MDCUtils.NO_NAMESPACE` value instead. This allows searching for resources without namespace (notably, clustered
58-
resources) in the logs more easily.
59-
60-
### Disabling MDC support
61-
62-
MDC support is enabled by default. If you want to disable it, you can set the `JAVA_OPERATOR_SDK_USE_MDC` environment
63-
variable to `false` when you start your operator.
64-
6518
## Metrics
6619

6720
JOSDK provides built-in support for metrics reporting on what is happening with your reconcilers in the form of
@@ -77,9 +30,9 @@ Metrics metrics; // initialize your metrics implementation
7730
Operator operator = new Operator(client, o -> o.withMetrics(metrics));
7831
```
7932

80-
### MicrometerMetricsV2
33+
### MicrometerMetricsV2
8134

82-
[`MicrometerMetricsV2`](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/micrometer-support/src/main/java/io/javaoperatorsdk/operator/monitoring/micrometer/MicrometerMetricsV2.java)
35+
[`MicrometerMetricsV2`](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/micrometer-support/src/main/java/io/javaoperatorsdk/operator/monitoring/micrometer/MicrometerMetricsV2.java)
8336
is the recommended micrometer-based implementation. It is designed with low cardinality in mind:
8437
all meters are scoped to the controller, not to individual resources. This avoids unbounded cardinality growth as
8538
resources come and go.
@@ -230,8 +183,8 @@ Metrics loggingMetrics = new LoggingMetrics();
230183

231184
// combine them into a single aggregated instance
232185
Metrics aggregatedMetrics = new AggregatedMetrics(List.of(
233-
micrometerMetrics,
234-
customMetrics,
186+
micrometerMetrics,
187+
customMetrics,
235188
loggingMetrics
236189
));
237190

0 commit comments

Comments
 (0)