Skip to content

Splunk Operator: CR label/annotation changes trigger unexpected rolling restarts #1652

Description

@ductrung-nguyen

Please select the type of request

Enhancement

Tell us more

Describe the request

When you add or modify labels/annotations on a Splunk CR (like IndexerCluster), it causes a full rolling restart of all pods. This is pretty disruptive - especially for indexer clusters where each pod has to go through the whole decommissioning cycle (reassigning primaries, graceful shutdown, etc.).

For example, just adding a simple label like team: platform to organize resources shouldn't bring down your entire cluster one pod at a time.

In enterprise environment, we need to use labels/annotations on K8s resources for many purpose: management, FinOps....
So, this issue is quite annoying for us.

Expected behavior

Adding metadata (labels/annotations) to a CR shouldn't automatically restart all the pods. Labels are commonly used for organizational purposes (cost tracking, team ownership, filtering in dashboards), and most operators treat them as non-disruptive changes.

Ideally:

  • CR labels/annotations should propagate to the StatefulSet level only (which doesn't cause restarts)
  • If users actually need labels on the pods themselves, there should be an explicit way to do that (like spec.podLabels) with clear docs that it'll trigger a rolling update

Splunk setup on K8S

Any Splunk deployment managed by the operator - Standalone, IndexerCluster, SearchHeadCluster, etc.

Reproduction steps

  1. Deploy an IndexerCluster with 3 replicas
  2. Wait for all pods to be ready
  3. Add a label to the CR: kubectl label indexercluster my-cluster team=platform
  4. Watch all 3 indexer pods get recycled one by one (with full decommissioning for each)

K8s environment

Happens on any K8s version - it's an operator behavior, not K8s specific.

Additional context

The root cause is that CR metadata gets propagated to the Pod Template (.spec.template.metadata), which changes the controller-revision-hash and triggers the operator's rolling update logic.

Most other operators either:

  • Don't propagate CR metadata to pods at all
  • Provide separate fields for "safe" metadata vs "pod-level" metadata
  • Have an opt-in mechanism for changes that require restarts

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions