Skip to content

bug: yaml based deployment seems to be broken in v3.9.0 #1995

Description

@Cajga

Describe the issue

We updated our yaml deployment and the operator failed to start. We downgraded to v3.8.0 and that worked as usual.

For example, if you compare the fluent-operator deployment from the setup.yaml to the one that you generate from the helm template, they are totally different (the containers called differently and they call different commands):
From setup.yaml:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: operator
    app.kubernetes.io/name: fluent-operator
  name: fluent-operator
  namespace: fluent
spec:
  selector:
    matchLabels:
      app.kubernetes.io/component: operator
      app.kubernetes.io/name: fluent-operator
  template:
    metadata:
      labels:
        app.kubernetes.io/component: operator
        app.kubernetes.io/name: fluent-operator
    spec:
      containers:
        - args:
            - --leader-elect
            - --health-probe-bind-address=:8081
            - --metrics-bind-address=:8443
          command:
            - /manager
          env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
          image: ghcr.io/fluent/fluent-operator/fluent-operator:latest
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8081
            initialDelaySeconds: 15
            periodSeconds: 20
          name: manager
          ports:
            - containerPort: 8081
              name: health
              protocol: TCP
          readinessProbe:
            httpGet:
              path: /readyz
              port: 8081
            initialDelaySeconds: 5
            periodSeconds: 10
          resources:
            limits:
              cpu: 200m
              memory: 128Mi
            requests:
              cpu: 10m
              memory: 64Mi
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
            readOnlyRootFilesystem: true
          volumeMounts:
            - mountPath: /fluent-operator
              name: env
      securityContext:
        runAsNonRoot: true
      serviceAccountName: fluent-operator
      terminationGracePeriodSeconds: 10
      volumes:
        - configMap:
            name: fluent-operator-env
          name: env

From helm generated with helm template:

# Source: fluent-operator/templates/fluent-operator-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fluent-operator
  namespace: "default"
  labels:
    app.kubernetes.io/component: operator
    app.kubernetes.io/name: fluent-operator
  annotations:
    {}
spec:
  selector:
    matchLabels:
      app.kubernetes.io/component: operator
      app.kubernetes.io/name: fluent-operator
  template:
    metadata:
      labels:
        app.kubernetes.io/component: operator
        app.kubernetes.io/name: fluent-operator
    spec:
      volumes:
      - name: env
        configMap:
          name: fluent-operator-env
      containers:
      - name: fluent-operator
        image: ghcr.io/fluent/fluent-operator/fluent-operator:3.9.0
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsGroup: 65532
          runAsNonRoot: true
          runAsUser: 65532
          seccompProfile:
            type: RuntimeDefault
        resources:
          limits:
            cpu: 100m
            memory: 60Mi
          requests:
            cpu: 100m
            memory: 20Mi
        env:
          - name: NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          - name: CONTAINER_LOG_PATH
            value: "/var/log/containers"
        args:
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 8081
          initialDelaySeconds: 15
          periodSeconds: 20
          timeoutSeconds: 5
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: 8081
          initialDelaySeconds: 5
          periodSeconds: 10
          timeoutSeconds: 5
        volumeMounts:
        - name: env
          mountPath: /fluent-operator
        ports:
          - containerPort: 8080
            name: metrics
      serviceAccountName: fluent-operator
      securityContext:
        fsGroup: 65532
        runAsGroup: 65532
        runAsNonRoot: true
        runAsUser: 65532
        seccompProfile:
          type: RuntimeDefault

To Reproduce

Look into the setup.yaml of v3.9.0. The deployment seems to be totally different then previous versions or the ones from the helm chart.

Expected behavior

Regardless the deployment method (helm vs yaml) we end up with the same deployment.

Your Environment

- Fluent Operator version:v3.9.0
- Container Runtime: containerd
- Operating system: ubuntu 24

How did you install fluent operator?

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions