Skip to content

Fast watcher, slow processing issue #3501

@coderdjw

Description

@coderdjw

Seeking help from k8s experts.

I leveraged client-go / controller-runtime to implement a controller for my CRD. And now I noticed a symptom that my controller's performance cannot be improved no matter I added more shards to controller or increased the max-requests-inflight/max-mutating-request-inflight.

Below is the overview of my CRD reconciling.

  1. Add finalizer
  2. Mark the CRD status to pending
  3. Create another CRD and waits for the status to be ready.
  4. Mark the CRD status to running.

The avg latency of above 4 steps is around 1s - 5s.

I simulated 10000 CRDs creation, and found the E2E duration for all CRD becoming running needs around ~20s.
I observed sometimes entering reconcile (step #1) occurs 8s after the CR creation on api server side.
When I checked api server logs, I found
https://github.com/kubernetes/kubernetes/blob/release-1.28/staging/src/k8s.io/apiserver/pkg/storage/etcd3/watcher.go#L139

  • around 80k "Fast watcher, slow processing. Probably caused by slow decoding, user not receiving fast, or other processing logic" incomingEvents=100 objectType="*unstructured.Unstructured" ..."
  • around 500 "Fast watcher, slow processing. Probably caused by slow dispatching events to watchers" outgoingEvents=100 objectType="*unstructured.Unstructured" ..."

I cannot tell whether the bottleneck is on controller side or api server side? I tried to increase the shards of the controller, but no help. And I also observed the cpu/memory usage of k8s api server, the usage is around ~50%, not very high.

Any suggestions how to do the further troubleshooting and improve the controller's performance?

The parameters I used:

  1. controller: 3 shards and max_concurrent_reconciles of each shard is 2000 (the load is balanced across all shards).
  2. api server side: 3 api server and max-requests-inflight = 2000, max-mutating-request-inflight = 2000 on every api server.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions