Skip to content

Unable to auto-scale Kubernetes cluster #52

@saffronjam

Description

@saffronjam

Hi!

I am unable to auto-scale Kubernetes clusters. As I understand, it create a "cluster-autoscaler" deployment that decides whether to scale or not. However, it does not seem to work, since it logs multiple errors and warnings in the pod, even though it is a completely clean cluster.

Normal scaling seems to work just fine.

Setup

A "default" CloudStack setup 4.18 running KVMs.

Settings (relevant)

  • Cloud kubernetes service enabled true
  • Cloud kubernetes cluster experimental features enabled true
  • Cloud kubernetes cluster max size 50

The nodes uses the following service offering:

  • 2 CPU x 2.05 Ghz
  • 2048 MB memory
  • 8 GB root disk

Replicate

  1. Create a new cluster using Kubernets 1.24 ISO found here:
    http://download.cloudstack.org/cks/

  2. Enable forced auto-scaling
    Since the cluster starts with only one worker node, auto-scaling with 3-5 nodes should trigger an upscale (I assume)
    Screenshot from 2023-08-07 16-55-00

  3. Check the logs for cluster-autoscaler in the Kubernetes cluster
    Some notable entries:

E0807 14:41:30.317148       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:serviceaccount:kube-system:cluster-autoscaler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope

E0807 14:41:32.388828       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:serviceaccount:kube-system:cluster-autoscaler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope

Even though I have not edited anything myself (just a clean CKS cluster), I get these weird logs:

W0807 14:41:43.251280       1 clusterstate.go:590] Failed to get nodegroup for 6a4c91a3-9694-4596-9ddd-dc86e60136ff: Unable to find node 6a4c91a3-9694-4596-9ddd-dc86e60136ff in cluster

W0807 14:41:43.251361       1 clusterstate.go:590] Failed to get nodegroup for bd0b855f-6dc6-4678-9bea-b52329333024: Unable to find node bd0b855f-6dc6-4678-9bea-b52329333024 in cluster

I0807 14:57:06.667061       1 static_autoscaler.go:341] 2 unregistered nodes present

The IDs are correct in CloudStack

The entire log:
logs-from-cluster-autoscaler-in-cluster-autoscaler-5bf887ddd8-hxg2g.log

Please tell me if you need more logs to look at, or if I should try some other configuration.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions