If Pods starts to use more Memory or CPU than defined in the HPA, Auto-scaling action will trigger and it will add another replica of the same deployment to balance the load.
To make HPA work we need CPU and memory utilised by the pods. For this metrics server must be deployed.
File: clutser.yaml
Add the following in cluster spec.
These lines will inform the kube controller manager not to use Rest based endpoints to gather metrics.
kubeControllerManager:
horizontalPodAutoscalerUseRestClients: falseFile: site/dep.yaml
-
Add container usage limit. Metrics server will be tracking this usage limits. Eg. 50% of the requested resources.
containers: - name: php-fpm image: superuser/php:v1.0.8 resources: limits: cpu: "0.2" memory: 100Mi requests: cpu: 10m memory: 64Mi
-
Add an HPA block same as example :
apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: test-aws-k8s-master spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: test-aws-k8s-master #name of deployment minReplicas: 3 #maximum limit pods can replicate to maxReplicas: 1 #minimum number of replicas: >=1 metrics: - type: Resource resource: name: memory #target resource targetAverageUtilization: 50 #trigger if utilization is >=50% - type: Resource resource: name: cpu targetAverageUtilization: 50
If Pod/Node affinities are there then HPA behaviour will depend on it.
-
If pod afinity is there and 2 nodes are available. If auto scaling triggers then maximum 2 numbers of replicas can be created.
-
If node afinity is there and only 2 nodes are available in the node pool then numer of replicas are depended on the resources are available in the nodepool.
-
If Pod and Node afinities are configured at a same time then both limits 1 and 2 will be applied.