You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
summary: Learn how to configure Horizontal Pod Autoscaling (HPA) to dynamically scale a Milvus cluster.
5
+
title: Configure Horizontal Pod Autoscaling (HPA) for Milvus
6
+
---
7
+
# Configure Horizontal Pod Autoscaling (HPA) for Milvus
8
+
9
+
## Overview
10
+
11
+
Horizontal Pod Autoscaling (HPA) is a Kubernetes feature that automatically adjusts the number of Pods in a deployment based on resource utilization, such as CPU or memory. In Milvus, HPA can be applied to stateless components like `proxy`, `queryNode`, `dataNode`, and `indexNode` to dynamically scale the cluster in response to workload changes.
12
+
13
+
This guide explains how to configure HPA for Milvus components using the Milvus Operator.
14
+
15
+
## Prerequisites
16
+
17
+
- A running Milvus cluster deployed with Milvus Operator.
18
+
- Access to `kubectl` for managing Kubernetes resources.
19
+
- Familiarity with Milvus architecture and Kubernetes HPA.
20
+
21
+
## Configure HPA with Milvus Operator
22
+
23
+
To enable HPA in a Milvus cluster managed by the Milvus Operator, follow these steps:
24
+
25
+
1.**Set Replicas to -1**:
26
+
27
+
In the Milvus custom resource (CR), set the `replicas` field to `-1` for the component you want to scale with HPA. This delegates scaling control to HPA instead of the operator. You can edit the CR directly or use the following `kubectl patch` command to quickly switch to HPA control:
Replace `<your-release-name>` with the name of your Milvus cluster.
34
+
35
+
To verify that the change has been applied, run:
36
+
37
+
```bash
38
+
kubectl get milvus <your-release-name> -o jsonpath='{.spec.components.proxy.replicas}'
39
+
```
40
+
41
+
The expected output should be `-1`, confirming that the `proxy` component is now under HPA control.
42
+
43
+
Alternatively, you can define it in the CR YAML:
44
+
45
+
```yaml
46
+
apiVersion: milvus.io/v1beta1
47
+
kind: Milvus
48
+
metadata:
49
+
name: <your-release-name>
50
+
spec:
51
+
mode: cluster
52
+
components:
53
+
proxy:
54
+
replicas: -1
55
+
```
56
+
2. **Define an HPA Resource**:
57
+
58
+
Create an HPA resource to target the deployment of the desired component. Below is an example for the `proxy` component:
59
+
60
+
```yaml
61
+
apiVersion: autoscaling/v2
62
+
kind: HorizontalPodAutoscaler
63
+
metadata:
64
+
name: my-release-milvus-proxy-hpa
65
+
spec:
66
+
scaleTargetRef:
67
+
apiVersion: apps/v1
68
+
kind: Deployment
69
+
name: my-release-milvus-proxy
70
+
minReplicas: 2
71
+
maxReplicas: 10
72
+
metrics:
73
+
- type: Resource
74
+
resource:
75
+
name: cpu
76
+
target:
77
+
type: Utilization
78
+
averageUtilization: 60
79
+
- type: Resource
80
+
resource:
81
+
name: memory
82
+
target:
83
+
type: Utilization
84
+
averageUtilization: 60
85
+
behavior:
86
+
scaleUp:
87
+
policies:
88
+
- type: Pods
89
+
value: 1
90
+
periodSeconds: 30
91
+
scaleDown:
92
+
stabilizationWindowSeconds: 300
93
+
policies:
94
+
- type: Pods
95
+
value: 1
96
+
periodSeconds: 60
97
+
```
98
+
99
+
Replace `my-release` in `metadata.name` and `spec.scaleTargetRef.name` with your actual Milvus cluster name (e.g., `<your-release-name>-milvus-proxy-hpa` and `<your-release-name>-milvus-proxy`).
100
+
3. **Apply the HPA Configuration**:
101
+
102
+
Deploy the HPA resource using the following command:
103
+
104
+
```bash
105
+
kubectl apply -f hpa.yaml
106
+
```
107
+
108
+
To verify that the HPA has been successfully created, run:
109
+
110
+
```bash
111
+
kubectl get hpa
112
+
```
113
+
114
+
You should see output similar to:
115
+
116
+
```
117
+
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
The `NAME` and `REFERENCE` fields will reflect your cluster name (e.g., `<your-release-name>-milvus-proxy-hpa` and `Deployment/<your-release-name>-milvus-proxy`).
122
+
123
+
- `scaleTargetRef`: Specifies the deployment to scale (e.g., `my-release-milvus-proxy`).
124
+
- `minReplicas` and `maxReplicas`: Sets the scaling range (2 to 10 Pods in this example).
125
+
- `metrics`: Configures scaling based on CPU and memory utilization, targeting 60% average usage.
126
+
127
+
## Conclusion
128
+
129
+
HPA allows Milvus to efficiently adapt to varying workloads. By using the `kubectl patch` command, you can quickly switch a component to HPA control without manually editing the full CR. For more details, refer to the [Kubernetes HPA documentation](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/).
summary: Learn how to manually or automatically scale out and scale in a Milvus cluster.
5
5
title: Scale a Milvus Cluster
6
6
---
7
-
8
7
# Scale a Milvus Cluster
9
8
10
-
Milvus supports horizontal scaling of its components. This means you can either increase or decrease the number of worker nodes of each type according to your own need.
9
+
Milvus supports horizontal scaling of its components. This means you can either increase or decrease the number of worker nodes of each type according to your own need.
11
10
12
-
This topic describes how to scale out and scale in a Milvus cluster. We assume that you have already [installed a Milvus cluster](install_cluster-helm.md) before scaling. Also, we recommend familiarizing yourself with the [Milvus architecture](architecture_overview.md) before you begin.
11
+
This topic describes how to scale out and scale in a Milvus cluster. We assume that you have already [installed a Milvus cluster](install_cluster-helm.md) before scaling. Also, we recommend familiarizing yourself with the [Milvus architecture](architecture_overview.md) before you begin.
13
12
14
13
This tutorial takes scaling out three query nodes as an example. To scale out other types of nodes, replace `queryNode` with the corresponding node type in the command line.
15
14
@@ -23,8 +22,9 @@ For information on how to scale a cluster with Milvus Operator, refer to [Scale
23
22
24
23
Horizontal scaling includes scaling out and scaling in.
25
24
26
-
### Scaling out
27
-
Scaling out refers to increasing the number of nodes in a cluster. Unlike scaling up, scaling out does not require you to allocate more resources to one node in the cluster. Instead, scaling out expands the cluster horizontally by adding more nodes.
25
+
### Scaling out
26
+
27
+
Scaling out refers to increasing the number of nodes in a cluster. Unlike scaling up, scaling out does not require you to allocate more resources to one node in the cluster. Instead, scaling out expands the cluster horizontally by adding more nodes.
@@ -33,15 +33,17 @@ Scaling out refers to increasing the number of nodes in a cluster. Unlike scalin
33
33
According to the [Milvus architecture](architecture_overview.md), stateless worker nodes include query node, data node, index node, and proxy. Therefore, you can scale out these type of nodes to suit your business needs and application scenarios. You can either scale out the Milvus cluster manually or automatically.
34
34
35
35
Generally, you will need to scale out the Milvus cluster you created if it is over-utilized. Below are some typical situations where you may need to scale out the Milvus cluster:
36
+
36
37
- The CPU and memory utilization is high for a period of time.
37
38
- The query throughput becomes higher.
38
39
- Higher speed for indexing is required.
39
40
- Massive volumes of large datasets need to be processed.
40
41
- High availability of the Milvus service needs to be ensured.
41
42
42
-
43
43
### Scaling in
44
+
44
45
Scaling in refers to decreasing the number of nodes in a cluster. Generally, you will need to scale in the Milvus cluster you created if it is under-utilized. Below are some typical situations where you need to scale in the Milvus cluster:
46
+
45
47
- The CPU and memory utilization is low for a period of time.
Milvus only supports adding the worker nodes and does not support adding the coordinator components.
75
77
</div>
76
78
77
-
## Scale a Milvus cluster
79
+
## Scale a Milvus cluster
78
80
79
-
You can scale in your Milvus cluster either manually or automatically. If autoscaling is enabled, the Milvus cluster will shrink or expand automatically when CPU and memory resources consumption reaches the value you have set.
81
+
You can scale in your Milvus cluster either manually or automatically. For automatic scaling with Horizontal Pod Autoscaling (HPA), see [Configure HPA for Milvus](hpa.md). If autoscaling is enabled, the Milvus cluster will shrink or expand automatically when CPU and memory resources consumption reaches the value you have set.
80
82
81
83
Currently, Milvus 2.1.0 only supports scaling in and out manually.
82
84
83
-
84
85
#### Scaling out
85
86
86
87
Run `helm upgrade my-release milvus/milvus --set queryNode.replicas=3 --reuse-values` to manually scale out the query node.
0 commit comments