Skip to content

Commit dfe2fbb

Browse files
Milvus-doc-botMilvus-doc-bot
authored andcommitted
Release new docs to master
1 parent 00cde98 commit dfe2fbb

2 files changed

Lines changed: 142 additions & 13 deletions

File tree

v2.5.x/site/en/adminGuide/hpa.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
---
2+
id: hpa.md
3+
related_key: scale Milvus cluster with HPA
4+
summary: Learn how to configure Horizontal Pod Autoscaling (HPA) to dynamically scale a Milvus cluster.
5+
title: Configure Horizontal Pod Autoscaling (HPA) for Milvus
6+
---
7+
# Configure Horizontal Pod Autoscaling (HPA) for Milvus
8+
9+
## Overview
10+
11+
Horizontal Pod Autoscaling (HPA) is a Kubernetes feature that automatically adjusts the number of Pods in a deployment based on resource utilization, such as CPU or memory. In Milvus, HPA can be applied to stateless components like `proxy`, `queryNode`, `dataNode`, and `indexNode` to dynamically scale the cluster in response to workload changes.
12+
13+
This guide explains how to configure HPA for Milvus components using the Milvus Operator.
14+
15+
## Prerequisites
16+
17+
- A running Milvus cluster deployed with Milvus Operator.
18+
- Access to `kubectl` for managing Kubernetes resources.
19+
- Familiarity with Milvus architecture and Kubernetes HPA.
20+
21+
## Configure HPA with Milvus Operator
22+
23+
To enable HPA in a Milvus cluster managed by the Milvus Operator, follow these steps:
24+
25+
1. **Set Replicas to -1**:
26+
27+
In the Milvus custom resource (CR), set the `replicas` field to `-1` for the component you want to scale with HPA. This delegates scaling control to HPA instead of the operator. You can edit the CR directly or use the following `kubectl patch` command to quickly switch to HPA control:
28+
29+
```bash
30+
kubectl patch milvus <your-release-name> --type='json' -p='[{"op": "replace", "path": "/spec/components/proxy/replicas", "value": -1}]'
31+
```
32+
33+
Replace `<your-release-name>` with the name of your Milvus cluster.
34+
35+
To verify that the change has been applied, run:
36+
37+
```bash
38+
kubectl get milvus <your-release-name> -o jsonpath='{.spec.components.proxy.replicas}'
39+
```
40+
41+
The expected output should be `-1`, confirming that the `proxy` component is now under HPA control.
42+
43+
Alternatively, you can define it in the CR YAML:
44+
45+
```yaml
46+
apiVersion: milvus.io/v1beta1
47+
kind: Milvus
48+
metadata:
49+
name: <your-release-name>
50+
spec:
51+
mode: cluster
52+
components:
53+
proxy:
54+
replicas: -1
55+
```
56+
2. **Define an HPA Resource**:
57+
58+
Create an HPA resource to target the deployment of the desired component. Below is an example for the `proxy` component:
59+
60+
```yaml
61+
apiVersion: autoscaling/v2
62+
kind: HorizontalPodAutoscaler
63+
metadata:
64+
name: my-release-milvus-proxy-hpa
65+
spec:
66+
scaleTargetRef:
67+
apiVersion: apps/v1
68+
kind: Deployment
69+
name: my-release-milvus-proxy
70+
minReplicas: 2
71+
maxReplicas: 10
72+
metrics:
73+
- type: Resource
74+
resource:
75+
name: cpu
76+
target:
77+
type: Utilization
78+
averageUtilization: 60
79+
- type: Resource
80+
resource:
81+
name: memory
82+
target:
83+
type: Utilization
84+
averageUtilization: 60
85+
behavior:
86+
scaleUp:
87+
policies:
88+
- type: Pods
89+
value: 1
90+
periodSeconds: 30
91+
scaleDown:
92+
stabilizationWindowSeconds: 300
93+
policies:
94+
- type: Pods
95+
value: 1
96+
periodSeconds: 60
97+
```
98+
99+
Replace `my-release` in `metadata.name` and `spec.scaleTargetRef.name` with your actual Milvus cluster name (e.g., `<your-release-name>-milvus-proxy-hpa` and `<your-release-name>-milvus-proxy`).
100+
3. **Apply the HPA Configuration**:
101+
102+
Deploy the HPA resource using the following command:
103+
104+
```bash
105+
kubectl apply -f hpa.yaml
106+
```
107+
108+
To verify that the HPA has been successfully created, run:
109+
110+
```bash
111+
kubectl get hpa
112+
```
113+
114+
You should see output similar to:
115+
116+
```
117+
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
118+
my-release-milvus-proxy-hpa Deployment/my-release-milvus-proxy <some>/60% 2 10 2 <time>
119+
```
120+
121+
The `NAME` and `REFERENCE` fields will reflect your cluster name (e.g., `<your-release-name>-milvus-proxy-hpa` and `Deployment/<your-release-name>-milvus-proxy`).
122+
123+
- `scaleTargetRef`: Specifies the deployment to scale (e.g., `my-release-milvus-proxy`).
124+
- `minReplicas` and `maxReplicas`: Sets the scaling range (2 to 10 Pods in this example).
125+
- `metrics`: Configures scaling based on CPU and memory utilization, targeting 60% average usage.
126+
127+
## Conclusion
128+
129+
HPA allows Milvus to efficiently adapt to varying workloads. By using the `kubectl patch` command, you can quickly switch a component to HPA control without manually editing the full CR. For more details, refer to the [Kubernetes HPA documentation](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/).

v2.5.x/site/en/adminGuide/scaleout.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,11 @@ related_key: scale Milvus cluster
44
summary: Learn how to manually or automatically scale out and scale in a Milvus cluster.
55
title: Scale a Milvus Cluster
66
---
7-
87
# Scale a Milvus Cluster
98

10-
Milvus supports horizontal scaling of its components. This means you can either increase or decrease the number of worker nodes of each type according to your own need.
9+
Milvus supports horizontal scaling of its components. This means you can either increase or decrease the number of worker nodes of each type according to your own need.
1110

12-
This topic describes how to scale out and scale in a Milvus cluster. We assume that you have already [installed a Milvus cluster](install_cluster-helm.md) before scaling. Also, we recommend familiarizing yourself with the [Milvus architecture](architecture_overview.md) before you begin.
11+
This topic describes how to scale out and scale in a Milvus cluster. We assume that you have already [installed a Milvus cluster](install_cluster-helm.md) before scaling. Also, we recommend familiarizing yourself with the [Milvus architecture](architecture_overview.md) before you begin.
1312

1413
This tutorial takes scaling out three query nodes as an example. To scale out other types of nodes, replace `queryNode` with the corresponding node type in the command line.
1514

@@ -23,8 +22,9 @@ For information on how to scale a cluster with Milvus Operator, refer to [Scale
2322

2423
Horizontal scaling includes scaling out and scaling in.
2524

26-
### Scaling out
27-
Scaling out refers to increasing the number of nodes in a cluster. Unlike scaling up, scaling out does not require you to allocate more resources to one node in the cluster. Instead, scaling out expands the cluster horizontally by adding more nodes.
25+
### Scaling out
26+
27+
Scaling out refers to increasing the number of nodes in a cluster. Unlike scaling up, scaling out does not require you to allocate more resources to one node in the cluster. Instead, scaling out expands the cluster horizontally by adding more nodes.
2828

2929
![Scaleout](../../../assets/scale_out.jpg "Scaleout illustration.")
3030

@@ -33,15 +33,17 @@ Scaling out refers to increasing the number of nodes in a cluster. Unlike scalin
3333
According to the [Milvus architecture](architecture_overview.md), stateless worker nodes include query node, data node, index node, and proxy. Therefore, you can scale out these type of nodes to suit your business needs and application scenarios. You can either scale out the Milvus cluster manually or automatically.
3434

3535
Generally, you will need to scale out the Milvus cluster you created if it is over-utilized. Below are some typical situations where you may need to scale out the Milvus cluster:
36+
3637
- The CPU and memory utilization is high for a period of time.
3738
- The query throughput becomes higher.
3839
- Higher speed for indexing is required.
3940
- Massive volumes of large datasets need to be processed.
4041
- High availability of the Milvus service needs to be ensured.
4142

42-
4343
### Scaling in
44+
4445
Scaling in refers to decreasing the number of nodes in a cluster. Generally, you will need to scale in the Milvus cluster you created if it is under-utilized. Below are some typical situations where you need to scale in the Milvus cluster:
46+
4547
- The CPU and memory utilization is low for a period of time.
4648
- The query throughput becomes lower.
4749
- Higher speed for indexing is not required.
@@ -74,13 +76,12 @@ my-release-minio-5564fbbddc-9sbgv 1/1 Running 0
7476
Milvus only supports adding the worker nodes and does not support adding the coordinator components.
7577
</div>
7678

77-
## Scale a Milvus cluster
79+
## Scale a Milvus cluster
7880

79-
You can scale in your Milvus cluster either manually or automatically. If autoscaling is enabled, the Milvus cluster will shrink or expand automatically when CPU and memory resources consumption reaches the value you have set.
81+
You can scale in your Milvus cluster either manually or automatically. For automatic scaling with Horizontal Pod Autoscaling (HPA), see [Configure HPA for Milvus](hpa.md). If autoscaling is enabled, the Milvus cluster will shrink or expand automatically when CPU and memory resources consumption reaches the value you have set.
8082

8183
Currently, Milvus 2.1.0 only supports scaling in and out manually.
8284

83-
8485
#### Scaling out
8586

8687
Run `helm upgrade my-release milvus/milvus --set queryNode.replicas=3 --reuse-values` to manually scale out the query node.
@@ -125,17 +126,16 @@ my-release-milvus-rootcoord-75585dc57b-cjh87 1/1 Running 0 2m
125126
my-release-minio-5564fbbddc-9sbgv 1/1 Running 0 2m
126127
```
127128

128-
129129
## What's next
130130

131131
- If you want to learn how to monitor the Milvus services and create alerts:
132-
- Learn [Monitor Milvus with Prometheus Operator on Kubernetes](monitor.md)
133132

133+
- Learn [Monitor Milvus with Prometheus Operator on Kubernetes](monitor.md)
134134
- If you are ready to deploy your cluster on clouds:
135+
135136
- Learn how to [Deploy Milvus on Amazon EKS with Terraform](eks.md)
136137
- Learn how to [Deploy Milvus Cluster on GCP with Kubernetes](gcp.md)
137138
- Learn how to [Deploy Milvus on Microsoft Azure With Kubernetes](azure.md)
138-
139139
- If you are looking for instructions on how to allocate resources:
140-
- [Allocate Resources on Kubernetes](allocate.md#standalone)
141140

141+
- [Allocate Resources on Kubernetes](allocate.md#standalone)

0 commit comments

Comments
 (0)