Release new docs to master

Milvus-doc-bot · Milvus-doc-bot · commit dfe2fbb47cdd · 2025-03-26T02:39:04.000Z
diff --git a/v2.5.x/site/en/adminGuide/hpa.md b/v2.5.x/site/en/adminGuide/hpa.md
@@ -0,0 +1,129 @@
+---
+id: hpa.md
+related_key: scale Milvus cluster with HPA
+summary: Learn how to configure Horizontal Pod Autoscaling (HPA) to dynamically scale a Milvus cluster.
+title: Configure Horizontal Pod Autoscaling (HPA) for Milvus
+---
+# Configure Horizontal Pod Autoscaling (HPA) for Milvus
+
+## Overview
+
+Horizontal Pod Autoscaling (HPA) is a Kubernetes feature that automatically adjusts the number of Pods in a deployment based on resource utilization, such as CPU or memory. In Milvus, HPA can be applied to stateless components like `proxy`, `queryNode`, `dataNode`, and `indexNode` to dynamically scale the cluster in response to workload changes.
+
+This guide explains how to configure HPA for Milvus components using the Milvus Operator.
+
+## Prerequisites
+
+- A running Milvus cluster deployed with Milvus Operator.
+- Access to `kubectl` for managing Kubernetes resources.
+- Familiarity with Milvus architecture and Kubernetes HPA.
+
+## Configure HPA with Milvus Operator
+
+To enable HPA in a Milvus cluster managed by the Milvus Operator, follow these steps:
+
+1. **Set Replicas to -1**:
+
+   In the Milvus custom resource (CR), set the `replicas` field to `-1` for the component you want to scale with HPA. This delegates scaling control to HPA instead of the operator. You can edit the CR directly or use the following `kubectl patch` command to quickly switch to HPA control:
+
+   ```bash
+   kubectl patch milvus <your-release-name> --type='json' -p='[{"op": "replace", "path": "/spec/components/proxy/replicas", "value": -1}]'
+   ```
+
+   Replace `<your-release-name>` with the name of your Milvus cluster.
+
+   To verify that the change has been applied, run:
+
+   ```bash
+   kubectl get milvus <your-release-name> -o jsonpath='{.spec.components.proxy.replicas}'
+   ```
+
+   The expected output should be `-1`, confirming that the `proxy` component is now under HPA control.
+
+   Alternatively, you can define it in the CR YAML:
+
+   ```yaml
+   apiVersion: milvus.io/v1beta1
+   kind: Milvus
+   metadata:
+     name: <your-release-name>
+   spec:
+     mode: cluster
+     components:
+       proxy:
+         replicas: -1
+   ```
+2. **Define an HPA Resource**:
+
+   Create an HPA resource to target the deployment of the desired component. Below is an example for the `proxy` component:
+
+   ```yaml
+   apiVersion: autoscaling/v2
+   kind: HorizontalPodAutoscaler
+   metadata:
+     name: my-release-milvus-proxy-hpa
+   spec:
+     scaleTargetRef:
+       apiVersion: apps/v1
+       kind: Deployment
+       name: my-release-milvus-proxy
+     minReplicas: 2
+     maxReplicas: 10
+     metrics:
+       - type: Resource
+         resource:
+           name: cpu
+           target:
+             type: Utilization
+             averageUtilization: 60
+       - type: Resource
+         resource:
+           name: memory
+           target:
+             type: Utilization
+             averageUtilization: 60
+     behavior:
+       scaleUp:
+         policies:
+           - type: Pods
+             value: 1
+             periodSeconds: 30
+       scaleDown:
+         stabilizationWindowSeconds: 300
+         policies:
+           - type: Pods
+             value: 1
+             periodSeconds: 60
+   ```
+
+   Replace `my-release` in `metadata.name` and `spec.scaleTargetRef.name` with your actual Milvus cluster name (e.g., `<your-release-name>-milvus-proxy-hpa` and `<your-release-name>-milvus-proxy`).
+3. **Apply the HPA Configuration**:
+
+   Deploy the HPA resource using the following command:
+
+   ```bash
+   kubectl apply -f hpa.yaml
+   ```
+
+   To verify that the HPA has been successfully created, run:
+
+   ```bash
+   kubectl get hpa
+   ```
+
+   You should see output similar to:
+
+   ```
+   NAME                          REFERENCE                            TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
+   my-release-milvus-proxy-hpa   Deployment/my-release-milvus-proxy   <some>/60%      2         10        2          <time>
+   ```
+
+   The `NAME` and `REFERENCE` fields will reflect your cluster name (e.g., `<your-release-name>-milvus-proxy-hpa` and `Deployment/<your-release-name>-milvus-proxy`).
+
+- `scaleTargetRef`: Specifies the deployment to scale (e.g., `my-release-milvus-proxy`).
+- `minReplicas` and `maxReplicas`: Sets the scaling range (2 to 10 Pods in this example).
+- `metrics`: Configures scaling based on CPU and memory utilization, targeting 60% average usage.
+
+## Conclusion
+
+HPA allows Milvus to efficiently adapt to varying workloads. By using the `kubectl patch` command, you can quickly switch a component to HPA control without manually editing the full CR. For more details, refer to the [Kubernetes HPA documentation](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/).
diff --git a/v2.5.x/site/en/adminGuide/scaleout.md b/v2.5.x/site/en/adminGuide/scaleout.md
@@ -4,12 +4,11 @@ related_key: scale Milvus cluster
 summary: Learn how to manually or automatically scale out and scale in a Milvus cluster.
 title: Scale a Milvus Cluster
 ---
-
 # Scale a Milvus Cluster
 
-Milvus supports horizontal scaling of its components. This means you can either increase or decrease  the number of worker nodes of each type according to your own need. 
+Milvus supports horizontal scaling of its components. This means you can either increase or decrease  the number of worker nodes of each type according to your own need.
 
-This topic describes how to scale out and scale in a Milvus cluster. We assume that you have already [installed a Milvus cluster](install_cluster-helm.md) before scaling. Also, we recommend familiarizing yourself with the [Milvus architecture](architecture_overview.md) before you begin.  
+This topic describes how to scale out and scale in a Milvus cluster. We assume that you have already [installed a Milvus cluster](install_cluster-helm.md) before scaling. Also, we recommend familiarizing yourself with the [Milvus architecture](architecture_overview.md) before you begin.
 
 This tutorial takes scaling out three query nodes as an example. To scale out other types of nodes, replace `queryNode` with the corresponding node type in the command line.
 
@@ -23,8 +22,9 @@ For information on how to scale a cluster with Milvus Operator, refer to [Scale
 
 Horizontal scaling includes scaling out and scaling in.
 
-### Scaling out 
-Scaling out refers to increasing the number of nodes in a cluster. Unlike scaling up, scaling out does not require you to allocate more resources to one node in the cluster. Instead, scaling out expands the cluster horizontally by adding more nodes. 
+### Scaling out
+
+Scaling out refers to increasing the number of nodes in a cluster. Unlike scaling up, scaling out does not require you to allocate more resources to one node in the cluster. Instead, scaling out expands the cluster horizontally by adding more nodes.
 
 ![Scaleout](../../../assets/scale_out.jpg "Scaleout illustration.")
 
@@ -33,15 +33,17 @@ Scaling out refers to increasing the number of nodes in a cluster. Unlike scalin
 According to the [Milvus architecture](architecture_overview.md), stateless worker nodes include query node, data node, index node, and proxy. Therefore, you can scale out these type of nodes to suit your business needs and application scenarios. You can either scale out the Milvus cluster manually or automatically.
 
 Generally, you will need to scale out the Milvus cluster you created if it is over-utilized. Below are some typical situations where you may need to scale out the Milvus cluster:
+
 - The CPU and memory utilization is high for a period of time.
 - The query throughput becomes higher.
 - Higher speed for indexing is required.
 - Massive volumes of large datasets need to be processed.
 - High availability of the Milvus service needs to be ensured.
 
-
 ### Scaling in
+
 Scaling in refers to decreasing the number of nodes in a cluster. Generally, you will need to scale in the Milvus cluster you created if it is under-utilized. Below are some typical situations where you need to scale in the Milvus cluster:
+
 - The CPU and memory utilization is low for a period of time.
 - The query throughput becomes lower.
 - Higher speed for indexing is not required.
@@ -74,13 +76,12 @@ my-release-minio-5564fbbddc-9sbgv               1/1     Running      0
 Milvus only supports adding the worker nodes and does not support adding the coordinator components.
 </div>
 
-## Scale a Milvus cluster 
+## Scale a Milvus cluster
 
-You can scale in your Milvus cluster either manually or automatically. If autoscaling is enabled, the Milvus cluster will shrink or expand automatically when CPU and memory resources consumption reaches the value you have set. 
+You can scale in your Milvus cluster either manually or automatically. For automatic scaling with Horizontal Pod Autoscaling (HPA), see [Configure HPA for Milvus](hpa.md). If autoscaling is enabled, the Milvus cluster will shrink or expand automatically when CPU and memory resources consumption reaches the value you have set.
 
 Currently, Milvus 2.1.0 only supports scaling in and out manually.
 
-
 #### Scaling out
 
 Run `helm upgrade my-release milvus/milvus --set queryNode.replicas=3 --reuse-values` to manually scale out the query node.
@@ -125,17 +126,16 @@ my-release-milvus-rootcoord-75585dc57b-cjh87    1/1     Running   0          2m
 my-release-minio-5564fbbddc-9sbgv               1/1     Running   0          2m
 ```
 
-
 ## What's next
 
 - If you want to learn how to monitor the Milvus services and create alerts:
-  - Learn [Monitor Milvus with Prometheus Operator on Kubernetes](monitor.md)
 
+  - Learn [Monitor Milvus with Prometheus Operator on Kubernetes](monitor.md)
 - If you are ready to deploy your cluster on clouds:
+
   - Learn how to [Deploy Milvus on Amazon EKS with Terraform](eks.md)
   - Learn how to [Deploy Milvus Cluster on GCP with Kubernetes](gcp.md)
   - Learn how to [Deploy Milvus on Microsoft Azure With Kubernetes](azure.md)
-
 - If you are looking for instructions on how to allocate resources:
-  - [Allocate Resources on Kubernetes](allocate.md#standalone)
 
+  - [Allocate Resources on Kubernetes](allocate.md#standalone)