Autoscaling policy.
| Field | Type | Required | Description |
|---|---|---|---|
cooldown_period |
Optional[int] | ➖ | Determines how long the endpoint waits before scaling down after the last request. |
max_replica |
Optional[int] | ➖ | The maximum replicas that the endpoint can scale up to. The maximum value is 10. |
min_replica |
Optional[int] | ➖ | Setting minReplica to 0 allows the endpoint to sleep when idle, reducing costs. The minimum value is 0. |