Skip to main content
Auto scaling of a group of nodes

Auto scaling of a group of nodes

For your information

Autoscaling is not available for node groups with GPUs without drivers.

In a Managed Kubernetes cluster on a cloud server, autoscaling of node groups is accomplished by Cluster Autoscaler. It helps to optimally utilize cluster resources — depending on the load on the cluster, the number of nodes in the group will automatically decrease or increase.

You do not need to install Cluster Autoscaler in the cluster.

Enable node group autoscaling can be accessed in the control panel, via API Managed Kubernetes or through Terraform.

Managed Kubernetes uses Managed Kubernetes to autoscale pods. Metrics Server.

Principle of operation

The minimum and maximum number of nodes in a group can be set when autoscaling is enabled — Cluster Autoscaler will only change the number of nodes within these limits.

If the node group is in the status ACTIVE, Cluster Autoscaler checks every 10 seconds to see if there are pods (Pod) in the status PENDINGand analyzes the load — requests from pods for vCPU, RAM and GPU. Depending on the results, nodes are added or removed. A group of nodes at this time goes into the status PENDING_SCALE_UP or PENDING_SCALE_DOWN. Cluster status during autoscaling is . ACTIVE.

Read more about cluster statuses in the instructions View cluster status.

Adding a node

If there are pods in the status PENDING and there are not enough free resources in the cluster to accommodate them, the necessary number of nodes will be added to the cluster. In a cluster with Kubernetes version 1.28 and higher, Cluster Autoscaler will work in several groups at once and distribute nodes evenly.

note

For example, you have two groups of nodes with autoscaling enabled. The load on the cluster has increased and requires the addition of four nodes. Two new nodes will be created in each node group at the same time.

In a cluster with Kubernetes version 1.27 and below, nodes are added one per validation cycle.

Deleting a node

If there are no pods in the status PENDING, Cluster Autoscaler checks the number of resources that pods request.

If the total number of resources requested by pods on a node is less than 50% of its resources, Cluster Autoscaler marks the node as unnecessary. If the number of resource requests on a node does not increase after 10 minutes, Cluster Autoscaler will check if pods can be moved to other nodes.

Cluster Autoscaler will not migrate pods and therefore will not delete a node if one of the conditions is met:

  • the pods are used PodDisruptionBudget;
  • there is no PodDisrptionBudget in Kube-system pods;
  • pods are created without a controller — for example, Deployment, ReplicaSet, StatefulSet;
  • Pods use local storage;
  • the other nodes don't have the resources for the pod's requests;
  • there is a mismatch of nodeSelector, affinity/anti-affinity rules or other parameters.

You can allow such submissions to carry over — add an annotation to do so:

cluster-autoscaler.kubernetes.io/safe-to-evict: "true"

If there are no restrictions, pods will be moved and low-loaded nodes will be removed. Nodes are removed one at a time per test cycle.

Recommendations

For optimal performance of Cluster Autoscaler, we recommend:

  • to make sure that the project has quotas on vCPU, RAM, GPU, and disk capacity to create the maximum number of nodes in a group;
  • to specify in the manifestos for the pods resource requests;
  • check that nodes in the group have the same configuration and labels;
  • set for floors for which stops are not allowed, PodDisruptionBudget. This will help avoid downtime when transferring between nodes;
  • do not use any other Cluster Autoscaler;
  • do not manually modify node resources through the control panel. Cluster Autoscaler will ignore these changes and all new nodes will be created with the original configuration.

Enable autoscaling

For your information

If you set the minimum number of nodes in a group to be greater than the current number, it will not scale to the lower limit immediately. The node group will be scaled only after the pods appear in the status PENDING. The same with the upper limit of nodes in the group — if the current number of nodes is greater than the upper limit, deletion will start only after checking the pods.

  1. В control panels from the top menu, press Products and select Managed Kubernetes.
  2. Open the cluster page → tab Cluster composition.
  3. On the menu node groups, select Change the number of nodes.
  4. In the field Number of nodes tab With auto scaling.
  5. Set the minimum and maximum number of nodes in the group — the value of nodes will change only within this range. For fault-tolerant operation of system components we recommend using at least two working nodes in the cluster, nodes can be in different groups.
  6. Click Save.