Autoscaling in Kubernetes
Editor’s note: this post is part of a
Customers using Kubernetes respond to end user requests quickly and ship software faster than ever before. But what happens when you build a service that is even more popular than you planned for, and run out of compute? In
To understand better where autoscaling would provide the most value, let’s start with an example. Imagine you have a 24/7 production service with a load that is variable in time, where it is very busy during the day in the US, and relatively low at night. Ideally, we would want the number of nodes in the cluster and the number of pods in deployment to dynamically adjust to the load to meet end user demand. The new Cluster Autoscaling feature together with Horizontal Pod Autoscaler can handle this for you automatically. The following instructions apply to GCE. For GKE please check the autoscaling section in cluster operations manual available
Before we begin, we need to have an active GCE project with Google Cloud Monitoring, Google Cloud Logging and Stackdriver enabled. For more information on project creation, please read our
First, we set up a cluster with Cluster Autoscaler turned on. The number of nodes in the cluster will start at 2, and autoscale up to a maximum of 5. To implement this, we’ll export the following environment variables: and start the cluster by running: The kube-up.sh script creates a cluster together with Cluster Autoscaler add-on. The autoscaler will try to add new nodes to the cluster if there are pending pods which could schedule on a new node. Let’s see our cluster, it should have two nodes: To demonstrate autoscaling we will use a custom docker image based on php-apache server. The image can be found
First, we’ll start a deployment running the image and expose it as a service: Now, we will wait some time and verify that both the deployment and the service were correctly created and are running: We may now check that php-apache server works correctly by calling wget with the service's address: Now that the deployment is running, we will create a Horizontal Pod Autoscaler for it. To create it, we will use kubectl autoscale command, which looks like this: This defines a Horizontal Ppod Autoscaler that maintains between 1 and 10 replicas of the Pods controlled by the php-apache deployment we created in the first step of these instructions. Roughly speaking, the horizontal autoscaler will increase and decrease the number of replicas (via the deployment) so as to maintain an average CPU utilization across all Pods of 50% (since each pod requests 500 milli-cores by
We may check the current status of autoscaler by running: Please note that the current CPU consumption is 0% as we are not sending any requests to the server (the CURRENT column shows the average across all the pods controlled by the corresponding replication controller). Now, we will see how our autoscalers (Cluster Autoscaler and Horizontal Pod Autoscaler) react on the increased load of the server. We will start two infinite loops of queries to our server (please run them in different terminals): We need to wait a moment (about one minute) for stats to propagate. Afterwards, we will examine status of Horizontal Pod Autoscaler: Horizontal Pod Autoscaler has increased the number of pods in our deployment to 7. Let’s now check, if all the pods are running: As we can see, some pods are pending. Let’s describe one of pending pods to get the reason of the pending state: The pod is pending as there was no CPU in the system for it. We see there’s a TriggeredScaleUp event connected with the pod. It means that the pod triggered reaction of Cluster Autoscaler and a new node will be added to the cluster. Now we’ll wait for the reaction (about 3 minutes) and list all nodes: As we see a new node kubernetes-minion-group-6z5i was added by Cluster Autoscaler. Let’s verify that all pods are now running: After the node addition all php-apache pods are running! We will finish our example by stopping the user load. We’ll terminate both infinite while loops sending requests to the server and verify the result state: As we see, in the presented case CPU utilization dropped to 0, and the number of replicas dropped to 1. After deleting pods most of the cluster resources are unused. Scaling the cluster down may take more time than scaling up because Cluster Autoscaler makes sure that the node is really not needed so that short periods of inactivity (due to pod upgrade etc) won’t trigger node deletion (see
The number of nodes in our cluster is now two again as node kubernetes-minion-group-6z5i was removed by Cluster Autoscaler. As we have shown, it is very easy to dynamically adjust the number of pods to the load using a combination of Horizontal Pod Autoscaler and Cluster Autoscaler. However Cluster Autoscaler alone can also be quite helpful whenever there are irregularities in the cluster load. For example, clusters related to development or continuous integration tests can be less needed on weekends or at night. Batch processing clusters may have periods when all jobs are over and the new will only start in couple hours. Having machines that do nothing is a waste of money. In all of these cases Cluster Autoscaler can reduce the number of unused nodes and give quite significant savings because you will only pay for these nodes that you actually need to run your pods. It also makes sure that you always have enough compute power to run your tasks. -- Jerzy Szczepkowski and Marcin Wielgus, Software Engineers, GoogleBenefits of Autoscaling
Setting Up Autoscaling on GCE
export NUM\_NODES=2
export KUBE\_AUTOSCALER\_MIN\_NODES=2
export KUBE\_AUTOSCALER\_MAX\_NODES=5
export KUBE\_ENABLE\_CLUSTER\_AUTOSCALER=true
./cluster/kube-up.sh
$ kubectl get nodes
NAME STATUS AGE
kubernetes-master Ready,SchedulingDisabled 2m
kubernetes-minion-group-de5q Ready 2m
kubernetes-minion-group-yhdx Ready 1m
Run & Expose PHP-Apache Server
$ kubectl run php-apache \
--image=gcr.io/google\_containers/hpa-example \
--requests=cpu=500m,memory=500M --expose --port=80
service "php-apache" createddeployment "php-apache" created
$ kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
php-apache 1 1 1 1 49s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
php-apache-2046965998-z65jn 1/1 Running 0 30s
$ kubectl run -i --tty service-test --image=busybox /bin/sh
Hit enter for command prompt
$ wget -q -O- http://php-apache.default.svc.cluster.local
OK!
Starting Horizontal Pod Autoscaler
$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
$ kubectl get hpa
NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE
php-apache Deployment/php-apache/scale 50% 0% 1 20 14s
Raising the Load
$ kubectl run -i --tty load-generator --image=busybox /bin/sh
Hit enter for command prompt
$ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done
$ kubectl get hpa
NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE
php-apache Deployment/php-apache/scale 50% 310% 1 20 2m
$ kubectl get deployment php-apache
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
php-apache 7 7 7 3 4m
jsz@jsz-desk2:~/k8s-src$ kubectl get pods
php-apache-2046965998-3ewo6 0/1 Pending 0 1m
php-apache-2046965998-8m03k 1/1 Running 0 1m
php-apache-2046965998-ddpgp 1/1 Running 0 5m
php-apache-2046965998-lrik6 1/1 Running 0 1m
php-apache-2046965998-nj465 0/1 Pending 0 1m
php-apache-2046965998-tmwg1 1/1 Running 0 1m
php-apache-2046965998-xkbw1 0/1 Pending 0 1m
$ kubectl describe pod php-apache-2046965998-3ewo6
Name: php-apache-2046965998-3ewo6
Namespace: default
...
Events:
FirstSeen From SubobjectPath Type Reason Message
1m {default-scheduler } Warning FailedScheduling pod (php-apache-2046965998-3ewo6) failed to fit in any node
fit failure on node (kubernetes-minion-group-yhdx): Insufficient CPU
fit failure on node (kubernetes-minion-group-de5q): Insufficient CPU
1m {cluster-autoscaler } Normal TriggeredScaleUp pod triggered scale-up, mig: kubernetes-minion-group, sizes (current/new): 2/3
$ kubectl get nodes
NAME STATUS AGE
kubernetes-master Ready,SchedulingDisabled 9m
kubernetes-minion-group-6z5i Ready 43s
kubernetes-minion-group-de5q Ready 9m
kubernetes-minion-group-yhdx Ready 9m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
php-apache-2046965998-3ewo6 1/1 Running 0 3m
php-apache-2046965998-8m03k 1/1 Running 0 3m
php-apache-2046965998-ddpgp 1/1 Running 0 7m
php-apache-2046965998-lrik6 1/1 Running 0 3m
php-apache-2046965998-nj465 1/1 Running 0 3m
php-apache-2046965998-tmwg1 1/1 Running 0 3m
php-apache-2046965998-xkbw1 1/1 Running 0 3m
Stop Load
$ kubectl get hpa
NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE
php-apache Deployment/php-apache/scale 50% 0% 1 10 16m
$ kubectl get deployment php-apache
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
php-apache 1 1 1 1 14m
$ kubectl get nodes
NAME STATUS AGE
kubernetes-master Ready,SchedulingDisabled 37m
kubernetes-minion-group-de5q Ready 36m
kubernetes-minion-group-yhdx Ready 36m
Other use cases