vertical pod autoscaler
With applications running on Kubernetes it’s important to properly set the CPU resources and Memory resources. But it could be hard to find and choose the proper values for cpu
and memory
.
Resource request is a contract between your workload and the Kubernetes scheduler.
Using monitoring tools with Kubernetes clusters for getting insights about the usage of cpu
and memory
of application could help. Complementary to this, kubectl top pods
to get more insights may help too. In the meantime, you could run load tests and simulate more usage of your applications, the more close to what you could have in Production with high traffic, the better.
I recently discovered that Vertical Pod Autoscaler is better at it since that’s its job.
Setting resource request and limit is hard, VPA is here to help. Observes usage, Recommends resources and Updates resources (if
Auto
mode).
Let’s now see in actions how easy it is to leverage VPA on a GKE cluster.
First enable VPA:
# For a new cluster
gcloud container clusters create --enable-vertical-pod-autoscaling
# For an existing cluster
gcloud container clusters update --enable-vertical-pod-autoscaling
Then deploy a VPA resource for your specific application:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myblog
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: myblog
updatePolicy:
updateMode: "Off"
And finally, after some time and load tests with your applications, you could discover those metrics by running kubectl describe vpa myblog
:
...
Recommendation:
Container Recommendations:
Container Name: myblog
Lower Bound:
Cpu: 3m
Memory: 4194304
Target:
Cpu: 6m
Memory: 5242880
Uncapped Target:
Cpu: 6m
Memory: 5242880
Upper Bound:
Cpu: 6m
Memory: 5242880
...
In this illustration, I’m using the updateMode: "Off"
, with that I need to manually apply those numbers on my Deployment
manifest. Lower Bound
could be used to set the requests
numbers (but you may want to use Target
to be more conservative). Upper Bound
could be used to set the limits
numbers. So with the numbers gotten above, here is what the Deployment
manifest will look like to represent them:
...
resources:
requests:
cpu: 3m
memory: 4Mi
limits:
cpu: 6m
memory: 5Mi
...
Notes: those numbers could be applied automatically and continuously if you are using updateMode: "Auto"
instead. Furthermore, there is also some known limitations with VPA to be aware of.
Complementary and further resources:
- VPA on GKE
- Autoscaling with GKE: Overview and pods
- Best practices for running cost-optimized Kubernetes applications on GKE
- Kubernetes Autoscaling 101: Cluster Autoscaler, Horizontal Autoscaler, and Vertical Pod Autoscaler
Hope you enjoyed that blog article and that you are now more equiped to properly set your Kubernetes resources request and limits for your own applications.
Cheers! ;)