vertical pod autoscaler

With applications running on Kubernetes it’s important to properly set the CPU resources and Memory resources. But it could be hard to find and choose the proper values for cpu and memory.

Resource request is a contract between your workload and the Kubernetes scheduler.

Using monitoring tools with Kubernetes clusters for getting insights about the usage of cpu and memory of application could help. Complementary to this, kubectl top pods to get more insights may help too. In the meantime, you could run load tests and simulate more usage of your applications, the more close to what you could have in Production with high traffic, the better.

I recently discovered that Vertical Pod Autoscaler is better at it since that’s its job.

Setting resource request and limit is hard, VPA is here to help. Observes usage, Recommends resources and Updates resources (if Auto mode).

Let’s now see in actions how easy it is to leverage VPA on a GKE cluster.

First enable VPA:

# For a new cluster
gcloud container clusters create --enable-vertical-pod-autoscaling

# For an existing cluster
gcloud container clusters update --enable-vertical-pod-autoscaling

Then deploy a VPA resource for your specific application:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myblog
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: myblog
  updatePolicy:
    updateMode: "Off"

And finally, after some time and load tests with your applications, you could discover those metrics by running kubectl describe vpa myblog:

...
Recommendation:
    Container Recommendations:
      Container Name:  myblog
      Lower Bound:
        Cpu:     3m
        Memory:  4194304
      Target:
        Cpu:     6m
        Memory:  5242880
      Uncapped Target:
        Cpu:     6m
        Memory:  5242880
      Upper Bound:
        Cpu:     6m
        Memory:  5242880
...

In this illustration, I’m using the updateMode: "Off", with that I need to manually apply those numbers on my Deployment manifest. Lower Bound could be used to set the requests numbers (but you may want to use Target to be more conservative). Upper Bound could be used to set the limits numbers. So with the numbers gotten above, here is what the Deployment manifest will look like to represent them:

...
          resources:
            requests:
              cpu: 3m
              memory: 4Mi
            limits:
              cpu: 6m
              memory: 5Mi
...

Notes: those numbers could be applied automatically and continuously if you are using updateMode: "Auto" instead. Furthermore, there is also some known limitations with VPA to be aware of.

Complementary and further resources:

Hope you enjoyed that blog article and that you are now more equiped to properly set your Kubernetes resources request and limits for your own applications.

Cheers! ;)