Working with HPAs in Kubernetes
How to make your Kubernetes workloads scale with a few simple steps
What is an HPA?
HPAs (horizontal pod autoscalers) are one of the two ways to scale your services elastically within Kubernetes. In the event that your pod is under sufficient load, then you can scale up the number of pods in use. You can also scale down in the event that your pods are underutilized, thereby freeing up resources within your cluster.
The other way to scale, a VPA or vertical pod autoscaler, simply allocates more resources to your pods. We’ll ignore that for this deep dive.
Remember:
- Horizontal -> More replicas of the same thing
- Vertical -> A larger amount of resources for one thing
So how do we tell the HPA to scale up? Well, first we’ll need to have metrics about the pods themselves. That means we need.
The Metrics Server
The most basic way to get resources exported in a Kubernetes cluster is to run the metrics server. It exports metrics at a minimum of 15 seconds, so there’s a bit of a delay, but it’s good enough for our HPA needs.
The installation is very simple if you have cluster-admin privileges.