Working with HPAs in Kubernetes

Matt Kornfield
5 min readFeb 8, 2023

How to make your Kubernetes workloads scale with a few simple steps

Photo by Pawel Czerwinski on Unsplash

What is an HPA?

HPAs (horizontal pod autoscalers) are one of the two ways to scale your services elastically within Kubernetes. In the event that your pod is under sufficient load, then you can scale up the number of pods in use. You can also scale down in the event that your pods are underutilized, thereby freeing up resources within your cluster.

The other way to scale, a VPA or vertical pod autoscaler, simply allocates more resources to your pods. We’ll ignore that for this deep dive.

Remember:

  • Horizontal -> More replicas of the same thing
  • Vertical -> A larger amount of resources for one thing

So how do we tell the HPA to scale up? Well, first we’ll need to have metrics about the pods themselves. That means we need.

The Metrics Server

The most basic way to get resources exported in a Kubernetes cluster is to run the metrics server. It exports metrics at a minimum of 15 seconds, so there’s a bit of a delay, but it’s good enough for our HPA needs.

The installation is very simple if you have cluster-admin privileges.

--

--

Matt Kornfield
Matt Kornfield

Written by Matt Kornfield

Today's solutions are tomorrow's debugging adventure.

No responses yet