-- Living Mobile --: Kubernetes - HPA autoscaler and replica set

Monday, October 6, 2025

In the below diagram, does how does HPA manages the deployment?

1. What HPA Does

The HPA watches the Deployment (or sometimes a StatefulSet, ReplicaSet, etc.) and:

Monitors metrics like CPU utilization, memory usage, or custom metrics.

Adjusts the .spec.replicas field in the Deployment automatically to keep those metrics within target thresholds.

2. How the Connection Works

Here’s the sequence:

You create a Deployment (e.g., dev-app) with an initial replica count (say 2).

You create an HPA resource that targets the Deployment by name:

apiVersion: autoscaling/v2

kind: HorizontalPodAutoscaler

metadata:

spec:

scaleTargetRef:

apiVersion: apps/v1

kind: Deployment

minReplicas: 2

maxReplicas: 10

metrics:

- type: Resource

resource:

target:

type: Utilization

averageUtilization: 60

The Kubernetes control plane (controller manager) continuously checks:

The current CPU usage of pods managed by dev-app.

If average CPU > 60%, the HPA increases .spec.replicas in the Deployment (e.g., from 2 → 4).

If usage drops, it scales down again (e.g., 4 → 2).

The Deployment controller then updates its ReplicaSet, which creates or deletes pods accordingly.

HPA does NOT deploy → Deployment

HPA monitors → Deployment’s metrics

HPA modifies → Deployment’s replica count

Deployment manages → ReplicaSet

ReplicaSet manages → Pods

-- Living Mobile --