HorizontalPodAutoscaler
A HorizontalPodAutoscaler (HPA for short) automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand.
The Platon PaaS cluster is running the Metrics Server which allows for CPU/Memory based horizontal autoscaling.
Example
In this example (based on the Kubernetes HPA docs) the HPA controller will increase and decrease the number of replicas (by updating the Deployment) to maintain an average CPU utilization across all Pods of 50%:
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-test-deployment
namespace: platon-hpa-test-deployment
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-test-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
$ kubectl -n platon-hpa-test-deployment get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-test-deployment Deployment/hpa-test-deployment 0%/50% 1 10 0 6s
Limitations & considerations
- HPA’s does not work with DaemonSets
- Make sure you set suitable CPU and memory limits on your pods, or your pods may terminate frequently or you will waste cluster resources