书名：Hands-On Microservices with Kubernetes
作者名：Gigi Sayfan
本章字数：243字
更新时间：2021-06-24 13:46:34

Scaling microservices

There are two aspects to scaling a microservice with Kubernetes. The first aspect is scaling the number of pods backing up a particular microservice. The second aspect is the total capacity of the cluster. You can easily scale a microservice explicitly by updating the number of replicas of a deployment, but that requires constant vigilance on your part. For services that have large variations in the volume of requests they handle over long periods (for example, business hours versus off hours or week days versus weekends), it might take a lot of effort. Kubernetes provides horizontal pod autoscaling, which is based on CPU, memory, or custom metrics, and can scale your service up and down automatically.

Here is how to scale our nginx deployment that is currently fixed at three replicas to go between 2 and 5, depending on the average CPU usage across all instances:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
    name: nginx
    namespace: default
spec:
    maxReplicas: 5
    minReplicas: 2
    targetCPUUtilizationPercentage: 90
    scaleTargetRef:
      apiVersion: v1
      kind: Deployment
      name: nginx

The outcome is that Kubernetes will watch CPU utilization of the pods that belong to the nginx deployment. When the average CPU over a certain period of time (5 minutes, by default) exceeds 90%, it will add more replicas until the maximum of 5, or until utilization drops below 90%. The HPA can scale down too, but will always maintain a minimum of two replicas, even if the CPU utilization is zero.

本周热推：

CSS全程指南绝美Maya Arduino &乐高创意机器人制作教程网络服务搭建、配置与管理大全（Linux版）Java编程全能词典