tutorial-cloud/blog/2019-05-29-blog-6.md

---
slug: sixth-blog-post
title: Sixth Blog Post
authors: [slorber, yangshun]
tags: [hola, docusaurus]
---

<!-- truncate -->

import CodeBlock from '@site/src/components/CodeBloack';

<br/>
<span className="main-head">Automatically Adjusting App Size in Kubernetes with Horizontal PodAutoscale</span>

<div className="head">Introduction</div>

<div className="text">In the real world, when you run your applications, you want them to be strong and able to handle lots of users without crashing. But you also want to save money by not using more resources than you need. So, you need a tool that can make your applications get bigger or smaller depending on how many people are using them.</div><br/>

<div className="text">Usually, when you set up your apps in Kubernetes, you decide on a fixed number of copies to run all the time. But in real-life situations, like when many people visit your website or use your app, this fixed setup might not work well. It's like trying to fit everyone into the same-sized room, even when the crowd keeps changing. For example, if your website gets more visitors in the afternoon or during holidays, it could slow down or crash. This happens because there aren't enough copies of your app to handle all the requests. Manually changing the number of copies each time this happens can be a hassle. Plus, it's not efficient because you'd end up using more resources than you need, costing you more money. So, it's better to have a way for your apps to automatically adjust their size based on how busy they are. That way, they can handle more visitors when needed and use fewer resources when things calm down.</div><br/>

<div className="text">This is where Horizontal Pod Autoscaling comes in! It's a clever system that automatically adjusts your app's size to handle more or fewer visitors, saving you money. Think of it as a tool that watches how busy your app is and adds more capacity when needed, or removes it when things calm down. So, you don't have to worry about manually changing things all the time.</div><br/>

<div className="text">Horizontal pod scaling focuses on adding or removing resources for your app, while vertical pod scaling focuses on making sure each resource has enough capacity to run smoothly.</div>

<div className="head">How Horizontal Pod Autoscaling Keeps Your Apps Running Smoothly</div>

<div className="text"><ol><li>Watching Your App: HPA keeps an eye on how much your app is using resources like CPU and memory.</li><li>Comparing and Deciding: It compares how many resources your app is using with a target value you set. If your app is using more resources than it should, HPA knows it needs to do something.</li><li>Scaling Up or Down: If your app is using a lot of resources, HPA can make your app bigger by adding more parts to it. This helps your app handle more users without slowing down. But if your app isn't using many resources, HPA can make your app smaller by removing some parts. This saves resources and money.</li></ol></div>

<div className="text">If you want to know more about how HPA decides when to make your app bigger or smaller, you can check out the details in the official documentation.</div><br/>

<div className="text">Behind the scenes, a HorizontalPodAutoscaler is just another CRD (Custom Resource Definition) that powers a special feature in Kubernetes. It works like a helpful tool that tells Kubernetes when to make your apps bigger or smaller based on how busy they are.To use it, you create a HorizontalPodAutoscaler file that tells Kubernetes which app you want to scale. Then, you use a command called kubectl to apply that file.Just remember, you can't use HorizontalPodAutoscaler with every type of app. For example, you can't use it with things like DaemonSets.</div><br/>


<div className="text">To work properly, HPA needs a metrics server in your cluster. This server keeps track of important details like how much CPU and memory your apps are using. One popular option for this server is the Kubernetes Metrics Server.
The metrics server is like an extra tool that adds more capabilities to the Kubernetes system. Basically, it works by gathering information about how much of your computer's resources, like CPU and memory, your apps are using. Then, it makes this information available to other parts of Kubernetes, like the Horizontal Pod Autoscaler.You can also use the metrics server with a command called kubectl top to check on how your apps are doing. This can be helpful for fixing any issues with autoscaling.</div><br/>

<div className="text">Be sure to take a look at the main documentation to understand why the metrics server is important.</div> <br/>

<div className="text">If you want to use metrics other than CPU or memory, like counting how many times your app is used, you can use something called Prometheus with a special adapter called prometheus-adapter. This lets you change the size of your apps based on these different metrics, not just how much CPU and memory they're using.</div> <br/>

<div className="text" style={{ fontSize:'28px' }}>This guide will help you:</div>

<div className="text"><ol><li>Set up the Kubernetes Metrics Server in your DOKS cluster.</li><li>Learn the key ideas and how to make HPAs for your apps.</li><li>Test each HPA setup with two situations: when your app load stays the same, and when it changes.</li><li>Set up and use the Prometheus Adapter to change app sizes based on different metrics.</li></ol></div>

<div className="head">Before starting this tutorial, make sure you have:</div>

<div className="text"><ol><li>A Git client installed on your computer to download the Starter Kit repository.</li><li>Helm installed, which helps with managing Kubernetes Metrics Server installations and updates.</li><li>Kubectl installed for interacting with Kubernetes. Make sure it's set up and ready to use by following the instructions here to connect to your cluster.</li></ol></div>

<div className="text">Next, we'll show you how to deploy the Kubernetes Metrics Server using the Helm package manager.</div>

<div className="head">Setting Up the Kubernetes Metrics Server</div>

<div className="text"><ol><li>Using a single kubectl command, you can set up the Kubernetes Metrics Server in high availability mode.<br/><CodeBlock code={`kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability.yaml`} /><br/> </li><li>Alternatively, you can use Helm to deploy the metrics-server chart to your cluster.</li></ol></div>

<div className="text">In this tutorial, we're using the Helm installation method because it's more flexible. You can easily adjust settings later if needed, like making sure everything stays up and running smoothly.</div><br/>

<div className="text">Keep in mind, setting up the metrics server needs some special permissions. For all the details on what you need, take a look at the official requirements page.</div><br/>

<div className="text">Here's how to deploy the metrics server using Helm:</div>

<div className="text"><ol><li>Start by downloading the Starter Kit repository and open it on your computer. <br/><CodeBlock code={`git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git
cd Kubernetes-Starter-Kit-Developers
`} /><br/> </li><li>Next, add the metrics-server Helm repository to your Helm configuration. Then, check the available charts. <br/> <CodeBlock code={`helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server
helm repo update metrics-server
helm search repo metrics-server
`} /><br/></li><li>The result you see will look something like this: <br/><CodeBlock code={`OutputNAME                       CHART VERSION     APP VERSION          DESCRIPTION
metrics-server/metrics-server        3.8.2           0.6.1          Metrics Server is a scalable, efficient source …
`} /><br/></li><br/> <div className="note"><strong>Note</strong>:  The chart we're interested in is called metrics-server/metrics-server. This will install the Kubernetes metrics server on your cluster. For more information, you can visit the metrics-server chart page. Next, open and take a look at the metrics-server Helm values file provided in the Starter Kit repository. You can use any text editor you like, but it's best if it supports YAML linting. For example, you can use VS Code.</div><br/><CodeBlock code={`code 07-scaling-application-workloads/assets/manifests/metrics-server-values-v3.8.2.yaml
`} /><br/><li>Lastly, you'll install the Kubernetes Metrics Server using Helm. This will also create a special namespace just for the metrics-server.</li></ol></div>

<CodeBlock code={`HELM_CHART_VERSION="3.8.2
helm install metrics-server metrics-server/metrics-server --version '$HELM_CHART_VERSION' \
--namespace metrics-server \
--create-namespace \
--f "07-scaling-application-workloads/assets/manifests/metrics-server-values-v$ {HELM_CHART_VERSION}.yaml"
`} /><br/>

<div className="note"><strong>Note</strong>: We're using a specific version of the metrics-server Helm chart. In this case, we chose version 3.8.2, which corresponds to the 0.6.1 release of the metrics-server application (you can find this information in the output from Step 2). It's generally a good idea to specify a particular version like this. It helps ensure that your setup remains consistent and makes it easier to manage changes over time using tools like Git.</div><br/>

<div className="text" style={{ fontSize:'28px' }}>Deployment Status</div>
<div className="text">You can check the status of the metrics-server deployment by</div>

<CodeBlock code={`helm ls -n metrics-server`} /><br/>

<div className="text">The result you see will look something like this (make sure to check that the STATUS column says "deployed"):-</div>


<CodeBlock code={`OutputNAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
metrics-server      metrics-server  1               2022-02-24 14:58:23.785875 +0200 EET      deployed        metrics-server-3.8.2    0.6.1
`} /><br/>

<div className="text">Next, take a look at the status of all the resources in the metrics-server namespace in Kubernetes:</div>

<CodeBlock code={`kubectl get all -n metrics-server`} /><br/>

<div className="text">The result will look something like this (make sure the deployment and replicaset resources are healthy, and there are two of each):</div>

<CodeBlock code={`OutputNAME                           READY    STATUS    RESTARTS   AGE
pod/metrics-server-694d47d564-9sp5h   1/1     Running   0          8m54s
pod/metrics-server-694d47d564-cc4m2   1/1     Running   0          8m54s


NAME                     TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/metrics-server   ClusterIP   10.245.92.63   <none>        443/TCP   8m54s


NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/metrics-server   2/2     2            2           8m55s


NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/metrics-server-694d47d564   2         2         2       8m55s
`} /><br/>


<div className="text">Lastly, let's see if the kubectl top command works. It's similar to the Linux top command, which shows the current usage of resources like CPU and memory. The command below will display the current resource usage for all Pods in the kube-system namespace:</div>


<CodeBlock code={`kubectl top pods -n kube-system`} /><br/>

<div className="text">The result you'll see will look something like this (CPU usage is shown in millicores, and memory usage is shown in Mebibytes):</div>


<CodeBlock code={`OutputNAME                               CPU(cores)   MEMORY(bytes)
cilium-operator-5db58f5b89-7kptj            2m           35Mi
cilium-r2n9t                                4m           150Mi
cilium-xlqkp                                9m           180Mi
coredns-85d9ccbb46-7xnkg                    1m           21Mi
coredns-85d9ccbb46-gxg6d                    1m           20Mi
csi-do-node-p6lq4                           1m           19Mi
csi-do-node-rxd6m                           1m           21Mi
do-node-agent-2z2bc                         0m           15Mi
do-node-agent-ppds8                         0m           21Mi
kube-proxy-l9ddv                            1m           25Mi
kube-proxy-t6c29                            1m           30Mi
`} /><br/>


<div className="text">If you see a similar output as shown above, then you've set up the metrics server correctly. In the next step, we'll show you how to set up and test your first HorizontalPodAutoscaling resource.</div>


<div className="head">Introduction to Horizontal Pod Autoscalers (HPAs)</div>

<div className="text">Up until now, whenever you made a new setup for your application in Kubernetes, you set a fixed number of copies for it to run. This might be okay for basic situations or when you're just testing things out. But now, let's talk about something called a Horizontal Pod Autoscaler (HPA) which can make things more flexible.</div><br/>

<div className="text">With an HPA, Kubernetes can automatically adjust the number of copies of your application based on how much it's being used. Think of it like this: if your app suddenly gets really popular and needs more copies to handle all the users, the HPA can make that happen without you having to manually change anything.</div>

<CodeBlock code={`apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app-deployment
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50
`} /><br/>

<div className="text" style={{ fontSize:'28px' }}>Breaking Down the Configuration:</div>
<div className="text"><ol><li>spec.scaleTargetRef: This tells Kubernetes what to keep an eye on and adjust. In our case, it's our deployment, my-app-deployment.</li><br/><li>spec.minReplicas: This is the smallest number of copies of our app that Kubernetes will keep running. We've set it to 1, so there's always at least one copy running, even if the app isn't busy.</li><br/><li>spec.maxReplicas: This is the largest number of copies Kubernetes will make. We've set it to 10, so even if lots of people start using our app, Kubernetes won't make too many copies and overload things</li><br/><li>spec.metrics.type: This tells Kubernetes what information to use when deciding if it needs to make more copies. In our example, we're using the "Resource" type, which means Kubernetes looks at things like how much CPU our app is using. If it goes over 50%, Kubernetes will make more copies to handle the extra load.</li></ol></div><br/>

<div className="text">Next, there are two ways you can set up an HPA for your application deployment:</div>

<div className="text"><ol><li>Using kubectl command: If you already have a deployment set up, you can use a command called kubectl autoscale to add an HPA to it.</li><li>Creating a YAML file: Alternatively, you can create a simple text file (called a YAML file) that describes your HPA settings. Then, you use another command, kubectl apply, to make those changes happen in your Kubernetes cluster.</li></ol></div>


<div className="text">If you want to quickly test something without dealing with complex files, you can use the first option. Let's try it with an example from the Starter Kit.</div>

<div className="text"><ol><li>First, if you haven't already, you'll need to make a copy of the Starter Kit repository on your computer. Then, go to the folder where you saved it.<br/> <CodeBlock code={`git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git
cd Kubernetes-Starter-Kit-Developers
`} /><br/></li><li>Next, let's create a deployment called "myapp-test". This deployment is meant to do something that uses up CPU, like printing a message over and over again without stopping.<br/><CodeBlock code={`kubectl apply -f https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers/blob/main/07-scaling-application-workloads/assets/manifests/hpa/metrics-server/myapp-test.yaml`} /><br/></li><li>Lastly, we'll make a Horizontal Pod Autoscaler (HPA) that focuses on our "myapp-test" deployment. This HPA helps adjust the number of copies of our app based on how much CPU it's using.<br/><CodeBlock code={`kubectl autoscale deployment myapp-test --cpu-percent=50 --min=1 --max=3`} /><br/></li></ol></div>

<div className="text">Here's what we're doing: We're asking Kubernetes to create something called an HPA for us. This HPA will keep an eye on our "myapp-test" deployment. When the average CPU usage of our app hits 50%, the HPA will automatically adjust the number of copies of our app. It won't make more than 3 copies, but it'll always keep at least 1 copy running. You can check if the HPA was created by running:</div>


<CodeBlock code={`kubectl get hpa`} /><br/>

<div className="text">Here's what you might see when you check if the HPA was created successfully: In the output, you'll notice a column labeled "TARGETS" which shows a value of 50%. This means that the HPA is set to keep the average CPU usage of our app at 50%. However, you might also see a higher number, like 240%, which represents the current CPU usage. This just means that our app is currently using more CPU than the target set by the HPA.</div>


<CodeBlock code={`OutputNAME         REFERENCE                  TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
myapp-test   Deployment/myapp-test             240%/50%     1         3         3       52s
`} /><br/>

<div className="text">Here's what you can expect: After creating the HPA, you might notice that for a short period, usually about 15 seconds, the "TARGETS" column will show "&lt;unknown>/50%". This is completely normal. It just means that the HPA is gathering information about how much CPU our app is using and calculating the average over time. By default, the HPA checks these metrics every 15 seconds.</div>

<div className="text">If you want to see what's happening behind the scenes, you can check the events that the HPA generates by using:</div>

<CodeBlock code={`kubectl describe hpa myapp-test`} /><br/>

<div className="text">Here's what you might see in the output: Check out the list of events, and you'll notice something interesting. You'll see how the HPA is doing its job by automatically increasing the number of copies of our app.</div>


<CodeBlock code={`OutputName:                                            myapp-test
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Mon, 28 Feb 2022 10:10:50 +0200
Reference:                                             Deployment/myapp-test
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  240% (48m) / 50%
Min replicas:                                          1
Max replicas:                                          3
Deployment pods:                                       3 current / 3 desired
...
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  17s   horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  37s   horizontal-pod-autoscaler  New size: 3; reason: cpu resource utilization (percentage of request) above target
`} /><br/>

<div className="text">In a real-world situation, it's better to use a specific YAML file to create each HPA. This makes it easier to keep track of changes by saving the file in a Git repository. It also allows you to revisit the file later if you need to make any adjustments.</div>

<div className="text">Before we move on to the next step, let's remove the "myapp-test" deployment and the HPA that we created earlier. This will clean up our environment and get it ready for the next stage.</div>

<CodeBlock code={`kubectl delete hpa myapp-test
kubectl delete deployment myapp-test
`} /><br/>

<div className="text">Next, we're going to see how HPAs work in two different situations:</div>

<div className="text"><ol><li>Steady CPU Usage: Here, the application will be doing tasks that use up a lot of CPU power consistently.</li><li>Simulated Traffic: In this case, we'll pretend that there's a sudden surge of people using our application by sending a lot of requests to it over the internet. This will help us understand how the HPA reacts to increased demand.</li></ol></div>

<div className="head">Letting Apps Grow Automatically with Metrics Server</div>

<div className="text">During this step, we'll see how HPAs work in two scenarios:</div>
<div className="text"><ol><li>Active Workload: We'll have an app that's consistently performing tasks that require a lot of computer power.</li><li>High Volume Usage: We'll simulate a scenario where many users are accessing our web app by sending it a high volume of rapid requests using a script.</li></ol></div><br/>

<div className="text" style={{ fontSize:'28px' }}>Scenario 1 - Keeping Busy with CPU-Intensive Tasks</div><br/>

<div className="text">In this scenario, we'll create a basic program using Python. This program will stay busy by doing tasks that require a lot of computer power. Below is the Python code:</div>

<CodeBlock code={`import math

while True:
  x = 0.0001
  for i in range(1000000):
    x = x + math.sqrt(x)
    print(x)
  print("OK!")
`} /><br/>


<div className="text">You can deploy the code using a file called "constant-load-deployment-test" from the Starter Kit repository. This file sets up everything needed for your program to run.</div>

<div className="text">To get started, first, you need to copy the Starter Kit repository to your computer. Then, go to the folder where you copied it.</div>

<CodeBlock code={`git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git
cd Kubernetes-Starter-Kit-Developers
`} /><br/>

<div className="text">Next, let's create the deployment for our program using a command called "kubectl". We're also creating a separate area, called a "namespace", to make it easier to see what's happening.</div>

<CodeBlock code={`kubectl create ns hpa-constant-load
kubectl apply -f https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers/blob/06eec522f859bba957297d3068341df089468e97/07-scaling-application-workloads/assets/manifests/hpa/metrics-server/constant-load-deployment-test.yaml -n hpa-constant-load
`} /><br/>

<div className="note"><strong>Note</strong>: The deployment file included in this repository sets limits for the resources (like CPU and memory) that the sample application Pods can use. This is important because the HPA needs these limits to work properly. It's a good idea to set resource limits for all your application Pods to make sure they don't use up too much of your cluster's resources. </div>

<div className="text">Check to make sure that the deployment was created without any issues and that it's now running as expected.</div>

<CodeBlock code={`kubectl get deployments -n hpa-constant-load`} /><br/>

<div className="text">Here's what you might see in the output: You'll notice that there's only one copy of the application running at the moment.</div>

<CodeBlock code={`OutputNAME                            READY   UP-TO-DATE   AVAILABLE   AGE
constant-load-deployment-test          1/1       1            1        8s
`} /><br/>

<div className="text">After that, let's set up the "constant-load-hpa-test" resource in your cluster using the "kubectl" command.</div>

<CodeBlock code={`kubectl apply -f https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers/blob/main/07-scaling-application-workloads/assets/manifests/hpa/metrics-server/constant-load-hpa-test.yaml`} /><br/>

<div className="text">This command will make a HPA resource, which will keep an eye on the sample deployment we made earlier. You can check how the "constant-load-test" HPA is doing by using:</div>

<CodeBlock code={`kubectl get hpa constant-load-test -n hpa-constant-load`} /><br/>


<div className="text">You'll see some details on the screen. Look for the part that says "REFERENCE". It shows that the HPA is keeping an eye on our "constant-load-deployment-test" deployment. Also, check out the "TARGETS" section. It tells us how much CPU our app is using.</div>

<CodeBlock code={`OutputNAME                 REFERENCE                                  TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
constant-load-test   Deployment/constant-load-deployment-test         255%/50%   1         3         3          49s
`} /><br/>

<div className="text">You may also notice in the above information that the number of copies of our sample app, shown in the "REPLICAS" column, went up from 1 to 3. This matches what we set in the HPA configuration. Since the app in our example quickly creates load, the whole process of scaling happened pretty fast.If you want to see more details about what the HPA did, you can check its events using the command: "kubectl describe hpa -n hpa-constant-load".</div><br/>

<div className="text" style={{ fontSize:'28px' }}>Scenario 2 - Testing External Traffic</div>

<div className="text">In this scenario, we'll create a more realistic test where we simulate external users accessing our application. To do this, we'll use a different area, called a namespace, along with a set of files to observe how our application behaves separately from the previous test.</div>

<div className="text">You're going to test a sample server called "quote of the moment". When you send it an HTTP request, it sends back a different quote each time. To put pressure on the server, you'll send a lot of requests really quickly, about one every millisecond.</div>

<div className="text">To get started, you need to set up the "quote" deployment and service using a command called "kubectl". Before you do that, make sure you're in the right folder on your computer, called "Kubernetes-Starter-Kit-Developers.</div>

<CodeBlock code={`kubectl create ns hpa-external-load
kubectl apply -f 07-scaling-application-workloads/assets/manifests/hpa/metrics-server/quote_deployment.yaml -n hpa-external-load`} /><br/>

<div className="text">Now, let's make sure that the quote application deployment and services are working correctly.</div>

<CodeBlock code={`kubectl get all -n hpa-external-load`} /><br/>

<div className="text">Here's how the output might look:</div>
<CodeBlock code={`OutputNAME                             READY   STATUS    RESTARTS   AGE
pod/quote-dffd65947-s56c9               1/1     Running   0         3m5s

NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/quote   ClusterIP   10.245.170.194   <none>        80/TCP    3m5s


NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/quote       1/1     1            1           3m5s
NAME                                   DESIRED   CURRENT   READY   AGE
replicaset.apps/quote-6c8f564ff        1         1         1       3m5s
`} /><br/>

<div className="text">After that, let's set up the HPA for the quote deployment using the "kubectl" command:</div>

<CodeBlock code={`kubectl apply -f 07-scaling-application-workloads/assets/manifests/hpa/metrics-server/quote-deployment-hpa-test.yaml -n hpa-external-load`} /><br/>

<div className="text">Now, let's see if the HPA resource is set up and working correctly:</div>

<CodeBlock code={`kubectl get hpa external-load-test -n hpa-external-load`} /><br/>

<div className="text">Here's how the output might look:</div>

<CodeBlock code={`OutputNAME                 REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
external-load-test   Deployment/quote   1%/20%    1         3         1          108s
`} /><br/>

<div className="text">In this case, it's important to note that we've set a different value for the CPU usage threshold, and we're also using a different approach to scale down. Here's how the configuration for the "external-load-test" HPA looks:</div>

<CodeBlock code={`apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: external-load-test
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: quote
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 60
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 20
`} /><br/>


<div className="text">In this setup, we've changed how quickly the autoscaler reacts when scaling down, setting it to 60 seconds instead of the default 300 seconds. This isn't typically necessary, but it helps speed up the process for this specific test. Usually, the autoscaler waits for 5 minutes after scaling before making more changes. This helps prevent rapid changes and keeps things stable.</div>

<div className="text">In the last step, you'll use a script from this repository to put pressure on the target application, which is the quote server. This script quickly makes a bunch of HTTP requests, pretending to be users accessing the server. This helps simulate a lot of external activity, which is useful for demonstration purposes.</div>

<div className="text">Make sure to open two separate windows on your computer screen so you can see the results more clearly. You can use a tool like "tmux" for this.In the first window, run the script called "quote service load test". You can stop the script at any time by pressing Ctrl+C</div>

<CodeBlock code={`./quote_service_load_test.sh`} /><br/>

<div className="text">In another window, use the command "kubectl watch" with the "-w" flag to keep an eye on the Horizontal Pod Autoscaler (HPA) resource. This will let you see any changes to the HPA in real-time.</div>

<CodeBlock code={`kubectl get hpa -n hpa-external-load -w`} /><br/>

<div className="text">Check out the animation below to see the results of the experiment:</div>


<!-- <images should add here > -->


<div className="text">Next, you'll discover how to adjust the size of your applications based on specific metrics from Prometheus. For instance, you can make your deployments bigger or smaller depending on how many times your application gets visited with HTTP requests, instead of just looking at how much CPU or memory it's using</div>

<div className="head">Automatically Scaling Applications with Prometheus: Beyond CPU Metrics</div>

<div className="text">In the previous steps, you learned how to make your applications bigger or smaller based on how much computer power they use (like CPU). But, you can also do this with other things, not just computer power. For example, you can use a tool called Prometheus to keep track of how many times people visit your website (like with HTTP requests). Then, you can tell your system to automatically adjust your website's size based on how many visitors it's getting. This way, if a lot of people are coming to your site, it can automatically become bigger to handle the traffic!</div><br/>

<div className="text">To do this, you'll need to set up something called the "prometheus-adapter." It's like a special tool that helps Prometheus talk to Kubernetes, which is the system managing your applications. The prometheus-adapter helps them understand each other better, like a translator between them.</div><br/>

<div className="text">If you want something quick and simple that uses less computer power, go for metrics-server. It gives you basic info like how much your computer's brain (CPU) and memory are working. But, if you need more detailed control and want to adjust your apps based on other things besides just CPU and memory, then choose Prometheus with the prometheus-adapter. It's like having a more advanced system that can handle a lot more information.</div><br/>

<div className="text">Before you start, make sure you have something called "Prometheus Operator" set up in your system. You also need to know a bit about "ServiceMonitors." If you don't have them set up yet, you can follow the instructions in the "Prometheus Monitoring Stack" chapter from the Starter Kit repository. Once that's ready, you need to create something called a "ServiceMonitor" to keep an eye on how your application is doing. This will send the information to the Kubernetes system through the prometheus-adapter. After that, the system can adjust your applications based on this information using something called the "horizontal pod auto scaler.</div><br/>

<div className="text" style={{ fontSize:'28px' }}>Simple Steps for Scaling Applications with Custom Metrics Using Prometheus</div>

<div className="text"><ol><li>Install the prometheus-adapter: First, you need to set up something called "prometheus-adapter" in your system.</li><br/><li>Tell Prometheus about your metrics: Then, you let Prometheus know what information to keep an eye on from your applications. We do this by creating something called "ServiceMonitors.</li><br/><li>Show your metrics to the system: After that, you tell the prometheus-adapter to share your application's custom metrics with the Kubernetes system. This is done by defining "discovery rules.</li><br/><li>Tell the system how to adjust: Lastly, you create something called an "HPA" (horizontal pod auto scaler) that focuses on your application. You configure it to change the size of your application based on the custom metrics you've set up.</li></ol></div>


<div className="text" style={{ fontSize:'28px' }}>Easy Install: Prometheus Adapter with Helm</div>
<div className="text">You can install the Prometheus adapter using Helm, which is a tool that helps with managing software on your system. Here's how:</div>

<div className="text"><ol><li>First, copy the Starter Kit to your computer by using a command called "clone." Then, go to the folder where you put the Starter Kit on your computer.<br/> <CodeBlock code={`git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git
Kubernetes-Starter-Kit-Developers
`} /><br/></li><li>Now, include the prometheus-community Helm repo on your system, and see what kinds of charts (software packages) are available.</li><CodeBlock code={`helm repo add prometheus-community https://prometheus-community.github.io/helm-chart
helm repo update prometheus-community
helm search repo prometheus-community
`} /><br/>The result you see will be something like this: <br/><CodeBlock code={`OutputNAME                                                    CHART VERSION   APP VERSION     DESCRIPTION
prometheus-community/alertmanager                       0.18.0          v0.23.0         The Alertmanager handles alerts sent by client ...
prometheus-community/kube-prometheus-stack              35.5.1          0.56.3          kube-prometheus-stack collects Kubernetes manif...
prometheus-community/kube-state-metrics                 4.9.2           2.5.1           Install kube-state-metrics to generate and expo...
prometheus-community/prometheus                         15.10.1         2.34.0          Prometheus is a monitoring system and time seri...
prometheus-community/prometheus-adapter                 3.3.1           v0.9.1          A Helm chart for k8s prometheus adapter
…
`} /><br/><li>The chart we're interested in is called "prometheus-adapter." This is the one that will set up prometheus-adapter on your system. You can find more information about it on the prometheus-adapter chart page.Next, open the prometheus-adapter Helm values file from the Starter Kit repository using a text editor. It's best to use one that supports YAML linting. For example, you can use VS Code.<br/><CodeBlock code={`code https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers/blob/main/07-scaling-application-workloads/assets/manifests/prometheus-adapter-values-v3.3.1.yaml`} /><br/></li><li>Be sure to change the Prometheus endpoint setting according to your setup, and the instructions are there in the Helm values file.</li><li>Once you've made the necessary changes, save the file. Now, install a prometheus-adapter using Helm. It will also create a special space for the prometheus-adapter called a "namespace.</li></ol></div>

<CodeBlock code={`HELM_CHART_VERSION="3.3.1"
helm install prometheus-adapter prometheus-community/prometheus-adapter \
--version "$ HELM_CHART_VERSION" \
--namespace prometheus-adapter \
--create-namespace \
-f "07-scaling-application-workloads/assets/manifests/prometheus-adapter-values-v$ {HELM_CHART_VERSION}.yaml"
`} /><br/>

<div className="text">We're using a particular version of the prometheus-adapter Helm chart, specifically version 3.3.1. This version matches with the 0.9.1 release of the Prometheus-adapter application (you can find this information in the output from Step 2).Choosing a specific version is a good idea because it gives us control and predictability. It helps ensure that we get the expected results, and it's easier to manage different versions using tools like Git.</div>

<div className="text" style={{ fontSize:'28px' }}>Evaluating Setup: What to Look For</div>

<div className="text">Check to see if prometheus-adapter has been successfully set up by:</div>

<CodeBlock code={`helm ls -n prometheus-adapter`} /><br/>

<div className="text">The result you want to see will look something like this (pay attention to the word "deployed" in the STATUS column):</div>

<CodeBlock code={`OutputNAME                 NAMESPACE            REVISION   UPDATED                STATUS     CHART                      APP VERSION
prometheus-adapter   prometheus-adapter   1          2022-03-01 12:06:22    deployed   prometheus-adapter-3.3.1   v0.9.1
`} /><br/>

<div className="text">Now, take a look at the status of the resources in the special space we created for the prometheus-adapter, called the "prometheus-adapter namespace.</div>

<CodeBlock code={`kubectl get all -n prometheus-adapter`} /><br/>

<div className="text">The result you're aiming for will resemble this (pay attention to the deployment and replicaset resources, they should be in good health and the count should be 2):</div>


<CodeBlock code={`OutputNAME                                      READY   STATUS    RESTARTS   AGE
pod/prometheus-adapter-7f475b958b-fd9sm   1/1     Running   0          2m54s
pod/prometheus-adapter-7f475b958b-plzbw   1/1     Running   0          2m54s

NAME                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service/prometheus-adapter   ClusterIP   10.245.150.99   <none>        443/TCP   2m55s

NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/prometheus-adapter   2/2     2            2           2m55s

NAME                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/prometheus-adapter-7f475b958b   2         2         2       2m55s
`} /><br/>

<div className="text">Wait for a little bit and then ask the system for information using something called the "custom.metrics.k8s.io API." Save the answers to a file</div>

<CodeBlock code={`kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 > custom_metrics_k8s_io.json`} /> <br/>

<div className="text">Open the file named "custom_metrics_k8s_io.json," and you'll see information about special metrics that the Kubernetes system is sharing.</div>

<CodeBlock code={`{
 "kind": "APIResourceList",
 "apiVersion": "v1",
 "groupVersion": "custom.metrics.k8s.io/v1beta1",
 "resources": [
     {
         "name": "jobs.batch/prometheus_sd_kubernetes_events",
         "singularName": "",
         "namespaced": true,
         "kind": "MetricValueList",
         "verbs": [
             "get"
         ]
     },
     {
         "name": "services/kube_statefulset_labels",
         "singularName": "",
         "namespaced": true,
         "kind": "MetricValueList",
         "verbs": [
             "get"
         ]
     },
...
`} /><br/>


<div className="text">If what you see matches the example above, it means you've set up the prometheus-adapter correctly. Now, let's move on to the next step where you'll discover how to create a simple application. This application will have its own special metrics, and we'll make sure Prometheus knows how to keep an eye on them by setting up something called a "ServiceMonitor.</div>


<div className="head">Setting Up a Practice Application with Prometheus Example Metrics</div>

<div className="text">Let's set up a practice application to see if our system is working correctly. We're going to use an example application called "prometheus-example," and it gives us information about how our website is doing. Here are the things it will tell us:</div>

<div className="text">Total incoming requests:</div>
<div className="text"> &nbsp;  &nbsp;  &nbsp; a. Metric: http_requests_total</div>
<div className="text"> &nbsp;  &nbsp;  &nbsp; b. What it tells us: The overall number of people coming to our website.</div> <br/>

<div className="text">Duration of all requests:</div>
<div className="text">&nbsp; &nbsp; &nbsp; a. Metric: http_request_duration_seconds</div>
<div className="text">&nbsp; &nbsp; &nbsp; b. What it tells us: How long, on average, people spend on our website.</div><br/>

<div className="text">Total count of all requests:</div>
<div className="text">&nbsp; &nbsp; &nbsp; a. Metric: http_request_duration_seconds_count</div>
<div className="text">&nbsp; &nbsp; &nbsp; b. What it tells us: The total number of all visits to our website</div><br/>

<div className="text">Total duration of all requests</div>
<div className="text">&nbsp; &nbsp; &nbsp;a. Metric: http_request_duration_seconds_sum </div>
<div className="text">&nbsp; &nbsp; &nbsp;b. What it tells us: The combined time, in seconds, everyone spends on our website.</div><br/>


<div className="text">Histogram of request durations:</div>
<div className="text">&nbsp; &nbsp; &nbsp; a. Metric: http_request_duration_seconds_bucke</div>
<div className="text">&nbsp; &nbsp; &nbsp; b. What it tells us: A fancy way of showing the range of time people spend on our website. </div><br/>


<div className="text">This will help us test if our system is set up correctly.</div>

<div className="text">Once you've deployed the prometheus-example application, you'll want to test out how the automatic scaling works with custom metrics. To do this, you'll need to send a bunch of requests to the prometheus-example service and see how it responds by adjusting its size based on the number of requests.</div>

<div className="text">Before you do that, make sure you're in the right folder on your computer where you copied the Starter Kit.</div>

<CodeBlock code={`cd Kubernetes-Starter-Kit-Developers`} /><br/>

<div className="text">Next, let's put the prometheus-example application into action. You can do this by using a special set of instructions written in a file called a "YAML manifest" from the Starter Kit. This file will tell your system to create the prometheus-example application along with some other necessary things, like a service. It also sets up a special area, known as a "namespace," called prometheus-custom-metrics-test, just to make sure everything is working smoothly.</div>


<CodeBlock code={`kubectl create ns prometheus-custom-metrics-test
kubectl apply -f 07-scaling-application-workloads/assets/manifests/hpa/prometheus-adapter/prometheus-example-deployment.yaml -n prometheus-custom-metrics-test
`} /><br/>

<div className="text">After you've set up the prometheus-example application, double-check to make sure everything was created correctly in the special area we made called "prometheus-custom-metrics-test namespace.</div>

<CodeBlock code={`kubectl get all -n prometheus-custom-metrics-test`} /><br/>

<div className="text">You'll see something like this in the results (make sure the "prometheus-example-app" is listed and marked as "up and running," along with the associated service).</div>


<CodeBlock code={`OutputNAME                                          READY   STATUS    RESTARTS   AGE
pod/prometheus-example-app-7455d5c48f-wbshc         1/1     Running   0          9s

NAME                             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/prometheus-example-app   ClusterIP   10.245.103.235   <none>        80/TCP    10s

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/prometheus-example-app   1/1     1            1           10s

NAME                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/prometheus-example-app-7455d5c48f   1         1         1       10s
`} /><br/>

<div className="text" style={{ fontSize:'28px' }}>Preparing Prometheus for Application Monitoring</div>

<div className="text">Before we set up something called a "ServiceMonitor" for our application, let's make sure our Prometheus system is ready for it. Follow these steps:</div>

<div className="text"><ol><li>Start by finding out which Prometheus instances are in your system. The Starter Kit uses a place called the "monitoring namespace," but you might need to adjust this based on how your system is set up. <br/> <CodeBlock code={`kubectl get prometheus -n monitoring`} /><br/> The result you'll see will look something like this: <br/> <CodeBlock code={`OutputNAME                                    VERSION   REPLICAS   AGE
kube-prom-stack-kube-prome-prometheus   v2.35.0   1          7h4m
`} /><br/></li><li>Now, choose the Prometheus instance you found in the last step (if there's more than one, select the one that matches your setup). Look for something called "serviceMonitorSelector.matchLabels" and note down its value.<br/><CodeBlock code={`kubectl get prometheus kube-prom-stack-kube-prome-prometheus -n monitoring -o jsonpath='{.spec.serviceMonitorSelector.matchLabels}'`} /><br/></li></ol></div>


<div className="text">You'll see something like this in the results (keep an eye out for a label called "release").</div>

<CodeBlock code={`{"release":"kube-prom-stack"}`} /><br/>

<div className="text">By default, each Prometheus instance is set up to find only service monitors that have a certain label. To make sure our prometheus-example-app ServiceMonitor gets noticed by Prometheus, we need to give it a label called "release" with the value "kube-prom-stack."</div>

<div className="text">Before you do anything else, make sure you're in the right folder on your computer where you copied the Starter Kit.</div>

<CodeBlock code={`cd Kubernetes-Starter-Kit-Developers`} /><br/>

<div className="text">Next, take a look at a file called "prometheus-example-service-monitor" from the Starter Kit on your computer. You can use a program like VS Code, which is a good choice because it helps with checking if everything is written correctly in this special file called YAML.</div>

<CodeBlock code={`code 07-scaling-application-workloads/assets/manifests/hpa/prometheus-adapter/prometheus-example-service-monitor.yaml`} /><br/>

<div className="text">In the part of the file that talks about "metadata.labels," make sure to include the label we found earlier (it's called "release" and should have the value "kube-prom-stack"). The ServiceMonitor file will look something like this:</div>

<CodeBlock code={`kind: ServiceMonitor
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus-example-app
  labels:
    app: prometheus-example-app
    release: kube-prom-stack
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: prometheus-example-app
  namespaceSelector:
    matchNames:
      - prometheus-custom-metrics-test
  endpoints:
    - port: web
`} /><br/>


<div className="text">Last step! Let's make the ServiceMonitor we need. This ServiceMonitor will tell Prometheus to keep an eye on the /metrics information from our prometheus-example-app.</div>

<CodeBlock code={`kubectl apply -f 07-scaling-application-workloads/assets/manifests/hpa/prometheus-adapter/prometheus-example-service-monitor.yaml -n prometheus-custom-metrics-test`} /><br/>

<div className="text">Once you've finished the steps above, you'll notice a new item in the Targets panel on the Prometheus dashboard. To view this dashboard, you need to use a command to make it accessible. Here's an example command using the Starter Kit naming conventions. Please adjust it based on how your system is set up:</div>

<CodeBlock code={`kubectl port-forward svc/kube-prom-stack-kube-prome-prometheus 9090:9090 -n monitoring`} /><br/>

<div className="text">The result you'll see looks something like this (look for "prometheus-example-app" in the list of discovered targets):</div>


<!-- <need to add image here > -->


<div className="text">Open a web browser and go to "localhost:9090". Then, click on "Status" and choose "Targets". You'll see that the target we added has been included.</div>


<div className="head">Enabling Kubernetes Access to Custom Metrics with Prometheus Adapter</div>

<div className="text">Even though Prometheus can find and gather data about your application's special metrics, the prometheus-adapter needs some help to share these metrics with Kubernetes. To make this happen, you have to set up a few rules called "discovery rules." These rules will guide the prometheus-adapter in exposing your application's custom metrics.</div>

<div className="text" style={{ fontSize:'28px' }}>Quoting from the official guide:</div>
<div className="text">The adapter decides which metrics to show and how to show them using a set of "discovery" rules. Each rule works on its own (so they shouldn't overlap), and it outlines the steps for the adapter to expose a metric in the API.</div>

<div className="text">Each rule has roughly four parts:</div>
<div className="text"><ol><li>Discovery: <ul><li>How the adapter should find all Prometheus metrics for this rule.</li></ul></li><li>Association:<ul><li>How the adapter should figure out which Kubernetes resources a specific metric is related to.</li></ul></li><li>Naming: <ul><li>How the adapter should display the metric in the custom metrics API.</li></ul></li><li>Querying: <ul><li>How a request for a certain metric on one or more Kubernetes objects should be translated into a question to Prometheus.</li></ul></li></ol></div>

<div className="text">A regular definition for a discovery rule looks something like this:</div>

<CodeBlock code={`rules:
  custom:
    - seriesQuery: 'http_requests_total{pod!=""}'
      resources:
        template: "<<.Resource>>"
      name:
        matches: "^(.*)_total"
        as: "${1}_per_second"
      metricsQuery: "sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)"
`} /><br/>


<div className="text">Let's break down the configuration into simpler parts:</div>

<div className="text"><ol><li>seriesQuery:<ul><li>This part represents the metric you want to track, like the total number of HTTP requests shown by the application's /metrics endpoint. It tells the prometheus-adapter to choose the http_requests_total metric for all your application Pods that are not null (pod!=""). This is like telling it what to look for.</li></ul></li><li>resources.template:<ul><li>Think of this as a template that Prometheus uses to understand where the metrics are coming from (like from a Pod). This part helps the prometheus-adapter figure out the resource that exposes the metrics (e.g., Pod). It's like connecting the metric to the thing it's related to.</li></ul></li><li>name: <ul><li>Here, you're giving a new name to the rule. In simple terms, you're telling the prometheus-adapter to rename http_requests_total to http_requests_per_second. Essentially, you're saying you're interested in the number of HTTP requests per second, not just a simple count. This is about making the metric name more meaningful.</li></ul></li><li>metricsQuery: <ul><li>This is a fancy term for a parameterized query in Prometheus language (PromQL). It's like a way of asking Prometheus for information. In this case, it calculates the rate of HTTP requests on average over a set period (like 2 minutes). It's the way of telling the prometheus-adapter how to interpret the metric.</li></ul></li></ol></div>


<div className="text">Now that you've learned how to create discovery rules for prometheus-adapter, let's apply this knowledge. Follow these simple steps to tell prometheus-adapter to use the rules you've just set up:</div>

<CodeBlock code={`cd Kubernetes-Starter-Kit-Developers`} /><br/>

<div className="text"><ol><li>Start by making sure you're in the right folder on your computer where you copied the Starter Kit. <br/> <CodeBlock code={`cd Kubernetes-Starter-Kit-Developers`} /><br/> </li><li>After that, open a file called "prometheus-adapter Helm values" from the Starter Kit on your computer. You can use a program like VS Code, which is a good choice because it helps check if everything is written correctly in this special file called YAML. <br/> <CodeBlock code={`code https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers/blob/main/07-scaling-application-workloads/assets/manifests/prometheus-adapter-values-v3.3.1.yaml`} /><br/></li><li>Find a part in the file called "rules," and remove the commenting symbols (usually hashtags or double slashes) from the beginning of the lines. When you're done, it should look something like this <br/><CodeBlock code={``} /><br/></li><li>First, keep a file where we list all the settings we want to use. Then, when we want to make changes to our project, we use Helm to update it with these new settings.<br/><CodeBlock code={`HELM_CHART_VERSION="3.3.1"
helm upgrade prometheus-adapter prometheus-community/prometheus-adapter \
  --version "$ HELM_CHART_VERSION" \
  --namespace prometheus-adapter \
  -f "07-scaling-application-workloads/assets/manifests/prometheus-adapter-values-v$ {HELM_CHART_VERSION}.yaml"
`} /><br/> </li></ol></div>

<div className="note"><strong>Note</strong>: When there aren't any requests coming in, we only see the version metric. To start seeing more metrics, like how many times our website is visited, we need to do something: generate some HTTP requests. This means we need to simulate people visiting our website by sending requests to it.</div>

<CodeBlock code={`kubectl port-forward svc/prometheus-example-app 8080:8080 -n prometheus-custom-metrics-test`} /><br/>

<div className="text">Open a web browser and go to http://localhost:8080. Then, just refresh the homepage of the application a few times.</div>

<div className="text">If everything's working correctly, you'll be able to check a new metric by using the custom metrics API. To make it easier to read the results, you can install a tool called jq.</div>

<CodeBlock code={`kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1/namespaces/prometheus-custom-metrics-test/pods/*/http_requests_per_second" | jq .`} /><br/>

<div className="text">The result you see might look something like this:</div>

<CodeBlock code={`{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/prometheus-custom-metrics-test/pods/%2A/http_requests_per_second"
  },
  "items": [
    {
      "describedObject": {
        "kind": "Pod",
        "namespace": "prometheus-custom-metrics-test",
        "name": "prometheus-example-app-7455d5c48f-wbshc",
        "apiVersion": "/v1"
      },
      "metricName": "http_requests_per_second",
      "timestamp": "2022-03-02T13:33:32Z",
      "value": "0",
      "selector": null
    }
  ]
}
`} /><br/>

<div className="text">When you look at the output above, you'll see a metric called http_requests_per_second with a value of 0. This is expected because we haven't put any pressure on the application yet.</div>

<div className="text">Now, for the last step, we'll set up something called HPA for the deployment of our application. Then, we'll create some activity on the application using a script called custom_metrics_service_load_test, which you can find in the Starter Kit repository.</div>

<div className="head">Setting Up and Testing HPAs with Custom Metrics</div>
<div className="text">Creating and testing HPAs (Horizontal Pod Autoscalers) with custom metrics is pretty much like what we did with the metrics server examples. The only thing that changes is how we measure the application's performance, which in this case uses custom metrics like http_requests_per_second.</div>

<div className="text">A typical HPA definition based on custom metrics looks something like this (important fields are explained as we go along):</div>

<CodeBlock code={`kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta2
metadata:
  name: prometheus-example-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: prometheus-example-app
  minReplicas: 1
  maxReplicas: 5
  metrics:
    # use a "Pods" metric, which takes the average of the
    # given metric across all pods controlled by the autoscaling target
    - type: Pods
      pods:
        metricName: http_requests_per_second
        # target 500 milli-requests per second,
        # which is 1 request every two seconds
        targetAverageValue: 500m
`} /><br/>

<div className="text">First, go to the folder where you saved the Starter Kit on your computer:</div>
<CodeBlock code={`cd Kubernetes-Starter-Kit-Developers`} /><br/>

<div className="text">Next, set up the prometheus-custom-metrics-hpa resource in your cluster by using kubectl:</div>

<CodeBlock code={`kubectl apply -f https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers/blob/06eec522f859bba957297d3068341df089468e97/07-scaling-application-workloads/assets/manifests/hpa/prometheus-adapter/prometheus-custom-metrics-hpa.yaml`} /><br/>

<div className="text">The command above creates something called an HPA, which stands for Horizontal Pod Autoscaler. It's set up to watch over the sample deployment we made earlier. You can see how the HPA is doing by using:</div>

<CodeBlock code={`kubectl get hpa -n prometheus-custom-metrics-test`} /><br/>

<div className="text">The result you'll see looks something like this (pay attention to the REFERENCE column, which points to the prometheus-example-app deployment, and the TARGETS column, showing the current number of http_requests_per_second):</div>

<CodeBlock code={`OutputNAME                        REFERENCE                           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
prometheus-custom-metrics   Deployment/prometheus-example-app   0/500m    1         5         1          19s
`} /><br/>

<div className="text">In the last step, you'll use a script provided in this repository to put some pressure on the target (which is the Prometheus-example-app). This script will make a bunch of quick HTTP requests, acting like multiple users accessing the app simultaneously (it's enough for showing how things work).</div>

<div className="text">To make it easier to see what's happening, it's best to split your screen into two separate windows. You can do this with a tool like tmux. Then, in the first window, you'll run a script called custom_metrics_service_load_test. You can stop it anytime by pressing Ctrl+C.</div>

<CodeBlock code={`./custom_metrics_service_load_test.sh`} /><br/>

<div className="text">Next, in the second window, you'll want to keep an eye on what's happening with the HPA resource. To do this, you can use a command called kubectl watch with the -w flag. This command will continuously show you updates in real-time.</div>

<CodeBlock code={`kubectl get hpa -n prometheus-custom-metrics-test -w`} /><br/>


<!-- <image needs to add here > -->


<div className="text">You can watch as the autoscaler starts working when the load increases (while the load generator script is running). It'll increase the number of replicas for the prometheus-example-app deployment. Once you stop the load generator script, there's a waiting period, and after about 5 minutes, the number of replicas goes back to the original value of 1.</div>


<div className="head">Highlighted Phases:</div>

<div className="text">Phase 1: This is when things are ramping up. You'll see the HPA gradually increasing the number of replicas from 1 to 8 as the initial load increases from around 2140 milli-requests per second to a more manageable 620 milli-requests per second with more Pods added.</div>

<div className="text">Phase 2: Things start to stabilize here. The current load has small ups and downs, staying between 520-540 milli-requests per second.</div>

<div className="text">Phase 3: In this phase, there's a sudden increase in load, going over 10% of the threshold value to 562 milli-requests per second. Since we're out of the hysteresis window, the HPA adds more replicas (9) to stabilize the system. This quickly brings the load back down to around 480 milli-requests per second.</div>

<div className="text">Phase 4: Here, we stop the load generator script. You'll see the application's load decrease rapidly. After about 5 minutes (the default cooldown time), the number of replicas goes back to the minimum value of 1.</div>

<div className="text">Let's say we want to keep the number of HTTP requests close to a certain limit, like our threshold. The HPA won't keep increasing the number of replicas if the average number of requests is already close to the threshold (let's say within ± 10%). Even if we haven't hit the upper limit yet, this helps prevent constant changes in the number of replicas. This idea is called hysteresis. It's like a stabilizing factor that helps avoid bouncing back and forth between different replica counts. Hysteresis is important because it keeps systems more stable, preventing constant ups and downs.</div><br/>

<div className="con">Conclusion</div>

<div className="text">In this guide, you discovered how to make your application adjust its size based on its needs. We used some tools to measure this, like metrics-server and Prometheus.You also got to see how this works in real life. The HPAs automatically change your application's size to handle more visitors or traffic, keeping everything running smoothly.</div>

<div className="text"><strong>Excited to try it out? Give it a go with your own applications and see how it works!</strong></div>