Skip to content

Latest commit

 

History

History
71 lines (43 loc) · 3.88 KB

File metadata and controls

71 lines (43 loc) · 3.88 KB

Chaos Engineering – Simulating CPU Spike

img

In this series of chaos engineering articles, let’s discuss how to simulate CPU consumption to spike up to 100% on a host (or container). CPU consumption will spike up whenever a thread goes on an infinite loop. Here is a sample program from the open-source BuggyApp application, which would cause the CPU to spike up.

public class CPUSpikeDemo {

  public static void start() {
    new CPUSpikerThread().start();
    new CPUSpikerThread().start();
    new CPUSpikerThread().start();
    new CPUSpikerThread().start();
    new CPUSpikerThread().start();
    new CPUSpikerThread().start();
    System.out.println("6 threads launched!");
  }
}

public class CPUSpikerThread extends Thread {

  @Override
  public void run() {

    while (true) {

      // Just looping infinitely
    }
  }
}

In the above Java program, you will notice the ‘CPUSpikeDemo’ class. In this class, 6 threads with the name ‘CPUSpikerThread’ are launched. If you notice the ‘CPUSpikerThread’ class code, there is a ‘while (true)’ loop without any code in it. This condition will cause the thread to go on an infinite loop. Since 6 threads are executing this code, all the 6 threads will go on an infinite loop. When this program is executed, CPU consumption will skyrocket on the machine.

We launched the above BuggyApp program on a ‘t3a.medium’ EC2 instance, which has 2 CPUs. Below is the output from the UNIX performance monitoring tool ‘top’. You can notice the overall CPU % reaching out to 100%.

img

Fig: Top tool showing CPU consumption spiking up to 100%

How to diagnose CPU spike?

As highlighted in this article, you can use manual approach to do root cause analysis:

  1. Capture thread dump from the application
  2. Capture ‘top -H -p {PID}’ output
  3. Marry these #a and #b and identify the root cause of the CPU spike problem

On the other hand, you can use automated root cause analysis tool like yCrash – which automatically captures application-level data (thread dump, heap dump, Garbage Collection log), system-level data (netstat, vmstat, iostat, top, top -H, dmesg,…) and marries these two datasets to generate instant root cause analysis report instantly. Below is the report generated by the yCrash tool when the above sample program is executed:

img

img

img

Fig: yCrash tool point out lines of code causing the CPU spike

From the report, you can observe the yCrash is pointing out that 6 threads are causing the CPU to spike up. In the ‘CPU | Memory’ section of this report, you can notice that CPU consumption of each thread (which is > 30%) to be reported. You can also notice that tool is pointing out exact lines of code i.e., com.buggyapp.cpuspike.CPUSpikerThread.run(CPUSpikerThread.java:12) that is causing the infinite loop. Equipped with this information one can easily go ahead and fix the problematic code.

相关链接