March 11, 2013

How to limit CPU usage of suspect runaway processes

Have you ever had the problem that a Linux guest under z/VM was using "too much" CPU and when you looked closer you identified a specific process in this guest? But you couldn't reach the application team, so just recycling this process isn't an option? And maybe it's still doing something reasonable, so a
kill -SIGSTOP [pid]
isn't an option.

The first option to solve the impact is to reduce the share from a z/VM perspective for this specific guest. So the usual SET SHARE ... But this will impact all processes in the guest, so if you have more than one application in a guest this isn't really an option.

The second option requires a tool called cpulimit. I've tried it on a SLES 11 SP2+ and here is what's needed to get it going. Download it from github and then build it on System z by calling make. Copy the binary to a default search location e.g. /usr/local/bin. Next find out the PID of the offending process e.g. by using top:

top - 14:58:30 up 11 min,  2 users,  load average: 12.38, 3.10, 1.19
Tasks: 109 total,   2 running, 107 sleeping,   0 stopped,   0 zombie
Cpu(s): 91.0%us,  0.1%sy,  0.0%ni,  8.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:     16124M total,     8645M used,     7479M free,        6M buffers
Swap:        0M total,        0M used,        0M free,      135M cached

  PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  COMMAND
 3487 root      20   0 4724m  90m  11m S    909  0.6   4:41.35 java



Now limit this process to e.g. one CPU (I'm using 10 CPUs on this system) by

cpulimit --limit=100 --lazy --pid=3487

This tells the utility to limit to one CPU and exit if the process ends. I've had vmstat running while I entered this and the reduction is quite ok at the five second interval I've been using:

procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
12  0      0 7636040   6836 144752    0    0    11     1   88  118 26  0 74  0  0
33  0      0 7635816   6868 146000    0    0     0     2 2161 2634 89  0 11  0  0
34  0      0 7635816   6868 146120    0    0     0     0 2131 2576 89  0 11  0  0
36  0      0 7635816   6868 146236    0    0     0     0 2165 2640 89  0 11  0  0
 0  0      0 7635744   6868 146280    0    0     0     6  697 1421 15  0 84  0  0
 0  0      0 7635768   6868 146292    0    0     0     0  653 1454 11  0 89  0  0
 0  0      0 7635768   6868 146312    0    0     0     0  688 1371 11  0 89  0  0


Note that if you look at this in top, you still see quite some variance. Also don't expect it to be right to the last CPU cycles. But for the purposes here - reducing the CPU consumption down from 9 to 1 without entirely stopping the application - it does do the job.
There is one class of applications for which this doesn't work. Everything that needs an open terminal. As Rob correctly stated in his blog, this approach will disconnect and not reconnect the terminal again.

In newer distributions there is a third option called cgroups. Configured right, you should be able to move the offending PID into a limited group.

Everything should be tested on a test system first before tried on production images!