Linux and Mainframe

Showing posts with label performance. Show all posts

September 12, 2016

Java performance improvements

IBM continues to improve the performance of Java on the mainframe. To show this I've taken a snapshot of the performance improvements during the latest Java releases. The operating system was a SLES 12 SP1 and this was run on a z13 LPAR with 6 cores and SMT enabled.

As you can see you there is a solid 33% percent improvement going from the first Java 7 version to the latest Java 8 SR3 FP10 version.

So the first recommendation when you are having Java performance problems with Linux on System z is to try a more advanced Java version.

December 31, 2015

SLES 12 toolchain module available for Linux on z

SUSE has released the toolchain module for Linux on System z. This is the first officially supported gcc compiler that supports the z13.
To install you need to add the product and update repository with "yast2 repositories" and then you can install it with

# zypper install sle-module-toolchain-release
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following 10 NEW packages are going to be installed:
cpp5 gcc5 gcc5-c++ gcc5-fortran gcc5-locale libgfortran3 libstdc++6-devel-gcc5 patterns-toolchain-gcc5
sle-module-toolchain-release sle-module-toolchain-release-POOL

The following NEW pattern is going to be installed:
gcc5

The following NEW product is going to be installed:
"Toolchain Module"

The following 6 recommended packages were automatically selected:
cpp5 gcc5-c++ gcc5-fortran gcc5-locale libstdc++6-devel-gcc5 patterns-toolchain-gcc5

10 new packages to install.
Overall download size: 26.2 MiB. Already cached: 0 B After the operation, additional 136.1 MiB will be
used.
Continue? [y/n/? shows all options] (y): y
.....

As you see you get C, C++ and Fortran. To enable z13 instructions use the -march=z13 option.

August 19, 2015

Minecraft on the Mainframe

LinuxCon video - Joran Siu

Joran Siu from the IBM Java team installed Minecraft Server on the mainframe. In this entertaining talk he shows the results and the optimizations used, some of them only available on zEC12 and z13.

June 22, 2015

New Whitepaper "z/VM 6.3 HiperDispatch - Polarization Modes and Middleware Performance"

This white paper looks at the effect of using the HiperDispatch feature (introduced with z/VM 6.3) with a mixed Linux workload. It also provides results for applying the scalability APAR VM65586.
The improvements in the mixed workload including Websphere and Oracle DB shows that this APAR is not only valuable for z13, but rather for everyone with a larger number of IFLs. And if you are still running an old pre 6.3 z/VM release, now is the time to upgrade!

April 10, 2014

Linux performance guides for distributions

I'm often asked for the performance guides from Red Hat and SUSE. So here are the links for that:

(updated 7/72014)

February 13, 2014

IBM XL C/C++ for Linux on System z beta program

IBM has launched a beta program for a new compiler for Linux on System z. If you've been working with IBM platforms you may already know the xlc compiler. This compiler is now also available as a beta for Linux on System z as well.
Interested customers and software vendors should register for it at this registration link (free IBM ID required).

The key difference to the gcc included in the distributions is that this compiler is independent of the distribution. So as a software developer you can use this compiler and more important the generated binaries on all the supported Linux distribution. By that you can get improvements like zEC12 exploitation earlier than waiting for the next gcc in a distribution.

And my personal hope is that the performance will also be good as well. We will know this when the compiler is generally available.

December 20, 2013

Oprofile on zLinux - how to setup and use

(updated 1/8/2014)

Oprofile is a system wide profiler which is available in all major distributions for Linux on System z. Here is the setup and use for RHEL 6 and SLES 11.

The following example is for RHEL 6.5 (kernel 2.6.32-431.1.2.el6.s390x). For other kernel levels you need to adapt the package numbers. But basically it should work the same. The Red Hat description is here.

First step: install the required packages:

oprofile-0.9.7-1.el6.s390x
oprofile-jit-0.9.7-1.s390x (only needed for profiling Java code)
oprofile-gui-0.9.7-1.el6.s390x (only needed if you want the GUI)
kernel-debuginfo-2.6.32-431.1.2.el6.s390x.rpm
kernel-debuginfo-common-s390x-2.6.32-431.1.2.el6.s390x.rpm

Note that the kernel-debuginfo packages are only available on RHN. See this howto get it. You need to log into your Red Hat Customer portal for the full information. Also install from RHN any other debuginfo package of a distribution package you want to analyze.

Second step: configure oprofile

opcontrol --setup --vmlinux=/usr/lib/debug/lib/modules/`uname -r`/vmlinux

Third step: measure workload

opcontrol --start

run your workload

opcontrol --stop

opcontrol --dump

Last step: call opreport or opannotate with the options you want. For understanding options use the respective man pages. One commonly used option is:

opreport --symbols

Don't be surprised by an entry with symbol name vtime_stop_cpu. That's cpu idle in RHEL 6.

For SLES 11 SP3 the setup is similar. SUSE has a good description on how to use in their Systems Analysis and Tuning Guide.

So in the first step you need to install oprofile-0.9.8-0.13.31.s390x.rpm from the SDK. Optionally the kernel debuginfo package e.g. kernel-default-debuginfo-3.0.76-0.11.1.s390x.rpm as well as all the debuginfo versions of distribution packages you want to profile.

The vmlinux file for SLES is gzipped in /boot. If you have enough space there you can just gunzip it in place otherwise put it in /tmp as the SUSE guide suggests. Then in the second step you set up oprofile by

opcontrol --setup --vmlinux=/boot/vmlinux-`uname -r`

Step 3 & 4 for SLES 11 are the same as above.

If you want to analyze data on another system use oparchive. It will generate a directory with all required data that you can compress and take off the system. So e.g.

oparchive -p <path to Linux modules> -o /tmp/myoutputdir

You can also include Java and JITed code into the profiling by adding

-agentlib:jvmti_oprofile

to your Java options. For SLES11 you need to add /usr/lib64/oprofile to your LD_LIBRARY_PATH. This especially valuable if you don't know yet where to search for a problem. If you have identified Java code as the problem then a specialized profiler is probably the better choice.

September 12, 2013

New Whitepaper covering IBM Filenet P8 on zLinux

IBM Filenet P8 has been available for zLinux for quite some time. It's a really scalable document management, content lifecycle, and workflow platform. This is then used by Enterprise Content Management or Business Platform Management software. With that solution you can keep all the data secured on System z.
The new whitepaper "Linux on System z and IBM FileNet P8 5.1 Setup, Performance, and Scalability" (pdf version) covers in the performance and scalability on zLinux. It turns out that this is a solution well suited for the System z platform.

September 3, 2013

Updated Whitepaper: WebSphere Application Server - Idle Server Tuning

When running in a virtualized environment like z/VM it's beneficial if the hypervisor knows if a server is idle or not. Usually this is implemented by waiting a certain time before considering a server truly idle. The problem is that many of the applications and middleware products do housekeeping tasks way too often for this to be really effective. So any effort to lower this "noise" is good. The updated whitepaper "WebSphere Application Server - Idle Server Tuning" provides tuning suggestions for a Websphere environment to reduce this noise.
On top they also provide tuning recommendations to reduce the startup time. The team updated the paper to cover WAS v8 and v8.5.5 including the Liberty profile.

July 24, 2013

Large Systems Performance Reference (LSPR) for zLinux

The Large Systems Performance Reference (LSPR) published by IBM has also a Linux workload included. There you can get a first impression of the relative performance about new System z processors.
More details can then be found in the z Processor Capacity Reference (zPCR).

July 9, 2013

Cheat sheet for lock debugging in the Linux kernel

From time to time I'm getting a performance problems that requires identifying the lock in the Linux kernel that causes too much lock contention. The newer kernels are well equipped to help you find that lock. If you can do something that's another question.
The base documentation for that is in the kernel source under Documentation/locking/lockstat.txt. However due to the performance impact of all that tracing this usually is disabled in the distributions. For RHEL 6.4 there is a separate debug kernel that you need to install. Ensure that it's the default IPL/boot kernel or select in the IPL/boot menu.
SLES 11 is more difficult as this requires a kernel rebuild with CONFIG_LOCK_STATS enabled. You need to contact SUSE service to get a kernel for your system.
If you have the system up with this enabled you should do the following:

echo 1 >/proc/sys/kernel/lock_stat
run your workload
cat /proc/lock_stat > /tmp/lockreport.txt
echo 0 >/proc/sys/kernel/lock_stat

Usually you only need to take a look at the few top locks to find out what's going wrong.

June 21, 2013

zlib performance improvements

In my blog entry on RHEL 6.4 I've mentioned that there have been performance enhancements for zlib compression. However I never got around to actually measure this until today.

I've taken the zlib test program called mingzip.c from the Red Hat zlib 1.2.3 and linked it dynamically against libz. Then I created a 2 GB data file by taring up /usr/share in the Red Hat file system three times. So it has quite some compressible text in it. Finally I ran five rounds of compression for this file on each RHEL 6.3 and RHEL 6.4 with default compression and "-9" maximum compression.

The result for the Red Hat update is a +13% throughput increase for the maximum compression and still a +8% throughput increase for normal compression.Your mileage may vary of course.

The same test comparing a SLES 11.2 with the new SLES 11.3 (which has an upgrade to zlib 1.2.7) shows a +25% throughput increase for the maximum compression and still a +13% throughput increase for normal compression.
This is a relative comparison: the reason the numbers are higher on SLES is that SLES 11.2 is significantly slower in this test than RHEL 6.3. The latest releases (6.4 and 11.3) are showing again about the same throughput.

Everyone using applications that dynamically link against zlib gets an improvement automatically. For applications who either ship their own version of zlib or statically link against it, the vendor needs to pick up the patch and put it into the next version for this improvement.

June 3, 2013

New white paper: HyperPAV setup with z/VM and Red Hat Linux on zSeries

Parallel Access Volume (PAV) allow you on System z to have more than one I/O outstanding per volume. However it's not so easy to set up and maintain, so this is why there is HiperPAV, which is quite easy to install and maintain. And it's supported by all in service Linux distributions now.

The white paper is gone from the IBM site - so the link is no longer working.
~~The new white paper / howto "HyperPAV setup with z/VM and Red Hat Linux on zSeries" describes the step by step setup for HyperPAV for z/VM and zLinux.~~ So if you are using ECKD disks and have any I/O performance problems - make sure you've implemented this.

As this white paper is removed - here are a few pointers to get you started:

The presentation "z/VM PAV and HyperPAV Support" and the z/VM HyperPAV web page has a good overview from the z/VM side and the presentation "HyperPAV and Large Volume Support for Linux on System z" shows the Linux part (which is basically working out of the box). And there is of course the Virtualization Workbook, which covers HyperPAV as well.

The whitepaper has been removed from the IBM site - (updated 05/30/2015)

March 11, 2013

How to limit CPU usage of suspect runaway processes

Have you ever had the problem that a Linux guest under z/VM was using "too much" CPU and when you looked closer you identified a specific process in this guest? But you couldn't reach the application team, so just recycling this process isn't an option? And maybe it's still doing something reasonable, so a
kill -SIGSTOP [pid]
isn't an option.

The first option to solve the impact is to reduce the share from a z/VM perspective for this specific guest. So the usual SET SHARE ... But this will impact all processes in the guest, so if you have more than one application in a guest this isn't really an option.

The second option requires a tool called cpulimit. I've tried it on a SLES 11 SP2+ and here is what's needed to get it going. Download it from github and then build it on System z by calling make. Copy the binary to a default search location e.g. /usr/local/bin. Next find out the PID of the offending process e.g. by using top:

top - 14:58:30 up 11 min, 2 users, load average: 12.38, 3.10, 1.19
Tasks: 109 total,   2 running, 107 sleeping,   0 stopped,   0 zombie
Cpu(s): 91.0%us, 0.1%sy, 0.0%ni, 8.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem:     16124M total,     8645M used,     7479M free,        6M buffers
Swap:        0M total,        0M used,        0M free,      135M cached

PID USER      PR NI VIRT RES SHR S   %CPU %MEM    TIME+ COMMAND
3487 root      20   0 4724m 90m 11m S    909 0.6   4:41.35 java

Now limit this process to e.g. one CPU (I'm using 10 CPUs on this system) by

cpulimit --limit=100 --lazy --pid=3487

This tells the utility to limit to one CPU and exit if the process ends. I've had vmstat running while I entered this and the reduction is quite ok at the five second interval I've been using:

procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------
r b   swpd   free   buff cache   si   so    bi    bo   in   cs us sy id wa st
12 0      0 7636040   6836 144752    0    0    11     1   88 118 26 0 74 0 0
33 0      0 7635816   6868 146000    0    0     0     2 2161 2634 89 0 11 0 0
34 0      0 7635816   6868 146120    0    0     0     0 2131 2576 89 0 11 0 0
36 0      0 7635816   6868 146236    0    0     0     0 2165 2640 89 0 11 0 0
0 0      0 7635744   6868 146280    0    0     0     6 697 1421 15 0 84 0 0
0 0      0 7635768   6868 146292    0    0     0     0 653 1454 11 0 89 0 0
0 0      0 7635768   6868 146312    0    0     0     0 688 1371 11 0 89 0 0

Note that if you look at this in top, you still see quite some variance. Also don't expect it to be right to the last CPU cycles. But for the purposes here - reducing the CPU consumption down from 9 to 1 without entirely stopping the application - it does do the job.
There is one class of applications for which this doesn't work. Everything that needs an open terminal. As Rob correctly stated in his blog, this approach will disconnect and not reconnect the terminal again.

In newer distributions there is a third option called cgroups. Configured right, you should be able to move the offending PID into a limited group.

Everything should be tested on a test system first before tried on production images!

January 10, 2013

Red Hat Enterprise Linux 5.9 released

Red Hat has announced the availability of RHEL 5.9. You can find more information on this here:

Announcement
Release Notes
Technical Notes
Kernel update page including a list of bug fixes
kernel level: kernel-2.6.18-348.el5.

With this release in the life cycle of RHEL 5, the distribution is now in "Production 2". End of production is still a while out, but this means that from now on there will only be limited hardware enablement, no new software features and no new installation images. For the exact details see the life cycle page from Red Hat.

From a mainframe perspective there are two new features included in this release. VDSO will speed up certain system calls (e.g. gettimeofday) and HyperPAV will help FICON based I/O quite a bit. So please enable it, especially for Oracle databases!
An interesting enhancement in subscription manager now allows packages to be "locked" to a certain release. They won't be automatically updated with the next RHEL 5.10 if locked to e.g. 5.9.

July 31, 2012

Getting mutrace to work on zLinux

Recently I got the question on how to trace mutex contention in libpthread on zLinux. There are several solutions to this

Use SystemTap with the futexes.stp sample script
Use Valgrind with the drd tool (see also this article)
Use mutrace as a lightweight tool

The last tool I've discovered in searching for solutions. However it hasn't been clear if it runs on zLinux or not. Usually it's just a ./configure and make to get a tool running but this one turned out to be a little bit more difficult.

I started of on a standard SLES11 SP2 with some of the development tools installed. So I downloaded the source from the mutrace-git and installed it in a directory. Then I called the ./bootstrap.sh script. Sure enough it was failing:
...
+ aclocal -I m4
configure.ac:21: error: Autoconf version 2.68 or higher is required
configure.ac:21: the top level
autom4te: /usr/bin/m4 failed with exit status: 63
aclocal: autom4te failed with exit status: 63

SLES11 SP2 hast autoconf 2.63, which isn't that ancient and SUSE had been patching and fixing it now in the second service pack. So I gave it a try and modified configure.ac to accept a minimum level of 2.63. Next run:

...

checking for library containing bfd_init... no
configure: error: *** libbfd not found

This means that the system is missing the devel package of the binutils. After installing the binutil-devel package with
zypper install binutils-devel
the bootstrap script finished successfully. At the end I noted that it used a -O0 in the gcc options, which from a performance perspective is really bad on zLinux. So I changed that in the Makefile to a -O2.

So now only the compile had to work and sure enough it ended with a

mutrace.c: In function setup:
mutrace.c:441: error: #pragma GCC diagnostic not allowed inside functions
mutrace.c:442: error: #pragma GCC diagnostic not allowed inside functions
mutrace.c:444: error: #pragma GCC diagnostic not allowed inside functions
make[1]: *** [libmutrace_la-mutrace.lo] Error 1

So this tool was using an advanced gcc feature that the gcc-4.3 from SUSE didn't have. Fortunately SUSE includes an updated version gcc-4.6 that can be installed along with the standard system compiler. The package name is gcc46 and instead of gcc you call gcc-4.6. After changing the Makefile once more the compile went smoothly.
Finally tried it on a small test program and it seems to work fine.
mutrace: Showing statistics for process a.out (PID: 12050).
mutrace: 1 mutexes used.

Mutex #0 (0x0x80003088) first referenced by:
        /root/mutrace-e23dc42/.libs/libmutrace.so(pthread_mutex_lock+0x9e) [0x3fffd07c28e]
        ./a.out(functionCount1+0x20) [0x80000e34]
        /lib64/libpthread.so.0(+0x836e) [0x3fffd05436e]
        /lib64/libc.so.6(+0xef17e) [0x3fffcfb417e]

mutrace: Showing 1 mutexes in order of (write) contention count:

Mutex #   Locked Changed    Cont. cont.Time[ms] tot.Time[ms] avg.Time[ms] Flags
       0       83       13        7         0.155        0.089        0.001 M-.--.
     ...      ...      ...      ...           ...          ...          ... ||||||
                                                                            /|||||
          Object:                                      M = Mutex, W = RWLock /||||
           State:                                  x = dead, ! = inconsistent /|||
             Use:                                  R = used in realtime thread /||
      Mutex Type:                   r = RECURSIVE, e = ERRORCHECK, a = ADAPTIVE /|
Mutex Protocol:                                       i = INHERIT, p = PROTECT /
     RWLock Kind: r = PREFER_READER, w = PREFER_WRITER, W = PREFER_WRITER_NONREC

mutrace: Note that rwlocks are shown as two lines: write locks then read locks.

mutrace: Note that the flags column R is only valid in --track-rt mode!

mutrace: 1 condition variables used.

Condvar #0 (0x0x800030b0) first referenced by:
        /root/mutrace-e23dc42/.libs/libmutrace.so(pthread_cond_wait+0x7a) [0x3fffd07caea]
        ./a.out(functionCount1+0x32) [0x80000e46]
        /lib64/libpthread.so.0(+0x836e) [0x3fffd05436e]
        /lib64/libc.so.6(+0xef17e) [0x3fffcfb417e]

mutrace: Showing 1 condition variables in order of wait contention count:

Cond #    Waits Signals    Cont. tot.Time[ms] cont.Time[ms] avg.Time[ms] Flags
       0        6       67        0        0.106         0.000        0.000     -.
     ...      ...      ...      ...          ...           ...          ...     ||
                                                                                /|
           State:                                     x = dead, ! = inconsistent /
             Use:                                     R = used in realtime thread

mutrace: Note that the flags column R is only valid in --track-rt mode!

mutrace: Total runtime is 0.319 ms.

mutrace: Results for SMP with 16 processors.

July 27, 2012

DB2 Connect high CPU utilization

DB2 Connect servers are usually a good target for consolidation. Recently we observed relatively high CPU utilization even though only a small workload was being used. The oprofile and strace output showed that the system was busy doing semget() calls that failed. So a resource was missing. In the end it turned out to be a known problem in DB2 which is fixed starting DB2 9.7 FP5 and can also be circumvented in older versions by issuing a "db2trc alloc" during startup.
This is a typical consolidation problem - on individual dedicated servers this usually is not even noticed - however after consolidation this will be visible. An additional positive effect of fixing the problem is an improved throughput at higher transaction rates.

July 4, 2012

New Whitepaper "Using the Linux cpuplugd Daemon to manage CPU and memory resources from z/VM Linux guests"

CPU and memory resources are normally shared in a virtualized environment. Therefore multiple guests are fighting for the same resources. Before the release of SLES11 SP2 and RHEL6.2 the automatic management of the number of virtual CPUs and the memory used in a virtual guest has been quite limited. Each system was requiring special attention and tuning of cpuplugd, the daemon that does the autonomic management in Linux on System z. Even then it had to be disabled for many systems.
The newer releases have a vastly improved daemon, that now allows for more detailed rules for adding and removing CPUs and memory to a guest. The drawback of more tuning knobs are more tuning knobs.
So this whitepaper tries to develop a recommended set of parameters to get the most benefit with the least effort. Furthermore results of measurements and experiments are shown together with the used parameters for advanced tuning.
Be aware that if you want cpuplugd to control memory and you run with more than one virtual CPU, you really want the fix for APAR VM65060 installed.
Also there has been a bug discovered in the base of SLES11 and RHEL 6 that's been fixed with maintweb kernel 3.0.42-0.7.3 for SLES 11 and the base RHEL 6.4 for Red Hat.

March 26, 2012

New Whitepaper: "Java Design and Coding for Virtualized Environments"

Ever wondered why your Java application is not running as smooth as it should in a virtualized environment? There are multiple reasons for this and so Steve Wehr gives some explanation for this in this new whitepaper. This isn't a complete answer and not everything is black and white but it should get you started. The topic covered in the paper are:

Why is Virtualization a Problem for Applications?
How to Waste CPU in your Java Application
How to Waste Memory in your Java Application
Best Practices for Java Applications in Virtualized Environments
Employ Strategies that Encourage Efficient Application Design for Virtualization

Of course there are other "performance bugs" as well, but from what I see in may daily work running applications in a virtualized environment with shared resources will make those bugs visible.

March 8, 2012

Whitepaper update: "Explore Decimal Floating Point support in Linux on System z"

Since a few years no System z offers hardware support for decimal floating point operations as defined by IEEE 754-2008. This allows for exact and very fast calculations in the decimal (and not the normal binary) system in the processor.

You can check if your system does support this by
cat /proc/cpuinfo | grep features
features : esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs

If you see a "dfp" in included in the feature list, your system is enabled for decimal floating point.

The new instructions can be used from Java and C/C++. For Java use the java.math.BigDecimal class in a recent JVM. Other languages need compiler and library support and language extensions. ISO/IEC TR 24732 defines this for C and ISO/IEC TR 24733 for C++.

This updated whitepaper describes the details for what's needed to exploit this feature. It includes some performance results as well and all the necessary references