December 15, 2011

Gathering performance data with sysstat

The sysstat package is included in all distributions but not always installed by default. It's a collection of performance monitoring tools and you can check the options for your current version in the manual pages. It's platform independent and so this works on Linux in general.

After installation of the package there are two different ways to gather data.

Using the command line
Ad hoc data can be gathered by calling the data collector from the command line. For newer versions of sysstat the command

/usr/lib64/sa/sadc -S XALL -F 5 outfile.sa

and for older versions of sysstat the command

/usr/lib64/sa/sadc -d -F 5 outfile.sa

will collect data in 5 second intervals until it's interrupted by ^C. The -F forces a file compatible with the current sysstat version and the -S DISK or -d enables the collection of statistics for the block devices.
You then can convert the created binary file by

sar -A -f outfile.sa > outfile.txt

to get a report to read. There are many options for sar to select what's being shown in the report, please check the man page for them. The -A option shows all the data.

Using cron
For regular monitoring the distributions configure sysstat to be started as a cron job or as a service. The history files are then put under /var/log/sa and converted once a day to text file. In the configuration file /etc/sysstat/sysstat (SUSE) and /etc/sysconfig/sysstat (Red Hat) the archive settings for this history can be configured.

For SLES  you can install the cron settings by
  • SLES10: /etc/init.d/sysstat start
  • SLES11: /etc/init.d/boot.sysstat start
  • SLES12: systemctl start sysstat 
For RHEL5, RHEL6, RHEL7 the installation is done automatically.

For Ububtu 16.04.1 cron jobs are disabled by default (after installation). You have to edit the file /etc/default/sysstat and change the variable ENABLED from "false" to "true". After that you have to restart the service: /etc/init.d/sysstat restart

In the end there is a file/link to /etc/cron.d/ that you can adapt and e.g. changing the 10 min collection interval or adapting the reports (e.g. add the -S XALL). As usual the documentation for the sa1 and sa2 commands used there are in the respective man pages.

(updated 10/12/2016)

December 13, 2011

New whitepaper: "Sharing A WebSphere Application Server V8 Installation Among Many Linux for IBM System z Servers"

In the new white paper "Sharing A WebSphere Application Server V8 Installation Among Many Linux for IBM System z Servers" you find a hands on description on a more complex WAS server setup. This is the update of the white paper covering WebSphere Application Server V7.
So if you want to install once and use the installation in many Linux guests take a look at this.

New Redpaper: "Installing Oracle 11gR2 RAC on Linux on System z"

This Redpaper describes the installation of the latest Oracle DB on RHEL 5 and SLES 11 for Linux on System z. It covers both ECKD as well as SCSI installations and on the Oracle side the Grid Infrastructure and RAC.
It's a good documentation that you should read before you starting the installation!

December 5, 2011

z/VM 6.2 - beyond SSI and LGR

The main features in z/VM 6.2 are certainly Single System Image (SSI) and Live Guest Relocation (LGR). With that it's now possible to have up to four z/VMs in one cluster and move the Linux guests from one to the other. So finally a z/VM update will no longer require a Linux outage. IBM generated lot's of technical and marketing material about this. 

However there are performance enhancements in z/VM 6.2 as well that will help running Linux guests that should not be neglected. See the links below for more details:
  • many memory management improvements, that are all transparent to the Linux guests. So install them and if you've been running in memory over committed scenarios with Linux, you'll get the performance benefit.
  • additional performance enhancements, some of which have been also available as APARs on older z/VM releases.
So it's definitely worthwhile to update older versions to z/VM 6.2 just because of those enhancements. On top there are also several really useful technology exploitations available.

December 1, 2011

How to find software solutions for Linux on System z

There are multiple sources for finding middleware and solutions for Linux. As the web sites get updated this may change - let me know then I adapt the description.
Please note that none of those catalogs is complete, they are all snapshots. 

IBM Global Solution Directory (GSD)
Here you can create complex searches. To select Linux on System z select "IBM System z (Mainframe)" under platform configurations and then all the operating systems you are interested in. Finally click on "search" without entering anything else.
 
IBM Software (SWG) product compatibility
The most important reports are the operating system reports. When generating searches please be aware that their are two different entries for RHEL. The base one has RHEL2 - RHEL5 whereas the one titled RHEL server has RHEL6.
SUSE Enterprise Linux server is all under SLES.

IBM System z page (unfortunately discontinued by IBM end of 2013)
This page lists the ISV solutions updated in approximately the last year.

Red Hat Product Catalog
All the products registered with Red Hat. After you've searched for a specific product you can check the platform and see where it's running.
Note that Red Hat is still using the search term "zSeries" for referring to "System z". So you may want to try this search term as well.

SUSE Linux Enterprise Software Catalog
SUSE has a catalog that can be searched quite easily. Click on "advanced search" and select "IBM System z" as the platform and you get all the vendors offering products for the mainframe that registered with SUSE.

November 28, 2011

Why doesn't the Linux iowait show up as busy time in the virtualization layer?

When a process in Linux has issued an IO and is waiting for a response there are basically two different possibilities. The good case is that there is other work pending and the CPU can continue to work. If there is no work left to do the CPU changes it's state to idle and the time until either new work comes in or the IO is delivered is accounted as %iowait in tools like vmstat, iostat or sar. So you can view "iowait" as a shade of "idle".

From the perspective of the hipervisor such an idle CPU can be put to good use e.g. by dispatching it to another guest that's in need of CPU. This is why from a virtualization layer perspective guest CPUs in iowait are usually accounted as idle.
For problem determination it would help to add an iowait measure to the hipervisor as well as it would help detecting problems that are created by cloning inefficient guests, e.g. 100
servers all doing sync IO where async IO would have been possible.



November 11, 2011

VDSO - what is this?

(updated 3/12/2013)

Virtual Dynamically-linked Shared Object (VDSO) is a shared library provided by the kernel. This allows normal programs to do certain system calls without the usual overhead of system calls like switching address spaces.

For Linux on System z there are three functions at the moment that are accelerated in this way: gettimeofday, clock_getres, and clock_gettime. The most important one is probably gettimeofday. On a z196 system by using VDSO more than six times as many function calls of this function are possible as without using it.

The newer distributions (RHEL 5.9+, RHEL 6, SLES 11) have this feature and it's enabled by default.   In the rare case of an application that can't deal with such fast responses it can be turned of by the kernel parameter "vdso=0". If you are experiencing performance problems on older systems due to slow time operations it's definitely worth a try of this get's resolved by this feature.

If you want to check if your system already has it enabled do a
grep vdso /proc/self/maps
if this command finds an entry it's enabled.

November 8, 2011

OCFS2 support for Linux on SLES

Recently there has been quite some discussion around the support for OCFS2. SUSE has published their support statements in two blog entries:
So the good news is that there will be continued support for a shared file system integrated in the SUSE distribution. SUSE support will require the installation of the High Availability Extensions (HAE) that are included for Linux on System z and POWER.

So it's best to view OCFS2 as a cluster file system that has been created by Oracle. Don't get confused with the Oracle DB itself. If Oracle RAC is a requirement, you are not dependent on OCFS2 - Oracle ASM is in my opinion better integrated with the Oracle DB and should be used.

October 28, 2011

Using cio_ignore under z/VM

More and more installation tools use the cio_ignore kernel parameter. This parameter has been developed to allow hiding of devices from Linux, especially when using production Linux systems in LPAR.
The approach used in installers / automation systems is to first exclude everything and then explicitly define a white list of devices that Linux should see and use. So something like

cio_ignore=all,!0.0.5000-0.5002,!0.0.4711

will significantly speed up the IPL boot process as only those four devices need to be scanned and initialized.

The drawback is that with a line like this under z/VM the console is not available. So you log into your green screen and it seems that Linux doesn't IPL as there are no messages. If you wait long enough you can log in using the normal network. The z/VM console has the number 0.0.0009, so the line above needs to be modified to

cio_ignore=all,!0.0009,!0.0.5000-0.5002,!0.0.4711

So in general for a z/VM guest cio_ignore is not really needed as the access to devices can be controlled through the z/VM user directory. For guests that should come fast up in LPAR and z/VM add the console to the white list.

October 10, 2011

Methods to pause a z/VM guest: Optimize the resource utilization of idling servers

The startup time for servers increased during the last years. So even for development and test servers it's now common to either let them run or pause them. On z/VM with Linux on System z there are two different methods to accomplish that:
  1. the basic Linux suspend / resume feature 
  2. CP STOP and BEGIN mechanism
There is now a new publication looking at this in more detail, especially focused on overall performance. The paper is available on Infocenter in HTML and as PDF.

October 7, 2011

Bug fix for Linux on System z extensive swapping

Recently there has been a small bug fix for the swap behavior of Linux on System z. The following problem has been observed by several users:

The first time Linux started swapping it was doing so really extensively even though only a small amount of additional memory was needed. Depending on the disk speed and size of the swap space this can take a while.

The fix is included in
  • RHEL 5.7.z update 2.6.18-274.11.1
  • RHEL 6.2 base 2.6.32-220
  • SLES 10 SP4 kernel update 2.6.16.60-0.91.1
  • SLES 11 SP1 kernel update 2.6.32.45-0.3.1

and will be available for the other distros in the future as well.

As the system becomes totally unresponsive during that period I really recommend to get this fix installed if your system has more than a minimal swap attached and you are experiencing heavy first time swapping now and then. Contact your distribution partner or service provider in this case to get a temp fix.


October 2, 2011

Oracle 11g R2 - RMAN compression eating up CPU resources

There have been several complaints with respect to the RMAN CPU usage on Linux on System z. At the bottom of this was always the use of the 'BASIC' compression method in RMAN.
So if your CPU is also too high during RMAN backups, consider two options:
  1. use hardware compression in your tape unit. In this case don't use compression, or if it's your corporate policy use 'NONE' as the algorithm by the following RMAN command:
    CONFIGURE COMPRESSION ALGORITHM 'NONE';
  2. Use one of the advanced compression options. My recommendation for System z is to use the 'MEDIUM' compression algorithm. This corresponds to a basic zlib compression and offers a good trade of between CPU consumption and compression ratio on System z. The required command is:
    CONFIGURE COMPRESSION ALGORITHM 'MEDIUM';
    Note that for this option you will need the "Advanced Compression Option (ACO)".
Further reading for the Oracle DBA:

September 22, 2011

Which Distribution is supported on which hardware?

A frequently asked question for Linux on System z is which distributions are supported with what hardware. There are two different sources for this:
  1. IBM is publishing tested platforms
  2. The distributors publish their hardware certifications
    • On the Red Hat page select "s390x" as the platform and the distribution you are interested in
    • On the SUSE page in the lower box select the distribution for zSeries that you are interested in and press search.