November 28, 2011

Why doesn't the Linux iowait show up as busy time in the virtualization layer?

When a process in Linux has issued an IO and is waiting for a response there are basically two different possibilities. The good case is that there is other work pending and the CPU can continue to work. If there is no work left to do the CPU changes it's state to idle and the time until either new work comes in or the IO is delivered is accounted as %iowait in tools like vmstat, iostat or sar. So you can view "iowait" as a shade of "idle".

From the perspective of the hipervisor such an idle CPU can be put to good use e.g. by dispatching it to another guest that's in need of CPU. This is why from a virtualization layer perspective guest CPUs in iowait are usually accounted as idle.
For problem determination it would help to add an iowait measure to the hipervisor as well as it would help detecting problems that are created by cloning inefficient guests, e.g. 100
servers all doing sync IO where async IO would have been possible.



November 11, 2011

VDSO - what is this?

(updated 3/12/2013)

Virtual Dynamically-linked Shared Object (VDSO) is a shared library provided by the kernel. This allows normal programs to do certain system calls without the usual overhead of system calls like switching address spaces.

For Linux on System z there are three functions at the moment that are accelerated in this way: gettimeofday, clock_getres, and clock_gettime. The most important one is probably gettimeofday. On a z196 system by using VDSO more than six times as many function calls of this function are possible as without using it.

The newer distributions (RHEL 5.9+, RHEL 6, SLES 11) have this feature and it's enabled by default.   In the rare case of an application that can't deal with such fast responses it can be turned of by the kernel parameter "vdso=0". If you are experiencing performance problems on older systems due to slow time operations it's definitely worth a try of this get's resolved by this feature.

If you want to check if your system already has it enabled do a
grep vdso /proc/self/maps
if this command finds an entry it's enabled.

November 8, 2011

OCFS2 support for Linux on SLES

Recently there has been quite some discussion around the support for OCFS2. SUSE has published their support statements in two blog entries:
So the good news is that there will be continued support for a shared file system integrated in the SUSE distribution. SUSE support will require the installation of the High Availability Extensions (HAE) that are included for Linux on System z and POWER.

So it's best to view OCFS2 as a cluster file system that has been created by Oracle. Don't get confused with the Oracle DB itself. If Oracle RAC is a requirement, you are not dependent on OCFS2 - Oracle ASM is in my opinion better integrated with the Oracle DB and should be used.