Cooperative Memory Management for Linux Guests on z/VM
by Dave Jones
April 8, 2008
** Read this article online at http://www.mainframezone.com/operating-systems/cooperative-memory-management-for-linux-guests-on-z-vm
Running Linux as a guest of z/VM presents a different set of performance problems from running Linux on a discrete server. One such problem area is in how Linux wants to manage its memory in a guest virtual machine environment. In such environments, multiple guest operating systems are hosted on top of a host operating system or hypervisor—in this case, z/VM’s Control Program (CP). The problem of overcommitting physical memory is solved either by dynamically adjusting the memory sizes of the guests or through transparent host paging. Both approaches can introduce significant overhead in heavily overcommitted memory scenarios due to frequent resize requests or high paging activity. This article discusses the design and implementation of a novel approach to this problem called Cooperative Memory Management (CMM) on Linux for System z and the z/VM hypervisor.
The problem of memory pressure, or the lack of free, allocatable memory when it’s needed, stems from the fact that guest operating systems such as Linux use all available memory given to them, usually using any “extra” virtual memory for file cache. As a result, static “partitioning” of the system would be significantly limited by real system memory. Static memory partitioning is also contrary to the nature of many systems today, which often show bursts of high utilization. z/VM virtualization technologies can effectively exploit this variability. Memory overcommitment is an attribute of the application mix that runs on a system and can’t be eliminated. Memory overcommitment occurs when a process is started and it allocates more memory than it really needs at start-up. This allows the process to begin actual work sooner, but it also causes available memory to run out sooner. The memory pressure resulting from memory overcommitment must be dealt with by either pushing it back into the guest operating system or resolving it in the hypervisor. So there’s potential for high paging rates in the hypervisor, guest, or both.
High paging rates have non-linear impact on application and system response times and limit the number of guests you can effectively deploy. This non-linear performance impact makes dealing with memory overcommitment unique compared to overcommitting other resources. Nevertheless, through proper global memory management, you can significantly reduce the symptoms experienced due to memory overcommitment.
How Does It Work?
The two main approaches to real memory management among multiple virtual guests running on a hypervisor are:
1. Dynamic partitioning, in which individual guests are forced to dynamically change their memory size to accommodate a global memory strategy
2. Memory virtualization, in which the hypervisor pages guest memory in a way similar to how any virtual memory operating system overcommits real memory to applications.
Both approaches have strengths and weaknesses; both can support overcommitment of available real memory.
The z/VM hypervisor takes the second approach, mapping guest virtual memory into the real memory storage of the System z machine. If there aren’t enough real memory frames to contain all the active guests’ virtual memory pages, some pages are moved to expanded storage (XSTOR). Once expanded storage becomes full, the guests’ pages are moved from expanded storage to DASD paging space.
Figure 1 provides a simplified view of the z/VM memory management mechanism, showing some inactive virtual storage pages in each Linux guest. These inactive virtual memory pages must be recovered for use by other guests, whether Linux-based or not.

CMM assists in managing memory constraints in the system as they arise. Based on several performance variables obtained from the system and storage domain CP monitor data records, a resource manager, such as IBM’s Virtual Machine Resource Manager (VMRM), detects such constraints and notifies specific Linux virtual guests when this occurs. The guests can then take the appropriate action to adjust their memory utilization to relieve this system constraint. Figure 2 shows this process.

Once a system memory constraint is detected, the resource manager calculates how much memory each eligible Linux guest should release to relieve the constraint. It then sends each guest a SHRINK request as a CP SMSG command, indicating the amount of storage to release.
A Linux device driver named CMM receives these SHRINK messages. This device driver implements dynamic memory sizing in the Linux guest by a technique known as “ballooning.” The CMM driver allocates storage in Linux and then tells CP that it no longer needs to manage these pages. This has two beneficial effects:
1. It reduces the guest’s working set and effective memory footprint.
2. It forces the guest to reclaim pages in use for read and write cache.
When system memory is no longer constrained, another SHRINK command with a smaller absolute value is issued. These smaller SHRINK requests effectively instruct the Linux guests to reclaim some of the storage previously released.
By growing and shrinking these memory “balloons,” CP real storage usage can be managed to deal with changing memory requirements by individual Linux guest systems and the z/VM system overall. The CMM device driver, which can be either generated directly into the Linux kernel or dynamically loaded as a module, causes this action to happen by issuing a CP X’10’ Release Pages Diagnose command.
CMM vs. CMMA
To make matters more confusing, IBM has recently introduced a followon approach to memory management and storage overcommitment called Collaborative Memory Management Assist (CMMA). While CMM, now sometimes referred to as CMM-1, is a software-only approach to managing Linux memory usage under z/VM (requiring only that a resource manager such as VMRM and the Linux CMM driver be installed), CMMA adds new hardware functions to the IBM System z9 Enterprise Class (z9 EC) and System z9 Business Class (z9 BC) processors. This new hardware support, Host Page- Management Assist (HPMA), announced July 27, 2005, lets both Linux guests and CP modify and track the usage state of each 4KB page being used by Linux guests. This exchange of page state information lets z/VM CP and its guests optimize their use and management of memory.
With CMMA, CP can determine when a Linux application releases storage and can select those pages for removal at a higher priority, or reclaim the page frames without the overhead of paging-out their data content to expanded storage or disk. CP also now recognizes “clean” disk cache pages (pages whose contents Linux can reconstruct), allowing it to bypass paging-out that data when reclaiming the backing frames for these pages.
Prerequisites to use CMMA are:
• z/VM 5.2 with APAR VM63856 applied or z/VM
• z9 EC or z9 BC processors or newer. CMM-1 and CMMA can be deployed simultaneously if all CMMA prerequisites are met.
Configuring VMRM for CMM
To support the CMM function, VMRM supports a “NOTIFY” statement with a MEMORY keyword in its configuration file, listing the Linux guest image names to be notified when the resource manager detects memory constraint:
nOTiFY memORY user1 [user2…..userx]
where:
• NOTIFY is the type of statement
• MEMORY is the system object being managed
• User1….userx is the list of blank- delimited userids to be managed; userids may contain a wildcard “*” character at the end, such as “linux*”.
Examples:
nOTiFY memORY LnX00080 LnX00081 LnX00082 LnX00083 LnX00084 nOTiFY memORY LnX0008*
Configuring Linux for CMM CMM support is available as a module, cmm, or as a built-in kernel component. If the CMM component has been compiled into the kernel, it can be configured adding parameters to the kernel parameter line:
cmm.sender=VmRmSVm
or
cmm.sender=<guest name>
where <guest name> is the name of the z/VM guest that’s allowed to send messages to the module through the SMSG interface. The default guest name is VMRMSVM, which is the default name of the VMRM Service Virtual Machine (SVM).
If CMM support is compiled as a loadable kernel module, use the insmod or modprobe command to load the module. For example, to load the CMM module and let the guest TESTID send messages, issue this Linux command:
modprobe cmm sender=TeSTiD
See the IBM publication “Device Drivers, Features, and Commands, November 2007” (SC33-8289-04) for more details on installing and configuring CMM support on Linux.
Performance Implications and Conclusions
Since the support for CMM in z/VM and Linux is fairly recent, few performance studies are available. Those that have been conducted indicate that the use of VMRM CMM-1 can significantly improve overall system performance in cases where the overall z/VM system is constrained for real storage and much of that storage is being held by one or more Linux guests. However, in some cases, the use of CMM can reduce the performance of one or more of the participating Linux guests. Employing a good z/VM performance monitor can help identify and eliminate these situations. Monitor your Linux guest performance before and after CMM use so you can determine whether any performance- critical guests are being adversely affected. If so, remove them from CMM use.
Since this article was written, there has been very little real-world expeience with the new CMM-A support found in the newer Linux kernels and distributions, thus no really useful performance data is available. However, as CMM-A is deployed, either by itself or in conjunction with CMM and VM-RM, its performance impact on typical production Linux workloads running on the mainframe will become understood. Z