4 large page sizing – IBM pSeries User Manual

Page 13

Advertising
background image

pshpstuningguidewp040105.doc

Page

13

statistics in 5-second intervals, with the first set of statistics being the statistics since the node or
LPAR was last booted.

vmstat 5

The pi and po of the page group is the number of 4KB pages read from and written to the paging
device between consecutive samplings. If

po

is high, it could indicate that thrashing is taking

place. In that case, it is a good idea to run the svmon command

to see the system-wide virtual

segment allocation.

3.4 Large page sizing

Some HPC applications that use Technical Large Pages (TLPs) can benefit from a 5 - 20%
increase in performance. There are two reasons why TLPs boost performance:

Because the hardware prefetch streams cross fewer page boundaries, they are more
efficient.

Because missing the translation lookaside buffer is less likely, there is a better chance of
using a fast path for address translation.


TLPs must be configured by the root user and require a system reboot as described below. The
operating system limits the maximum number of TLP to about 80% of the total physical storage
on the system. The application can choose to use small pages only, large pages only, or both.
Using both small and large pages is also known as an advisory mode recommended for high
performance computing applications.

You can enable the application for TLPs by using the loader flag, by means of the ldedit
command, or by using the environment variable at run time. The ldedit command enables the
application for TLPs in the advisory mode:

ldedit –b lpdata <executable path name>

You can use –b nolpdata to turn TLPs off. The –b lpdata loader flag on the ld command does
the same thing.

Setting the LDR_CNTRL environment variable enables TLPs in the advisory mode for all
processes spawned from a shell process and their children. Here is an example:

export LDR_CNTRL=LARGE_PAGE_DATA=Y

Setting the environment variable has a side effect for MPI jobs spawned by the MPI daemons
from the shell process, because it also enables the daemons for TLPs. This takes away about
512MB of physical memory from an application. TLPs by their nature are pinned in memory
(they cannot be paged out). In addition, TLPs are mapped into the process address space with
segment granularity (256MB) even if the process uses only a few bytes in that segment. As a
result, each of the two MPI daemons gets 256MB of pinned memory. For that reason, you should
avoid using the LDR_CNTRL environment variable with MPI jobs.

Using TLPs boosts the performance of the MPI protocol stack. Some of the TLPs are reserved by
the HPS adapter code at boot time and are not available to an application as long as the HPS

Advertising