IBM 990 User Manual

Page 66

Advertising
background image

54

IBM

^

zSeries 990 Technical Guide

Dynamic Memory sparing

The z990 does not contain spare memory DIMMs. Instead, it has redundant memory
distributed throughout its operational memory and this is used to bypass failing memory.
Replacing memory cards requires the removal of a book and this is disruptive. The
extensive use of redundant elements in the operational memory greatly minimizes the
possibility of a failure that requires memory card replacement.

Partial Memory Restart

In the rare event of a memory card failure, Partial Memory Restart enables the system to
be restarted with only part of the original memory. In a one-book system, the failing card
will be deactivated, after which the system can be restarted with the memory on the
remaining memory card.

In a system with more than one book, all physical memory in the book containing the
failing memory card is taken offline, allowing you to bring up the system with the remaining
physical memory in the other books. In this way, processing can be resumed until a
replacement memory card is installed.

Memory error-checking and correction code detects and corrects single-bit errors, or 2-bit
errors from a chipkill failure, using the Error Correction Code (ECC). Also, because of the
memory structure design, errors due to a single memory chip failure are corrected.

Memory background scrubbing provides continuous monitoring of storage for the correction
of detected faults before the storage is used.

The memory cards use the latest fast 256 Mb, and 512 Mb, synchronous DRAMs. Memory
access is interleaved between the memory cards to equalize memory activity across the
cards.

Memory cards have 8 GB, 16 GB, or 32 GB of capacity. All memory cards installed in one
book must have the same capacity. Books may have different memory sizes, but the card size
of the two cards per book must always be the same.

The total capacity installed may have more usable memory than required for a configuration,
and Licensed Internal Code Configuration Control (LIC-CC) will determine how much
memory is used from each card. The sum of the LIC-CC provided memory from each card is
the amount available for use in the system.

Memory allocation

Memory assignment or allocation is done at Power-on Reset (POR) when the system is
initialized. Actually, PR/SM is responsible for the memory assignments; it is PR/SM that
controls the resource allocation of the CPC. Table 2-1 on page 28 shows the distribution of
physical memory across books when a system initially is installed with the amounts of
memory shown in the first column. However, the table gives no indication of

where

the initial

memory is allocated. Memory allocation is done as evenly as possible across all installed
books.

PR/SM has knowledge of the amount of purchased memory and how it relates to the
available physical memory in each of the installed books. PR/SM has control over all physical
memory and therefore is able to make physical memory available to the configuration when a
book is non-disruptively added. PR/SM also controls the reassignment of the content of a
specific physical memory array in one book to a memory array in another book. This is known
as the Memory Copy/Reassign function.

Due to the memory allocation algorithm, systems that undergo a number of MES upgrades for
memory can have a variety of memory card mixes in all books of the system. If, however

Advertising