Dimm replacement policy, How dimm errors are handled by the system, Uncorrectable dimm errors – Sun Microsystems Sun Fire X4240 User Manual

Page 22

Advertising
background image

12

Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide • August 2008

DIMM Replacement Policy

Replace a DIMM when one of the following events takes place:

The DIMM fails memory testing under BIOS due to Uncorrectable Memory Errors
(UCEs).

UCEs occur and investigation shows that the errors originated from memory.

In addition, a DIMM should be replaced whenever more than 24 Correctable
Errors (CEs) originate in 24 hours from a single DIMM and no other DIMM is
showing further CEs.

If more than one DIMM has experienced multiple CEs, other possible causes of
CEs have to be ruled out by a qualified Sun Support specialist before replacing
any DIMMs.

Retain copies of the logs showing the memory errors per the above rules to send to
Sun for verification prior to calling Sun.

How DIMM Errors Are Handled by the
System

This section describes system behavior for the two types of DIMM errors: UCEs and
CEs, and also describes BIOS DIMM error messages.

Uncorrectable DIMM Errors

For all operating systems (OS’s), the behavior is the same for UCEs:

1. When an UCE occurs, the memory controller causes an immediate reboot of the

system.

2. During reboot, the BIOS checks the Machine Check registers and determines that

the previous reboot was due to an UCE, then reports this in POST after the
memtest stage:

A Hypertransport Sync Flood occurred on last boot

Advertising
This manual is related to the following products: