Dimm replacement policy, How dimm errors are handled by the system, Uncorrectable dimm errors – Sun Microsystems Sun Fire X4240 User Manual
Page 22
 
12
Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide • August 2008
DIMM Replacement Policy
Replace a DIMM when one of the following events takes place:
■
The DIMM fails memory testing under BIOS due to Uncorrectable Memory Errors
(UCEs).
■
UCEs occur and investigation shows that the errors originated from memory.
In addition, a DIMM should be replaced whenever more than 24 Correctable
Errors (CEs) originate in 24 hours from a single DIMM and no other DIMM is
showing further CEs.
■
If more than one DIMM has experienced multiple CEs, other possible causes of
CEs have to be ruled out by a qualified Sun Support specialist before replacing
any DIMMs.
Retain copies of the logs showing the memory errors per the above rules to send to
Sun for verification prior to calling Sun.
How DIMM Errors Are Handled by the
System
This section describes system behavior for the two types of DIMM errors: UCEs and
CEs, and also describes BIOS DIMM error messages.
Uncorrectable DIMM Errors
For all operating systems (OS’s), the behavior is the same for UCEs:
1. When an UCE occurs, the memory controller causes an immediate reboot of the
system.
2. During reboot, the BIOS checks the Machine Check registers and determines that
the previous reboot was due to an UCE, then reports this in POST after the
memtest stage:
A Hypertransport Sync Flood occurred on last boot