Critical conditions, Noncritical conditions, Invalid storage array – Dell PowerVault MD3820f User Manual

Page 45: Ecc errors, Pci errors

Advertising
background image

Critical Conditions

The storage array generates a critical event if the RAID controller module detects a critical condition that
could cause immediate failure of the array and/or loss of data. The storage array is in a critical condition if
one of the following occurs:

• More than one fan has failed
• Any midplane temperature sensors in the critical range
• Midplane/power supply module failure
• Two or more temperature sensors are unreadable
• Failure to detect or unable to communicate with peer port

NOTE: If both RAID controller modules fail simultaneously, the enclosure cannot issue critical or
noncritical event alarms for any enclosure component.

Noncritical Conditions

A noncritical condition is an event or status that does not cause immediate failure, but must be corrected
to ensure continued reliability of the storage array. Examples of noncritical events include the following:

• One power supply module has failed
• One cooling fan module has failed
• One RAID controller module in a redundant configuration has failed
• A battery has failed or has been removed
• A physical disk in a redundant virtual disk has failed

Invalid Storage Array

The RAID controller module is supported only in a Dell-supported storage array. After installation in the
storage array, the controller performs a set of validation checks. The array status LED is lit with a steady
amber color while the RAID controller module completes these initial tests and the controllers are
booted successfully. If the RAID controller module detects a non-Dell supported storage array, the
controller does not start up. The RAID controller module does not generate any events to alert you in the
event of an invalid array, but the array status LED is lit with a flashing amber color to indicate a fault state.

ECC Errors

RAID controller firmware can detect ECC errors and can recover from a single-bit ECC error whether the
RAID controller module is in a redundant or non-redundant configuration. A storage array with redundant
controllers can recover from multi-bit ECC errors as well because the peer RAID controller module can
take over, if necessary.
The RAID controller module fails over if it experiences up to 10 single-bit errors, or up to three multi-bit
errors.

PCI Errors

The storage array firmware can detect and only recover from PCI errors when the RAID controller
modules are configured for redundancy. If a virtual disk uses cache mirroring, it fails over to its peer RAID
controller module, which initiates a flush of the dirty cache.

45

Advertising
This manual is related to the following products: