Failure indicators – HP Insight Management Agents User Manual

Page 47

Advertising
background image

— On a storage system, check the SCSI ID cable on the drive tray. If the cable is damaged

or incorrectly installed, SCSI Timeouts can occur. See the documentation accompanying
the Hot Plug Drive Tray Service Spare Kit.

— Ensure that the system temperature is within specified limits. Ensure that the fans are

operating and are not blocked.

In some instances, drive failure can cause Timeouts. If you continue to receive many of
these errors, replace the drive.

SCSI Bus Faults—Displays the number of times that SCSI bus parity, overrun, or underrun
errors have been detected on the SCSI bus. Since the controller retries the operation, SCSI
bus faults can cause a drop in performance, or, in some cases, data corruption.

If the count is not zero and the drive has failed, the failure might be correctable without
replacing the drive. Verify the status of the drive by checking the following:
— Ensure that all system and storage system cables are intact and seated properly. You

may need to replace the cables.

— Check the physical proximity of the system to other electrical devices. Since electrical

noise may cause a Bus Fault error, check the AC circuit for other electrical devices.

— Ensure that the system temperature is within specified limits. Ensure that fans are

operating and are not blocked.

SCSI Bus Faults can be caused when two or more drives are set to the same SCSI ID.
Ensure that storage system and system SCSI IDs do not conflict.

In some instances, drive failure can cause SCSI Bus Faults. If you continue to receive
many of these errors, replace the drive.

IRQ Deglitch—Displays the number of times that a glitch has been detected on the drive
interface cable. Since the controller retries the operation, problems can cause a drop in
performance or, in some cases, data corruption. Glitches indicate electrical noise on the drive
cable or an intermittent failure of the drive electronics.

This item is considered a Problem Indicator that may be correctable without replacing the
drive. Verify the status of the drive by checking the following:
— Ensure that all system and storage system cables are intact and seated properly. You

may need to replace cables.

— Check the physical proximity of the system to other electrical devices. Since electrical

noise may cause a glitch error, check the AC circuit for other electrical devices.

— If you continue to receive many of these errors, replace the drive.

Failure Indicators

Use the Failure Indicators to determine the cause of a drive failure. Typically, the number of
failures is zero when the drive is operating normally. If a counter is not zero and the drive has
not failed, there could be an intermittent problem that may require the drive to be replaced.

The Failure Indicators are:

Spinup Errors—When the physical drive fails due to the failure of a spin-up command, a
Spinup Error occurs. If the failure count is not zero and the drive has failed, replace the
drive.

If the counter is not zero and the drive is OK (has not failed), there may be an intermittent
problem that requires drive replacement. If you observe that the count is increasing over
time, replace the drive.

Aborted Commands—The Aborted Commands counter records the number of times that a
physical SCSI drive returned an Aborted Command status when a SCSI command was
attempted. This error count indicates unsuccessful termination of the SCSI command. When
the physical drive is failed due to aborted commands that could not be retried successfully,

Storage Agent

47

Advertising