34 handling disk errors on sfs20 storage, 34 handling disk errors on sfs20 storage -59, E section 9.34 – HP StorageWorks Scalable File Share User Manual

Page 283: Section 9.34, See section 9.34

Advertising
background image

Handling Disk Errors on SFS20 storage

9–59

When the resynchronization is complete, the status information will change, as shown in the following

example:

# mdadm --detail /dev/md0
/dev/md0:
.

.

.

State : clean
.

.

.

Number Major Minor RaidDevice State
0 105 96 0 active sync /dev/cciss/c1d6
1 105 32 1 active sync /dev/cciss/c1d2

You can check the progress of the resynchronization process by examining the event log as follows:

sfs> show log facility=storage && age < "5m"
.

.

.

2004/11/02 10:28:56 storage n south2: mds8: /proc/mdstat:
md0 : active raid1 cciss/c1d2[2] cciss/c1d6[0]
10485504 blocks [2/1] [U_]
[=>...................] recovery = 6.8% (721344/10485504)
finish=2.9min speed=55488K/sec
----
.

.

.

When the resynchronization is complete, the

/proc/mdstat

command indicates this, as shown in

the following example:

sfs> show log facility=storage && age < "5m"
.

.

.

2004/11/02 10:56:41 storage n south2: mds8: /proc/mdstat:
md0 : active raid1 cciss/c1d2[1] cciss/c1d6[0]
10485504 blocks [2/2] [UU]
----
.

.

.

9.34 Handling Disk Errors on SFS20 storage

The

sfsmgr show array array_number

command displays any one of the following states for each

of the bays/disks on an SFS20 array:

ok

removed/failed

predict fail

logging errors

See Section 4.5 for more information on these states.

The system log records disk issues, as shown in the following example:

sfs> show log data contains "disk bay" && facility=storage && severity>notice
2006/01/05 13:40:44 storage !! south_test5: P92CB0AMQRA684: array 4: disk bay
1: disk Y69BMY3E has been removed or failed (was online)
2006/01/06 09:32:04 storage !! south_test5: P92CB0AMQRA684: array 4: disk bay
1: disk Y69BMY3E is logging errors (was removed or failed)
2006/01/10 10:43:25 storage !! south_test2: P92CB0AMQR2618: array 1: disk bay
12: disk Y69CHCDE has been removed or failed (was online)
2006/01/26 07:11:35 storage !! south_test5: P92CB0AMQRA683: array 3: disk bay
7: disk Y69BLLYE is logging errors (was online)
sfs>

In addition, if email alerts are configured on the system, disk errors trigger the default

disk_errors

alert

to send email to the configured recipients. The filter for the default

disk_errors

alert is as follows:

facility=storage && severity>notice && data contains "disk bay"

Advertising