HP StorageWorks Scalable File Share User Manual

Page 172

Advertising
background image

Verifying, diagnosing, and maintaining the system

6–42

Table 6-4

Default email alerts

Email Alert Name

Purpose

Email Alert Filter

Action Required

array_failure

Alerts you when a server

cannot access an SFS20

storage array.

data contains
"array" &&
severity>err

Power cycle the array and

then reboot the servers

attached to the array.

bond_link_down

Alerts you when one of the

links in a bonded ethernet

interface has been

disconnected.

facility=kern &&
data contains
"status definitely
down for interface"

Run the

syscheck

command to identify the

link that has been

disconnected. Reattach the

cable to the interface.

check_condition

Alerts you when a LUN

has a

CHECK

CONDITION

error.

facility=kern &&
severity>notice &&
data contains "CHECK
CONDITION"

Contact your HP Customer

Support representative.

critical_iml

Alerts you to the presence

of critical events in the

ProLiant Integrated

Management Log (IML).

severity=crit &&
data contains
"Integrated
Management Log"

These events usually

indicate a serious problem

with a server.
Read the IML and correct

the problem as soon as

possible. (See Section 6.6

for more information on the

IML).

disk_errors

Alerts you when a disk

drive in an SFS20 array

fails or is predicted to fail.

facility=storage &&
severity>notice &&
data contains "disk
bay"

Replace the disk that has

the problem. See

Section 8.1.10 for

information on replacing a

disk in an SFS20 array.

groups_server

Alerts you when the system

cannot access the group

server(s).

facility=lustre &&
data contains "all
group servers have
failed"

Check that the group

servers are booted. If they

are, check that the HP SFS

system can connect to the

group servers using the

ssh

utility without using a

password.

Refer to Chapter 9 in the

HP StorageWorks Scalable

File Share System

Installation and Upgrade

Guide for more information.

invalid_uid

Alerts you when the upcall

mechanism for

supplementary groups

cannot resolve a user's

UID. This happens when

the UIDs of a client system

do not match the UIDs of

the group server.

facility=lustre &&
data contains
"Invalid uid"

Check whether the group

server and the client system

are using a common set of

user UIDs.

Refer to Chapter 9 in the

HP StorageWorks Scalable

File Share System

Installation and Upgrade

Guide for more information.

link_failure

Alerts you when an

interconnect link

(Quadrics, Myrinet,

Voltaire, Gigabit Ethernet)

fails.

facility=lustre &&
severity>notice &&
data contains "link
down"

Reseat the interconnect

cables. If the problem

persists, run the appropriate

interconnect diagnostics

(see Section 6.1.8 for more

information).

lun_failure

Alerts you when an MDS

or OST service fails to start

because of a LUN failure.

data contains "/usr/
sbin/lctl (22):
error:"

The most common cause of

this failure is that the array

where the LUN is located is

hung. Power cycle the array

where the LUN is located

and then reboot the servers

attached to the array.

Advertising