HP StorageWorks Scalable File Share User Manual
Page 172

Verifying, diagnosing, and maintaining the system
6–42
Table 6-4
Default email alerts
Email Alert Name
Purpose
Email Alert Filter
Action Required
array_failure
Alerts you when a server
cannot access an SFS20
storage array.
data contains
"array" &&
severity>err
Power cycle the array and
then reboot the servers
attached to the array.
bond_link_down
Alerts you when one of the
links in a bonded ethernet
interface has been
disconnected.
facility=kern &&
data contains
"status definitely
down for interface"
Run the
syscheck
command to identify the
link that has been
disconnected. Reattach the
cable to the interface.
check_condition
Alerts you when a LUN
has a
CHECK
CONDITION
error.
facility=kern &&
severity>notice &&
data contains "CHECK
CONDITION"
Contact your HP Customer
Support representative.
critical_iml
Alerts you to the presence
of critical events in the
ProLiant Integrated
Management Log (IML).
severity=crit &&
data contains
"Integrated
Management Log"
These events usually
indicate a serious problem
with a server.
Read the IML and correct
the problem as soon as
possible. (See Section 6.6
for more information on the
IML).
disk_errors
Alerts you when a disk
drive in an SFS20 array
fails or is predicted to fail.
facility=storage &&
severity>notice &&
data contains "disk
bay"
Replace the disk that has
the problem. See
information on replacing a
disk in an SFS20 array.
groups_server
Alerts you when the system
cannot access the group
server(s).
facility=lustre &&
data contains "all
group servers have
failed"
Check that the group
servers are booted. If they
are, check that the HP SFS
system can connect to the
group servers using the
ssh
utility without using a
password.
Refer to Chapter 9 in the
HP StorageWorks Scalable
File Share System
Installation and Upgrade
Guide for more information.
invalid_uid
Alerts you when the upcall
mechanism for
supplementary groups
cannot resolve a user's
UID. This happens when
the UIDs of a client system
do not match the UIDs of
the group server.
facility=lustre &&
data contains
"Invalid uid"
Check whether the group
server and the client system
are using a common set of
user UIDs.
Refer to Chapter 9 in the
HP StorageWorks Scalable
File Share System
Installation and Upgrade
Guide for more information.
link_failure
Alerts you when an
interconnect link
(Quadrics, Myrinet,
Voltaire, Gigabit Ethernet)
fails.
facility=lustre &&
severity>notice &&
data contains "link
down"
Reseat the interconnect
cables. If the problem
persists, run the appropriate
interconnect diagnostics
(see Section 6.1.8 for more
information).
lun_failure
Alerts you when an MDS
or OST service fails to start
because of a LUN failure.
data contains "/usr/
sbin/lctl (22):
error:"
The most common cause of
this failure is that the array
where the LUN is located is
hung. Power cycle the array
where the LUN is located
and then reboot the servers
attached to the array.