5 monitoring failover pairs – HP StorageWorks Scalable File Share User Manual

Page 54

Advertising
background image

1.

Stop the Heartbeat service on all the OSS nodes:

# pdsh -w oss[1-n] service heartbeat stop

2.

Stop the Heartbeat service on the MDS and MGS nodes:

# pdsh -w mgs,mds service heartbeat stop

3.

To prevent the file system components and the Heartbeat service from automatically starting
on boot, enter the following command:

# pdsh -a chkconfig --level 345 heartbeat off

This forces you to manually start the Heartbeat service and the file system after a file system
server node is rebooted.

5.5 Monitoring Failover Pairs

Use the crm_mon command to monitor resources in a failover pair.

In the following sample crm_mon output, there are two nodes that are Lustre OSSs, and eight
OSTs, four for each node.

============
Last updated: Thu Sep 18 16:00:40 2008
Current DC: n4 (0236b688-3bb7-458a-839b-c19a69d75afa)
2 Nodes configured.
10 Resources configured.
============

Node: n4 (0236b688-3bb7-458a-839b-c19a69d75afa): online
Node: n3 (48610537-c58e-48c5-ae4c-ae44d56527c6): online

Filesystem_1 (heartbeat::ocf:Filesystem): Started n3
Filesystem_2 (heartbeat::ocf:Filesystem): Started n3
Filesystem_3 (heartbeat::ocf:Filesystem): Started n3
Filesystem_4 (heartbeat::ocf:Filesystem): Started n3
Filesystem_5 (heartbeat::ocf:Filesystem): Started n4
Filesystem_6 (heartbeat::ocf:Filesystem): Started n4
Filesystem_7 (heartbeat::ocf:Filesystem): Started n4
Filesystem_8 (heartbeat::ocf:Filesystem): Started n4
Clone Set: clone_9
stonith_9:0 (stonith:external/riloe): Started n4
stonith_9:1 (stonith:external/riloe): Started n3
Clone Set: clone_10
stonith_10:0 (stonith:external/riloe): Started n4
stonith_10:1 (stonith:external/riloe): Started n3

The display updates periodically until you interrupt it and terminate the program.

5.6 Moving and Starting Lustre Servers Using Heartbeat

Lustre servers can be moved between nodes in a failover pair, and stopped, or started using the
Heartbeat command crm_resource. The local file systems corresponding to the Lustre servers
appear as file system resources with names of the form Filesystem_n, where n is an integer.
The mapping from file system resource names to Lustre server mount-points is found in cib.xml.
For example, to move Filesystem_7 from its current location to node 11:

# crm_resource -H node11 -M -r Filesystem_7

The destination host name is optional but it is important to note that if it is not specified,
crm_resource

forces the resource to move by creating a rule for the current location with the

value -INFINITY. This prevents the resource from running on that node again until the constraint
is removed with crm_resource -U.

54

Using HP SFS Software

Advertising