4 stopping the file system, 5 testing your configuration, 1 examining and troubleshooting – HP StorageWorks Scalable File Share User Manual
Page 50: 1 on the server

4.
Start the Heartbeat service on the remaining OSS nodes:
# pdsh -w oss[1-n] service heartbeat start
5.
After the file system has started, HP recommends that you set the Heartbeat service to
automatically start on boot:
# pdsh -a chkconfig --level 345 heartbeat on
This automatically starts the file system component defined to run on the node when it is
rebooted.
5.4 Stopping the File System
Before the file system is stopped, unmount all client nodes. For example, run the following
command on all client nodes:
# umount /testfs
1.
Stop the Heartbeat service on all the OSS nodes:
# pdsh -w oss[1-n] service heartbeat stop
2.
Stop the Heartbeat service on the MDS and MGS nodes:
# pdsh -w mgs,mds service heartbeat stop
3.
To prevent the file system components and the Heartbeat service from automatically starting
on boot, enter the following command:
# pdsh -a chkconfig --level 345 heartbeat off
This forces you to manually start the Heartbeat service and the file system after a file system
server node is rebooted.
5.5 Testing Your Configuration
The best way to test your Lustre file system is to perform normal file system operations, such as
normal Linux file system shell commands like df, cd, and ls. If you want to measure performance
of your installation, you can use your own application or the standard file system performance
benchmarks described in Chapter 17 Benchmarking of the Lustre 1.6 Operations Manual at:
.
5.5.1 Examining and Troubleshooting
If your file system is not operating properly, you can refer to information in the Lustre 1.6
Operations Manual, PART III Lustre Tuning, Monitoring and Troubleshooting. Many important
commands for file system operation and analysis are described in the Part V Reference section,
including lctl, lfs, tunefs.lustre, and debugfs. Some of the most useful diagnostic and
troubleshooting commands are also briefly described below.
5.5.1.1 On the Server
Use the following command to check the health of the system.
# cat /proc/fs/lustre/health_check
healthy
This returns healthy if there are no catastrophic problems. However, other less severe problems
that prevent proper operation might still exist.
Use the following command to show the LNET network interface active on the node.
# lctl list_nids
172.31.97.1@o2ib
50
Using HP SFS Software