Copying files in parallel, Using parastation accounting, 26 5.18. using parastation accounting – PAR Technologies PARASTATION5 V5 User Manual

Page 30

Advertising
background image

Copying files in parallel

26

ParaStation5 Administrator's Guide

# UseMCast

statement.

If Multicast is enabled, the ParaStation daemons exchange status information using multicast messages.
Thus, a Linux kernel supporting multicast on all nodes of the cluster is required. This is usually no problem,
since all standard kernels from all common distribution are compiled with multicast support. If a customized
kernel is used, multicast support must be enabled within the kernel configuration! In order to learn more
about multicast take a look at the Multicast over TCP/IP HOWTO.

In addition, the hardware also has to support multicast packets. Since all modern Ethernet switches support
multicast and the nodes of a cluster typically live in a private subnet, this should be not a problem. If the
cluster nodes are connected by a gateway, it has to be configured appropriately to allow multicast packets
to reach all nodes of the cluster from all nodes.

Using a gateway in order to link parts of a cluster is not a recommended configuration.

On nodes with more than one Ethernet interface, typically frontend or head nodes, or systems where the
default route does not point to the private cluster subnet, a proper route for the multicast traffic must be
setup. This is done by the command

route add -net 224.0.0.0 netmask 240.0.0.0 dev ethX

where

ethX

should be replaced by the actual name of the interface connecting to all other nodes. In order

to enable this route at system startup, a corresponding entry has to be added to

/etc/route.conf

or

/

etc/sysconfig/networks/routes

, depending on the type of Linux distribution in use.

5.17. Copying files in parallel

To copy large files to many or all nodes in a cluster at once, pscp is very handy. It overlaps storing data
to disk and transfering data on the network, therefore it scales very well with respect to the number of
nodes. Arbitrary size of files may be copied, even archives containing large lists of files may be created
and unpacked on-the-fly.

Pscp uses the ParaStation

pscom

library for data transfers, that automatically will use the most

effective communication channel available. If required, the communication layer may be controlled using
environment variables, refer to ps_environment(7) for details. The client process on each node is spawned
using the ParaStation process management.

As pscp uses administrative ParaStation tasks to spawn the client processes, the user must be a member
of the

adminuser

list or the user's group must be a member of the

admingroup

list. By default, only root

is a member of the

adminuser

list and therefore allowed to use pscp. Refer to ParaStation5 User's Guide

and psiadmin(8) for details.

For more details refer to ParaStation5 User's Guide and pscp(8).

5.18. Using ParaStation accounting

ParaStation may write accounting information about each finished job run on the cluster to

/var/

account/yyyymmdd

, where

yyyymmdd

denotes the current accounting file in the form year, month and

day.

To enable accouting, the special hardware

accounter

must be set within the ParaStation configuration

file for at least one node. On each configured node, an accounting daemon collecting all information for all
jobs within the cluster will store the job information in the accouting file.

Advertising