5 srun control options, 1 node management options, 2 working features options – HP XC System 2.x Software User Manual

Page 77

Advertising
background image

6.4.5

srun

Control Options

srun

control options determine how a SLURM job manages its nodes and other resources,

what its working features (such as job name) are, and how it gives you help. Separate
"constraint" options and I/O options are available and are described in other sections of this
chapter. The following types of control options are available:

Node management

Working features

Resource control

Help options

6.4.5.1 Node Management Options

-k

(

--no-kill

)

The

-k

option avoids automatic termination if any node fails that has been allocated to this job.

The job assumes responsibility for handling such node failures internally. (SLURM’s default is
to terminate a job if any of its allocated nodes fail.)

-m dist

(

--distribution=dist

)

The

-m

option tells SLURM how to distribute tasks among nodes for this job. The choices for

dist

are either

block

or

cyclic

.

block

Assigns tasks in order to each CPU on one node before assigning any to the
next node. This is the default if the number of tasks exceeds the number of
nodes requested.

cyclic

Assigns tasks "round robin" across all allocated nodes (task 1 goes to the first
node, task 2 goes to the second node, and so on). This is the default if the number
of nodes requested equals or exceeds the number of tasks.

-r n

(

--relative=n

)

The

-r

option offsets the first job step to node

n

of this job’s allocated node set (where the first

node is 0). Option

-r

is incompatible with "constraint" options

-w

and

-x

, and it is ignored

when you run a job without a prior node allocation (default for

n

is 0).

-s

(

--share

)

The

-s

option allows this job to share nodes with other running jobs. Sharing nodes often starts

the job faster and boosts system utilization, but it can also lower application performance.

6.4.5.2 Working Features Options

-D path

(

--chdir=path

)

The

-D

option causes each remote process to change its default directory to path (by using

CHDIR

) before it begins execution (without

-D

, the current working directory of

srun

becomes the default directory for each process).

-d level

(

--slurmd-debug=level

)

The

-d

option specifies level as the level at which daemon

SLURMD

reports debug information

and deposits it in this job’s

stderr

location. Here, level can be any integer between 0 (quiet,

reports only errors, the default) and 5 (extremely verbose messages).

Using SLURM

6-7

Advertising