8 srun environment variables, 9 using srun with hp-mpi, 10 using srun with lsf – HP XC System 2.x Software User Manual

Page 82: 5 monitoring jobs with the squeue command

Advertising
background image

6.4.8

srun

Environment Variables

Many

srun

options have corresponding environment variables. An

srun

option, if invoked,

always overrides (resets) the corresponding environment variable (which contains each job
feature’s default value, if there is a default).

In addition,

srun

sets the following environment variables for each executing task on the

remote compute nodes:

SLURM_JOBID

Specifies the job ID of the executing job.

SLURM_NODEID

Specifies the relative node ID of the current node.

SLURM_NODELIST

Specifies the list of nodes on which the job is actually running.

SLURM_NPROCS

Specifies the total number of processes in the job.

SLURM_PROCID

Specifies the MPI rank (or relative process ID) for the current
process.

Other environment variables important for

srun

— managed jobs include:

MAX_TASKS_PER_NODE

Provides an upper bound on the number of tasks that

srun

assigns to each job node, even if you allow

more than one process per CPU by invoking the

srun

-O

option.

SLURM_NNODES

Is the actual number of nodes assigned to run your
job (which may exceed the number of nodes that you
explicitly requested with the

srun -N

option).

6.4.9 Using

srun

with HP-MPI

The

srun

command can be used as an option in an HP-MPI launch command. Refer to

Section 8.3.3 for information about using

srun

with HP-MPI.

6.4.10 Using

srun

with LSF

The

srun

command can be used in an LSF launch command. Refer to Chapter 7 for

information about using

srun

with LSF.

6.5 Monitoring Jobs with the

squeue

Command

The

squeue

command displays the queue of running and waiting jobs (or "job steps"),

including the JobID used for

scancel

), and the nodes assigned to each running job. It has a

wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs
in priority order and then the pending jobs in priority order.

Example 6-2 reports on job 12345 and job 12346:

Example 6-2: Displaying Queued Jobs by Their JobIDs

$ squeue --jobs 12345,12346

JOBID PARTITION NAME USER ST TIME_USED NODES NODELIST

12345

debug job1 jody

R

0:21

4 n[9-12]

12346

debug job2 jody PD

0:00

8

6-12

Using SLURM

Advertising