HP XC System 2.x Software User Manual

Page 90

Advertising
background image

4.

LSF-HPC prepares the user environment for the job on the LSF-HPC execution host
node and dispatches the job with the

job_starter.sh

script. This user environment

includes standard LSF environment variables and two SLURM-specific environment
variables:

SLURM_JOBID

and

SLURM_NPROCS

.

SLURM_JOBID

is the SLURM job ID of the job. Note that this is not the same as the

LSF

jobID

.

SLURM_NPROCS

is the number of processors allocated.

These environment variables are intended for use by the user’s job, whether it is explicitly
(user scripts may use these variables as necessary) or implicitly (any

srun

commands in

the user’s job can use these variables to determine its allocation of resources).

The value for

SLURM_NPROCS

is 4 and the

SLURM_JOBID

is 53 in this example.

5.

The user job

myscript

begins execution on compute node

n1

.

The first line in

myscript

is the

hostname

command. It executes locally and returns

the name of node,

n1

.

6.

The second line in the

myscript

script is the

srun hostname

command. The

srun

command in

myscript

inherits

SLURM_JOBID

and

SLURM_NPROCS

from

the environment and executes the

hostname

command on each compute node in the

allocation.

7.

The output of the

hostname

tasks (

n1

,

n2

,

n3

, and

n4

). is aggregated back to the

srun

launch command (shown as dashed lines in Figure 7-1), and is ultimately returned to the

srun

command in the job starter script, where it is collected by LSF-HPC.

The last line in

myscript

is the

mpirun -srun ./hellompi

command. The

srun

command inside the

mpirun

command in

myscript

inherits the

SLURM_JOBID

and

SLURM_NPROCS

environment variables from the environment and executes

hellompi

on

each compute node in the allocation.

The output of the

hellompi

tasks is aggregated back to the

srun

launch command where it

is collected by LSF-HPC.

The command executes on the allocated compute nodes

n1

,

n2

,

n3

, and

n4

.

When the job finishes, LSF-HPC cancels the SLURM allocation, which frees the compute
nodes for use by another job.

7.1.5 Differences Between LSF on HP XC and Standard LSF

LSF for the HP XC environment supports all the standard features and functions that standard
LSF supports, except for those items described in this section, in Section 7.1.6, and in the HP
XC release notes for LSF.

The external scheduler option for HP XC provides additional capabilities at the job level and
queue level by allowing the inclusion of several SLURM options in the LSF command line.

LSF does not collect

maxswap

,

ndisks

,

r15s

,

r1m

,

r15m

,

ut

,

pg

,

io

,

tmp

,

swp

and

mem load

indices from each application node.

lshosts

and

lsload

commands will

display "

-

" for all of these items.

LSF-enforced job-level run-time limits are not supported.

Except run-time (wall clock) and total number of CPUs, LSF cannot report any other job
accounting information.

LSF does not support parallel or SLURM-based interactive jobs in PTY mode (

bsub -Is

and

bsub -Ip

).

LSF does not support user-account mapping and system-account mapping.

7-6

Using LSF

Advertising