Skip to content

Running Jobs

To run an application (job) , computational resources must be allocated. ARIS uses SLURM Workload Manager (Simple Linux Utility for Resource Management) to distribute workloads across the supercomputer. For more information check the slurm quick start guide. https://computing.llnl.gov/linux/slurm/quickstart.html

Slurm has three key functions.

First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work.

Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes.

Finally, it arbitrates contention for resources by managing a queue of pending work.

Common terms:

  • nodes: the compute resource in SLURM
  • partitions (queues): node groups
  • jobs: allocations of resources
  • job steps: sets of tasks within a job.

Resource Queues

In order to use all the nodes efficiently and in a fair share fashion, resources are distributed in queues (partitions). Queues group nodes into logical sets, each of which has an assortment of constraints such as job size limit, job time limit, users permitted to use it, etc.

To determine what partitions exist on the system, what nodes they include, and general system state. use the sinfo command.

sinfo -s

sinfo man page

ARIS queue (partition) overview:

Queue Table

PARTITION DESCRIPTION AVAIL TIMELIMIT NODES NODELIST
compute Compute nodes w/o accelerator up 2-00:00:00 426 node[001-426]
fat Fat compute nodes up 2-00:00:00 24 fat[01-24]
taskp Serial jobs queue up 2-00:00:00 20 fat[25-44]
gpu GPU accelerated nodes up 2-00:00:00 44 gpu[01-44]
phi MIC accelerated nodes up 2-00:00:00 18 phi[01-18]
  • compute queue: Is intended to run parallel jobs on the THIN compute nodes.
  • fat queue: Is dedicated to run parallel jobs on the FAT compute nodes.
  • taskp queue: Is intended to run multiple serial jobs on the FAT compute nodes.
  • gpu queue: Provides access on the GPU accelerated nodes.
  • phi queue: Provides access on the PHI accelerated nodes.

The scontrol command can be used to report more detailed information partitions and configuration

:$ scontrol show partition

PartitionName=compute
   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
   AllocNodes=ALL Default=YES
   DefaultTime=NONE DisableRootJobs=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=node[001-426]
   Priority=1 RootOnly=NO ReqResv=NO Shared=NO PreemptMode=OFF
   State=UP TotalCPUs=8520 TotalNodes=426 SelectTypeParameters=N/A
   DefMemPerNode=UNLIMITED MaxMemPerNode=57344

PartitionName=fat
   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
   AllocNodes=ALL Default=NO
   DefaultTime=NONE DisableRootJobs=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=fat[01-24]
   Priority=1 RootOnly=NO ReqResv=NO Shared=NO PreemptMode=OFF
   State=UP TotalCPUs=960 TotalNodes=24 SelectTypeParameters=N/A
   DefMemPerNode=UNLIMITED MaxMemPerNode=507904

PartitionName=taskp
   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
   AllocNodes=ALL Default=NO
   DefaultTime=NONE DisableRootJobs=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=fat[25-44]
   Priority=1 RootOnly=NO ReqResv=NO Shared=NO PreemptMode=OFF
   State=UP TotalCPUs=1600 TotalNodes=20 SelectTypeParameters=N/A
   DefMemPerNode=UNLIMITED MaxMemPerNode=507904

PartitionName=gpu
   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
   AllocNodes=ALL Default=NO
   DefaultTime=NONE DisableRootJobs=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=gpu[01-44]
   Priority=1 RootOnly=NO ReqResv=NO Shared=NO PreemptMode=OFF
   State=UP TotalCPUs=880 TotalNodes=44 SelectTypeParameters=N/A
   DefMemPerNode=UNLIMITED MaxMemPerNode=57344

PartitionName=phi
   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
   AllocNodes=ALL Default=NO
   DefaultTime=NONE DisableRootJobs=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=phi[01-18]
   Priority=1 RootOnly=NO ReqResv=NO Shared=NO PreemptMode=OFF
   State=UP TotalCPUs=360 TotalNodes=18 SelectTypeParameters=N/A
   DefMemPerNode=UNLIMITED MaxMemPerNode=57344
:$ scontrol show node node426

NodeName=node426 Arch=x86_64 CoresPerSocket=10
   CPUAlloc=0 CPUErr=0 CPUTot=20 CPULoad=0.15 Features=(null)
   Gres=(null)
   NodeAddr=node426 NodeHostName=node426 Version=14.11
   OS=Linux RealMemory=57344 AllocMem=0 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1
   BootTime=2015-07-27T19:34:56 SlurmdStartTime=2015-07-27T22:15:06
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Job Submission

In order to create a resource allocation and launch tasks you can submit a batch script.

A batch script, submitted to the scheduling system must specify the job specifications:

  1. resource queue , default is compute
  2. number of nodes required
  3. number of cores per node required
  4. maximum wall time for the job , (please notice the jobs exceeding wall time will be killed).

To submit a job, user can use the sbatch command.

sbatch my_script

Please check sbatch man for more information.

man sbatch

Define batch script

Batch scripts contain

  1. scheduler directives : lines begin with #SBATCH
  2. shell commands: UNIX shell (bash) commands
  3. job steps: created with the srun command
#!/bin/bash
#SBATCH --job-name=my_script    # Job name
#SBATCH --ntasks=2              # Number of tasks
#SBATCH --time=01:30:00         # Run time (hh:mm:ss) - 1.5 hours

module load gnu intelmpi        #load any needed modules

echo "Start at `date`"
cd $HOME/workdir
./a.out
echo "End at `date`"

To submit this batch script

sbatch my_script

Job Specifications

Option Argument Specification
--job-name, -J job_name Job name is job_name
--partition, -p queue_name Submits to queue queue_name
--account, -A project_name Project to charge compute hours
--ntasks, -n number_of_tasks Total number of tasks
--nodes, -N number_of_nodes Number of nodes
--ntasks-per-node ntasks_per_node Tasks per node
--cpus-per-task, -c ntasks_per_node Threads per task
--time, -t HH:MM:SS Time limit (hh:mm:ss)
--mem memory_mb Total memory requirements (MB)
--output, -o stdout_filename Direct job satndard output to stdout_filename, (%j expands to jobID)
--error, -e stderr_filename Direct job error to error_file, (%j expands to jobID)
--depend, -d afterok:jobid Job dependency

SLURM Environment Variables

SLURM provides environment variables for most of the values used in the #SBATCH directives.

Evironment Variable Description
$SLURM_JOBID Job id
$SLURM_JOB_NAME Job name
$SLURM_SUBMIT_DIR Submit directory
$SLURM_SUBMIT_HOST Submit host
$SLURM_JOB_NODELIST Node list
$SLURM_JOB_NUM_NODES Number of nodes
$SLURM_CPUS_ON_NODE Number of cores/node
$SLURM_CPUS_PER_TASK Threads per task
$SLURM_NTASKS_PER_NODE Number of tasks per node
#!/bin/bash
#SBATCH --job-name=slurm_env
#SBATCH --nodes=2                # 2 nodes
#SBATCH --ntasks-per-node=12     # Number of tasks to be invoked on each node
#SBATCH --mem-per-cpu=1024       # Minimum memory required per CPU (in megabytes)
#SBATCH --time=00:01:00          # Run time in hh:mm:ss
#SBATCH --error=job.%J.out
#SBATCH --output=job.%J.out

echo "Start at `date`"
echo "Running on hosts: $SLURM_NODELIST"
echo "Running on $SLURM_NNODES nodes."
echo "Running $SLURM_NTASKS_PER_NODE tasks per node"
echo "Job id is $SLURM_JOBID"
echo "End at `date`"

Job Scripts

Here are some sample job submission scripts for different runtime models.

  • Serial job: Run serial programs,scripts on a single core.
  • MPI job: Run multi-process programs with MPI.
  • Hybrid job: Parallel programs with MPI and OpenMP threads.
  • GPU job: Utilize GPU accelerators.
  • PHI job: Utilize PHI accelrators (offload mode only).

Serial batch script

#!/bin/bash

#-----------------------------------------------------------------
# Serial job , requesting 1 core , 2800 MB of memory per job
#-----------------------------------------------------------------

#SBATCH --job-name=seraljob# Job name
#SBATCH --output=serialjob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=serialjob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=1 # Total number of tasks
#SBATCH --nodes=1 # Total number of nodes requested
#SBATCH --ntasks-per-node=1 # Tasks per node
#SBATCH --cpus-per-task=1 # Threads per task
#SBATCH --mem=2800 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=taskp # Submit queue
#SBATCH -A testproj # Accounting project

# Load any necessary modules
module load gnu
module load intel

# Launch the executable a.out
./a.out ARGS

Pure MPI batch script

Launch MPI jobs with srun command

DON’T USE mpirun AND mpiexec

#!/bin/bash

#-----------------------------------------------------------------
# Pure MPI job , using 80 procs on 4 nodes ,
# with 20 procs per node and 1 thread per MPI task

#-----------------------------------------------------------------

#SBATCH --job-name=mpijob # Job name
#SBATCH --output=mpijob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=mpijob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=80 # Total number of tasks
#SBATCH --nodes=4 # Total number of nodes requested
#SBATCH --ntasks-per-node=20 # Tasks per node
#SBATCH --cpus-per-task=1 # Threads per task(=1) for pure MPI
#SBATCH --mem=56000 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=compute # Submit queue
#SBATCH -A testproj # Accounting project

# Load any necessary modules

module load gnu
module load intel
module load intelmpi

export I_MPI_FABRICS=shm:dapl

# Launch the executable

srun EXE ARGS

Hybrid MPI/OpenMP batch script

Launch MPI jobs with srun command

DON’T USE mpirun AND mpiexec

#!/bin/bash

#-----------------------------------------------------------------
# Hybrid MPI/OpenMP job , using 80 procs on 4 nodes ,
# with 2 procs per node and 10 threads per MPI task.
#-----------------------------------------------------------------

#SBATCH --job-name=hybridjob # Job name
#SBATCH --output=hybridjob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=hybridjob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=8 # Total number of tasks
#SBATCH --nodes=4 # Total number of nodes requested
#SBATCH --ntasks-per-node=2 # Tasks per node
#SBATCH --cpus-per-task=10 # Threads per task
#SBATCH --mem =56000 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=compute # Submit queue
#SBATCH -A testproj # Accounting project

# Load any necessary modules

module load gnu
module load intel
module load intelmpi

export I_MPI_FABRICS=shm:dapl

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

# Launch the executable
srun EXE ARGS

GPU batch script

Launch GPU accelerated jobs.

#!/bin/bash

#-----------------------------------------------------------------
# GPU job , using 80 procs on 4 nodes ,
# with 2 gpus per node, 1 procs per node  and 20 threads per MPI task.
#-----------------------------------------------------------------

#SBATCH --job-name=gpujob # Job name
#SBATCH --output=gpuob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=gpujob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=4 # Total number of tasks
#SBATCH --gres=gpu:2 # GPUs per node
#SBATCH --nodes=4 # Total number of nodes requested
#SBATCH --ntasks-per-node=1 # Tasks per node
#SBATCH --cpus-per-task=20 # Threads per task
#SBATCH --mem =56000 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=gpu # Run on the GPU nodes queue
#SBATCH -A testproj # Accounting project

# Load any necessary modules

module load gnu
module load intel
module load intelmpi
module load cuda

export I_MPI_FABRICS=shm:dapl

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

# Launch the executable
srun EXE ARGS

PHI batch script

Launch PHI accelerated jobs.

#!/bin/bash

#-----------------------------------------------------------------
# PHI job , using 80 procs on 4 nodes ,
# with 2 phi's per node, 1 procs per node  and 20 threads per MPI task.
#-----------------------------------------------------------------

#SBATCH --job-name=phijob # Job name
#SBATCH --output=phijob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=phijob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=4 # Total number of tasks
#SBATCH --nodes=4 # Total number of nodes requested
#SBATCH --ntasks-per-node=1 # Tasks per node
#SBATCH --cpus-per-task=20 # Threads per task
#SBATCH --gres:mic:2 # Accelerators per node
#SBATCH --mem =56000 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=phi # Run on the GPU nodes queue
#SBATCH -A testproj # Accounting project

# Load any necessary modules

module load gnu
module load intel
module load intelmpi

export I_MPI_FABRICS=shm:dapl

## (HOST) OPENMP NUMBER OF THREADS
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

## (MIC) OPENMP NUMBER OF THREADS
export MIC_ENV_PREFIX=MIC
## (MIC) 60 physical cores 4 hardware threads
export MIC_OMP_NUM_THREADS=240

# Launch the executable
srun EXE ARGS

Interactive jobs

NAN

Job Monitoring

Show jobs queue

To determine what jobs exist on the system use

:$ squeue --all
JOBID PARTITION  NAME  USER ST  TIME NODES NODELIST(REASON)
  • JOBID: job id
  • PARTITION: partition (use sinfo to list all available partitions)
  • NAME: partition name
  • USER: username
  • ST: STate column,
    • R: Running
    • PD: PenDing
    • TO: TimedOut
    • S: Suspended
    • CD: Completed
    • CA: CAncelled
    • F: Failed
    • NF: Node Failure

To list jobs only for your user, use

squeue -u username

Check job scheduled time to start

squeue --start
squeue -o "%.8i %.9P %.10j %.10u %.8T %.5C %.4D %.6m %.10l %.10M %.10L %.16R"

Please check squeue man for more information.

man squeue

Job information

To view detailed job information use

:$ scontrol show job 11841

JobId=11841 JobName=rungmx.sh
   UserId=ntell(1000) GroupId=ntell(1000)
   Priority=4294900666 Nice=0 Account=(null) QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=14:01:51 TimeLimit=2-00:00:00 TimeMin=N/A
   SubmitTime=2015-07-28T00:51:21 EligibleTime=2015-07-28T00:51:21
   StartTime=2015-07-28T00:51:22 EndTime=2015-07-30T00:51:22
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=all AllocNode:Sid=login01:5379
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=node[001-072]
   BatchHost=node001
   NumNodes=72 NumCPUs=1440 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   Socks/Node=* NtasksPerN:B:S:C=20:0:*:* CoreSpec=*
   MinCPUsNode=20 MinMemoryNode=0 MinTmpDiskNode=0
   Features=(null) Gres=(null) Reservation=(null)
   Shared=0 Contiguous=0 Licenses=(null) Network=(null)
   Command=/users/staff/ntell/Runs/Oxides/1273/rungmx.sh
   WorkDir=/users/staff/ntell/Runs/Oxides/1273
   StdErr=/users/staff/ntell/Runs/Oxides/1273/slurm-11841.out
   StdIn=/dev/null
   StdOut=/users/staff/ntell/Runs/Oxides/1273/slurm-11841.out

Pending Jobs

Common reasons for awaiting jobs.

Dependency This job is waiting for a dependent job to complete.
NodeDown A node required by the job is down.
PartitionDown The partition (queue) required by this job is in a DOWN state and temporarily accepting no jobs, for instance because of maintenance. Note that this message may be displayed for a time even after the system is back up.
Priority One or more higher priority jobs exist for this partition or advanced reservation. Other jobs in the queue have higher priority than yours.
ReqNodeNotAvail No nodes can be found satisfying your limits, for instance because maintenance is scheduled and the job can not finish before it
Reservation The job is waiting for its advanced reservation to become available.
Resources The job is waiting for resources (nodes) to become available and will run when Slurm finds enough free nodes.
SystemFailure Failure of the SLURM system, a file system, the network, etc

Job Accounting

Accounting information for completed jobs, use the sacct command. (current day)

       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
17692              Job0    compute   testproj         64 CANCELLED+      0:0
17693              Job1    compute   testproj          1  COMPLETED      0:0

You can tailor the output with the use of the --format= option to specify the fields to be shown.

sacct --format=jobid,jobname,partition,user%12,account%12,alloccpus,nnodes,elapsed,cputime,state,exitcode
-e, --helpformat
Print a list of fields that can be specified with the --format option.

-S, --starttime
Select jobs eligible after the specified time. Default is midnight of current day. If states are given with the -s option then return jobs in this state at this time, 'now' is also used as the default time.
Valid time formats are...

HH:MM[:SS] [AM|PM]
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
MM/DD[/YY]-HH:MM[:SS]
YYYY-MM-DD[THH:MM[:SS]]

Accounting information for running jobs, use the sstat command.

Job Cancelling

To cancel job, use scancel

scancel job_id

Examples

Send SIGTERM to steps 1 and 3 of job 1234:

scancel --signal=TERM 1234.1 1234.3

Cancel job 1234 along with all of its steps:

scancel 1234

Send SIGKILL to all steps of job 1235, but do not cancel the job itself:

scancel --signal=KILL 1235

Cancel all your jobs

scancel -u username

Cancel all your pending jobs

scancel -t PD

Man page

man scancel

scancel man page

Job Dependency

Use sbatch --dependency=<dependency_list> option if you need your jobs to run in a specific order.

DEPENDENCY DESCRIPTION
after This job can begin execution after the specified jobs have begun execution.
afterany This job can begin execution after the specified jobs have terminated.
afternotok This job can begin execution after the specified jobs have terminated in some failed state (non-zero exit code, node failure, timed out, etc).
afterok This job can begin execution after the specified jobs have successfully executed (ran to completion with an exit code of zero).
expand Resources allocated to this job should be used to expand the specified job. The job to expand must share the same QOS (Quality of Service) and partition. Gang scheduling of resources in the partition is also not supported.
singleton This job can begin execution after any previously launched jobs sharing the same job name and user have terminated.

Example: “diamond workflow”

Four jobs A,B,C,D. Job A runs first. Jobs B and C depend on completition of job A. Job D depends on jobs B and C.

Image of diamond workflow

Submit job A first:

$ sbatch A
Submitted batch job 17703

Jobs B and C depend on successful completion of job A.

$sbatch -d afterok:17703 B
Submitted batch job 17704

$ sbatch -d afterok:17703 C
Submitted batch job 17705

Job D depends on successful completion of both jobs B and C

sbatch -d afterok:17704:17705 D
Submitted batch job 17706

Check your running queue

$squeue -u <username>

JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
17706   compute        D nikolout PD       0:00      1 (Dependency)
17704   compute        B nikolout  R       1:27      1 node260
17705   compute        C nikolout  R       1:27      1 node260

Job A has already completed so jobs B and C are already running. Job D is on pending state waiting B and C to end.

Chain dependencies

Automating dependency steps:

jobA=`sbatch A | awk '{print $4}'`
jobB=`sbatch -d afterok:$jobA B | awk '{print $4}'`
jobC=`sbatch -d afterok:$jobA C | awk '{print $4}'`
jobD=`sbatch -d afterok:$jobB:$jobC D | awk '{print $4}'`

Or, you can submit your dependent jobs from within the job that they depend on.

#!/bin/bash
#SBATCH -n 1
#SBATCH -t 00:01:00
#SBATCH -J A

sbatch --dependency=afterok:$SLURM_JOB_ID B
srun EXE ARGS
#!/bin/bash
#SBATCH -n 1
#SBATCH -t 00:01:00
#SBATCH -J B
srun EXE ARGS

NOTE Job B will only be considered for scheduling when its dependency has completed.

Usage report

mybudget: Check allocated core hours

$ mybudget
=======================================================================
Core Hours Allocation Information for account : testproj
=======================================================================
Allocated Core Hours   :        2400000.00
Consumed  Core Hours   :             15.00
Percentage of Consumed :              0.00
=======================================================================

myreport: Check consumed core hours and energy.

$ myreport
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2015-04-07T00:00:00 - 2015-10-07T23:59:59 (15897600 secs)
Time reported in CPU Hours
--------------------------------------------------------------------------------
  Cluster         Account     Login     Proper Name       Used     Energy
--------- --------------- --------- --------------- ---------- ----------
     aris        testproj username      User Name         15        110

SLURM commands

Man pages exist for all SLURM daemons, commands, and API functions. The command option –help also provides a brief summary of options. Note that the command options are all case insensitive.

  • sacct is used to report job or job step accounting information about active or completed jobs.

  • sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.

  • scancel is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step.

  • scontrol is the administrative tool used to view and/or modify SLURM state. Note that many scontrol commands can only be executed as user root.

  • sinfo reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.

  • squeue reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.

  • srun is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job’s node allocation.