Job Submission
Job Submission¶
In order to create a resource allocation and launch tasks you can submit a batch script.
A batch script, submitted to the scheduling system must specify the job specifications:
- resource queue , default is
compute
- number of nodes required
- number of cores per node required
- maximum wall time for the job , (please notice the jobs exceeding wall time will be killed).
To submit a job, user can use the sbatch
command.
sbatch my_script
Please check sbatch
man for more information.
man sbatch
Define batch script¶
Batch scripts contain
- scheduler directives : lines begin with #SBATCH
- shell commands: UNIX shell (bash) commands
- job steps: created with the
srun
command
#!/bin/bash -l
#SBATCH --job-name=my_script # Job name
#SBATCH --ntasks=2 # Number of tasks
#SBATCH --time=01:30:00 # Run time (hh:mm:ss) - 1.5 hours
module load gnu intelmpi #load any needed modules
echo "Start at `date`"
cd $HOME/workdir
./a.out
echo "End at `date`"
To submit this batch script
sbatch my_script
Job Specifications¶
Option | Argument | Specification |
---|---|---|
--job-name, -J | job_name | Job name is job_name |
--partition, -p | queue_name | Submits to queue queue_name |
--account, -A | project_name | Project to charge compute hours |
--ntasks, -n | number_of_tasks | Total number of tasks |
--nodes, -N | number_of_nodes | Number of nodes |
--ntasks-per-node | ntasks_per_node | Tasks per node |
--cpus-per-task, -c | ntasks_per_node | Threads per task |
--time, -t | HH:MM:SS | Time limit (hh:mm:ss) |
--mem | memory_mb | Total memory requirements (MB) |
--output, -o | stdout_filename | Direct job satndard output to stdout_filename, (%j expands to jobID) |
--error, -e | stderr_filename | Direct job error to error_file, (%j expands to jobID) |
--depend, -d | afterok:jobid | Job dependency |
SLURM Environment Variables¶
SLURM provides environment variables for most of the values used in the #SBATCH
directives.
Evironment Variable | Description |
---|---|
$SLURM_JOBID | Job id |
$SLURM_JOB_NAME | Job name |
$SLURM_SUBMIT_DIR | Submit directory |
$SLURM_SUBMIT_HOST | Submit host |
$SLURM_JOB_NODELIST | Node list |
$SLURM_JOB_NUM_NODES | Number of nodes |
$SLURM_CPUS_ON_NODE | Number of cores/node |
$SLURM_CPUS_PER_TASK | Threads per task |
$SLURM_NTASKS_PER_NODE | Number of tasks per node |
#!/bin/bash -l
#SBATCH --job-name=slurm_env
#SBATCH --nodes=2 # 2 nodes
#SBATCH --ntasks-per-node=12 # Number of tasks to be invoked on each node
#SBATCH --mem-per-cpu=1024 # Minimum memory required per CPU (in megabytes)
#SBATCH --time=00:01:00 # Run time in hh:mm:ss
#SBATCH --error=job.%J.out
#SBATCH --output=job.%J.out
echo "Start at `date`"
echo "Running on hosts: $SLURM_NODELIST"
echo "Running on $SLURM_NNODES nodes."
echo "Running $SLURM_NTASKS_PER_NODE tasks per node"
echo "Job id is $SLURM_JOBID"
echo "End at `date`"
Job Scripts¶
Here are some sample job submission scripts for different runtime models.
- Serial job: Run serial programs,scripts on a single core.
- MPI job: Run multi-process programs with MPI.
- Hybrid job: Parallel programs with MPI and OpenMP threads.
- GPU job: Utilize GPU accelerators.
- PHI job: Utilize PHI accelrators (offload mode only).
- Multiple Serial job: Run multiple serial programs simultaneously in a single batch script
Serial batch script¶
#!/bin/bash -l
#-----------------------------------------------------------------
# Serial job , requesting 1 core , 2800 MB of memory per job
#-----------------------------------------------------------------
#SBATCH --job-name=seraljob# Job name
#SBATCH --output=serialjob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=serialjob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=1 # Total number of tasks
#SBATCH --nodes=1 # Total number of nodes requested
#SBATCH --ntasks-per-node=1 # Tasks per node
#SBATCH --cpus-per-task=1 # Threads per task
#SBATCH --mem=2800 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=taskp # Submit queue
#SBATCH -A testproj # Accounting project
# Load any necessary modules
module load gnu
module load intel
# Launch the executable a.out
./a.out ARGS
Pure MPI batch script¶
Launch MPI jobs with srun
command
DON’T USE mpirun AND mpiexec
#!/bin/bash -l
#-----------------------------------------------------------------
# Pure MPI job , using 80 procs on 4 nodes ,
# with 20 procs per node and 1 thread per MPI task
#-----------------------------------------------------------------
#SBATCH --job-name=mpijob # Job name
#SBATCH --output=mpijob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=mpijob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=80 # Total number of tasks
#SBATCH --nodes=4 # Total number of nodes requested
#SBATCH --ntasks-per-node=20 # Tasks per node
#SBATCH --cpus-per-task=1 # Threads per task(=1) for pure MPI
#SBATCH --mem=56000 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=compute # Submit queue
#SBATCH -A testproj # Accounting project
# Load any necessary modules
module load gnu
module load intel
module load intelmpi
# Launch the executable
srun EXE ARGS
Hybrid MPI/OpenMP batch script¶
Launch MPI jobs with srun
command
DON’T USE mpirun AND mpiexec
#!/bin/bash -l
#-----------------------------------------------------------------
# Hybrid MPI/OpenMP job , using 80 procs on 4 nodes ,
# with 2 procs per node and 10 threads per MPI task.
#-----------------------------------------------------------------
#SBATCH --job-name=hybridjob # Job name
#SBATCH --output=hybridjob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=hybridjob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=8 # Total number of tasks
#SBATCH --nodes=4 # Total number of nodes requested
#SBATCH --ntasks-per-node=2 # Tasks per node
#SBATCH --cpus-per-task=10 # Threads per task
#SBATCH --mem=56000 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=compute # Submit queue
#SBATCH -A testproj # Accounting project
# Load any necessary modules
module load gnu
module load intel
module load intelmpi
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# Launch the executable
srun EXE ARGS
GPU batch script¶
Launch GPU accelerated jobs.
#!/bin/bash -l
#-----------------------------------------------------------------
# GPU job , using 80 procs on 4 nodes ,
# with 2 gpus per node, 1 procs per node and 20 threads per MPI task.
#-----------------------------------------------------------------
#SBATCH --job-name=gpujob # Job name
#SBATCH --output=gpujob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=gpujob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=4 # Total number of tasks
#SBATCH --gres=gpu:2 # GPUs per node
#SBATCH --nodes=4 # Total number of nodes requested
#SBATCH --ntasks-per-node=1 # Tasks per node
#SBATCH --cpus-per-task=20 # Threads per task
#SBATCH --mem=56000 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=gpu # Run on the GPU nodes queue
#SBATCH -A testproj # Accounting project
# Load any necessary modules
module load gnu
module load intel
module load intelmpi
module load cuda
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# Launch the executable
srun EXE ARGS
PHI batch script¶
Launch PHI accelerated jobs.
#!/bin/bash -l
#-----------------------------------------------------------------
# PHI job , using 80 procs on 4 nodes ,
# with 2 phi's per node, 1 procs per node and 20 threads per MPI task.
#-----------------------------------------------------------------
#SBATCH --job-name=phijob # Job name
#SBATCH --output=phijob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=phijob.%j.err # Stderr (%j expands to jobId)
#SBATCH --ntasks=4 # Total number of tasks
#SBATCH --nodes=4 # Total number of nodes requested
#SBATCH --ntasks-per-node=1 # Tasks per node
#SBATCH --cpus-per-task=20 # Threads per task
#SBATCH --gres:mic:2 # Accelerators per node
#SBATCH --mem=56000 # Memory per job in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=phi # Run on the GPU nodes queue
#SBATCH -A testproj # Accounting project
# Load any necessary modules
module load gnu
module load intel
module load intelmpi
## (HOST) OPENMP NUMBER OF THREADS
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
## (MIC) OPENMP NUMBER OF THREADS
export MIC_ENV_PREFIX=MIC
## (MIC) 60 physical cores 4 hardware threads
export MIC_OMP_NUM_THREADS=240
# Launch the executable
srun EXE ARGS
Multiple Serial batch script¶
Multiple sruns executed simultaneously from a single batch script.
Please not the wait
at the end of the script, that ensures slurm will not exit before all tasks are completed
#!/bin/bash -l
#-----------------------------------------------------------------
# Multiple Serial job , 5 tasks , requesting 1 node, 2800 MB of memory per task
#-----------------------------------------------------------------
#SBATCH --job-name=multiple-seraljob# Job name
#SBATCH --output=multiple-serialjob.%j.out # Stdout (%j expands to jobId)
#SBATCH --error=multiple-serialjob.%j.err # Stderr (%j expands to jobId)
#SBATCH --nodes=1 # Total number of nodes requested
#SBATCH --ntasks=5 # Total number of tasks
#SBATCH --ntasks-per-node=5 # Tasks per node
#SBATCH --cpus-per-task=1 # Threads per task
#SBATCH --mem-per-cpu=2800 # Memory per task in MB
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - (max 48h)
#SBATCH --partition=taskp # Submit queue
#SBATCH -A testproj # Accounting project
# Load any necessary modules
module load gnu
module load intel
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# Launch the executable a.out
srun -n 1 -c 1 ./a.out input0 &
srun -n 1 -c 1 ./a.out input1 &
srun -n 1 -c 1 ./a.out input2 &
srun -n 1 -c 1 ./a.out input3 &
srun -n 1 -c 1 ./a.out input4
wait