Job arrays¶
Job arrays are a SLURM feature to manage many similar jobs together. They make it easy to specify, submit and monitor a group of jobs all using the same sbatch options. It is possible to change some of these options after the job has begun execution using the scontrol
command specifying the JobID of the array or individual ArrayJobID.
- Submit/run a series of independent jobs via a single SLURM script
- Each job in the array gets a unique identifier (SLURM_ARRAY_TASK_ID) based on which various workloads can be organized
- Example (job_array.sh), 10 jobs, SLURM_ARRAY_TASK_ID=1,2,3…10
#!/bin/sh
#SBATCH -J array
#SBATCH -N 1
#SBATCH --array=1-10
echo "Hi, this is array job number" $SLURM_ARRAY_TASK_ID
sleep $SLURM_ARRAY_TASK_ID
independent jobs: 1, 2, 3 … 10
squeue -u $user
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
406846_[7-10] mem_0096 array sh PD 0:00 1 (Resources)
406846_4 mem_0096 array sh R INVALID 1 n403-062
406846_5 mem_0096 array sh R INVALID 1 n403-072
406846_6 mem_0096 array sh R INVALID 1 n404-031
Once submitted, the job array is assigned an id that can be used to manipulate all jobs at once. For example:
scancel 406846
To limit the number of simultaneously running jobs to 2 (e.g. for licences):
#SBATCH --array=1-20:5%2
Warning¶
Having said this... if the jobs in the array are short (less than ~10 min each)
Warning
Submitting a large number of short jobs is bad for performance since there is an overhead associated with starting and stopping each job.
Avoid submitting thousands of short jobs!
Package short jobs into larger jobs, for example instead of running thirty 1 min jobs, run one job for 30 minutes.
Don't:
#!/bin/bash -l
#SBATCH -N 1
#SBATCH -t 00:01:00
srun -n 1 myprog $1
#!/bin/bash -l
#SBATCH -N 1
#SBATCH -t 00:30:00
for arg in "$@"; do
srun -n 1 myprog $arg
done
Environmental variables¶
Job arrays will have additional environment variables set:
SLURM_ARRAY_JOB_ID will be set to the first job ID of the array.
SLURM_ARRAY_TASK_ID will be set to the job array index value.
SLURM_ARRAY_TASK_COUNT will be set to the number of tasks in the job array.
SLURM_ARRAY_TASK_MAX will be set to the highest job array index value.
SLURM_ARRAY_TASK_MIN will be set to the lowest job array index value.