How do I...?
- Innovation and Research at Marquette
- Marquette Visualization Lab (MARVL)
- XSEDE National Compute Resources
PROBLEM WITH THIS WEBPAGE?
To report another problem, please contact firstname.lastname@example.org.
The Portable Batch System (PBS) and the Simple Linux Utility for Resource Management (Slurm) are two of the most popular job schedulers used for requesting resources allocations on a multi user cluster. Marquette University's previous HPC resource Pére used the PBS scheduler, while current HPC resource Raj uses Slurm. If you are porting your work over from Pére to Raj, this page will guide you through converting your PBS submission scripts to Slurm submission scripts. If you are completely new to submitting jobs through a job scheduler and need help writing your first submission script, see the section on writing Slurm submission scripts in the Raj User's Guide.
There are two ways to convert your PBS script to Slurm. The first way is to run a PBS to Slurm Python script which will do the conversion for you. The benefit of this is that it is quick and easy. The downside is that the automated conversion needs to make several assumptions about how to request resources. These assumptions may not fit your workflow or allocate resources in the most efficient manner for your specific job. For a more detailed description of how the script converts PBS submission scripts to Slurm submission scripts, read the comments at the top of the script. To access the script, log on to Raj and copy it from the path /cm/shared/Public/scripts/pbs2slurm.py. The second way to convert your PBS script to Slurm is to finish reading this guide and make the necessary changes yourself.
Both PBS and Slurm submission scripts start off with a set of directives which provide the scheduler with information about the job and request resources. In PBS, these directives start with #PBS and for Slurm these directives start with #SBATCH. Common directives include jobname, queue, processor count and output file. Translations of PBS to Slurm directives can be seen below.
|Job Name||-N [name]||--job-name=[name]|
|Queue||-q [queue]||-p [queue]|
|Nodes||-l nodes=[nodes]||-N [min[-max]]|
|CPU Count||-l ppn=[count]||-n [count]|
|Memory requirements||-l mem=[MB]||--mem=[mem][M|G|T] OR
|Wall Clock Limit||-l walltime=[hh:mm:ss]||-t [min] OR -t [days-hh:mm:ss]|
|Standard Output File||-o [file_name]||-o [file_name]|
|Standard Error File||-e [file-name]||-e [file_name]|
|Join stdout/stderr||-j oe OR -j eo||use -o without -e|
Please note that PBS and Slurm request compute resources differently. In PBS, you ask for N number of nodes and n number of processors per node. Therefore, the total number of cores requested is N*n. In Slurm, you request n number of cores and those cores are evenly distributed across N nodes. For example, if you had a job requiring 32 cores and wrote a PBS script designed for Pére (which has eight cores per node) it would include the directive #PBS -l nodes=4,ppn=8. On Raj if you had the directives #SBATCH -N=4 and #SBATCH -n=8 you would get eight cores spread across four nodes. The correct way to specify this would be to include the directives #SBATCH N=1 and #SBATCH n=32, assuming you want all your processes running on a single node. If want your processes spread across four nodes, include the directive #SBATCH -N=4. If you do not care whether your processes are on one node or 32 nodes, omit the nodes allocation and just use #SBATCH -n=32. Slurm will then optimize how many nodes to run across to allow the job to run as quickly as possible.
To reference specific aspects off your job within your submission script, both PBS and Slurm have environment variables. Translations environment variables can be seen below.
|Job Array Index||$PBS_ARRAYID||$SLURM_ARRAY_TASK_ID|
If you wish to include some of these environment variables in naming your output files, you will need to use Slurm's file patterns shown below.
|Variable Name||File Pattern|
|Job array id||%a|
|Hostname (This will create a separate I/O file per node)||%N|
Here is a quick example of converting a simple PBS submission script which runs an MPICH rendition of "Hello World!" to a Slurm submission script.
#!/bin/bash #PBS -N hello_world #PBS -q batch #PBS -l nodes=2:ppn=8 #PBS -l walltime=500:00:00 #PBS -j oe #PBS -o $PBS_JOBNAME-$PBS_JOBID.log cd $PBS_O_WORKDIR module load mpich mpiexec -n 16 hello_world
#!/bin/bash #SBATCH --job-name="hello_world" #SBATCH -p batch #SBATCH -t 20-20:00:00 #SBATCH -N 1 #SBATCH -n 16 #SBATCH --output=%x-%j.log cd $SLURM_SUBMIT_DIR module load mpich mpiexec -n 16 hello_world