Slurm Cheatsheet

A compact reference for Slurm commands and useful options, with examples.

Job submission

salloc - Obtain a job allocation for interactive use (docs)
sbatch - Submit a batch script for later execution (docs)
srun - Obtain a job allocation and run an application (docs)

OptionDescription
-A, --account=<account>Account to be charged for resources used
-a, --array=<index>Job array specification (sbatch only)
-b, --begin=<time>Initiate job after specified time
-C, --constraint=<features>Required node features
--cpu-bind=<type>Bind tasks to specific CPUs (srun only)
-c, --cpus-per-task=<count>Number of CPUs required per task
-d, --dependency=<state:jobid>Defer job until specified jobs reach specified state
-m, --distribution=<method[:method]>Specify distribution methods for remote processes
-e, --error=<filename>File in which to store job error messages (sbatch and srun only)
-x, --exclude=<name>Specify host names to exclude from job allocation
--exclusiveReserve all CPUs and GPUs on allocated nodes
--export=<name=value>Export specified environment variables (e.g., all, none)
--gpus-per-task=<list>Number of GPUs required per task
-J, --job-name=<name>Job name
-l, --labelPrepend task ID to output (srun only)
--mail-type=<type>E-mail notification type (e.g., begin, end, fail, requeue, all)
--mail-user=<address>E-mail address
--mem=<size>[units]Memory required per allocated node (e.g., 16GB)
--mem-per-cpu=<size>[units]Memory required per allocated CPU (e.g., 2GB)
-w, --nodelist=<hostnames>Specify host names to include in job allocation
-N, --nodes=<count>Number of nodes required for the job
-n, --ntasks=<count>Number of tasks to be launched
--ntasks-per-node=<count>Number of tasks to be launched per node
-o, --output=<filename>File in which to store job output (sbatch and srun only)
-p, --partition=<names>Partition in which to run the job
--signal=[B:]<num>[@time]Signal job when approaching time limit
-t, --time=<time>Limit for job run time

Examples:

# Request interactive job on debug node with 4 CPUs
salloc -p debug -c 4

# Request interactive job with V100 GPU
salloc -p gpu --ntasks=1 --gpus-per-task=v100:1

# Submit batch job
sbatch batch.job

Job management

squeue - View information about jobs in scheduling queue (docs)

OptionDescription
-A, --account=<account_list>Filter by accounts (comma-separated list)
-o, --format=<options>Output format to display
-j, --jobs=<job_id_list>Filter by job IDs (comma-separated list)
-l, --longShow more available information
--meFilter by your own jobs
-n, --name=<job_name_list>Filter by job names (comma-separated list)
-p, --partition=<partition_list>Filter by partitions (comma-separated list)
-P, --prioritySort jobs by priority
--startShow the expected start time and resources to be allocated for pending jobs
-t, --states=<state_list>Filter by states (comma-separated list)
-u, --user=<user_list>Filter by users (comma-separated list)

Examples:

# View your own job queue with estimated start times
squeue --me

# View own job queue with estimated start times for pending jobs
squeue --me --start

# View job queue on specified partition in long format
squeue -lp epyc-64

scancel - Signal or cancel jobs, job arrays, or job steps (docs)

OptionDescription
-A, --account=<account>Restrict to the specified account
-n, --name=<job_name>Restrict to jobs with specified name
-w, --nodelist=<hostnames>Restrict to jobs using the specified host names (comma-separated list)
-p, --partition=<partition>Restrict to the specified partition
-u, --user=<username>Restrict to the specified user

Examples:

# Cancel specific job
scancel 111111

# Cancel all your own jobs
scancel -u $USER

# Cancel your own jobs on specified partition
scancel -u $USER -p oneweek

# Cancel your own jobs in specified state
scancel -u $USER -t pending

sprio - View job scheduling priorities (docs)

OptionDescription
-o, --format=<options>Output format to display
-j, --jobs=<job_id_list>Filter by job IDs (comma-separated list)
-l, --longShow more available information
-n, --normShow the normalized priority factors
-p, --partition=<partition_list>Filter by partitions (comma-separated list)
-u, --user=<user_list>Filter by users (comma-separated list)

Examples:

# View normalized job priorities for your own jobs
sprio -nu $USER

# View normalized job priorities for specified partition
sprio -nlp gpu

Job accounting

sacct - View job accounting data (docs)

OptionDescription
-A, --account=<account_list>Filter by accounts (comma-separated list)
-X, --allocationsShow job allocations, but not job steps
-a, --allusersShow jobs for all users
-E, --endtime=<time>End of reporting period
-o, --format=<options>Output format to display
-j, --jobs=<job_id_list>Filter by job IDs (comma-separated list)
--name=<job_name_list>Filter by job names (comma-separated list)
-N, --nodelist=<hostnames>Filter by host names (comma-separated list)
-r, --partition=<partition_list>Filter by partitions (comma-separated list)
-S, --starttime=<time>Start of reporting period
-s, --state=<state_list>Filter by states (comma-separated list)
-u, --user=<user_list>Filter by users (comma-separated list)

Examples:

# View accounting data for specific job with custom format
sacct -j 111111 --format=jobid,jobname,submit,exitcode,elapsed,reqnodes,reqcpus,reqmem

# View compact accounting data for your own jobs for specified time range
sacct -X -S 2022-07-01 -E 2022-07-14

sacctmgr - View or modify account information (docs)

sacctmgr show associations
sacctmgr show user <username>

OptionDescription
cluster=<clusters>Filter by clusters (e.g., condo, discovery)
format=<options>Output format to display
user=<user_list>Filter by users (comma-separated list)

Examples:

# View your own associations with custom format
sacctmgr show associations user=$USER format=cluster,account,user,qos

sreport - Generate reports from accounting data (docs)

sreport cluster accountutilizationbyuser
sreport cluster userutilizationbyaccount
sreport job sizesbyaccount
sreport user topusage

OptionDescription
-T, --tres=<resource_list>Resources to report (e.g., cpu, gpu, mem, billing, all)
clusters=<clusters>Filter by clusters (e.g., condo, discovery)
end=<time>End of reporting period
format=<options>Output format to display
start=<time>Start of reporting period
accounts=<account_list>Filter by accounts (comma-separated list)
users=<user_list>Filter by users (comma-separated list)
nodes=<hostnames>Filter by host names (comma-separated list) (job reports only)
partitions=<partition_list>Filter by partitions (comma-separated list) (job reports only)
printjobcountPrint number of jobs ran instead of time used (job reports only)

Examples:

# Report account utilization for specified user and time range
sreport cluster accountutilizationbyuser start=2022-07-01 end=2022-07-14 users=$USER

# Report account utilization by user for specified account and time range
sreport cluster userutilizationbyaccount start=2022-07-01 end=2022-07-14 accounts=ttrojan_123

# Report job sizes for specified partition
sreport job sizesbyaccount partitions=epyc-64 printjobcount

# Report top users for specified account and time range
sreport user topusage start=2022-07-01 end=2022-07-14 accounts=ttrojan_123

Partition and node information

sinfo - View information about nodes and partitions (docs)

OptionDescription
-o, --format=<options>Output format to display
-l, --longShow more available information
-N, --NodeShow information in a node-oriented format
-n, --nodes=<hostnames>Filter by host names (comma-separated list)
-p, --partition=<partition_list>Filter by partitions (comma-separated list)
-t, --states=<state_list>Filter by node states (comma-separated list)
-s, --summarizeShow summary information

Examples:

# View all partitions and nodes by state
sinfo

# Summarize node states by partition
sinfo -s

# View nodes in idle state
sinfo --states=idle

# View nodes for specified partition in long, node-oriented format
sinfo -lNp epyc-64

scontrol - View or modify configuration and state (docs)

scontrol show partition <partition>
scontrol show node <hostname>
scontrol show job <job_id>

OptionDescription
-d, --detailsShow more details
-o, --onelinerShow information on one line

scontrol hold <job_list>
scontrol release <job_list>
scontrol show hostnames

Examples:

# View information for specified partition
scontrol show partition epyc-64

# View information for specified node
scontrol show node b22-01

# View detailed information for running job
scontrol show job 111111 -d

# View hostnames for job (one name per line)
scontrol show hostnames

Output environment variables

VariableDescription
SLURM_ARRAY_TASK_COUNTNumber of tasks in job array
SLURM_ARRAY_TASK_IDJob array task ID
SLURM_CPUS_PER_TASKNumber of CPUs requested per task
SLURM_JOB_ACCOUNTAccount used for job
SLURM_JOB_IDJob ID
SLURM_JOB_NAMEJob Name
SLURM_JOB_NODELISTList of nodes allocated to job
SLURM_JOB_NUM_NODESNumber of nodes allocated to job
SLURM_JOB_PARTITIONPartition used for job
SLURM_NTASKSNumber of job tasks
SLURM_PROCIDMPI rank of current process
SLURM_SUBMIT_DIRDirectory from which job was submitted
SLURM_TASKS_PER_NODENumber of job tasks per node

Examples:

# Specify OpenMP threads
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

# Specify MPI tasks
srun -n $SLURM_NTASKS ./mpi_program

Custom CARC Slurm commands

myaccount - View own account information
acctusage - View account usage information
nodeinfo - View partition and node states
gpuinfo - View GPU states
cqueue - View jobs in scheduling queue
myqueue - View own jobs in scheduling queue
jobhist - View compact history of own jobs
jobinfo - View detailed information about jobs

Back to top