Slurm
SCRP nodes utilizes the Slurm Workload Manager to allocate compute node resources.
Partitions
Compute nodes are grouped into partitions, which you can think of as queues:
| Partition | Nodes | Resource per Node | Limits |
|---|---|---|---|
| scrp* | scrp-node-[1-4,6,8,9,12,14-18] | CPU: 16-64 cores RAM: 60-497GiB GPU: 0-4 |
1 node/job 32 CPU/job |
| large | scrp-node-[1-4,6-8,12,14-18] | CPU: 16-192 cores RAM: 60GiB-1.5TiB |
4 GPU/user |
| a100 | scrp-node-10,13,21 | CPU: 64-128 cores RAM: 1-2TiB GPU: A100 80GB x 4 |
GPU jobs only 32 CPU/job 512GB RAM/job 2 GPU/user |
| jrc | scrp-node-11 | CPU: 128 cores RAM: 747GiB GPU: A100 80GB x 2 |
JRC members only |
| g2007 | scrp-node-5 | CPU: 128 cores RAM: 747GiB |
g2007 members only |
*Default partition.
The scrp partition is accessible to all users, while the large and a100 partitions
is only accessible to faculty members and research postgraduate students.
The jrc parititon is only accessible to JRC members.
Resource Limits
To check what resources you can use, type in a terminal:
qos
The QoS (for “Quality of Service”) field tells you the resource limits applicable to you:
| QoS | Logical CPUs | GPUs | Max. Job Duration |
|---|---|---|---|
| c4g1 | 4 | 1 | 1 Day |
| c16g1 | 16 | 1 | 5 Days |
| c32g4 | 128 | 4 | 5 Days |
| c32g8 | 128 | 8 | 5 Days |
| c16-long | 16 | 1 | 30 Days |
Default job duration is 1 day regardless of the QoS you use.
Compute Node Status
To check the status of the compute nodes, type:
scrp-info
Use Slurm’s sinfo if you want to see a different set of information.
Run a Job Immediately
Predefined Shortcuts
SCRP has several sets of predefined shortcuts to provide quick access:
compute-[1-4,8,16,32,64,128]requests the specified number of logical CPUs from a compute node. Each CPU comes with 2GB of RAM.gpurequests one available GPU of any type.rtx3090requests an NVIDIA RTX 3090 GPU.a100requests an NVIDIA A100 GPU.hopperrequests an NVIDIA Hopper GPU.
For example, to launch Stata with 16 logical CPUs and 32GB RAM, you can simply type:
compute-16 stata-mp
compute
For more control, you can use the compute command:
compute [options] [command]
Note:
- Pseudo terminal mode is enabled. X11 forwarding is also enabled when a valid display is present.
- Jobs requesting more than 32 cores are routed to the
largepartition, while jobs requesting A100 GPUs are routed to thea100parititon. - Automatically scales CPU core count and memory with GPU type unless user specify them.
- RTX 3060: four CPU cores and 24GB RAM per GPU
- RTX 3090: eight CPU cores and 48GB RAM per GPU
- A100: eight CPU cores and 160GB RAM per GPU
Options:
-crequests a specific number of logical CPUs. Defaults to four CPU cores.--memrequests a specific amount of memory.--gpus-per-taskrequest GPUs in one of the following formats:numbermodelmodel:number
-tsets the maximum running time of the job in one of the following format:- minutes
- minutes:seconds
- hour:minutes:seconds
- day-hours
- day-hours:minutes:seconds
-prequests a specific partition. Defaults to ‘scrp’.-qrequests a specific quality of service.-wRequest specify nodes.--accountrequests resource under the specified account.--constraintadditional restrictions. Valid values:a100orh100.-zPrint the generated srun command.commandThe command to run. Defaults to starting a bash shell.
Examples:
-
To request 16 CPUs and 40G of memory:
compute -c 16 --mem=40G stata-mp -
To request 8 CPUs, 40G of memory and one RTX 3090 GPU:
compute -c 8 --mem=40G --gpus-per-task=rtx3090 python -
If you need 1TB memory, you will need to request the large memory node. Since that node is under the separate
largepartition, you need to specify the-poption:compute -c 16 --ntasks=1 --mem 1000G -p large stata-mp -
To access NVIDIA A100 GPUs:
compute --gpus-per-task=a100 -c 16 --mem=160G python my_script.py
-
For accounting reasons, Hopper GPUs are also named
a100on the cluster. To specifically request Hopper GPUs, use the--constraintoption:compute --gpus-per-task=a100 --constraint=hopper -c 16 --mem=160G python my_script.py -
Faculty members can run jobs longer than five days by requesting the QoS
c16-long. The maximum job duration is 30 days.compute -t 30-0 -c 16 --mem=40G stata-mp
srun
Finally, for maximum flexibility, you can use Slurm’s srun command.
All the shortcuts above utilize srun underneath.
srun [options] command
Common Options:
--ptypseudo terminal mode. Required for any interactive software.--x11x11 forwarding. Required for software with GUI.-crequests a specific number of logical CPUs.--ntasksspecifies the number of parallel tasks to run. Specifying ths option is generally not necessary unless you are running an MPI job or setting--hint=nomultithread. For the latter, setting--ntasks=1prevent Slurm from erroneously starting multiple identical tasks.--memrequests a specific amount of memory.--gpus-per-taskrequest GPUs in one of the following formats:numbermodelmodel:number
-tsets the maximum running time of the job in one of the following format:- minutes
- minutes:seconds
- hour:minutes:seconds
- day-hours
- day-hours:minutes:seconds
-prequests a specific partition.-qrequests a specific quality of service.--accountrequests resource under the specified account.--constraintadditional restrictions. Valid values:a100orh100.
See Slurm’s srun documentation for additional options.
Examples:
-
To request two logical CPUs, 10GB of RAM and one NVIDIA RTX 3090 GPU for 10 hours of interactive use, type:
srun --pty -t 600 -c 2 --mem 10G --gpus-per-task=rtx3090:1 bash -
If you need 1TB memory, you will need to request the large memory node. Since that node is under the separate
largepartition, you need to specify the-poption:srun --pty -c 16 --ntasks=1 --mem 1000G -p large stata-mp -
To access NVIDIA A100 GPUs, you need to specify the
a100partition, plus your desired number of CPU cores and memory:srun -p a100 --gpus-per-task=1 -c 16 --mem=160G python my_script.py -
For accounting reasons, H100 GPUs are also named
a100on the cluster. To specifically request H100 GPUs, use the--constraintoption:srun -p a100 --gpus-per-task=1 --constraint=h100 -c 16 --mem=160G python my_script.py -
Faculty members and research postgraduate students can run jobs longer than five days by requesting the QoS
c16-long. The maximum job duration is 30 days.srun --pty -t 30-0 -c 16 -q c16-long Rscript my_code.r
If Slurm is unable to allocate the resources you request, you will be put on a queue until resources are available or you terminate the command.
Run a Job in Batch Mode
sbatch allows you to submit a job without the need to wait for it to execute.
Run a Single Task
For example, to run Stata in batch mode, we will need the following batch script:
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
stata-mp -b do file_path
Then submit your job:
sbatch [file_path_to_batch_script]
Output of your job will be saved in a file named slurm-[jobid].out
in the directory you called sbatch.
Run Multiple Tasks in Parallel within the Same Job
The Suboptimal Way
You can run multiple tasks in parallel by adding the & symbol behind the command
for each task, followed by the wait command at the end of the script:
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=4G
stata-mp -b do file_path_1 &
stata-mp -b do file_path_2 &
wait
The batch script above will request 2 logical CPUs and 4GB of RAM, to be shared by all the tasks. Note that each task has full access to all allocated CPU cores and memory. This could result in unexpected slowdown if multiple tasks try to take advantage of all the available resources. In this example, both Stata instances will have access to the same four logical CPUs and 4GB of memory.
The Optimal Way
To allocate exclusive resource for each task, you need to run each
task with a srun command inside the Batch file.
You will need to specify:
-n,-cand--memas special#SBATCHcomments.-n,--memand--exclusivefor eachsruncommand.
Example:
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH -n 2
#SBATCH -c 4
#SBATCH --mem=8G
srun -n 1 --mem=4G --exclusive stata-mp -b do file_path_1 &
srun -n 1 --mem=4G --exclusive stata-mp -b do file_path_2 &
wait
The batch script above will requests 8 logical CPUs and 8GB of RAM. Each task will get 4 logical CPUs and 4GB of RAM for its exclusive use.
Explanation of the #SBATCH comments:
#SBATCH -n 2specifies that we want to run 2 tasks in parallel.#SBATCH -c 4means 4 logical CPUS per task.#SBATCH --mem=8Gmeans 8GB of RAM across all tasks.
Explanation of the srun options:
-n 1means run a single task. If you leave this option out, the task will run as many times as what you have specified in#SBATCH -n.--mem=4Gmeans 4GB of RAM for this task. The value you specify in#SBATCH --memmust be greater or equal to the sum of all the values you specify insrun --mem. If that is not true, or if you leave this option out, tasks will not necessarily run in parallel due to not having enough memory to share.--exclusivespecifies that each task should get exclusive access to the logical CPUs it has access to. If you leave this option out, all tasks will have access the same set of logical CPUs.
Job Status
Use scrp-queue to check the status of your job:
# Your running jobs
scrp-queue
# Completed jobs
scrp-queue -t COMPLETED
If you want to see a different set of information, you can use Slurm’s
squeue instead.
Cancel a Job
You can use scancel to cancel a job before it completes. First find the ID
of your running job with squeue, then:
scancel job_id
Further Information
See Slurm’s sbatch documentation for additional options.