Resource Limit Update

Resource limit has been updated. Details here.

SCRP has seen tremendous growth since it came online in summer 2020—we started with 84 CPU cores, 576GB RAM and 12 consumer-grade GPUs, while as of January 2023 we have close to 600 CPU cores, 4.3TB RAM and an increasing number of datacenter GPUs. To keep up with the increasing computing power and address user needs, we will be implementing new resource limit over the next month.

Already Implemented

The previous 32 CPU/user limit on the default scrp partition has been increased to 128. In its place is a limit of 1 node/job and 32 CPU/job. This change allows you to send jobs to the default partition even if you have a large job running on another partition.
The large partition now includes one RTX 3090 node. This change allows you to request a full RTX 3090 node, with all 64 CPU cores and four RTX 3090 GPUs.
A maximum of two GPU/user and 32 CPU/user on the a100 partition.
When there are multiple jobs in a queue, the following priority is observed:
- The time a job has been on the queue is the main determinant.
- Faculty, staff and research postgraduate students have a two-day priority over other user types.
- A small bonus that scales inversely with usage.

To be Implemented

We will restrict the default partition to jobs no longer than 24 hours later in the month. Once this new restriction comes into effect, you should send your long-running jobs to either the large partition or the a100 partition.
We will add two A100 GPUs to the default partition to allow more users to access them and facilitate faster turnaround.

Below is the updated resource limit for each user category:

User Type	Logical CPUs	GPUs	Job Duration
Faculty and staff	128	8	5/30 days^a
Research postgraduate student	128	4	5/30 days^a
Taught postgraduate student and Senior undergraduate student	16	1	5 days
Undergraduate student	4	1	1 day

New Partition settings:

Partition	Nodes	Resource per Node	Limits
scrp*	scrp-node-[1-6,8,9]	CPU: 16-64 cores RAM: 64-512GB GPU: 0-4	Max. 1 node/job Max. 32 CPU/job
large	scrp-node-[1-7]	CPU: 16-64 cores RAM: 64GB-1TB	Max. 4 GPU/user
a100	scrp-node-10	CPU: 64 cores RAM: 512GB GPU: A100 x 3	Max. 32 CPU/job Max. 2 GPU/user