What is SCRP?

SCRP is a high-performance computing cluster managed by the Department of Economics of The Chinese University of Hong Kong, designed to serve both research and teaching roles. Although miniscule in size when compared to most HPC clusters, it utilizes many of the same technology such as fast Infiniband interconnect, parallel storage and multi-node workload management.

As of 2024-3-5, the total resources available are:

Resource Quantity
CPU cores AMD Zen 4: 464
AMD Zen 3: 336
AMD Zen 2: 512
RAM 12TB
GPU NVIDIA A100/A800: 14
NVIDIA RTX 3090:   8
NVIDIA RTX 3060: 12
Distributed Flash Storage 44TB
24GB/s
>500K IOPS
Archival Storage 91TB

Design

SCRP’s design is similar to most HPC cluster and consists of four types of computer servers:

  • Login nodes handle user login and light computation.
  • Compute nodes handle heavy computation.
  • Storage nodes handle file storage.
  • Management node handles background tasks.

SCRP Struture

As a user, you will spend most of your time interacting with the login nodes—tasks such as file transfer and coding should all be done on them. As all nodes share a common file system, your files will be available everywhere even though you are only uploading them to a login node. See Account and Access for details.

You can do light computations on the login nodes, but since they are shared by all users they are not suitable for heavy use. For the latter you will want to run your code on a compute node. SCRP uses the Slurm Workload Manager to allocate computational resources to users. The guides to available software have more details on how to use Slurm.

Hardware Specifications

SCRP consists of 23 nodes in total.

  • scrp-login-1
    • AMD EPYC 7542 (32 cores, 64 threads)
    • 256GB DDR4-3200 ECC RAM
    • 500GB SATA SSD x 2 + 10TB SATA HDD x 12
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-login-2
    • AMD EPYC 7302 (16 cores, 32 threads)
    • 128GB DDR4-3200 ECC RAM
    • 500GB SATA SSD x 2
    • Mellanox ConnectX-3 Single 56Gb/s NIC
  • scrp-node-[1,2,6]
    • AMD EPYC 7742 (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • NVIDIA GeForce RTX 3060 x 4
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-3 Single 56Gb/s NIC
  • scrp-node-[3,9]
    • AMD EPYC 7742 (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • NVIDIA GeForce RTX 3090 x 4 with NVLink
    • 500GB SATA SSD + 2TB NVMe SSD
    • Network Interface Cards:
      • Mellanox ConnectX-6 Single 200Gb/s NIC (direct connection between node-3 and node-9)
      • Mellanox Connect-IB Dual 56Gb/s NIC
  • scrp-node-4
    • AMD EPYC 7542 (32 cores, 32 threads)
    • 512GB DDR4-3200 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-IB Single 56Gb/s NIC
  • scrp-node-5
    • AMD EPYC 9754 (128 cores, 128 threads)
    • 768GB DDR5-4800 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD x 2
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-node-7
    • AMD EPYC 9654 x 2 (192 cores, 192 threads)
    • 1.5TB DDR5-4800 ECC RAM
    • 512GB SATA SSD + 2TB NVMe SSD x 2
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-node-8
    • AMD Ryzen 5950X (16 cores, 16 threads)
    • 64GB DDR4-3200 ECC RAM
    • 1TB NVMe SSD
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-node-10
    • AMD EPYC 7763 (64 cores, 64 threads)
    • 1TB DDR4-3200 ECC RAM
    • NVIDIA A100 80GB PCIe x 4 with NVLink
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-node-11
    • AMD EPYC 7763 x 2 (128 cores, 128 threads)
    • 1TB DDR4-3200 ECC RAM
    • NVIDIA A100 80GB PCIe x 2
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-node-12
    • AMD Ryzen Threadripper PRO 3995WX (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-node-13
    • AMD EPYC 7763 x 2 (128 cores, 128 threads)
    • 2TB DDR4-3200 ECC RAM
    • HGX A800: NVIDIA A100 80GB SXM4 x 8
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-node-14
    • AMD EPYC 9654 (96 cores, 96 threads)
    • 768GB DDR5-4800 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-node-[15-17]
    • AMD Ryzen 7950X (16 cores, 16 threads)
    • 128GB DDR5-3600 ECC RAM
    • 1TB NVMe SSD
    • Mellanox Connect-IB Single 56Gb/s NIC
  • scrp-control is the cluster’s management node.
    • AMD EPYC 7302 (16 cores, 32 threads)
    • 128GB DDR4-2666 ECC RAM
    • Storage Disks:
      • 4TB Intel P4510 NVMe SSD x 2
      • 8TB Intel P4510 NVMe SSD x 2
    • Network Interface Cards:
      • Mellanox Connect-IB Dual 56Gb/s NIC x 2
      • Mellanox ConnectX-3 Single 56Gb/s NIC x 1
  • scrp-data-[1-2] are the cluster’s new distrubted storage nodes.
    • AMD EPYC 7542 (32 cores, 64 threads)
    • 256GB DDR4-3200 ECC RAM
    • Storage Disks:
      • 1.6TB Intel P4610 NVMe SSD x 2
      • 8TB Intel P4510 NVMe SSD x 6
    • Network Interface Cards:
      • Mellanox Connect-IB Dual 56Gb/s NIC x 2
      • Mellanox ConnectX-3 Single 56Gb/s NIC x 1
  • scrp-data is the cluster’s old distrubted storage node.
    • AMD EPYC 7302 (16 cores, 32 threads)
    • 256GB DDR4-3200 ECC RAM
    • Storage Disks:
      • 1.6TB Intel P4600 NVMe SSD x 2
      • 8TB Intel P4510 NVMe SSD x 12
    • Network Interface Cards:
      • Mellanox Connect-IB Dual 56Gb/s NIC x 2
      • Mellanox ConnectX-3 Single 56Gb/s NIC x 2

Staff

Director of Economic Computing Services: Dr. Vinci Chow (vincichow@cuhk.edu.hk)

Assistant Computer Officer: Gary Yeung (garyyeung@cuhk.edu.hk)