What is SCRP?

SCRP is a high-performance computing cluster managed by the Department of Economics of The Chinese University of Hong Kong, designed to serve both research and teaching roles. Although miniscule in size when compared to most HPC clusters, it utilizes many of the same technology such as fast Infiniband interconnect, parallel storage and multi-node workload management.

As of 2024-9-25, the total resources available are:

Resource Quantity
CPU cores AMD Zen 4: 880
AMD Zen 3: 512
AMD Zen 2: 576
RAM 17TB
GPU NVIDIA H100 NVL: 7
NVIDIA A100/A800: 14
NVIDIA RTX 3090:   8
NVIDIA RTX 3060: 20
Distributed Flash Storage 175TB
24GB/s
>500K IOPS
Archival Storage 91TB

Design

SCRP’s design is similar to most HPC cluster and consists of four types of computer servers:

  • Login nodes handle user login and light computation.
  • Compute nodes handle heavy computation.
  • Storage nodes handle file storage.
  • Management node handles background tasks.

SCRP Struture

As a user, you will spend most of your time interacting with the login nodes—tasks such as file transfer and coding should all be done on them. As all nodes share a common file system, your files will be available everywhere even though you are only uploading them to a login node. See Account and Access for details.

You can do light computations on the login nodes, but since they are shared by all users they are not suitable for heavy use. For the latter you will want to run your code on a compute node. SCRP uses the Slurm Workload Manager to allocate computational resources to users. The guides to available software have more details on how to use Slurm.

Hardware Specifications

SCRP consists of 26 nodes in total.

  • scrp-login-1
    • AMD EPYC 7542 (32 cores, 64 threads)
    • 256GB DDR4-3200 ECC RAM
    • 500GB SATA SSD x 2 + 10TB SATA HDD x 12
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-login-2
    • AMD EPYC 7302 (16 cores, 32 threads)
    • 128GB DDR4-3200 ECC RAM
    • 500GB SATA SSD x 2
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-[1,2,3]
    • AMD EPYC 7742 (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • NVIDIA GeForce RTX 3060 x 4
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-3 Single 56Gb/s NIC
  • scrp-node-[8,9]
    • AMD EPYC 7742 (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • NVIDIA GeForce RTX 3090 x 4 with NVLink
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-4
    • AMD EPYC 7542 (32 cores, 32 threads)
    • 512GB DDR4-3200 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-5
    • AMD EPYC 9754 x 2 (256 cores, 256 threads)
    • 1.5TB DDR5-4800 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD x 2
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-[6,7]
    • AMD EPYC 9654 x 2 (192 cores, 192 threads)
    • 1.5TB DDR5-4800 ECC RAM
    • 512GB SATA SSD + 2TB NVMe SSD x 2
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-8
    • AMD EPYC 7742 (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-10
    • AMD EPYC 7763 (64 cores, 64 threads)
    • 1TB DDR4-3200 ECC RAM
    • NVIDIA A100 80GB PCIe x 4 with NVLink
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-11
    • AMD EPYC 7763 x 2 (128 cores, 128 threads)
    • 1TB DDR4-3200 ECC RAM
    • NVIDIA A100 80GB PCIe x 2
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-12
    • AMD Ryzen Threadripper PRO 3995WX (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-13
    • AMD EPYC 7763 x 2 (128 cores, 128 threads)
    • 2TB DDR4-3200 ECC RAM
    • HGX A800: NVIDIA A100 80GB SXM4 x 8
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-[14,18]
    • AMD EPYC 9654 (96 cores, 96 threads)
    • 768GB DDR5-4800 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-[15-17]
    • AMD Ryzen 7950X (16 cores, 16 threads)
    • 128GB DDR5-3600 ECC RAM
    • 1TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-19
    • AMD EPYC 7763 (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • NVIDIA RTX 3060 x 8
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-20
    • AMD EPYC 7742 (64 cores, 64 threads)
    • 512GB DDR4-3200 ECC RAM
    • 500GB SATA SSD + 2TB NVMe SSD
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-node-21
    • AMD EPYC 7763 x 2 (128 cores, 128 threads)
    • 1TB DDR4-3200 ECC RAM
    • NVIDIA H100 NVL 94GB PCIe x 7 with NVLink
    • 2TB NVMe SSD x 4
    • Mellanox ConnectX-6 200Gb/s NIC
  • scrp-control is the cluster’s management node.
    • AMD EPYC 7302 (16 cores, 32 threads)
    • 128GB DDR4-2666 ECC RAM
    • Storage Disks:
      • 4TB Intel P4510 NVMe SSD x 2
      • 8TB Intel P4510 NVMe SSD x 2
    • Mellanox ConnectX-6 Single 200b/s NIC
  • scrp-data-[1-2] are the cluster’s distrubted storage nodes.
    • AMD EPYC 7542 (32 cores, 64 threads)
    • 256GB DDR4-3200 ECC RAM
    • Storage Disks:
      • 1.6TB Intel P4610 NVMe SSD x 2
      • 8TB Intel P4510 NVMe SSD x 6
    • Mellanox ConnectX-6 200Gb/s NIC x 2

Staff

Director of Economic Computing Services

Dr. Vinci Chow (vincichow@cuhk.edu.hk)

Assistant Computer Officers

Gary Yeung (garyyeung@cuhk.edu.hk)
Alex Fong (alex-fong@cuhk.edu.hk)