SCRP has different types of storage tailoring to different needs. Placing your files in correct locations is crucial to getting the best performance out of the cluster.

Basics

We recommend the following rule in placing your files:

File Type Location Quota
Anything you cannot afford to lose ~/ 2/20/50GBa
Large datasets ~/large-data 20/500/1000GBb
Dataset backups ~/archive 20/1000/5000GBb
Temporary files /tmp 0.3-3.7TB (shared)

aQuotas for junior undergraduates/senior undergraduates and postgraduates/faculty and staff respectively.
bQuotas for senior undergraduates and taught postgraduates/research postgraduates/faculty and staff respectively.

Different locations have different properties:

Location Shared across Nodes Backup Performance
~/ Yes Daily Moderate
~/large-data Yes No High
~/archive Yes No Moderate
/tmp No No Very high

Important: do not put any file you cannot afford to lose in locations that have no backup! We advise users to put anything that is generated and too important to lose in the ~/. For files too large to fit into ~/, you should keep your primary copy in ~/large-data and a backup copy in /archive.

Group Storage

Faculty members can request the creation of group storage directories, which allow files to be shared among multiple users. Group directories can be created on any of the storage location, with each group directory having its own disk quota. Details here.

Quota

To check your dist quota, type:

scrp-quota

What if I Need More Space? Additional quota can be granted on a case by case basis. Please contact support.

Backup

Backups are mounted under /backup on each node:

  • Hourly backups: /backup/hourly.[0-22]/users/[username]
  • Daily backups: /backup/daily.[0-5]/users/[username]
  • Weekly backups: /backup/weekly.[0-6]/users/[username]
  • Monthly backups: /backup/monthly.[0-3]/users/[username]

Configuration Details

~/ is your home directory, which you can also access by specifying the full path /home/users/username or the environment variable $HOME. It is exported from scrp-control-2 using Network File System (NFS), supported by two SSDs in RAID 1 configuration with a total usable capacity of 2.5TB.

  • Full Path: /home/users/username
  • Aliases: ~/, $HOME
  • Speed: 1100MB/s read and 850MB/s write.
  • Quota: 2/20/50GB for undergraduates/postgraduates/faculty respectively.
  • Purging Policy: Files are deleted when user is no longer affiliated with CUHK.

~/large-data is a storage space backed by a flash-based parallel file system (BeeGFS on 12 Intel P4510 8TB SSDs, paired with six Infiniband FDR connections). Each file in the directory is splitted and distributed across multiple servers, allowing for very high read and write speeds. Even though it appears under your home directory, it is only a soft link to the parallel storage’s real location /data/users/username, which means the folder is not included in the daily backup. The total usable capacity 44TB.

  • Full Path: /data/users/username
  • Aliases: ~/large-data, $LARGE_DATA
  • Speed: Up to 6GB/s read and 3GB/s write per node. Maximum aggregate throughout is 24GB/s.
  • Quota: 20/500/1000GB for taught postgrauduates/research postgraduates/faculty respectively
  • Purging Policy: Files owned by either UG or TPG students are automatically deleted at the end of each term.

~/archive is a large storage space suitable for storing a second copy of your data. Even though it appears under your home directory, it is only a soft link to the archive’s real location /archive/users/username, which means the folder is not included in the daily backup. It is exported from scrp-login using NFS, supported by a 12-disk RAID 6 HDD array with a total usable capacity of 91TB.

  • Full Path: /archive/users/username
  • Aliases: ~/archive, $ARCHIVE
  • Speed: 500-700MB/s read and 200-300MB/s write.
  • Quota: 20/1000/5000GB for taught postgrauduates/research postgraduates/faculty respectively.
  • Purging Policy: Files are deleted when user is no longer affiliated with CUHK.

/tmp is a shared storage space intended for temporary files generated during a job. The directory is not shared across nodes, meaning that files you generate in one node is not accessible from another node.

  • Full Path: /tmp
  • Aliases: $TMP
  • Speed: 2300MB/s read and 1200MB/s write on compute nodes. 500MB/s read and 200MB/s write on login nodes.
  • Quota:
    • scrp-login-[1-2] 300GB shared.
    • scrp-node-[1-3]: 1.9TB shared.
    • scrp-node-[4-5]: 800GB shared.
    • scrp-node-[7]: 3.7TB shared.
  • Purging Policy: Bi-monthly.