Storage
SCRP has different types of storage tailoring to different needs. Placing your files in correct locations is crucial to getting the best performance out of the cluster.
Basics
We recommend the following rule in placing your files:
File Type | Location | Quota |
---|---|---|
Anything you cannot afford to lose | ~/ | 2/20/50GBa |
Large datasets | ~/large-data | 20/500/1000GBb |
Dataset backups | ~/archive | 20/1000/5000GBb |
Temporary files | /tmp | 0.3-3.7TB (shared) |
aQuotas for junior undergraduates/senior undergraduates and postgraduates/faculty and staff respectively.
bQuotas for senior undergraduates and taught postgraduates/research postgraduates/faculty and staff respectively.
Different locations have different properties:
Location | Shared across Nodes | Backup | Performance |
---|---|---|---|
~/ | Yes | Daily | Moderate |
~/large-data | Yes | No | High |
~/archive | Yes | No | Moderate |
/tmp | No | No | Very high |
Important: do not put any file you cannot afford to lose in locations that have no backup!
We advise users to put anything that is generated and too important to lose in the ~/
.
For files too large to fit into ~/
, you should keep your primary copy in ~/large-data
and
a backup copy in /archive
.
Group Storage
Faculty members can request the creation of group storage directories, which allow files to be shared among multiple users. Group directories can be created on any of the storage location, with each group directory having its own disk quota. Details here.
Quota
To check your dist quota, type:
scrp-quota
What if I Need More Space? Additional quota can be granted on a case by case basis. Please contact support.
Backup
Backups are mounted under /backup
on each node:
- Hourly backups:
/backup/hourly.[0-22]/users/[username]
- Daily backups:
/backup/daily.[0-5]/users/[username]
- Weekly backups:
/backup/weekly.[0-6]/users/[username]
- Monthly backups:
/backup/monthly.[0-3]/users/[username]
Configuration Details
~/
is your home directory, which you can also access by specifying the full path /home/users/username
or the environment variable $HOME
. It is exported from scrp-control-2
using Network File System (NFS),
supported by two SSDs in RAID 1 configuration with a total usable capacity of 2.5TB.
- Full Path:
/home/users/username
- Aliases:
~/
,$HOME
- Speed: 1100MB/s read and 850MB/s write.
- Quota: 2/20/50GB for undergraduates/postgraduates/faculty respectively.
- Purging Policy: Files are deleted when user is no longer affiliated with CUHK.
~/large-data
is a storage space backed by a flash-based parallel file system
(BeeGFS on 12 Intel P4510 8TB SSDs, paired with six Infiniband FDR connections).
Each file in the directory is splitted and distributed across multiple servers, allowing for very high read and write speeds.
Even though it appears under your home directory,
it is only a soft link to the parallel storage’s real location /data/users/username
,
which means the folder is not included in the daily backup. The total usable capacity 44TB.
- Full Path:
/data/users/username
- Aliases:
~/large-data
,$LARGE_DATA
- Speed: Up to 6GB/s read and 3GB/s write per node. Maximum aggregate throughout is 24GB/s.
- Quota: 20/500/1000GB for taught postgrauduates/research postgraduates/faculty respectively
- Purging Policy: Files owned by either UG or TPG students are automatically deleted at the end of each term.
~/archive
is a large storage space suitable for storing a second copy of your data.
Even though it appears under your home directory,
it is only a soft link to the archive’s real location /archive/users/username
,
which means the folder is not included in the daily backup.
It is exported from scrp-login
using NFS,
supported by a 12-disk RAID 6 HDD array with a total usable capacity of 91TB.
- Full Path:
/archive/users/username
- Aliases:
~/archive
,$ARCHIVE
- Speed: 500-700MB/s read and 200-300MB/s write.
- Quota: 20/1000/5000GB for taught postgrauduates/research postgraduates/faculty respectively.
- Purging Policy: Files are deleted when user is no longer affiliated with CUHK.
/tmp
is a shared storage space intended for temporary files generated during a job. The directory is not shared across nodes, meaning that files you generate in one node is not accessible from another node.
- Full Path:
/tmp
- Aliases:
$TMP
- Speed: 2300MB/s read and 1200MB/s write on compute nodes. 500MB/s read and 200MB/s write on login nodes.
- Quota:
- scrp-login-[1-2] 300GB shared.
- scrp-node-[1-3]: 1.9TB shared.
- scrp-node-[4-5]: 800GB shared.
- scrp-node-[7]: 3.7TB shared.
- Purging Policy: Bi-monthly.