January 2024 Maintenace Completed
The January 2024 maintenance has been completed, and all services have resumed.
Software changes
- Python: all pre-configured environments have been rebuilt from scratch. All of them are now based off Python 3.11. The preconfigured PyTorch environment now supports vLLM for fast large language model inference.
- MATLAB: the most up-to-date version of MATLAB 2023a has been installed.
- R: R (CRAN) has been updated to 4.3.2 while R (Conda) has been updated to 4.3.1. All preinstalled packages updated. Julia: Julia 1.10 installed. We now support the use of the Stata Julia plugin.
- OS: we were not able to upgrade to Ubuntu 22.04 LTS due to a compatibility issue with our network cards. We will attempt the upgrade again after we upgrade the cluster’s network to 200G Infiniband next summer.
Hardware changes
- AMD Zen 4 nodes:* SCRP is the first academic cluster in Hong Kong to offer the latest AMD Zen 4 CPUs, in the form of EPYC 9654 96-core CPUs and Ryzen 7950X 16-core CPUs, both of which are among the most powerful CPUs currently in production.
- Node-7 offers 192 cores and 1.5TB DDR5 RAM,
- Node-14 offers 96 cores and 750GB DDR5 RAM.
- Node-[15-17] offers 16 cores and 124GB DDR5 RAM.
- Distributed storage: the migration to our third-generation distributed storage system has completed. The new system provides better redundancy and a potential capacity of up to 170TB. The old distributed storage is mounted as /data-old and will be available for one month, after which it will be taken offline.
Know issues
- Node-6’s GPUs are not working.
- Node-14’s internet access is not working, and as a consequence, MATLAB cannot run.
These issues are expected to be resolved on Monday.