SCRP nodes have two R installations:
- R 4.2 compiled with OpenBLAS 0.3.21 - list of installed packages
- Microsoft R Open (MRO) 4.0 - list of installed packages.
Option 1: JupyterHub
RStudio via JupyterHub is the recommended access method.
- Navigate to one of the following URLs on a browser:
- Choose one of the following under Notebook:
RStudio (...) [↗]launches RStudio running the R version specified in parenthesis.
R (...)creates a new R jupyter notebook.
Option 2: Remote Desktop
- Connect to a login node through remote desktop.
- Launch from the pull-down menu on the top-right corner, Applications > Statistics > RStudio.
Option 3: SSH
All the instructions below assume you have connected to a login node through SSH. See Account and Access for details.
Launch RStudio on a login node:
Launch R in console mode on a login node:
Run R in batch mode:
Launch MRO in console mode:
Run MRO in batch mode:
Installing Additional Packages
You can install additional packages with the following command in R:
Downloaded packages are placed under your home directory and will be immediately available on all nodes.
Running on a Compute Node - Short Duration
You should run your job on a compute node if you need more processing power. R installations on SCRP are linked to either OpenBLAS or Intel Math Kernel Library (MKL), which are capable of utilizing multiple CPU cores. You can speed up your analysis further by coding with parallelization in mind. There are many guides on how to do so available online, for example here and here.
To run RStudio or notebooks through Jupyter on a compute node, follow the instructions here.
To run RStudio on a compute node in remote desktop, launch Applications > Slurm (x cores) > RStudio, where x is the number of desirable cores.
To run R on a compute node in a terminal, simply prepend
# RStudio compute rstudio # Interactive console mode compute R # Batch mode. compute Rscript file_path
The above commands will launch R on a compute node with four logical CPUs and 8GB of RAM, for a duration of 24 hours.
You can request more logical CPUs with the
-c option, more memory with the
and more time with the
For example, to request 16 CPUs and 40G of memory for three days:
compute -c 16 --mem=40G -t 3-0 R
compute for a full list of options,
sbatch for maximum flexibility.
Running on a Compute Node - Long Duration
All of the above options will terminate R when you close the terminal. There are two options if you do not want this to happen:
sbatch. First create a script, hypothetically named my_job.sh:
#!/bin/bash #SBATCH --job-name=my_sid #SBATCH --ntasks=1 #SBATCH --cpus-per-task=2 Rscript file_path
#SBATCHcomments specify various options. In this example, we are requesting two logical CPUs for a single task.
Now submit your job:
Subject to available resources, your code will run even if you disconnect from the cluster. The maximum job duration is 5 days.
screen. Do note that we reserve the right to terminate processes that have been running for more than 24 hours on the login nodes.