Python
Basics
JupyterHub is the recommended access method. Navigate to one of the following URLs on a browser:
and log in with your credentials.
Environments
There are five pre-configured environments managed by our staff:
Environment | Description | Installed Packages |
---|---|---|
base | SCRP data analytics stack | list of installed packages |
anaconda | Stable Anaconda release | list of installed packages |
tensorflow | Stable Tensorflow release (2.17) | list of installed packages |
pytorch | Stable PyTorch release (2.4) | list of installed packages |
rapids | Stable RAPIDS release (23.12) | list of installed packages |
Read on if you need to install additional packages or use a different environment.
Advanced Usage
To use Python in console mode, type in terminal:
python
To run a Python script, type:
python file_path
Some of your courses might have a teacher-managed environment. You can activate it by:
conda activate environment_name
You can check what environments are available by:
conda info --env
To see what Python packages are available in current environment:
conda list
Running on a Compute Node - Short Duration
You should run your job on a compute node if you need more processing power. Note that many Python libraries are single-threaded, so there is little benefit in requesting more than one CPU core from a compute node unless your code is written with parallelization in mind. For mathematical operations, the default Conda environment is linked to Intel Math Kernel Library (MKL), so libraries such as numpy are capable of utilizing multiple CPU cores.
Jupyter Notebooks
To run Jupyter notebooks on a compute node, following the instructions here.
Python Console and Python Scripts
To launch the Python console on a compute node, simply prepend compute
:
# Default environment
compute python
# Custom environment
compute
conda activate environment_name
python
Both launch Python on a compute node with four logical CPUs and 8GB of RAM, for a duration of 24 hours.
To run a Python script, provide the file path of the script after python
:
compute python my_file.py
If you need GPU, prepend gpu
instead:
# Default environment
gpu python
# Custom environment
gpu
conda activate environment_name
python
Use one of the following to check if a GPU is truly available:
# PyTorch
import torch
torch.cuda.get_device_name()
# Tensorflow
import tensorflow as tf
tf.config.list_physical_devices('GPU')
You can request more logical CPUs with the -c
option, more memory with the --mem
option,
more time with the -t
option and specify GPU model with the --gpus-per-task
option.
For example, to request 16 CPUs, 40G of memory and one RTX 3090 GPU for three days:
compute -c 16 --mem=40G --gpus-per-task=rtx3090 -t 3-0 python
See compute
for a full list of options,
or srun
and sbatch
for maximum flexibility.
Running on a Compute Node - Long Duration
The above option will terminate Python when you close the terminal. There are two options if you do not want this to happen:
-
Use
sbatch
. First create a script, hypothetically named my_job.sh:#!/bin/bash #SBATCH --job-name=my_sid #SBATCH --ntasks=1 #SBATCH --cpus-per-task=2 python file_path
The
#SBATCH
comments specify various options. In this example, we are requesting two logical CPUs for a single task.Now submit your job:
sbatch my_job.sh
Subject to available resources, your code will run even if you disconnect from the cluster. The maximum job duration is 5 days.
-
Use linux
screen
. Do note that we reserve the right to terminate processes that have been running for more than 24 hours on the login nodes.
Custom Environment
You can create your own environment in a terminal. Start a terminal and type:
conda create -n env_name [-c channel_1 -c channel_2 ... packages]
For example, to create a new pytorch environment, type:
conda create -n my-pytorch -c pytorch -c nvidia pytorch torchvision torchaudio pytorch-cuda=11.7
After the environment has been created, you can activate it by:
conda activate env_name
If you want it activate the environment every time you log in,
you can add the following line to your ~/.bash_profile
:
conda activate env_name
Custom Installation
You can install Anaconda into your home directory. You will only need to do this once on a login node and the installation will be usable across the whole cluster.
Since Anaconda takes up a substantial amount of disk space, we will use its lightweight cousin Miniconda as an example instead:
# Remember to check if there is a new version available
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# Install. Default location is under your home folder
bash Miniconda3-latest-Linux-x86_64.sh
You will need to log out and log in again to allow changes to apply.