Rouge

From SciNet Users Documentation
Jump to navigation Jump to search
Rouge
Amd1.jpeg
Installed March 2021
Operating System Linux (Centos 7.6)
Number of Nodes 20
Interconnect Infiniband (2xEDR)
Ram/Node 512 GB
Cores/Node 48
GPUs/Node 8 MI50-32GB
Login/Devel Node rouge-login01
Vendor Compilers rocm/gcc
Queue Submission slurm

Specifications

The Rouge cluster was donated to the University of Toronto by AMD as part of their COVID-19 HPC Fund support program. The cluster consists of 20 x86_64 nodes each with a single AMD EPYC 7642 48-Core CPU running at 2.3GHz with 512GB of RAM and 8 Radeon Instinct MI50 GPUs per node.

The nodes are interconnected with 2xHDR100 Infiniband for internode communications and disk I/O to the SciNet Niagara filesystems. In total this cluster contains 960 CPU cores and 160 GPUs.

Access and support requests should be sent to support@scinet.utoronto.ca.

Getting started on Rouge

Rouge login node rouge-login01 can be accessed via the Niagara cluster.

ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca
ssh -Y rouge-login01

Storage

The filesystem for Rouge is currently shared with Niagara cluster. See Niagara Storage for more details.

Loading software modules

You have two options for running code on : use existing software, or compile your own. This section focuses on the former.

Other than essentials, all installed software is made available using module commands. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available. A detailed explanation of the module system can be found on the modules page.

Common module subcommands are:

  • module load <module-name>: load the default version of a particular software.
  • module load <module-name>/<module-version>: load a specific version of a particular software.
  • module purge: unload all currently loaded modules.
  • module spider (or module spider <module-name>): list available software packages.
  • module avail: list loadable software packages.
  • module list: list loaded modules.

Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.

There are handy abbreviations for the module commands. ml is the same as module list, and ml <module-name> is the same as module load <module-name>.

Available compilers and interpreters

  • The Rocm module has to be loaded first for GPU software.
  • To compile mpi code, you must additionally load an openmpi module.

ROCm

The current installed ROCm Tookit is 4.1.0

module load rocm/<version>
  • A compiler (GCC or rocm-clang) module must be loaded in order to use ROCm to build any code.

The current AMD driver version is 5.9.15. Use rocm-smi -a for full details.

Other Compilers and Tools

Available compiler modules are:

gcc/10.3.0 GNU Compiler Collection

rocm-clang/4.1.0 Clang

hipify-clang/12.0.0 Tool for translating CUDA sources into HIP sources

aocc/3.0.0 AMD Optimizing C/C++ Compiler (Clang-based)

OpenMPI

openmpi/<version> module is available with different compilers.

Software

Singularity Containers

/scinet/rouge/amd/containers/gromacs.rocm401.ubuntu18.sif
/scinet/rouge/amd/containers/lammps.rocm401.ubuntu18.sif
/scinet/rouge/amd/containers/namd.rocm401.ubuntu18.sif
/scinet/rouge/amd/containers/openmm.rocm401.ubuntu18.sif

GROMACS

The HIP version of GROMACS 2020.3 (better performance than OpenCL version) is provided by AMD in a container. Currently it is suggested to use a single GPU for all simulations. Job example:

#!/bin/bash
#SBATCH --time=1:00:00
#SBATCH --nodes=1
#SBATCH --gpus-per-node=1

export SINGULARITY_HOME=$SLURM_SUBMIT_DIR

singularity exec -B /home -B /scratch --env OMP_PLACES=cores /scinet/rouge/amd/containers/gromacs.rocm401.ubuntu18.sif gmx mdrun -pin off -ntmpi 1 -ntomp 6 ......

# setting '-ntomp 4' might give better performance, do your own benchmark. not recommended to set larger than 6 for single GPU job
# if you worry about 'GPU update with domain decomposition lacks substantial testing and should be used with caution.' warning message (if there is any), add '-update cpu' to override

NAMD

The HIP version of NAMD (3.0a) is provided by AMD in a container. Currently it is suggested to use a single GPU for all simulations. Job example:

#!/bin/bash
#SBATCH --time=1:00:00
#SBATCH --nodes=1
#SBATCH --gpus-per-node=1

export SINGULARITY_HOME=$SLURM_SUBMIT_DIR

singularity exec -B /home -B /scratch --env LD_LIBRARY_PATH=/opt/rocm/lib:/.singularity.d/libs /scinet/rouge/amd/containers/namd.rocm401.ubuntu18.sif namd2 +idlepoll +p 12 stmv.namd
# do not set +p flag larger than 12, there are only 6 cores (12 threads) per single GPU job.

PyTorch

Install PyTorch into a python virtual environment:

module load python gcc
mkdir -p ~/.virtualenvs
virtualenv --system-site-packages ~/.virtualenvs/pytorch-rocm
source ~/.virtualenvs/pytorch-rocm/bin/activate
pip3 install torch -f https://download.pytorch.org/whl/rocm4.0.1/torch_stable.html
pip3 install ninja && pip3 install 'git+https://github.com/pytorch/vision.git@v0.9.1'

Run PyTorch job with single GPU:

#!/bin/bash
#SBATCH --time=1:00:00
#SBATCH --nodes=1
#SBATCH --gpus-per-node=1

module load python gcc
source ~/.virtualenvs/pytorch-rocm/bin/activate
python code.py

Testing and debugging

You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.

  • Small test jobs can be run on the login node. Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.
  • Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:
rouge-login01:~$ debugjob --clean -g G=1

where G is the number of gpus. If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you a full node with 8 gpus for 30 minutes. The --clean argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.

Submitting jobs

Once you have compiled and tested your code or workflow on the Rouge login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster. Your jobs will run on one of Rouge's 20 compute nodes. When and where your job runs is determined by the scheduler.

Rouge uses SLURM as its job scheduler.

You submit jobs from a login node by passing a script to the sbatch command:

rouge-login01:scratch$ sbatch jobscript.sh

This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).

Example job scripts can be found below. Keep in mind:

  • Scheduling is by gpu each with 6 CPU cores.
  • Your job's maximum walltime is 24 hours.
  • Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).
  • Compute nodes have no internet access.
  • Your job script will not remember the modules you have loaded, so it needs to contain "module load" commands of all the required modules (see examples below).

Single-GPU job script

For a single GPU job, each will have a 1/8 of the node which is 1 GPU + 6/12 CPU Cores/Threads + ~64GB CPU memory. Users should never ask CPU or Memory explicitly. If running MPI program, user can set --ntasks to be the number of MPI ranks. Do NOT set --ntasks for non-MPI programs.

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --gpus-per-node=1
#SBATCH --time=1:00:0

module load <modules you need>
Run your program

Full-node job script

If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet support.

Multi-GPU job should ask for a minimum of one full node (8 GPUs). User need to specify "compute_full_node" partition in order to get all resource on a node.

  • An example for a 1-node job:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --gpus-per-node=8
#SBATCH --ntasks=8 #this only affects MPI job
#SBATCH --time=1:00:00
#SBATCH -p compute_full_node

module load <modules you need>
Run your program