Balam

From SciNet Users Documentation
Revision as of 16:48, 20 December 2023 by Rzon (talk | contribs) (→‎Testing and debugging)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Balam
Balam.jpeg
Installed October 2023
Operating System Linux (Rocky 9.2)
Number of Nodes 10
Interconnect Infiniband
Ram/Node 1 TB
Cores/Node 64
GPUs/Node 4 A100-40GB
Login/Devel Node balam-login01
Vendor Compilers cuda/intel/gcc
Queue Submission slurm

Specifications

The Balam cluster is owned by the Acceleration Consortium at the University of Toronto, and hosted at SciNet The cluster consists 10 x86_64 nodes each with two Intel Xeon(R) Platinum 8358 32-core CPUs running at 2.6GHz with 1 TB of RAM and four NVIDIA A100 GPUs per node.

The nodes are interconnected with Infiniband for internode communications and disk I/O to the SciNet Niagara file systems. In total this cluster contains 640 CPU cores and 40 GPUs.

Access is available only to those affiliated with the Acceleration Consortium. Support requests should be sent to balam-support@scinet.utoronto.ca.

Getting started on Balam

Balam can be accessed directly.

ssh -Y MYCCUSERNAME@balam.scinet.utoronto.ca

Or, the Balam login node balam-login01 can be accessed via the Niagara cluster.

ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca
ssh -Y balam-login01

Storage

The filesystem for Balam is currently shared with Niagara cluster. See Niagara Storage for more details.

Loading software modules

You have two options for running code on Balam: use existing software, or compile your own. This section focuses on the former.

Other than essentials, all installed software is made available using module commands. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available. A detailed explanation of the module system can be found on the modules page.

Common module subcommands are:

  • module load <module-name>: load the default version of a particular software.
  • module load <module-name>/<module-version>: load a specific version of a particular software.
  • module purge: unload all currently loaded modules.
  • module spider (or module spider <module-name>): list available software packages.
  • module avail: list loadable software packages.
  • module list: list loaded modules.

Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a MODULE_MODULENAME_PREFIX environment variable, which can be used to access commonly needed software directories, such as /include and /lib.

There are handy abbreviations for the module commands. ml is the same as module list, and ml <module-name> is the same as module load <module-name>.

Software stacks: BalamEnv and CCEnv

On Balam, there are two available software stacks:

BalamEnv

A software stack with Modules specific to Balam tuned and compiled for this machine. This stack is available by default, but if not, can be reloaded with

module load BalamEnv

This loads the default (set of modules), which is currently the 2023a epoch.

No modules are loaded by default on Balam except BalamEnv.

CCEnv

The same software stack available on Alliance (formerly Compute Canada)'s General Purpose clusters too, with:

module load CCEnv

Or, if you want the same default modules loaded as on Béluga and Narval, then do

module load CCEnv StdEnv

or, if you want the same default modules loaded as on Cedar and Graham, do

module load CCEnv arch/avx2 StdEnv/2020

Available compilers and interpreters

  • In the BalamEnv, the cuda module has to be loaded first for GPU software.
  • To compile mpi code, you must additionally load an openmpi module.

CUDA

The current installed CUDA versions are 11.8.0 and 12.3.1

module load cuda/<version>

The current NVIDIA driver version is 535.104.12. Use nvidia-smi -a for full details.

Other Compilers and Tools

Other available compiler modules are:

gcc/12.3.0 GNU Compiler Collection, compatible with CUDA/12.3

gcc/13.2.0 GNU Compiler Collection, incompatible with CUDA/12.3

intel/2023u1 Intel compiler suite

OpenMPI

openmpi/5.0.0 module is available once gcc/13.2.0 is loaded.

Testing and debugging

You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.

  • Small test jobs can be run on the login node. Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory and a few cores. Keep in mind that the login node only has one GPU.
  • Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:
balam-login01:~$ debugjob --clean -g G

where G is the number of gpus. If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a node with 4 gpus for 60 minutes. The --clean argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.

Submitting jobs

Once you have compiled and tested your code or workflow on the Balam login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster. Your jobs will run on one of Balam's 10 compute nodes. When and where your job runs is determined by the scheduler.

Balam uses SLURM as its job scheduler.

You submit jobs from a login node by passing a script to the sbatch command:

balam-login01:scratch$ sbatch jobscript.sh

This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out, since $HOME is read-only on the compute nodes.

Example job scripts can be found below. Keep in mind:

  • Scheduling is by gpu each with 16 CPU cores.
  • Your job's maximum walltime is 24 hours.
  • Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).
  • Compute nodes have no internet access.
  • Your job script will not remember the modules you have loaded, so it needs to contain "module load" commands of all the required modules (see examples below).

Single-GPU job script

A single GPU job will have a 1/4 of the node which is one A100 GPU with 16/32 CPU Cores/Threads and ~256GB CPU memory. Users should never ask for CPUs or memory explicitly. If running MPI program, a user can set --ntasks to be the number of MPI ranks. Do NOT set --ntasks for non-MPI programs.

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --gpus-per-node=1
#SBATCH --time=1:00:0

module load <modules you need>
Run your program

Full-node job script

If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact balam-support.

Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify "compute_full_node" partition in order to get all resource on a node.

  • An example for a 1-node job:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --gpus-per-node=4
#SBATCH --ntasks=64 #this only affects MPI jobs
#SBATCH --time=1:00:00
#SBATCH -p compute_full_node

module load <modules you need>
Run your program