Main Page

From SciNet Users Documentation
Revision as of 17:09, 9 April 2018 by Rzon (talk | contribs)
Jump to navigation Jump to search

Niagara

Niagara consists of 1500 Lenovo SD530 each with 2x Intel Skylake 6148 (20 core, 2.4GHz) & 192 GB, or 60,000 cores total. The network is an EDR based Infiniband with a dragonfly topology.

Login

Login with your CC credentials, not SciNet password, and login to USERID@niagara.scinet.utoronto.ca.

QuickStart

FileSystems

  • $HOME
  • $PROJECT
  • $SCRATCH

Software

module load

Scheduling

Niagara is using the SLURM scheduler with a maximum 24 hour walltime for all jobs. Scheduling is by node, 40 cores. There is a debug queue on 5 nodes, with 1 hour maximum walltime.


Submit a job

sbatch my.script

List jobs

squeue 

Look at available nodes

sinfo

Cancel a job

scancel $JOBID

Partitions are dragonfly1,dragonfly2,dragonfly3,dragonfly4,compute. Specify a dragonfly[1-4] if you want to ensure 1:1 (upto 432 nodes), otherwise specify nothing or compute to use upto 1500. You can also specify an explicit noderange if you want #SBATCH --nodelist=nia[0801-1000]

For IntelMPI - Start with this

#!/bin/bash 
#SBATCH -N 200
#SBATCH --ntasks-per-node=40
#SBATCH -t 00:30:00
#SBATCH -J test

module load NiaEnv/2018b intel/2018.1 intelmpi/2018.1

mpiexec.hydra my.code

For HPCX (OpenMPI)

Example

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=40
#SBATCH --time=1:00:00
#SBATCH --job-name openmp_job
#SBATCH --output=openmp_output_%j.txt

cd $SLURM_SUBMIT_DIR

module load intel/2018.2

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

srun ./openmp_example