Burst Buffer

From SciNet Users Documentation
Jump to navigation Jump to search

The Niagara burst buffer is fast, high performance shared file system, made of solid-state drives (SSD). While the overall bandwidth of the burst buffer is somewhat higher than that of the scratch file system, the true strength of the burst buffer lies in dealing with high I/O operations per seconds (IOPS). The ideal use-cases are therefore jobs which involve a lot of IOPS, too many for the /scratch file system, such as certain bio-informatics workflows and quantum chemistry calculations, and codes that have large restart checkpoint files to be saved between jobs.

The setup of the Burst Buffer of the Niagara cluster is evolving as we come to better understand how best to use this resource. The current setup is described below.

Short-term burst buffer space ($BBUFFER)

To get access to space on the burst buffer a user must first request space on it. If you desire access, send an email, detailing your motivation for desiring access to the burst buffer, to support@scinet.utoronto.ca. A quota of 10 TBs will be set for each burst buffer user.

Users with short-term burst buffer access will have a directory created on that resource. The location is accessible using the $BBUFFER environment variable. Like $SCRATCH, The $BBUFFER directory is accessible from all Niagara login, compute and datamover nodes.

Unlike ramdisk or job-specific burst buffer space (explained below), the files will remain on your burst buffer space between jobs. This makes burst buffer ideal for codes that have large restart checkpoint files to be saved between jobs.

Users should endeavour to clean up after each job, by staging out final files to $SCRATCH and removing temporary files. A very-short purging policy for the burst buffer (around 48 hours) will be implemented in the future.

Users should test a burst buffer workflow using a short test job before using the burst buffer in production.

Per-job temporary burst buffer space ($BB_JOB_DIR)

For every job on Niagara, the scheduler creates a temporary directory on the burst buffer called $BB_JOB_DIR. The $BB_JOB_DIR directory will be empty when your jobs starts and its content gets deleted after the job has finished. This directory is accessible from all nodes of a job.

$BB_JOB_DIR is intended as a place for applications that generate many small temporary files or that create files that are accessed very frequently (i.e., high IOPS applications), but that do not fit in ramdisk.

It should be emphasized that if the temporary files do fit in ramdisk, then that is generally a better location for them as both the bandwidth and iops of ramdisk far exceeds that of the burst buffer. To use ramdisk, you can either directly access /dev/shm or use the environment variable $SLURM_TMPDIR.

Note that Niagara compute nodes have no local disks, so $SLURM_TMPDIR lives in memory (ramdisk), in contrast to the general purpose Alliance (formerly Compute Canada) systems Cedar, Graham, Beluga and Narval, where this variable points to a directory on a node-local ssd disk.