Main Page

From SciNet Users Documentation
Revision as of 20:17, 21 August 2024 by Rzon (talk | contribs)
Jump to navigation Jump to search

System Status

Niagara Mist Teach Rouge
Jupyter Hub Scheduler File system Burst Buffer
HPSS Login Nodes External Network Globus
Balam CCEnv

Wed Aug 21 7:00:00 EDT 2024: Maintenance started.

Sun Aug 18 19:15:00 EDT 2024: Issues have been resolved.

Sun Aug 18 14:30:00 EDT 2024: Power issues seem to have brought compute nodes down, and compounded to the file system issues we had earlier.

Sun Aug 18 10:31:53 EDT 2024: GPFS is back online, and seems to be holding

Sun Aug 18 08:44:40 EDT 2024: Sorry, problems with GPFS file systems are reoccurring.

Sun Aug 18 07:59:02 EDT 2024: GPFS file systems are back to normal. Many jobs have died and will need to be resubmitted.

Sun Aug 18 06:39:12 EDT 2024: Support staff detected the problem and started to work on the fix

Sun Aug 18 00:53:52 EDT 2024: GPFS file systems (home, scratch, project) started to show initial stages of problems

August 21, 2024: The annual cooling tower maintenance for the SciNet data centre will take place on August 21, 2024 from 7 a.m. EDT until the end of day. This maintenance requires a shutdown of the compute nodes of all SciNet systems (Niagara, Mist, Rouge, Teach, as well as hosted equipment). The login nodes, file systems and the HPSS system will remain available.

The scheduler will hold jobs that cannot finish before the start of the shutdown. Users are encouraged to submit small and short jobs that can take advantage of this, as the scheduler may be able to fit these jobs in before the maintenance on otherwise idle nodes.

Previous messages

QuickStart Guides

Tutorials, Manuals, etc.