Main Page
System Status
Thu Aug 22 13:30:00 EDT 2024: Chiller issue caused about 25% of Niagara compute nodes to go down; users should resubmit any affected jobs. Wed Aug 21 16:35:00 EDT 2024: Maintenance finished; compute nodes are now available for user jobs. Wed Aug 21 7:00:00 EDT 2024: Maintenance started. Sun Aug 18 19:15:00 EDT 2024: Issues have been resolved. Sun Aug 18 14:30:00 EDT 2024: Power issues seem to have brought compute nodes down, and compounded to the file system issues we had earlier. Sun Aug 18 10:31:53 EDT 2024: GPFS is back online, and seems to be holding Sun Aug 18 08:44:40 EDT 2024: Sorry, problems with GPFS file systems are reoccurring. Sun Aug 18 07:59:02 EDT 2024: GPFS file systems are back to normal. Many jobs have died and will need to be resubmitted. Sun Aug 18 06:39:12 EDT 2024: Support staff detected the problem and started to work on the fix Sun Aug 18 00:53:52 EDT 2024: GPFS file systems (home, scratch, project) started to show initial stages of problems August 21, 2024: The annual cooling tower maintenance for the SciNet data centre will take place on August 21, 2024 from 7 a.m. EDT until the end of day. This maintenance requires a shutdown of the compute nodes of all SciNet systems (Niagara, Mist, Rouge, Teach, as well as hosted equipment). The login nodes, file systems and the HPSS system will remain available. The scheduler will hold jobs that cannot finish before the start of the shutdown. Users are encouraged to submit small and short jobs that can take advantage of this, as the scheduler may be able to fit these jobs in before the maintenance on otherwise idle nodes.
|