Main Page

From SciNet Users Documentation
Revision as of 16:13, 17 April 2024 by Norbertk (talk | contribs)
Jump to navigation Jump to search

System Status

Niagara Mist Teach Rouge
Jupyter Hub Scheduler File system Burst Buffer
HPSS Login Nodes External Network Globus
Balam CCEnv


Tuesday April 17, 2024: 11:00 The restart of the Niagara login nodes has been completed successfully.

Tuesday April 17, 2024: 09:40 Niagara login nodes will be rebooted

Tuesday April 16, 2024: 12:45 mist-login01 recovered now

Tuesday April 16, 2024: 11:45 mist-login01 will be unavailable due to maintenance from 12:15 to 12:45. Following the completion of maintenance, login access should be restored

Monday April 15, 2024: 13:02 Balam-login01 will be unavailable due to maintenance from 13:00 to 13:30. Following the completion of maintenance, login access should be restored and available once more.

Monday March 18, 2024: 14:45 File system issue resolved. Users are advised to check if their running jobs were affected, and if so, to resubmit.

Monday March 18, 2024: 13:02 File system issues. This affects the ability to log in. We are investigating.

Monday March 11, 2024: 14:05 All systems are recovered now

Monday March 11, 2024: There will be an shutdown of the file system at SciNet for an emergency repair. As a consequence, the login nodes and compute nodes of all SciNet clusters using the file system (Niagara, Mist, Balam, Rouge, and Teach) will be down from 11 am EST until later in the afternoon.

February 28, 2024, 16:30 PM EDT: All systems are recovered now.

February 28, 2024, 1:00 PM EDT: A loop pump fault caused many compute nodes overheat. If you jobs failed around this time, please resubmit. Once the root cause has been addressed, the cluster will be brought up completely. Please report issues to support@scinet.utoronto.ca.

February 22, 2024, 5:45 PM EDT: Maintenance finished and system restored. Please report issues to support@scinet.utoronto.ca.

February 21, 2024, 7:00 AM EDT: Maintenance starting. Niagara login nodes and the file system are kept up as much as possible, but will be rebooted at some point.

February 20, 2024, 3:45 PM EDT: Cooling tower has been restored, all systems are in production.

February 20, 2024, 1:30 AM EDT: Cooling tower malfunction, all compute nodes are shutdown, the root cause will be addressed earliest in the morning.

February 21 and 22, 2024: SciNet Data Centre Maintenance:
This annual winter maintenance involves a full data centre shutdown starting at 7:00 am EST on Wednesday, February 21st. None of the SciNet systems (Niagara, Mist, Rouge, Teach, the file systems, as well as hosted equipment) will be accessible. All systems should be fully available again in the last afternoon of the 22nd.

The scheduler will hold jobs that cannot finish before the start of the shutdown. Users are encouraged to submit small and short jobs that can take advantage of this, as the scheduler may be able to fit these jobs in before the maintenance on otherwise idle nodes.

Previous messages

QuickStart Guides

Tutorials, Manuals, etc.