Difference between revisions of "Main Page"

From SciNet Users Documentation
Jump to: navigation, search
m (System Status)
(System Status)
(29 intermediate revisions by 5 users not shown)
Line 9: Line 9:
 
|{{Up|Niagara|Niagara_Quickstart}}
 
|{{Up|Niagara|Niagara_Quickstart}}
 
|{{Up|HPSS|HPSS}}
 
|{{Up|HPSS|HPSS}}
|{{Up|SOSCIP GPU|SOSCIP_GPU}}
 
 
|{{Up|Mist|Mist}}
 
|{{Up|Mist|Mist}}
 +
|{{Up|Teach|Teach}}
 
|-
 
|-
|{{Up|Teach|Teach}}
 
 
|{{Up|Jupyter Hub|Jupyter_Hub}}
 
|{{Up|Jupyter Hub|Jupyter_Hub}}
 
|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}
 
|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}
 
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}
 
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}
 +
|{{Up|Burst Buffer|Burst_Buffer}}
 
|-
 
|-
 
|{{Up|Login Nodes|Niagara_Quickstart#Logging_in}}  
 
|{{Up|Login Nodes|Niagara_Quickstart#Logging_in}}  
 
|{{Up|External Network|Niagara_Quickstart#Logging_in}}  
 
|{{Up|External Network|Niagara_Quickstart#Logging_in}}  
 
|{{Up|Globus|Globus}}
 
|{{Up|Globus|Globus}}
|{{Up|Burst Buffer|Burst_Buffer}}
 
 
|}
 
|}
 
<!-- Current Messages: -->
 
<!-- Current Messages: -->
  
<b>May 3, 2020, 8:20 AM:</b> A power glitch this morning caused all compute nodes to be rebooted: jobs running at the time may have failed; users are asked to resubmit these jobs.
+
<b> June 29, 6:21:00  PM:</b> Systems are available again.   
 
 
<b>April 28, 2020, 7:20 AM:</b> A power glitch this morning caused all compute nodes to be rebooted: jobs running at the time have failed; users are asked to resubmit these jobs.
 
   
 
<b>April 20, 2020: Security Incident at Cedar; implications for Niagara users</b>
 
 
 
Last week, it became evident that the Cedar GP cluster had been
 
comprimised for several weeks.  The passwords of at least two
 
Compute Canada users were known to the attackers. One of these was
 
used to escalate privileges on Cedar, as explained on
 
https://status.computecanada.ca/view_incident?incident=423.
 
  
These accounts were used to login to Niagara as well, but Niagara
+
<b> June 29, 12:30:00  PM:</b> Power Outage caused thermal shutdown.
did not have the same security loophole as Cedar (which has been
 
fixed), and no further escalation was observed on Niagara.
 
  
Reassuring as that may sound, it is not known how the passwords of
+
<b>June 20, 2020, 10:24 PM:</b> File systems are back up. Unfortunately, all running jobs would have died and users are asked to resubmit them.
the two user accounts were obtained. Given this uncertainty, the
 
SciNet team *strongly* recommends that you change your password on
 
https://ccdb.computecanada.ca/security/change_password, and remove
 
any SSH keys and regenerate new ones (see
 
https://docs.scinet.utoronto.ca/index.php/SSH_keys).
 
  
<b> SciNet/Niagara Downtime Announcement, May 6-7, 2020</b>
+
<b>June 20, 2020, 9:48 PM:</b> An issue with the file systems is causing trouble.  We are investigating the cause.
  
All resources at SciNet will undergo a two-day maintenance shutdown on May 6th and 7th 2020, starting at 7 am EDT on Wednesday May 6th.  There will be no access to any of the SciNet systems (Niagara, Mist, HPSS, Teach cluster, or the file systems) or systems hosted at the SciNet data centre.  We expect to be able to bring the systems back online the evening of May 7th.
+
<b>June 15, 2020, 10:30 PM:</b> A <b>power glitch</b> caused some compute nodes to be rebooted: jobs running at the time may have failed; users are asked to resubmit these jobs.
  
 
<!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages -->
 
<!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages -->
Line 72: Line 54:
 
* [[Burst Buffer]]
 
* [[Burst Buffer]]
 
* [[SSH Tunneling]]
 
* [[SSH Tunneling]]
 +
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]
 
* [[Visualization]]
 
* [[Visualization]]
 
* [[Running Serial Jobs on Niagara]]
 
* [[Running Serial Jobs on Niagara]]
 
* [[Jupyter Hub]]
 
* [[Jupyter Hub]]
 
|}
 
|}

Revision as of 12:43, 30 June 2020

System Status

Niagara HPSS Mist Teach
Jupyter Hub Scheduler File system Burst Buffer
Login Nodes External Network Globus

June 29, 6:21:00 PM: Systems are available again.

June 29, 12:30:00 PM: Power Outage caused thermal shutdown.

June 20, 2020, 10:24 PM: File systems are back up. Unfortunately, all running jobs would have died and users are asked to resubmit them.

June 20, 2020, 9:48 PM: An issue with the file systems is causing trouble. We are investigating the cause.

June 15, 2020, 10:30 PM: A power glitch caused some compute nodes to be rebooted: jobs running at the time may have failed; users are asked to resubmit these jobs.

QuickStart Guides

Tutorials, Manuals, etc.