Difference between revisions of "Main Page"

From SciNet Users Documentation
Jump to: navigation, search
m
(System Status)
 
(135 intermediate revisions by 9 users not shown)
Line 9: Line 9:
 
|{{Up|Niagara|Niagara_Quickstart}}
 
|{{Up|Niagara|Niagara_Quickstart}}
 
|{{Up|HPSS|HPSS}}
 
|{{Up|HPSS|HPSS}}
|{{Up|SOSCIP GPU|SOSCIP_GPU}}
 
 
|{{Up|Mist|Mist}}
 
|{{Up|Mist|Mist}}
 +
|{{Up|Teach|Teach}}
 
|-
 
|-
|{{Up|Teach|Teach}}
 
 
|{{Up|Jupyter Hub|Jupyter_Hub}}
 
|{{Up|Jupyter Hub|Jupyter_Hub}}
 
|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}
 
|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}
 
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}
 
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}
 +
|{{Up|Burst Buffer|Burst_Buffer}}
 
|-
 
|-
 
|{{Up|Login Nodes|Niagara_Quickstart#Logging_in}}  
 
|{{Up|Login Nodes|Niagara_Quickstart#Logging_in}}  
Line 21: Line 21:
 
|{{Up|Globus|Globus}}
 
|{{Up|Globus|Globus}}
 
|}
 
|}
 +
 
<!-- Current Messages: -->
 
<!-- Current Messages: -->
 +
<b> October 9, 2020, 12:57 PM: </b> A short power glitch caused many of the Niagara compute nodes to lose power; jobs running on them would have failed. Please check your jobs and resubmit.
  
<b> March 2, 2020, 1:30 PM:</b> For the extension of Niagara, the operating system on all Niagara nodes has been upgraded
+
<b> October 8, 2020, 9:50 PM: </b> Jupyterhub service is back up.
from CentOS 7.4 to 7.6.  This required all
 
nodes to be rebooted. Running compute jobs are allowed to finish
 
before the compute node gets rebooted. Login nodes have all been rebooted, as have the datamover nodes and the jupyterhub service.
 
 
 
<b> Feb 24, 2020, 1:30PM: </b> The [[Mist]] login node got rebooted.  It is back, but we are still monitoring the situation.
 
 
 
<b> Feb 12, 2020, 11:00AM: </b> The [[Mist]] GPU cluster now available to users.
 
  
<b> Feb 11, 2020, 2:00PM: </b> The Niagara compute nodes were accidentally rebooted, killing all running jobs.
+
<b> October 8, 2020, 5:40 PM: </b> Jupyterhub service is down. We are investigating.
  
<b> Feb 10, 2020, 19:00PM: </b> HPSS is back to normal.
+
<b> September 28, 2020, 11:00 AM EST: </b> A short power glitch caused many of the Niagara compute nodes to lose power; jobs running on them would have failed. Please check your jobs and resubmit.
  
<b> Jan 30, 2020, 12:01PM: </b> We are having an issue with HPSS, in which the disk-cache is full. We put a reservation on the whole system (Globus, plus archive and vfs queues), until it has had a chance to clear some space on the cache.
+
<b> September 1, 2020, 2:15 PM EST: </b> A short power glitch caused about half of the Niagara compute nodes to lose power; jobs running on them would have failed. Please check your jobs and resubmit.
  
 +
<b> September 1, 2020, 9:27 AM EST: </b> The Niagara cluster has moved to a new default software stack, NiaEnv/2019b.  If your job scripts used the previous default software stack before (NiaEnv/2018a), please put the command "module load NiaEnv/2018a" before other module commands in those scripts, to ensure they will continue to work, or try the new stack (recommended).
 
<!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages -->
 
<!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages -->
 
{|style="border-spacing: 10px;width: 100%"
 
{|style="border-spacing: 10px;width: 100%"
Line 59: Line 55:
 
* [[Burst Buffer]]
 
* [[Burst Buffer]]
 
* [[SSH Tunneling]]
 
* [[SSH Tunneling]]
 +
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]
 
* [[Visualization]]
 
* [[Visualization]]
 
* [[Running Serial Jobs on Niagara]]
 
* [[Running Serial Jobs on Niagara]]
 
* [[Jupyter Hub]]
 
* [[Jupyter Hub]]
 
|}
 
|}

Latest revision as of 17:09, 9 October 2020

System Status

Niagara HPSS Mist Teach
Jupyter Hub Scheduler File system Burst Buffer
Login Nodes External Network Globus

October 9, 2020, 12:57 PM: A short power glitch caused many of the Niagara compute nodes to lose power; jobs running on them would have failed. Please check your jobs and resubmit.

October 8, 2020, 9:50 PM: Jupyterhub service is back up.

October 8, 2020, 5:40 PM: Jupyterhub service is down. We are investigating.

September 28, 2020, 11:00 AM EST: A short power glitch caused many of the Niagara compute nodes to lose power; jobs running on them would have failed. Please check your jobs and resubmit.

September 1, 2020, 2:15 PM EST: A short power glitch caused about half of the Niagara compute nodes to lose power; jobs running on them would have failed. Please check your jobs and resubmit.

September 1, 2020, 9:27 AM EST: The Niagara cluster has moved to a new default software stack, NiaEnv/2019b. If your job scripts used the previous default software stack before (NiaEnv/2018a), please put the command "module load NiaEnv/2018a" before other module commands in those scripts, to ensure they will continue to work, or try the new stack (recommended).

QuickStart Guides

Tutorials, Manuals, etc.