Difference between revisions of "Main Page"

From SciNet Users Documentation
Jump to: navigation, search
(System Status)
 
(70 intermediate revisions by 5 users not shown)
Line 7: Line 7:
 
<!-- Use "Up" or "Down"; these are templates. -->
 
<!-- Use "Up" or "Down"; these are templates. -->
 
{|style="width:100%"  
 
{|style="width:100%"  
|{{Down|Niagara|Niagara_Quickstart}}
+
|{{Up|Niagara|Niagara_Quickstart}}
 
|{{Up|HPSS|HPSS}}
 
|{{Up|HPSS|HPSS}}
|{{Down|Mist|Mist}}
+
|{{Up|Mist|Mist}}
|{{Down|Teach|Teach}}
+
|{{Up|Teach|Teach}}
 
|-
 
|-
|{{Down|Jupyter Hub|Jupyter_Hub}}
+
|{{Up|Jupyter Hub|Jupyter_Hub}}
|{{Down|Scheduler|Niagara_Quickstart#Submitting_jobs}}
+
|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}
 
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}
 
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}
 
|{{Up|Burst Buffer|Burst_Buffer}}
 
|{{Up|Burst Buffer|Burst_Buffer}}
 
|-
 
|-
|{{Down|Login Nodes|Niagara_Quickstart#Logging_in}}  
+
|{{Up|Login Nodes|Niagara_Quickstart#Logging_in}}  
|{{Down|External Network|Niagara_Quickstart#Logging_in}}  
+
|{{Up|External Network|Niagara_Quickstart#Logging_in}}  
 
|{{Up|Globus|Globus}}
 
|{{Up|Globus|Globus}}
 
|}
 
|}
  
 
<!-- Current Messages: -->
 
<!-- Current Messages: -->
<b>August 19, 2020, 4:40 PM EST:</b> Update: The current estimate is to have the cooling restored on Friday and we hope to have the systems available for users on Saturday August 22, 2020.
 
  
<b>August 17, 2020, 4:00 PM EST:</b> Unfortunately after taking the pump apart it was determined there was a more serious failure of the main drive shaft, not just the seal. As a new one will need to be sourced or fabricated we're estimating that it will take at least a few more days to get the part and repairs done to restore cooling. Sorry for the inconvenience. 
+
From Tue Mar 30 at 12 noon EST to Thu Apr 1 at 12 noon EST, there will be a two-day reservation for the "Niagara at Scale" pilot  event.  During these 48 hours, only "Niagara at Scale" projects will run on the compute notes (as well as SOSCIP projects, on a subset of nodes).  All other users can still login, access their data, and submit jobs throughout this event, but the jobs will not run until after the event. The debugjob queue will remain available to everyone as well.
  
<b>August 15, 2020, 1:00 PM EST:</b> Due to parts availablity to repair the failed pump and cooling system it is unlikely that systems will be able to be restored until Monday afternoon at the earliest.  
+
The scheduler will not start batch jobs that cannot finish before the start of this event. Users can submit small and short jobs can take advantage of this, as the scheduler may be able to fit these jobs in before the event starts on the otherwise idle nodes.
  
<b>August 15, 2020, 00:04 AM EST:</b>  A primary pump seal in the cooling infrastructure has blown and parts availability will not be able be determined until tomorrow. All systems are shut down as there is no cooling.  If parts are available, systems may be back at the earliest late tomorrow. Check here for updates.
+
Tue 23 Mar 2021 12:19:07 PM EDT - Planned external network maintenance 12pm-1pm Tuesday, March 23rd.  
  
<b>August 14, 2020, 21:04 AM EST:</b> Tomorrow's /scratch purge has been postponed.
 
 
<b>August 14, 2020, 21:00 AM EST:</b> Staff at the datacenter. Looks like one of the pumps has a seal that is leaking badly.
 
 
<b>August 14, 2020, 20:37 AM EST:</b> We seem to be undergoing a thermal shutdown at the datacenter.
 
 
<b>August 14, 2020, 20:20 AM EST:</b> Network problems to niagara/mist. We are investigating.
 
 
 
<!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages -->
 
<!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages -->
 
{|style="border-spacing: 10px;width: 100%"
 
{|style="border-spacing: 10px;width: 100%"
Line 46: Line 37:
 
* [[Niagara Quickstart]]
 
* [[Niagara Quickstart]]
 
* [[HPSS | HPSS archival storage]]
 
* [[HPSS | HPSS archival storage]]
* [[SOSCIP_GPU | SOSCIP GPU cluster]]
 
 
* [[Mist| Mist Power 9 GPU cluster]]
 
* [[Mist| Mist Power 9 GPU cluster]]
 
* [[Teach|Teach cluster]]
 
* [[Teach|Teach cluster]]

Latest revision as of 17:32, 1 April 2021

System Status

Niagara HPSS Mist Teach
Jupyter Hub Scheduler File system Burst Buffer
Login Nodes External Network Globus


From Tue Mar 30 at 12 noon EST to Thu Apr 1 at 12 noon EST, there will be a two-day reservation for the "Niagara at Scale" pilot event. During these 48 hours, only "Niagara at Scale" projects will run on the compute notes (as well as SOSCIP projects, on a subset of nodes). All other users can still login, access their data, and submit jobs throughout this event, but the jobs will not run until after the event. The debugjob queue will remain available to everyone as well.

The scheduler will not start batch jobs that cannot finish before the start of this event. Users can submit small and short jobs can take advantage of this, as the scheduler may be able to fit these jobs in before the event starts on the otherwise idle nodes.

Tue 23 Mar 2021 12:19:07 PM EDT - Planned external network maintenance 12pm-1pm Tuesday, March 23rd.

QuickStart Guides

Tutorials, Manuals, etc.