Difference between revisions of "Main Page"

From SciNet Users Documentation
Jump to: navigation, search
m
 
(154 intermediate revisions by 9 users not shown)
Line 7: Line 7:
 
<!-- Use "Up" or "Down"; these are templates. -->
 
<!-- Use "Up" or "Down"; these are templates. -->
 
{|style="width:100%"  
 
{|style="width:100%"  
|{{Up|Niagara|Niagara_Quickstart}}
+
|{{Up |Niagara|Niagara_Quickstart}}
|{{Up|HPSS|HPSS}}
+
|{{Up |Mist|Mist}}
|{{Up|Mist|Mist}}
+
|{{Up |Teach|Teach}}
|{{Up|Teach|Teach}}
+
|{{Up |Rouge|Rouge}}
 
|-
 
|-
|{{Up|Jupyter Hub|Jupyter_Hub}}
+
|{{Up |Jupyter Hub|Jupyter_Hub}}
|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}
+
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}
+
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}
|{{Up|Burst Buffer|Burst_Buffer}}
+
|{{Up |Burst Buffer|Burst_Buffer}}
 
|-
 
|-
|{{Up|Login Nodes|Niagara_Quickstart#Logging_in}}  
+
|{{Down |HPSS|HPSS}}
|{{Meh|External Network|Niagara_Quickstart#Logging_in}}  
+
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}}  
|{{Up|Globus|Globus}}
+
|{{Up |External Network|Niagara_Quickstart#Logging_in}}  
 +
|{{Up |Globus |Globus}} (-hpss)
 
|}
 
|}
  
 
<!-- Current Messages: -->
 
<!-- Current Messages: -->
<b> August 24, 2020, 6:35 PM EST: </b> We have partial connectivity back, but are still investigating.
+
<b>Wed Sep 23 17:23 EDT 2021 </b> Systems being brought back online. HPSS may be down for some more days.
  
<b> August 24, 2020, 3:15 PM EST: </b> There are issues connecting to the data centre. We're investigating.
+
<b>Wed Sep 23 12:30 EDT 2021 </b> Cooling restored. Systems should be available later this afternoon.
  
<b> August 21, 2020, 6:00 PM EST: </b> The pump has been repaired, cooling is restored, systems are up.  <br/>Scratch purging is postponed until the evening of Friday Aug 28th, 2020.
+
<b>Wed Sep 23 9:30 EDT 2021 </b> Technicians on site working on cooling system.  
  
<b>August 19, 2020, 4:40 PM EST:</b> Update: The current estimate is to have the cooling restored on Friday and we hope to have the systems available for users on Saturday August 22, 2020.
+
<b>Wed Sep 23 3:30 EDT 2021 </b> Cooling system issues still unresolved.  
  
<b>August 17, 2020, 4:00 PM EST:</b> Unfortunately after taking the pump apart it was determined there was a more serious failure of the main drive shaft, not just the seal. As a new one will need to be sourced or fabricated we're estimating that it will take at least a few more days to get the part and repairs done to restore cooling. Sorry for the inconvenience. 
+
<b>Wed Sep 22 23:27:48 EDT 2021 </b> Shutdown of the datacenter due to a problem with the cooling system.
  
<b>August 15, 2020, 1:00 PM EST:</b> Due to parts availablity to repair the failed pump and cooling system it is unlikely that systems will be able to be restored until Monday afternoon at the earliest.  
+
<b>Wed Sep 22 09:30 EDT 2021 </b>: File system issues, resolved.
  
<b>August 15, 2020, 00:04 AM EST:</b> A primary pump seal in the cooling infrastructure has blown and parts availability will not be able be determined until tomorrow. All systems are shut down as there is no cooling.  If parts are available, systems may be back at the earliest late tomorrow. Check here for updates.
+
<b>Wed Sep 22 07:30 EDT 2021 </b>: File system issues, investigating.
  
<b>August 14, 2020, 21:04 AM EST:</b> Tomorrow's /scratch purge has been postponed.
+
<b>Sun Sep 19 10:00 EDT 2021</b>: Power glitch interrupted all compute jobs; please resubmit any jobs you had running.
  
<b>August 14, 2020, 21:00 AM EST:</b> Staff at the datacenter. Looks like one of the pumps has a seal that is leaking badly.
+
<b>Wed Sep 15 17:35 EDT 2021</b>: filesystem issues resolved
  
<b>August 14, 2020, 20:37 AM EST:</b> We seem to be undergoing a thermal shutdown at the datacenter.
+
<b>Wed Sep 15 16:39 EDT 2021</b>: filesystem issues
 +
 
 +
<b>Mon Sep 13 13:15:07 EDT 2021</b> HPSS is back online.
 +
 
 +
<b>Fri Sep 10 17:57:23 EDT 2021</b> HPSS is offline due to unscheduled maintenance.
 +
 
 +
<b>Wed Aug 18 16:13:42 EDT 2021</b> The HPSS upgrade is complete.
 +
 
 +
<b>HPSS Downtime August 17th and 18th, 2021 (Tuesday and Wednesday):</b> We'll be upgrading the HPSS software to version 8.3, along with all the clients (htar/hsi, vfs and Globus/dsi)
  
<b>August 14, 2020, 20:20 AM EST:</b> Network problems to niagara/mist. We are investigating.
 
 
 
<!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages -->
 
<!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages -->
 
{|style="border-spacing: 10px;width: 100%"
 
{|style="border-spacing: 10px;width: 100%"
Line 52: Line 59:
 
* [[Niagara Quickstart]]
 
* [[Niagara Quickstart]]
 
* [[HPSS | HPSS archival storage]]
 
* [[HPSS | HPSS archival storage]]
* [[SOSCIP_GPU | SOSCIP GPU cluster]]
 
 
* [[Mist| Mist Power 9 GPU cluster]]
 
* [[Mist| Mist Power 9 GPU cluster]]
 
* [[Teach|Teach cluster]]
 
* [[Teach|Teach cluster]]
Line 60: Line 66:
  
 
== Tutorials, Manuals, etc. ==
 
== Tutorials, Manuals, etc. ==
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]
+
* [https://education.scinet.utoronto.ca SciNet education material]
 
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]
 
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]
 
* [[Modules specific to Niagara|Software Modules specific to Niagara]]  
 
* [[Modules specific to Niagara|Software Modules specific to Niagara]]  
 +
* [[Modules for Mist]]
 
* [[Commercial software]]
 
* [[Commercial software]]
 
* [[Burst Buffer]]
 
* [[Burst Buffer]]

Latest revision as of 00:59, 24 September 2021

System Status

Niagara Mist Teach Rouge
Jupyter Hub Scheduler File system Burst Buffer
HPSS Login Nodes External Network Globus (-hpss)

Wed Sep 23 17:23 EDT 2021 Systems being brought back online. HPSS may be down for some more days.

Wed Sep 23 12:30 EDT 2021 Cooling restored. Systems should be available later this afternoon.

Wed Sep 23 9:30 EDT 2021 Technicians on site working on cooling system.

Wed Sep 23 3:30 EDT 2021 Cooling system issues still unresolved.

Wed Sep 22 23:27:48 EDT 2021 Shutdown of the datacenter due to a problem with the cooling system.

Wed Sep 22 09:30 EDT 2021 : File system issues, resolved.

Wed Sep 22 07:30 EDT 2021 : File system issues, investigating.

Sun Sep 19 10:00 EDT 2021: Power glitch interrupted all compute jobs; please resubmit any jobs you had running.

Wed Sep 15 17:35 EDT 2021: filesystem issues resolved

Wed Sep 15 16:39 EDT 2021: filesystem issues

Mon Sep 13 13:15:07 EDT 2021 HPSS is back online.

Fri Sep 10 17:57:23 EDT 2021 HPSS is offline due to unscheduled maintenance.

Wed Aug 18 16:13:42 EDT 2021 The HPSS upgrade is complete.

HPSS Downtime August 17th and 18th, 2021 (Tuesday and Wednesday): We'll be upgrading the HPSS software to version 8.3, along with all the clients (htar/hsi, vfs and Globus/dsi)

QuickStart Guides

Tutorials, Manuals, etc.