<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://docs.scinet.utoronto.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dgruner</id>
	<title>SciNet Users Documentation - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://docs.scinet.utoronto.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dgruner"/>
	<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php/Special:Contributions/Dgruner"/>
	<updated>2026-04-15T06:34:43Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.12</generator>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7460</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7460"/>
		<updated>2025-12-31T17:40:05Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Up | OnDemand|Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | HPSS|HPSS}}&lt;br /&gt;
|{{Up | Balam|Balam}}&lt;br /&gt;
|{{Up | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Wed Dec 31, 2025, 12:40 pm:''' We believe the problem has now been resolved.  Please let us know if you still experience login problems or aborted jobs.&lt;br /&gt;
&lt;br /&gt;
'''Tue Dec 30, 2025, 2:10 pm:''' We are experiencing problems with authentication, resulting in failed logins, OOD errors, and aborted jobs (with &amp;quot;prolog error&amp;quot;).  Please bear with us, as we are very short-staffed during the holiday break.  We will post updates here.&lt;br /&gt;
&lt;br /&gt;
'''Tue Dec 3, 2025, 11:30 am:''' Open OnDemand is fully operational again.&lt;br /&gt;
&lt;br /&gt;
'''Sat Nov 29, 2025, 00:40 am:''' There has been a problem with the water chiller. Some systems are offline.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 12:55 pm:''' Balam is back online.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 10:00 am:''' Open OnDemand is back online.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 11:00 pm:''' Most of the work is done, data movers, Globus, and HPSS are back online. Remaining services will be worked on tomorrow.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 8:30 am:''' Scheduled network maintenance. Trillium cluster is *not* affected.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 17:30 am:''' Balam maintenance finished.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 7:00 am:''' Balam maintenance day.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:55 pm:''' Trillium inbound connections through trillium.alliancecan.ca or trillium.scinet.utoronto.ca are working again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:05 pm:''' Trillium is experiencing external network issues for both incoming traffic. Please try: ssh USERNAME@tri-login01.scinet.utoronto.ca in the meantime.&lt;br /&gt;
 &lt;br /&gt;
'''Thu Oct 06, 2025, 8:00 pm:''' HPSS is fully functional. You may submit archive jobs from trillium login nodes, datamovers and robots.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 03, 2025, 6:30 pm:''' HPSS is back online, and already accessible via alliancecan#hpss Globus endpoint. Directory tree now follows the other Alliance clusters. We're still working on job submission via Slurm&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 0:00 am:''' Niagara compute nodes are now unavailable for regular users. The login nodes will remain available for a while to allow a few last data transfers, although transfers from the Niagara file systems to Trillium are best done on nia-dm1.scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 9:30 am:''' HPSS is down for scheduled maintenance, including alliancecan#hpss Globus endpoint&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7457</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7457"/>
		<updated>2025-12-30T19:11:36Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Up | OnDemand|Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | HPSS|HPSS}}&lt;br /&gt;
|{{Up | Balam|Balam}}&lt;br /&gt;
|{{Up | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Tue Dec 30, 2025, 2:10 pm:''' We are experiencing problems with authentication, resulting in failed logins, OOD errors, and aborted jobs (with &amp;quot;prolog error&amp;quot;).  Please bear with us, as we are very short-staffed during the holiday break.  We will post updates here.&lt;br /&gt;
&lt;br /&gt;
'''Tue Dec 3, 2025, 11:30 am:''' Open OnDemand is fully operational again.&lt;br /&gt;
&lt;br /&gt;
'''Sat Nov 29, 2025, 00:40 am:''' There has been a problem with the water chiller. Some systems are offline.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 12:55 pm:''' Balam is back online.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 10:00 am:''' Open OnDemand is back online.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 11:00 pm:''' Most of the work is done, data movers, Globus, and HPSS are back online. Remaining services will be worked on tomorrow.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 8:30 am:''' Scheduled network maintenance. Trillium cluster is *not* affected.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 17:30 am:''' Balam maintenance finished.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 7:00 am:''' Balam maintenance day.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:55 pm:''' Trillium inbound connections through trillium.alliancecan.ca or trillium.scinet.utoronto.ca are working again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:05 pm:''' Trillium is experiencing external network issues for both incoming traffic. Please try: ssh USERNAME@tri-login01.scinet.utoronto.ca in the meantime.&lt;br /&gt;
 &lt;br /&gt;
'''Thu Oct 06, 2025, 8:00 pm:''' HPSS is fully functional. You may submit archive jobs from trillium login nodes, datamovers and robots.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 03, 2025, 6:30 pm:''' HPSS is back online, and already accessible via alliancecan#hpss Globus endpoint. Directory tree now follows the other Alliance clusters. We're still working on job submission via Slurm&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 0:00 am:''' Niagara compute nodes are now unavailable for regular users. The login nodes will remain available for a while to allow a few last data transfers, although transfers from the Niagara file systems to Trillium are best done on nia-dm1.scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 9:30 am:''' HPSS is down for scheduled maintenance, including alliancecan#hpss Globus endpoint&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=S4H&amp;diff=7346</id>
		<title>S4H</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=S4H&amp;diff=7346"/>
		<updated>2025-12-02T21:43:03Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Introduction =&lt;br /&gt;
S4H (formerly SciNet4Health) is our secure computing environment pilot, providing users with the ability to run [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium] jobs on confidential data. This subsystem is comprised of a dedicated login node and a storage appliance, but it is highly integrated with Trillium. Security concerns are addressed by&lt;br /&gt;
&lt;br /&gt;
* Hardened access&lt;br /&gt;
* Encryption at rest&lt;br /&gt;
* Group isolation&lt;br /&gt;
* Data egress control (optional)&lt;br /&gt;
&lt;br /&gt;
Usage of S4H is by request only. Access must be requested by a principal investigator (PI) on behalf of their group members (i.e. sponsored users on CCDB).&lt;br /&gt;
&lt;br /&gt;
= Policies =&lt;br /&gt;
Each user is assigned one of three policies:&lt;br /&gt;
&lt;br /&gt;
* '''Permissive:''' the user may connect to the login node using SSH from pre-approved source IP addresses, and has unrestricted internet access from the login node&lt;br /&gt;
* '''Restrictive:''' the user may connect to the login node using SSH from pre-approved source IP addresses, but internet access from the login node is restricted&lt;br /&gt;
* '''Prohibitive:''' the user may only connect to the login node using a remote desktop client program from pre-approved source IP addresses, and internet access from the login node is restricted&lt;br /&gt;
&lt;br /&gt;
If you don't know what policy you belong to, you should ask your PI.&lt;br /&gt;
&lt;br /&gt;
= Login =&lt;br /&gt;
== Direct ==&lt;br /&gt;
Users with permission to connect directly to the login node (permissive and restrictive policies) should first make sure that they are able to login to Trillium (i.e. they have uploaded an SSH public key to, and set up second factor authentication on CCDB). If access to Trillium is successful, use the same username and SSH key to login to the following address:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
s4h.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You should be prompted for the second factor, like in Trillium.&lt;br /&gt;
&lt;br /&gt;
The connection must be made from one of the '''IP addresses pre-approved by the PI''' for that user (e.g. a workstation or a jump host in your lab).&lt;br /&gt;
&lt;br /&gt;
== Through the graphical gateway ==&lt;br /&gt;
Users with permission to connect through the graphical gateway should use an RDP-enabled remote desktop client and login to the following address using their CCDB username and password:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
s4h-ggw.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to set the display resolution to 1600×900 if the resolution is not picked up automatically by the program.&lt;br /&gt;
&lt;br /&gt;
Additionally, a &amp;quot;pre-login&amp;quot; step has to be performed. In this step, an SSH ''agent'' must be forwarded to the above address. The graphical gateway asks the user's SSH agent program to perform the authentication (i.e. the workstation where the user is connecting from has the private key and it has been added to the agent). In a sense this is a 3-factor authentication: one needs the password, the SSH private key, and have either a YubiKey or the Duo mobile app registered with CCDB. Here is an example of this process:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
eval $(ssh-agent)&lt;br /&gt;
ssh-add /home/alice/.ssh/ccdb_ed25519&lt;br /&gt;
ssh -T -A alice@s4h-ggw.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that no shell access is expected after the &amp;lt;code&amp;gt;ssh&amp;lt;/code&amp;gt; command; but the window on the remote desktop program should now prompt for the YubiKey passcode or Duo mobile app push. Once that is done as well, the SSH client will print _Login successful_ and quit.&lt;br /&gt;
&lt;br /&gt;
The connection must be made from one of the '''IP addresses pre-approved by the PI''' for that user (e.g. a workstation or a jump host in your lab).&lt;br /&gt;
&lt;br /&gt;
= Storage =&lt;br /&gt;
== Directories ==&lt;br /&gt;
Trillium file systems are accessible via their usual paths but are read-only on S4H (to prevent accidentally saving sensitive data there). Instead, home, scratch, and project spaces are provided on alternative paths under &amp;lt;code&amp;gt;/s4h&amp;lt;/code&amp;gt; (indicating the encrypted storage appliance). If the user &amp;quot;alice&amp;quot; belongs to the group &amp;quot;def-bob&amp;quot; on S4H, their home directory (which can be expanded from &amp;lt;code&amp;gt;~&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt;) will be located in &amp;lt;code&amp;gt;/s4h/def-bob/home/alice&amp;lt;/code&amp;gt; and similarly their scratch directory (can be expanded from &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt;) will be in &amp;lt;code&amp;gt;/s4h/def-bob/scratch/alice&amp;lt;/code&amp;gt;. The project directory is &amp;lt;code&amp;gt;/s4h/def-bob/project&amp;lt;/code&amp;gt;, users may create their own directories there as needed.&lt;br /&gt;
&lt;br /&gt;
The environment variables &amp;lt;code&amp;gt;$TRIHOME&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;$TRISCRATCH&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;$TRIPROJECT&amp;lt;/code&amp;gt; expand to the corresponding file system paths for Trillium (as noted above, they are read-only on S4H)&lt;br /&gt;
&lt;br /&gt;
== Data transfer ==&lt;br /&gt;
For users in the permissive and restrictive policies, please use an SSH-based program (such as &amp;lt;code&amp;gt;scp&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;rsync&amp;lt;/code&amp;gt;) to transfer data directly in and out of the S4H login node. There is no dedicated datamover for S4H.&lt;br /&gt;
&lt;br /&gt;
Under the prohibitive policy, users may not transfer sensitive data in and out of S4H. They may only upload non-sensitive data to Trillium (storage not encrypted at rest) where they can be accessed from S4H. For egress purposes, the PI should designate at least one user (could be themselves) that is not under the prohibitive policy. Other users in the group could share files for egress with the designated user (e.g. by putting them in the group's project directory).&lt;br /&gt;
&lt;br /&gt;
== Data policies ==&lt;br /&gt;
It is important to understand that:&lt;br /&gt;
* Within a group, file access is managed by traditional POSIX permissions and access-control lists, like on Trillium. In case users in a group are working on separate sub-projects where there should not be mutual access, it is their responsibility to make sure that permissions are set up correctly.&lt;br /&gt;
* There is no way to facilitate cross-group file sharing of sensitive files on S4H; each group has a different encryption key and the system is set up so that a compute node can only use one key at a time.&lt;br /&gt;
* &amp;lt;span style=&amp;quot;color: red&amp;quot;&amp;gt;No backup is provided for encrypted storage;&amp;lt;/span&amp;gt; deletion is irreversible. This ensure that data are securely disposed of in compliance with a provision found in many data sharing agreements.&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
Same as Trillium.&lt;br /&gt;
&lt;br /&gt;
Note that you may use software (including, for example, Python virtual environments) that you installed in your Trillium file systems on S4H (but not vice versa). This could be useful for users under the restrictive or prohibitive policy, that may otherwise have difficulty installing software in their encrypted storage spaces.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
This is largely the same as Trillium. &amp;lt;span style=&amp;quot;color: red&amp;quot;&amp;gt;Note however that job metadata are not kept confidential!&amp;lt;/span&amp;gt; In particular, the submitting user, work directory, job name, comment, and command should be ''considered public information'' and the users must not include any sensitive information in these.&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=6689</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=6689"/>
		<updated>2025-07-05T02:06:46Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Partial   | Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Partial   | Mist|Mist}}&lt;br /&gt;
|{{Up  | Teach|Teach}}&lt;br /&gt;
|{{Up   | Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   | OnDemand|Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up   | Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up   | File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up   | Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   | HPSS|HPSS}}&lt;br /&gt;
|{{Up   | Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   | External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   | Balam|Balam}}&lt;br /&gt;
|{{Up   | Cvmfs|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''July 9, 2025:''' The [[Teach]] cluster will be unavailable for the day for network maintenance.&lt;br /&gt;
&lt;br /&gt;
'''July 4, 2025:''' Open OnDemand is back up.&lt;br /&gt;
&lt;br /&gt;
'''July 4, 2025:''' Open OnDemand is down. We are investigating.&lt;br /&gt;
&lt;br /&gt;
'''June 25, 2025, 7:15 PM EDT:''' The [[Teach]] cluster's scheduler is up again.&lt;br /&gt;
&lt;br /&gt;
'''June 25, 2025, 4:30 PM EDT:''' The [[Teach]] cluster's scheduler is down. We are investigating.&lt;br /&gt;
&lt;br /&gt;
'''April 30, 2025, 9:30 AM EDT:''' The [[Teach]] cluster is available again.&lt;br /&gt;
&lt;br /&gt;
'''April 30, 2025:''' The [[Teach]] cluster will be unavailable from 8:00 am to about 12:00 noon for file system maintenance.&lt;br /&gt;
&lt;br /&gt;
'''April 1, 2025:''' The Jupyter Hub has been replaced by SciNet's [[Open OnDemand Quickstart|Open OnDemand service]].&lt;br /&gt;
&lt;br /&gt;
'''March 1, 2025:''' As of March 1st scratch purging is suspended until after Trillium comes online.&lt;br /&gt;
&lt;br /&gt;
'''January 6, 2025:''' As part of the installation of the new computing cluster Trillium, there is now a permanent reduction in computing capacity of Niagara to 50% and of Mist to 35%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=6686</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=6686"/>
		<updated>2025-07-05T02:05:58Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Partial   | Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Partial   | Mist|Mist}}&lt;br /&gt;
|{{Up  | Teach|Teach}}&lt;br /&gt;
|{{Up   | Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   | OnDemand|Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up   | Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up   | File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up   | Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   | HPSS|HPSS}}&lt;br /&gt;
|{{Up   | Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   | External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   | Balam|Balam}}&lt;br /&gt;
|{{Up   | Cvmfs|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''July 9, 2025:''' The [[Teach]] cluster will be unavailable for the day for network maintenance.&lt;br /&gt;
&lt;br /&gt;
'''July 4, 2025:''' Open OnDemand is down. We are investigating.&lt;br /&gt;
&lt;br /&gt;
'''June 25, 2025, 7:15 PM EDT:''' The [[Teach]] cluster's scheduler is up again.&lt;br /&gt;
&lt;br /&gt;
'''June 25, 2025, 4:30 PM EDT:''' The [[Teach]] cluster's scheduler is down. We are investigating.&lt;br /&gt;
&lt;br /&gt;
'''April 30, 2025, 9:30 AM EDT:''' The [[Teach]] cluster is available again.&lt;br /&gt;
&lt;br /&gt;
'''April 30, 2025:''' The [[Teach]] cluster will be unavailable from 8:00 am to about 12:00 noon for file system maintenance.&lt;br /&gt;
&lt;br /&gt;
'''April 1, 2025:''' The Jupyter Hub has been replaced by SciNet's [[Open OnDemand Quickstart|Open OnDemand service]].&lt;br /&gt;
&lt;br /&gt;
'''March 1, 2025:''' As of March 1st scratch purging is suspended until after Trillium comes online.&lt;br /&gt;
&lt;br /&gt;
'''January 6, 2025:''' As part of the installation of the new computing cluster Trillium, there is now a permanent reduction in computing capacity of Niagara to 50% and of Mist to 35%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5909</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5909"/>
		<updated>2024-10-21T21:11:45Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Down |Teach|Teach}}&lt;br /&gt;
|{{Down |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Balam|Balam}}&lt;br /&gt;
|{{Up |CCEnv|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Mon Oct 21 17:15 EDT 2024''': Compute nodes will remain down until we can replace the main cooling pump.  This may take several days.  Please see this page for updates.&lt;br /&gt;
&lt;br /&gt;
'''Mon Oct 21 12:15 EDT 2024''': Compute nodes have been shutdown due to a cooling system failure.&lt;br /&gt;
&lt;br /&gt;
'''Fri Oct 18 21:40 EDT 2024''': Systems are back to normal&lt;br /&gt;
&lt;br /&gt;
'''Fri Oct 18 21:15 EDT 2024''': We are experiences technical difficulties, apparently caused by a glitch in the file systems&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 1 10:45 EDT 2024''': The Jupyter Hub service will be rebooted today at around 11:00 am EDT for system upgrades. &lt;br /&gt;
&lt;br /&gt;
'''Tue Sep 3 07:00 EDT 2024''': Intermittent file system issues which may cause issues logging in.  We are in the process of resolving the issue.&lt;br /&gt;
&lt;br /&gt;
'''Sun Sep 1 00:01 - 04:00 EDT 2024''': Network maintenance may cause connection issues to the datacentre.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5777</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5777"/>
		<updated>2024-08-02T02:01:20Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up   |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up|Mist|Mist}}&lt;br /&gt;
|{{Up|Teach|Teach}}&lt;br /&gt;
|{{Up   |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up  |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up   |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   |HPSS|HPSS}}&lt;br /&gt;
|{{Up  |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   |Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   |Balam|Balam}}&lt;br /&gt;
|{{Up |CCEnv|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Thursday, August 1, 10:00 PM EDT''' Filesystem problems resolved.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Thursday, August 1, 9:30 PM EDT''' Filesystem problems preventing logins to the systems.  Working on it.&lt;br /&gt;
&lt;br /&gt;
'''Monday, July 22, 11:50 AM EDT''' Systems are back to normal&lt;br /&gt;
&lt;br /&gt;
'''Monday, July 22, 10:50 AM EDT''' Cooling problem has been fixed. Systems are coming up&lt;br /&gt;
&lt;br /&gt;
'''Monday, July 22, 10:20 AM EDT''' Compute nodes have been shutdown due to a cooling tower failure.&lt;br /&gt;
&lt;br /&gt;
'''Friday, July 19, 9:30 AM EDT''' CCEnv modules available on all login nodes again.&lt;br /&gt;
&lt;br /&gt;
'''Friday, July 19, 5:00 AM EDT''' Some login nodes do not have the CCEnv modules available.  We are working on a fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5774</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5774"/>
		<updated>2024-08-02T01:42:17Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up   |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up|Mist|Mist}}&lt;br /&gt;
|{{Up|Teach|Teach}}&lt;br /&gt;
|{{Up   |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up  |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Partial |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up   |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   |HPSS|HPSS}}&lt;br /&gt;
|{{Up  |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   |Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   |Balam|Balam}}&lt;br /&gt;
|{{Up |CCEnv|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Thursday, August 1, 9:30 PM EDT''' Filesystem problems preventing logins to the systems.  Working on it.&lt;br /&gt;
&lt;br /&gt;
'''Monday, July 22, 11:50 AM EDT''' Systems are back to normal&lt;br /&gt;
&lt;br /&gt;
'''Monday, July 22, 10:50 AM EDT''' Cooling problem has been fixed. Systems are coming up&lt;br /&gt;
&lt;br /&gt;
'''Monday, July 22, 10:20 AM EDT''' Compute nodes have been shutdown due to a cooling tower failure.&lt;br /&gt;
&lt;br /&gt;
'''Friday, July 19, 9:30 AM EDT''' CCEnv modules available on all login nodes again.&lt;br /&gt;
&lt;br /&gt;
'''Friday, July 19, 5:00 AM EDT''' Some login nodes do not have the CCEnv modules available.  We are working on a fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Gurobi&amp;diff=5702</id>
		<title>Gurobi</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Gurobi&amp;diff=5702"/>
		<updated>2024-06-14T19:58:02Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* Using Gurobi */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The [http://www.gurobi.com/ Gurobi] linear programming solver is installed on the Niagara software stack.&lt;br /&gt;
&lt;br /&gt;
&amp;quot;Given a set of linear inequality/equality constraints, Ax&amp;gt;=b, where A is a matrix and x &amp;amp; b are vectors, what is the set of variables x (within a given range) that maximizes/minimizes a target objective function f(x)?&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Such a model is very common in scientific computation, engineering, and business. If the variables x are (partially) limited to integers, this becomes Mixed Integer Programming (MIP) which is a much more sophisticated problem. Gurobi, along with other solvers (such as &amp;quot;linprog&amp;quot; and &amp;quot;intlinprog&amp;quot; in [[MATLAB]]) can solve such LP/MIP problems efficiently. And for Gurobi, for instance, efficient multi-threading (and even distributed computation, depending on the license) capabilities are also implemented for parallelism and easy scaling for large models.&lt;br /&gt;
&lt;br /&gt;
=Getting a license=&lt;br /&gt;
The University of Toronto has a free academic license to use Gurobi.  Access to the license is granted by loading the Gurobi module.&lt;br /&gt;
&lt;br /&gt;
=Running using the Niagara installation=&lt;br /&gt;
&lt;br /&gt;
==Gurobi 11.0.1==&lt;br /&gt;
To access commercial modules on Niagara one must invoke the 'module use' command.&lt;br /&gt;
&lt;br /&gt;
 module load NiaEnv/2022a&lt;br /&gt;
 module use /scinet/niagara/software/commercial/modules&lt;br /&gt;
 module load gurobi/11.0.1&lt;br /&gt;
&lt;br /&gt;
==Using Gurobi==&lt;br /&gt;
&lt;br /&gt;
To use Gurobi, one only needs to include &amp;quot;gurobi_c++.h&amp;quot; in the source file, and use the compilation/linking flags:&lt;br /&gt;
&lt;br /&gt;
 CXXLIB=-L ${SCINET_GUROBI_LIB} -lgurobi_g++8.5 -lgurobi110 -fopenmp&lt;br /&gt;
 CXXINC=-I ${SCINET_GUROBI_INC} -fopenmp&lt;br /&gt;
&lt;br /&gt;
The actual documentation for using Gurobi's API can be found [http://www.gurobi.com/documentation/11.0 here].&lt;br /&gt;
&lt;br /&gt;
==Running Gurobi==&lt;br /&gt;
Example submission script for a job running on 1 node, with max walltime of 11 hours:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --ntasks-per-node=1&lt;br /&gt;
#SBATCH --cpus-per-task=40&lt;br /&gt;
#SBATCH --time=11:00:00&lt;br /&gt;
#SBATCH --job-name test&lt;br /&gt;
&lt;br /&gt;
module load NiaEnv/2022a&lt;br /&gt;
module use /scinet/niagara/software/commercial/modules&lt;br /&gt;
module load gurobi/11.0.1&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# If you are using OpenMP&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
./mycode&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Gurobi&amp;diff=5699</id>
		<title>Gurobi</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Gurobi&amp;diff=5699"/>
		<updated>2024-06-14T19:55:41Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* Running Gurobi */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The [http://www.gurobi.com/ Gurobi] linear programming solver is installed on the Niagara software stack.&lt;br /&gt;
&lt;br /&gt;
&amp;quot;Given a set of linear inequality/equality constraints, Ax&amp;gt;=b, where A is a matrix and x &amp;amp; b are vectors, what is the set of variables x (within a given range) that maximizes/minimizes a target objective function f(x)?&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Such a model is very common in scientific computation, engineering, and business. If the variables x are (partially) limited to integers, this becomes Mixed Integer Programming (MIP) which is a much more sophisticated problem. Gurobi, along with other solvers (such as &amp;quot;linprog&amp;quot; and &amp;quot;intlinprog&amp;quot; in [[MATLAB]]) can solve such LP/MIP problems efficiently. And for Gurobi, for instance, efficient multi-threading (and even distributed computation, depending on the license) capabilities are also implemented for parallelism and easy scaling for large models.&lt;br /&gt;
&lt;br /&gt;
=Getting a license=&lt;br /&gt;
The University of Toronto has a free academic license to use Gurobi.  Access to the license is granted by loading the Gurobi module.&lt;br /&gt;
&lt;br /&gt;
=Running using the Niagara installation=&lt;br /&gt;
&lt;br /&gt;
==Gurobi 11.0.1==&lt;br /&gt;
To access commercial modules on Niagara one must invoke the 'module use' command.&lt;br /&gt;
&lt;br /&gt;
 module load NiaEnv/2022a&lt;br /&gt;
 module use /scinet/niagara/software/commercial/modules&lt;br /&gt;
 module load gurobi/11.0.1&lt;br /&gt;
&lt;br /&gt;
==Using Gurobi==&lt;br /&gt;
&lt;br /&gt;
To use Gurobi, one only needs to include &amp;quot;gurobi_c++.h&amp;quot; in the source file, and use the compilation/linking flags:&lt;br /&gt;
&lt;br /&gt;
 CXXLIB=-L ${SCINET_GUROBI_LIB} -lgurobi_g++5.2 -lgurobi75 -fopenmp&lt;br /&gt;
 CXXINC=-I ${SCINET_GUROBI_INC} -fopenmp&lt;br /&gt;
&lt;br /&gt;
The actual documentation for using Gurobi's API can be found [http://www.gurobi.com/documentation/7.5 here].&lt;br /&gt;
&lt;br /&gt;
==Running Gurobi==&lt;br /&gt;
Example submission script for a job running on 1 node, with max walltime of 11 hours:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --ntasks-per-node=1&lt;br /&gt;
#SBATCH --cpus-per-task=40&lt;br /&gt;
#SBATCH --time=11:00:00&lt;br /&gt;
#SBATCH --job-name test&lt;br /&gt;
&lt;br /&gt;
module load NiaEnv/2022a&lt;br /&gt;
module use /scinet/niagara/software/commercial/modules&lt;br /&gt;
module load gurobi/11.0.1&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# If you are using OpenMP&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
./mycode&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Gurobi&amp;diff=5696</id>
		<title>Gurobi</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Gurobi&amp;diff=5696"/>
		<updated>2024-06-14T19:54:43Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* Gurobi 9.5.1 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The [http://www.gurobi.com/ Gurobi] linear programming solver is installed on the Niagara software stack.&lt;br /&gt;
&lt;br /&gt;
&amp;quot;Given a set of linear inequality/equality constraints, Ax&amp;gt;=b, where A is a matrix and x &amp;amp; b are vectors, what is the set of variables x (within a given range) that maximizes/minimizes a target objective function f(x)?&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Such a model is very common in scientific computation, engineering, and business. If the variables x are (partially) limited to integers, this becomes Mixed Integer Programming (MIP) which is a much more sophisticated problem. Gurobi, along with other solvers (such as &amp;quot;linprog&amp;quot; and &amp;quot;intlinprog&amp;quot; in [[MATLAB]]) can solve such LP/MIP problems efficiently. And for Gurobi, for instance, efficient multi-threading (and even distributed computation, depending on the license) capabilities are also implemented for parallelism and easy scaling for large models.&lt;br /&gt;
&lt;br /&gt;
=Getting a license=&lt;br /&gt;
The University of Toronto has a free academic license to use Gurobi.  Access to the license is granted by loading the Gurobi module.&lt;br /&gt;
&lt;br /&gt;
=Running using the Niagara installation=&lt;br /&gt;
&lt;br /&gt;
==Gurobi 11.0.1==&lt;br /&gt;
To access commercial modules on Niagara one must invoke the 'module use' command.&lt;br /&gt;
&lt;br /&gt;
 module load NiaEnv/2022a&lt;br /&gt;
 module use /scinet/niagara/software/commercial/modules&lt;br /&gt;
 module load gurobi/11.0.1&lt;br /&gt;
&lt;br /&gt;
==Using Gurobi==&lt;br /&gt;
&lt;br /&gt;
To use Gurobi, one only needs to include &amp;quot;gurobi_c++.h&amp;quot; in the source file, and use the compilation/linking flags:&lt;br /&gt;
&lt;br /&gt;
 CXXLIB=-L ${SCINET_GUROBI_LIB} -lgurobi_g++5.2 -lgurobi75 -fopenmp&lt;br /&gt;
 CXXINC=-I ${SCINET_GUROBI_INC} -fopenmp&lt;br /&gt;
&lt;br /&gt;
The actual documentation for using Gurobi's API can be found [http://www.gurobi.com/documentation/7.5 here].&lt;br /&gt;
&lt;br /&gt;
==Running Gurobi==&lt;br /&gt;
Example submission script for a job running on 1 node, with max walltime of 11 hours:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --ntasks-per-node=1&lt;br /&gt;
#SBATCH --cpus-per-task=40&lt;br /&gt;
#SBATCH --time=11:00:00&lt;br /&gt;
#SBATCH --job-name test&lt;br /&gt;
&lt;br /&gt;
module use /scinet/niagara/software/commercial/modules&lt;br /&gt;
module load gurobi/7.5.2&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# If you are using OpenMP&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
./mycode&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5633</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5633"/>
		<updated>2024-05-29T17:55:35Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down   |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down  |Mist|Mist}}&lt;br /&gt;
|{{Down   |Teach|Teach}}&lt;br /&gt;
|{{Down   |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down   |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down   |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down  |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down   |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down   |HPSS|HPSS}}&lt;br /&gt;
|{{Down   |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down   |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down   |Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down  |Balam|Balam}}&lt;br /&gt;
|{{Down   |CCEnv|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Wednesday May 29, 2 PM EDT''' Electricians are checking and testing all junction boxes and connectors under the raised floor for safety.  Some systems are expected to be back up later today (storage, login nodes), and compute systems will be powered up as soon as it is deemed safe.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 28, 3 PM EDT''' Cleaning crews are at the datacentre, to pump the water and install dryers.  Once the floors are dry, we need to inspect all electrical boxes to ensure safety.  We do not expect to have a fully functional datacentre before Thursday, although we hope to be able to turn on the storage and login nodes sometime tomorrow, if circumstances permit.  Apologies, and thank you for your patience.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 28, 7 AM EDT''' A water mains break outside our datacentre has caused extensive flooding, and all systems have been shut down preventatively. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Friday May 17, 10 PM EDT - Saturday May 18, 2 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 6:45 PM EDT:''' All systems are recovered now.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 5 PM EDT:''' Power loss at the datacentre resulted in loss of cooling.  Systems are being restored.&lt;br /&gt;
&lt;br /&gt;
'''Friday May 3, 10 PM EDT - Saturday May 4, 2 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5624</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5624"/>
		<updated>2024-05-28T19:17:20Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down   |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down  |Mist|Mist}}&lt;br /&gt;
|{{Down   |Teach|Teach}}&lt;br /&gt;
|{{Down   |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down   |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down   |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down  |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down   |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down   |HPSS|HPSS}}&lt;br /&gt;
|{{Down   |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down   |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down   |Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down  |Balam|Balam}}&lt;br /&gt;
|{{Down   |CCEnv|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 28, 3 PM EDT''' Cleaning crews are at the datacentre, to pump the water and install dryers.  Once the floors are dry, we need to inspect all electrical boxes to ensure safety.  We do not expect to have a fully functional datacentre before Thursday, although we hope to be able to turn on the storage and login nodes sometime tomorrow, if circumstances permit.  Apologies, and thank you for your patience.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 28, 7 AM EDT''' A water mains break outside our datacentre has caused extensive flooding, and all systems have been shut down preventatively. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Friday May 17, 10 PM EDT - Saturday May 18, 2 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 6:45 PM EDT:''' All systems are recovered now.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 5 PM EDT:''' Power loss at the datacentre resulted in loss of cooling.  Systems are being restored.&lt;br /&gt;
&lt;br /&gt;
'''Friday May 3, 10 PM EDT - Saturday May 4, 2 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5621</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5621"/>
		<updated>2024-05-28T10:58:14Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down   |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down  |Mist|Mist}}&lt;br /&gt;
|{{Down   |Teach|Teach}}&lt;br /&gt;
|{{Down   |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down   |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down   |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down  |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down   |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down   |HPSS|HPSS}}&lt;br /&gt;
|{{Down   |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down   |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down   |Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down  |Balam|Balam}}&lt;br /&gt;
|{{Down   |CCEnv|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 28, 7 AM EDT''' A water mains break outside our datacentre has caused extensive flooding, and all systems have been shut down preventatively. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Friday May 17, 10 PM EDT - Saturday May 18, 2 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 6:45 PM EDT:''' All systems are recovered now.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 5 PM EDT:''' Power loss at the datacentre resulted in loss of cooling.  Systems are being restored.&lt;br /&gt;
&lt;br /&gt;
'''Friday May 3, 10 PM EDT - Saturday May 4, 2 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5618</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5618"/>
		<updated>2024-05-28T10:56:11Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up   |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up  |Mist|Mist}}&lt;br /&gt;
|{{Up   |Teach|Teach}}&lt;br /&gt;
|{{Up   |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up   |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up  |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up   |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   |HPSS|HPSS}}&lt;br /&gt;
|{{Up   |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   |Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up  |Balam|Balam}}&lt;br /&gt;
|{{Up   |CCEnv|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 28, 7 AM EDT''' A water mains break outside our datacentre has caused extensive flooding, and all systems have been shut down preventatively. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Friday May 17, 10 PM EDT - Saturday May 18, 2 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 6:45 PM EDT:''' All systems are recovered now.&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 5 PM EDT:''' Power loss at the datacentre resulted in loss of cooling.  Systems are being restored.&lt;br /&gt;
&lt;br /&gt;
'''Friday May 3, 10 PM EDT - Saturday May 4, 2 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5606</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=5606"/>
		<updated>2024-05-14T21:30:54Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up   |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up  |Mist|Mist}}&lt;br /&gt;
|{{Up   |Teach|Teach}}&lt;br /&gt;
|{{Up   |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up   |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up  |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up   |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   |HPSS|HPSS}}&lt;br /&gt;
|{{Up   |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   |Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up  |Balam|Balam}}&lt;br /&gt;
|{{Up   |CCEnv|Using_modules}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Tuesday May 14, 5 PM EDT''' Power loss at the datacentre resulted in loss of cooling.  Systems are being restored.&lt;br /&gt;
&lt;br /&gt;
'''Friday May 3, 10 PM EDT - Saturday May 4, 1 AM EDT:''' The external network will be unavailable for maintenance. Running and queued jobs on the systems will not be affected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Balam&amp;diff=5313</id>
		<title>Balam</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Balam&amp;diff=5313"/>
		<updated>2023-12-19T22:24:49Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* Specifications */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[File:Balam.jpeg|center|300px|thumb]] &lt;br /&gt;
|name=Balam&lt;br /&gt;
|installed=October 2023&lt;br /&gt;
|operatingsystem= Linux (Rocky 9.2)&lt;br /&gt;
|loginnode= balam-login01&lt;br /&gt;
|nnodes=10 &lt;br /&gt;
|gpuspernode=4 A100-40GB&lt;br /&gt;
|rampernode=1 TB&lt;br /&gt;
|corespernode=64 &lt;br /&gt;
|interconnect=Infiniband&lt;br /&gt;
|vendorcompilers=cuda/intel/gcc&lt;br /&gt;
|queuetype=slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
= Specifications=&lt;br /&gt;
&lt;br /&gt;
The Balam cluster is owned by the [https://acceleration.utoronto.ca Acceleration Consortium at the University of Toronto], and hosted at SciNet  The cluster consists 10 x86_64 nodes each with two Intel Xeon(R) Platinum 8358 32-core CPUs running at 2.6GHz with 1 TB of RAM and four NVIDIA A100 GPUs per node.&lt;br /&gt;
 &lt;br /&gt;
The nodes are interconnected with Infiniband for internode communications and disk I/O to the SciNet Niagara file systems.  In total this cluster contains 640 CPU cores and 40 GPUs. &lt;br /&gt;
&lt;br /&gt;
Access is available only to those affiliated with the Acceleration Consortium.  Support requests should be sent to '''balam-support@scinet.utoronto.ca'''.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Balam =&lt;br /&gt;
&lt;br /&gt;
Balam can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@balam.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Or, the Balam login node '''balam-login01''' can be accessed via the Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y balam-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Storage ==&lt;br /&gt;
&lt;br /&gt;
The filesystem for Balam is currently shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Balam: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a MODULE_MODULENAME_PREFIX environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Software stacks: BalamEnv and CCEnv ==&lt;br /&gt;
&lt;br /&gt;
On Balam, there are two available software stacks:&lt;br /&gt;
&lt;br /&gt;
=== BalamEnv ===&lt;br /&gt;
&lt;br /&gt;
A software stack with [[Modules specific to Balam]] tuned and compiled for this machine. This stack is available by default, but if not, can be reloaded with&lt;br /&gt;
&amp;lt;pre&amp;gt;module load BalamEnv&amp;lt;/pre&amp;gt;&lt;br /&gt;
This loads the default (set of modules), which is currently the 2023a epoch.&lt;br /&gt;
&lt;br /&gt;
No modules are loaded by default on Balam except BalamEnv.&lt;br /&gt;
&lt;br /&gt;
=== CCEnv ===&lt;br /&gt;
&lt;br /&gt;
The same  [https://docs.alliancecan.ca/wiki/Modules software stack available on {{Alliance}}'s General Purpose clusters]  too, with:&lt;br /&gt;
&amp;lt;pre&amp;gt;module load CCEnv&amp;lt;/pre&amp;gt;&lt;br /&gt;
Or, if you want the same default modules loaded as on Béluga and Narval, then do&lt;br /&gt;
&amp;lt;pre&amp;gt;module load CCEnv StdEnv&amp;lt;/pre&amp;gt;&lt;br /&gt;
or, if you want the same default modules loaded as on Cedar and Graham, do&lt;br /&gt;
&amp;lt;pre&amp;gt;module load CCEnv arch/avx2 StdEnv/2020&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
&lt;br /&gt;
* In the BalamEnv, the &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA versions are 11.8.0 and 12.3.1 &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/&amp;lt;version&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The current NVIDIA driver version is 535.104.12.  Use '''nvidia-smi -a''' for full details.&lt;br /&gt;
&lt;br /&gt;
===Other Compilers and Tools ===&lt;br /&gt;
&lt;br /&gt;
Other available compiler modules are:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;gcc/12.3.0&amp;lt;/code&amp;gt; GNU Compiler Collection, compatible with CUDA/12.3&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;gcc/13.2.0&amp;lt;/code&amp;gt; GNU Compiler Collection, incompatible with CUDA/12.3&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;intel/2023u1&amp;lt;/code&amp;gt; Intel compiler suite&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/5.0.0&amp;lt;/tt&amp;gt; module is available once &amp;lt;tt&amp;gt;gcc/13.2.0&amp;lt;/tt&amp;gt; is loaded.&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
balam-login01:~$ debugjob --clean -g G&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
where G is the number of gpus.  If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a node with 4 gpus for 60 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Balam login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on one of Rouge's 20 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Balam uses SLURM as its job scheduler. &lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
balam-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by gpu each with 16 CPU cores.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below).&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a 1/8 of the node which is 1 GPU + 6/12 CPU Cores/Threads + ~64GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (8 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
--&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=4647</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=4647"/>
		<updated>2023-03-17T13:54:47Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Down |Teach|Teach}}&lt;br /&gt;
|{{Down |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down|File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |HPSS|HPSS}}&lt;br /&gt;
|{{Down |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
- Staff on site and ticket opened with cooling contractor, cause of failure unclear Fri 17 Mar 2023 09:15:39 EDT&lt;br /&gt;
&lt;br /&gt;
- Cooling system malfunction, datacentre is shut down. Fri Mar 17 01:47:43 EDT 2023&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=4539</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=4539"/>
		<updated>2023-02-10T00:25:42Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Down |Teach|Teach}}&lt;br /&gt;
|{{Down |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down|Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;b&amp;gt;Wed Feb 9, 2023, 07:24 PM EST&amp;lt;/b&amp;gt; Cooling problem will not be resolved tonight.  There were lots of fuses blown during the brownout, and this will likely be fixed tomorrow.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;b&amp;gt;Wed Feb 9, 2023, 06:11 PM EST&amp;lt;/b&amp;gt; Login nodes and storage accessible (cooling issue not yet completely resolved).&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;b&amp;gt;Wed Feb 9, 2023, 05:46 PM EST&amp;lt;/b&amp;gt; Johnson Control support technician on site, replacing the control board.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;b&amp;gt;Wed Feb 9, 2023, 03:40 PM EST&amp;lt;/b&amp;gt; Johnson Control support technician dispatched.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;b&amp;gt;Wed Feb 9, 2023, 03:12 PM EST&amp;lt;/b&amp;gt; SciNet staff on side and determined that the issue is the chiller, likely a burnt component triggered by local brownouts.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;b&amp;gt;Wed Feb 9, 2023, 02:15 PM EST&amp;lt;/b&amp;gt; Cooling system issue in the SciNet data centre. Shutting all systems down.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=4410</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=4410"/>
		<updated>2022-12-21T17:06:20Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Wed Dec 21, 2022, 12:00 PM: ''' Please note that SciNet is on vacation, together with the University of Toronto. Full service will resume on Jan 2, 2023. We will endeavour to keep systems running, and answer tickets, on a best-effort basis.  Happy Holidays!!!&lt;br /&gt;
&lt;br /&gt;
'''Fri Dec 16, 2022, 2:19 PM: ''' City power glitch caused all compute nodes to reboot. Please resubmit your jobs.&lt;br /&gt;
&lt;br /&gt;
'''Mon Dec 12, 2022, 9:30 AM - 11:30:''' File system issues caused login issues and may have affected running jobs.  System back to normal now, but users may want to check any jobs they had running. &lt;br /&gt;
&lt;br /&gt;
'''Wed Dec 7, 2022, 11:40 AM EST:''' Systems are being brought back online.&lt;br /&gt;
&lt;br /&gt;
'''Wed Dec 7, 2022, 09:00 AM EST:''' Maintenance is underway.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Gurobi&amp;diff=4272</id>
		<title>Gurobi</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Gurobi&amp;diff=4272"/>
		<updated>2022-10-19T15:06:25Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The [http://www.gurobi.com/ Gurobi] linear programming solver is installed on the Niagara software stack.&lt;br /&gt;
&lt;br /&gt;
&amp;quot;Given a set of linear inequality/equality constraints, Ax&amp;gt;=b, where A is a matrix and x &amp;amp; b are vectors, what is the set of variables x (within a given range) that maximizes/minimizes a target objective function f(x)?&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Such a model is very common in scientific computation, engineering, and business. If the variables x are (partially) limited to integers, this becomes Mixed Integer Programming (MIP) which is a much more sophisticated problem. Gurobi, along with other solvers (such as &amp;quot;linprog&amp;quot; and &amp;quot;intlinprog&amp;quot; in [[MATLAB]]) can solve such LP/MIP problems efficiently. And for Gurobi, for instance, efficient multi-threading (and even distributed computation, depending on the license) capabilities are also implemented for parallelism and easy scaling for large models.&lt;br /&gt;
&lt;br /&gt;
=Getting a license=&lt;br /&gt;
The University of Toronto has a free academic license to use Gurobi.  Access to the license is granted by loading the Gurobi module.&lt;br /&gt;
&lt;br /&gt;
=Running using the Niagara installation=&lt;br /&gt;
&lt;br /&gt;
==Gurobi 9.5.1==&lt;br /&gt;
To access commercial modules on Niagara one must invoke the 'module use' command.&lt;br /&gt;
&lt;br /&gt;
 module load NiaEnv/2019b&lt;br /&gt;
 module use /scinet/niagara/software/commercial/modules&lt;br /&gt;
 module load gurobi/9.5.1&lt;br /&gt;
&lt;br /&gt;
==Using Gurobi==&lt;br /&gt;
&lt;br /&gt;
To use Gurobi, one only needs to include &amp;quot;gurobi_c++.h&amp;quot; in the source file, and use the compilation/linking flags:&lt;br /&gt;
&lt;br /&gt;
 CXXLIB=-L ${SCINET_GUROBI_LIB} -lgurobi_g++5.2 -lgurobi75 -fopenmp&lt;br /&gt;
 CXXINC=-I ${SCINET_GUROBI_INC} -fopenmp&lt;br /&gt;
&lt;br /&gt;
The actual documentation for using Gurobi's API can be found [http://www.gurobi.com/documentation/7.5 here].&lt;br /&gt;
&lt;br /&gt;
==Running Gurobi==&lt;br /&gt;
Example submission script for a job running on 1 node, with max walltime of 11 hours:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --ntasks-per-node=1&lt;br /&gt;
#SBATCH --cpus-per-task=40&lt;br /&gt;
#SBATCH --time=11:00:00&lt;br /&gt;
#SBATCH --job-name test&lt;br /&gt;
&lt;br /&gt;
module use /scinet/niagara/software/commercial/modules&lt;br /&gt;
module load gurobi/7.5.2&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# If you are using OpenMP&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
./mycode&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Singularity&amp;diff=4074</id>
		<title>Singularity</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Singularity&amp;diff=4074"/>
		<updated>2022-06-23T18:32:50Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Please see our Docker page.&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3875</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3875"/>
		<updated>2022-05-23T20:46:46Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Down |Teach|Teach}}&lt;br /&gt;
|{{Down |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |HPSS|HPSS}}&lt;br /&gt;
|{{Down |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Mon May 23rd, 2022, 16:44:30 EDT:''' Systems still down. Filesystems are working, but there are quite a number of drive failures - no data loss - so out of an abundance of caution we are keeping the systems down at least until tomorrow.  The long weekend has also been disruptive for service response, and we prefer to err on the safe side.&lt;br /&gt;
&lt;br /&gt;
'''Mon May 23rd, 2022, 08:12:14 EDT:''' Systems still down. Filesystems being checked to ensure no heat damage.&lt;br /&gt;
&lt;br /&gt;
'''Sun May 22nd, 2022, 10.16 am EDT:''' Electrician dispatched to replace blown fuses.&lt;br /&gt;
&lt;br /&gt;
'''Sun May 22nd, 2022, 2:54 am EDT:''' Automatic shutdown down due to power/cooling.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3872</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3872"/>
		<updated>2022-05-23T12:15:46Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Down |Teach|Teach}}&lt;br /&gt;
|{{Down |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |HPSS|HPSS}}&lt;br /&gt;
|{{Down |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Mon May 23rd, 2022, 08:12:14 EDT:''' Systems still down. Filesystems being checked to ensure no heat damage.&lt;br /&gt;
&lt;br /&gt;
'''Sun May 22nd, 2022, 10.16 am EDT:''' Electrician dispatched to replace blown fuses.&lt;br /&gt;
&lt;br /&gt;
'''Sun May 22nd, 2022, 2:54 am EDT:''' Automatic shutdown down due to power/cooling.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3815</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3815"/>
		<updated>2022-05-09T11:27:49Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Fri May 6th, 2022, 11:35 am:''' HPSS scheduler upgrade also finished.&lt;br /&gt;
&lt;br /&gt;
'''Thu May 5th, 2022, 7:45 pm:''' Upgrade of the scheduler has finished, with the exception of HPSS.&lt;br /&gt;
&lt;br /&gt;
'''Thu May 5th, 2022, 7:00 am - 3:00 pm EDT (approx):''' Starting from 7:00 am EDT, an upgrade of the scheduler of the Niagara, Mist, and Rouge clusters will be applied.  This requires the scheduler to be down for about 5-6 hours, and all compute and login nodes to be rebooted.&lt;br /&gt;
Jobs cannot be submitted during this maintenance, but jobs submitted beforehand will remain in the queue.  For most of the time, the login nodes of the clusters will be available so that users may access their files on the home, scratch, and project file systems.&lt;br /&gt;
&lt;br /&gt;
'''Monday May 2nd, 2022, 9:30 - 11:00 am EDT:''' the Niagara login nodes, the jupyter hub, and nia-datamover2 will get rebooted for updates.  In the process, any login sessions will get disconnected, and servers on the jupyterhub will stop. Jobs in the Niagara queue will not be affected.&lt;br /&gt;
&lt;br /&gt;
'''Tue Apr 26, 11:20 AM EDT:''' A Rolling update of the Mist cluster is taking a bit longer than expected, affecting logins to Mist. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3500</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3500"/>
		<updated>2022-01-29T16:23:34Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Sat 29 Jan 2022 11:22:27 EST&amp;lt;/b&amp;gt; Fibre repair is underway.  Expect to have connectivity restored later today.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri 28 Jan 2022 07:35:01 EST&amp;lt;/b&amp;gt; The fibre optics cable that connects the SciNet datacentre was severed by uncoordinated digging at York University.  We expect repairs to happen as soon as possible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 27 12:46 EST PM 2022&amp;lt;/b&amp;gt; Network issues to and from the datacentre. We are investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sun Jan 23 11:05 EST AM 2022&amp;lt;/b&amp;gt; Filesystem issues appear to have resolved.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sun Jan 23 10:30 EST AM 2022&amp;lt;/b&amp;gt; Filesystem issues -- investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Jan 8 11:42 EST AM 2022&amp;lt;/b&amp;gt; The emergency maintenance is complete. Systems are up and available.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Jan 7 14:34 EST PM 2022&amp;lt;/b&amp;gt; The SciNet shutdown is in progress. Systems are expected back on Saturday, Jan 8.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3494</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3494"/>
		<updated>2022-01-28T12:38:28Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Fri 28 Jan 2022 07:35:01 EST&amp;lt;/b&amp;gt; The optical fibre cable that connects the SciNet datacentre was severed by uncoordinated digging at York University.  We expect repairs to happen today.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 27 12:46 EST PM 2022&amp;lt;/b&amp;gt; Network issues to and from the datacentre. We are investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sun Jan 23 11:05 EST AM 2022&amp;lt;/b&amp;gt; Filesystem issues appear to have resolved.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sun Jan 23 10:30 EST AM 2022&amp;lt;/b&amp;gt; Filesystem issues -- investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Jan 8 11:42 EST AM 2022&amp;lt;/b&amp;gt; The emergency maintenance is complete. Systems are up and available.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Jan 7 14:34 EST PM 2022&amp;lt;/b&amp;gt; The SciNet shutdown is in progress. Systems are expected back on Saturday, Jan 8.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3410</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3410"/>
		<updated>2022-01-08T16:44:41Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Jan 8 11:42 EST AM 2022&amp;lt;/b&amp;gt; The emergency maintenance is complete. Systems are up and available.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Jan 7 14:34 EST PM 2022&amp;lt;/b&amp;gt; The SciNet shutdown is in progress. Systems are expected back on Saturday, Jan 8.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;&amp;lt;span style=&amp;quot;color:red&amp;quot;&amp;gt;Emergency shutdown Friday January 7, 2022&amp;lt;/span&amp;gt;&amp;lt;/b&amp;gt;: An emergency shutdown of all SciNet to replace a crucial file system component is planned to take place on Friday January 7, 2022, starting at 8am EST, and will require at least 12 hours of downtime.  Updates will be posted during the day.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 6 08:20 EST AM 2022&amp;lt;/b&amp;gt; The SciNet filesystem is having issues.  We are investigating.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Dec 24 13:31 EST PM 2021&amp;lt;/b&amp;gt; Please note the following scheduled network maintenance, which will result in loss of connectivity to the SciNet datacentre:  Start time&lt;br /&gt;
Dec 29, 00:30 EST  Estimated duration  4 hours and 30 minutes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 4:29 EST PM 2021&amp;lt;/b&amp;gt; Filesystem is back to normal. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 2:53 EST PM 2021&amp;lt;/b&amp;gt; Filesystem problem - We are investigating. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3407</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3407"/>
		<updated>2022-01-07T19:36:48Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Down |Teach|Teach}}&lt;br /&gt;
|{{Down |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down|HPSS|HPSS}}&lt;br /&gt;
|{{Down |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Jan 7 14:34 EST PM 2022&amp;lt;/b&amp;gt; The SciNet shutdown is in progress. Systems are expected back on Saturday, Jan 8.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;&amp;lt;span style=&amp;quot;color:red&amp;quot;&amp;gt;Emergency shutdown Friday January 7, 2022&amp;lt;/span&amp;gt;&amp;lt;/b&amp;gt;: An emergency shutdown of all SciNet to replace a crucial file system component is planned to take place on Friday January 7, 2022, starting at 8am EST, and will require at least 12 hours of downtime.  Updates will be posted during the day.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 6 08:20 EST AM 2022&amp;lt;/b&amp;gt; The SciNet filesystem is having issues.  We are investigating.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Dec 24 13:31 EST PM 2021&amp;lt;/b&amp;gt; Please note the following scheduled network maintenance, which will result in loss of connectivity to the SciNet datacentre:  Start time&lt;br /&gt;
Dec 29, 00:30 EST  Estimated duration  4 hours and 30 minutes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 4:29 EST PM 2021&amp;lt;/b&amp;gt; Filesystem is back to normal. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 2:53 EST PM 2021&amp;lt;/b&amp;gt; Filesystem problem - We are investigating. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3404</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3404"/>
		<updated>2022-01-07T13:09:44Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Down |Teach|Teach}}&lt;br /&gt;
|{{Down |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down|HPSS|HPSS}}&lt;br /&gt;
|{{Down |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 6 08:20 EST AM 2022&amp;lt;/b&amp;gt; The SciNet filesystem is having issues.  We are investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;&amp;lt;span style=&amp;quot;color:red&amp;quot;&amp;gt;Emergency shutdown Friday January 7, 2022&amp;lt;/span&amp;gt;&amp;lt;/b&amp;gt;: An emergency shutdown of all SciNet to replace a crucial file system component is planned to take place on Friday January 7, 2022, starting at 8am EST, and will require at least 12 hours of downtime.  Updates will be posted during the day.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Dec 24 13:31 EST PM 2021&amp;lt;/b&amp;gt; Please note the following scheduled network maintenance, which will result in loss of connectivity to the SciNet datacentre:  Start time&lt;br /&gt;
Dec 29, 00:30 EST  Estimated duration  4 hours and 30 minutes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 4:29 EST PM 2021&amp;lt;/b&amp;gt; Filesystem is back to normal. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 2:53 EST PM 2021&amp;lt;/b&amp;gt; Filesystem problem - We are investigating. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3401</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3401"/>
		<updated>2022-01-06T19:17:47Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 6 08:20 EST AM 2022&amp;lt;/b&amp;gt; The SciNet filesystem is having issues.  We are investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;&amp;lt;span style=&amp;quot;color:red&amp;quot;&amp;gt;Emergency shutdown Friday January 7, 2022&amp;lt;/span&amp;gt;&amp;lt;/b&amp;gt;: An emergency shutdown of all SciNet to replace a crucial file system component is planned to take place on Friday January 7, 2022, starting at 8am EST, and will require at least 12 hours of downtime.  Updates will be posted during the day.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Dec 24 13:31 EST PM 2021&amp;lt;/b&amp;gt; Please note the following scheduled network maintenance, which will result in loss of connectivity to the SciNet datacentre:  Start time&lt;br /&gt;
Dec 29, 00:30 EST  Estimated duration  4 hours and 30 minutes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 4:29 EST PM 2021&amp;lt;/b&amp;gt; Filesystem is back to normal. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 2:53 EST PM 2021&amp;lt;/b&amp;gt; Filesystem problem - We are investigating. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3398</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3398"/>
		<updated>2022-01-06T18:47:47Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 6 08:20 EST AM 2022&amp;lt;/b&amp;gt; The SciNet filesystem is having issues.  We are investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;&amp;lt;span style=&amp;quot;color:red&amp;quot;&amp;gt;Emergency shutdown Friday January 7, 2022&amp;lt;/span&amp;gt;&amp;lt;/b&amp;gt;: An emergency shutdown of all SciNet to replace a crucial file system component is planned to take place on Friday January 7, 2022, starting at 8am EST, and will require at least 6 hours of downtime.  Updates will be posted during the day.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Dec 24 13:31 EST PM 2021&amp;lt;/b&amp;gt; Please note the following scheduled network maintenance, which will result in loss of connectivity to the SciNet datacentre:  Start time&lt;br /&gt;
Dec 29, 00:30 EST  Estimated duration  4 hours and 30 minutes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 4:29 EST PM 2021&amp;lt;/b&amp;gt; Filesystem is back to normal. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 2:53 EST PM 2021&amp;lt;/b&amp;gt; Filesystem problem - We are investigating. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3383</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3383"/>
		<updated>2021-12-24T18:34:26Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Dec 24 13:31 EST PM 2021&amp;lt;/b&amp;gt; Please note the following scheduled network maintenance, which will result in loss of connectivity to the SciNet datacentre:  Start time&lt;br /&gt;
Dec 29, 00:30 EST  Estimated duration  4 hours and 30 minutes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 4:29 EST PM 2021&amp;lt;/b&amp;gt; Filesystem is back to normal. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Dec 20 2:53 EST PM 2021&amp;lt;/b&amp;gt; Filesystem problem - We are investigating. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3313</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3313"/>
		<updated>2021-11-06T00:22:08Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Nov 5 19:35 EDT 2021 &amp;lt;/b&amp;gt; The filesystem issue from earlier in the afternoon is resolved.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Nov 5 16:58 EDT 2021 &amp;lt;/b&amp;gt; We are experiencing filesystem issues, login to the clusters may not be possible until they are resolved.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 19 noon EDT - Thu Oct 21 noon EDT:&amp;lt;/b&amp;gt; &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Niagara at Scale:&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; Only users of selected projects run at large scale during these 48 hours. Other users can still login and access their files, and submit jobs for after the event.  SOSCIP and Mist users are not affected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 14:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is back up.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 12:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is down for maintenance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Sep 27 16:11 EDT 2021 &amp;lt;/b&amp;gt; HPSS is back online.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Sep 23 17:23 EDT 2021 &amp;lt;/b&amp;gt; Systems being brought back online. HPSS may be down for some more days.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2821</id>
		<title>NAMD</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2821"/>
		<updated>2020-09-14T16:31:25Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* NAMD v2.14 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;====NAMD v2.14====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.14 Scalable Molecular Dynamics package.&lt;br /&gt;
NAMD_2.14_Linux-x86_64-ibverbs-smp.tar.gz binary tarball from TCBG.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, plus smp, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
module load namd/2.14&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=2&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun ++verbose +p $P ++ppn $PPN ++nodelist $NODELIST $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;br /&gt;
&lt;br /&gt;
====NAMD v2.12====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.12 Scalable Molecular Dynamics package.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# Note that the module will likely be taken out of experimental mode at some point.&lt;br /&gt;
module load namd/.experimental-2.12-ibverbs-smp&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun ++verbose +p $P ++ppn $PPN ++nodelist $NODELIST $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2764</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2764"/>
		<updated>2020-08-18T14:09:05Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down|Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down|HPSS|HPSS}}&lt;br /&gt;
|{{Down|Mist|Mist}}&lt;br /&gt;
|{{Down|Teach|Teach}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down|Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down|Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down|File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down|Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down|Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down|External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down|Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;August 17, 2020, 4:00 PM EST:&amp;lt;/b&amp;gt; Unfortunately after taking the pump apart it was determined there was a more serious failure of the main drive shaft, not just the seal. As a new one will need to be sourced or fabricated we're estimating that it will take at least a few more days to get the part and repairs done to restore cooling. Sorry for the inconvenience. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 15, 2020, 1:00 PM EST:&amp;lt;/b&amp;gt; Due to parts availablity to repair the failed pump and cooling system it is unlikely that systems will be able to be restored until Monday afternoon at the earliest. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 15, 2020, 00:04 AM EST:&amp;lt;/b&amp;gt;  A primary pump seal in the cooling infrastructure has blown and parts availability will not be able be determined until tomorrow. All systems are shut down as there is no cooling.  If parts are available, systems may be back at the earliest late tomorrow. Check here for updates.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 14, 2020, 21:04 AM EST:&amp;lt;/b&amp;gt; Tomorrow's /scratch purge has been postponed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 14, 2020, 21:00 AM EST:&amp;lt;/b&amp;gt; Staff at the datacenter. Looks like one of the pumps has a seal that is leaking badly.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 14, 2020, 20:37 AM EST:&amp;lt;/b&amp;gt; We seem to be undergoing a thermal shutdown at the datacenter.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 14, 2020, 20:20 AM EST:&amp;lt;/b&amp;gt; Network problems to niagara/mist. We are investigating.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[SOSCIP_GPU | SOSCIP GPU cluster]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2749</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2749"/>
		<updated>2020-08-15T00:59:54Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down|Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down|HPSS|HPSS}}&lt;br /&gt;
|{{Down|Mist|Mist}}&lt;br /&gt;
|{{Down|Teach|Teach}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down|Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down|Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down|File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Down|Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down|Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down|External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down|Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;August 14, 2020, 20:37 AM EST:&amp;lt;/b&amp;gt; We seem to be undergoing a thermal shutdown at the datacenter.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 14, 2020, 20:20 AM EST:&amp;lt;/b&amp;gt; Network problems to niagara/mist. We are investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 13, 2020, 10:40 AM EST:&amp;lt;/b&amp;gt; Network is fixed, scheduler and other services are back.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 13, 2020, 8:20 AM EST:&amp;lt;/b&amp;gt; We had an IB switch failure, which is affecting a subset of nodes, including the scheduler nodes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 10, 2020, 7:30 PM EST:&amp;lt;/b&amp;gt; Scheduler fully operational again.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 10, 2020, 3:00 PM EST:&amp;lt;/b&amp;gt; Scheduler partially functional: jobs can be submitted and are running.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 10, 2020, 2:00 PM EST:&amp;lt;/b&amp;gt; Scheduler is temporarily inoperational.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 7, 2020, 9:15 PM EST:&amp;lt;/b&amp;gt; Network is fixed, scheduler and other services are coming back.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 7, 2020, 8:20 PM EST:&amp;lt;/b&amp;gt; Disruption of part of the network in the data centre.  Causes issue with the scheduler, the mist login node, and possibly others. We are investigating.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[SOSCIP_GPU | SOSCIP GPU cluster]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2552</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2552"/>
		<updated>2020-03-20T17:16:52Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up|Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up|SOSCIP&amp;amp;nbsp;GPU|SOSCIP_GPU}}&lt;br /&gt;
|{{Up|Mist|Mist}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|Teach|Teach}}&lt;br /&gt;
|{{Up|Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up|External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up|Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt; Fri Mar 20 13:15:33 EDT 2020 - There was a power glitch at the datacentre earlier this morning, which resulted in jobs getting killed.  Please resubmit failed jobs. &amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt; COVID-19 Impact on SciNet Operations, March 18, 2020&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Although the University of Toronto is closing of some of its&lt;br /&gt;
research operations on Friday March 20 at 5 pm EDT, this does not&lt;br /&gt;
affect the SciNet systems (such as Niagara, Mist, and HPSS), which&lt;br /&gt;
will remain operational.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt; SciNet/Niagara Downtime Announcement, March 25-26, 2020&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All resources at SciNet will undergo a two-day maintenance shutdown on March 25th and 26th 2020, starting at 7 am EDT on Wednesday March 25th.  There will be no access to any of the SciNet systems (Niagara, Mist, HPSS, Teach cluster, or the file systems) during this time.&lt;br /&gt;
&lt;br /&gt;
This shutdown is necessary to finish the expansion of the Niagara cluster and its storage system.&lt;br /&gt;
&lt;br /&gt;
We expect to be able to bring the systems back online the evening of March 26th.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[SOSCIP_GPU | SOSCIP GPU cluster]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=MATLAB&amp;diff=2539</id>
		<title>MATLAB</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=MATLAB&amp;diff=2539"/>
		<updated>2020-03-11T17:49:43Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* Using a MATLAB stand-alone executable on Niagara */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;We often get questions about running MATLAB on Niagara.  With a few exceptions for compilers and debuggers, SciNet does not purchase licenses for commercial software.  As such, SciNet does not have a license for MATLAB, nor will it in the future.  If users wish to run MATLAB they must supply their own license, or explore alternative options.  This page gives information about the options for getting your MATLAB code to run, in recommended order. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Not using MATLAB ==&lt;br /&gt;
&lt;br /&gt;
Users can attempt to run MATLAB code using the open-source program [[Octave]], accessible through the octave module.  Though there are some differences between the two programs, Octave has been designed to interpret MATLAB code and can often be used in place of MATLAB.  If your MATLAB code does not use some of the more-fancy MATLAB toolboxes, you may be able to get away with using Octave instead.  Be sure to test your implementation in Octave thoroughly before committing to this option.&lt;br /&gt;
&lt;br /&gt;
It is worth observing that, while convenient for prototyping and running on a single workstation, there are reasons to avoid using MATLAB for larger HPC/ARC projects.  These include the prohibitive license cost for large-scale work, poor performance at scale, and portability issues.  If you can switch to a license-free option, such as Python, it may be worth the effort.&lt;br /&gt;
&lt;br /&gt;
== Using stand-alone MATLAB executables ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a MATLAB stand-alone executable ===&lt;br /&gt;
&lt;br /&gt;
If MATLAB must be used, you may be able to compile your MATLAB code into a stand-alone executable, and run this on a Niagara compute node.  The version of MATLAB being used will require a compiler license, and the compilation must be done on a Linux machine (not Niagara).&lt;br /&gt;
&lt;br /&gt;
=== Using a MATLAB stand-alone executable on Niagara ===&lt;br /&gt;
&lt;br /&gt;
Once the compilation is done, the executable can be copied to SciNet, and run using the MATLAB Compiler Runtime (MCR), which can be accessed using the MCR module.  The MCR used must be the same version of MATLAB as the compiler.  If the version of MCR that you need is not listed among the MCR module versions, contact us and we will install the version which you require.&lt;br /&gt;
&lt;br /&gt;
Here is an example script which uses the MCR.  Note that the &amp;quot;run_myscript.sh&amp;quot; script is produced by the MATLAB compiler, together with the &amp;quot;myscript&amp;quot; executable (assuming you were working on the myscript.m MATLAB code):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --cpus-per-task=40&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --job-name test_matlab&lt;br /&gt;
#SBATCH --output=matlab_output_%j.txt&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is the directory from which the job was submitted&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# load module&lt;br /&gt;
module load mcr/R2018a&lt;br /&gt;
&lt;br /&gt;
# Directory for the MCR to use to write temporary files.  Use whatever directory you wish.&lt;br /&gt;
mkdir -p $SCRATCH/temp&lt;br /&gt;
export MCR_CACHE_ROOT=$SCRATCH/temp&lt;br /&gt;
&lt;br /&gt;
# EXECUTION COMMAND (note that the MATLAB script may require that LD_LIBRARY_PATH be added&lt;br /&gt;
# to the script arguments).  Note that, if the calculations are serial, you must bundle 40 such&lt;br /&gt;
# calculations together for production runs!&lt;br /&gt;
./run_myscript.sh $MATLAB:$LD_LIBRARY_PATH&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Avaiable MATLAB runtime versions ===&lt;br /&gt;
&lt;br /&gt;
Below is a list of the available MATLAB runtime versions on Niagara, with the required module command:&lt;br /&gt;
&lt;br /&gt;
 module load NiaEnv/2018a mcr/R2018a&lt;br /&gt;
 module load NiaEnv/2019b mcr/R2019a&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2013a&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2013b&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2014a&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2014b&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2015a&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2015b&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2016a&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2016b&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2017a&lt;br /&gt;
 module load CCEnv nixpkgs/16.09 mcr/R2017b&lt;br /&gt;
&lt;br /&gt;
== Tunneling to a license server ==&lt;br /&gt;
&lt;br /&gt;
If you have access to a non-SciNet MATLAB license server, and have installed MATLAB in your $HOME directory, you can [[SSH_Tunneling|setup your submission script]] to access the external license server.  The following lines should be added to the beginning of your submission script, after the #SBATCH commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
PORT=XXX                                         # port number of the license server&lt;br /&gt;
LICENSE_IP=AAA.BBB.CCC.DDD                       # IP address of the license server&lt;br /&gt;
ssh nia-gw -L${PORT}:${LICENSE_IP}:${PORT} -N &amp;amp;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This last line will tunnel the port from the compute node back to the license server, through nia-gw.  The port number and IP address of the licence server must be supplied by the system administrator of the license server.&lt;br /&gt;
&lt;br /&gt;
== Using a different Consortium ==&lt;br /&gt;
&lt;br /&gt;
Both [https://www.sharcnet.ca/my/software/show/54 Sharcnet] and [https://www.westgrid.ca/support/software/matlab Westgrid] have purchased different types of MATLAB licenses.  Users can contact those consortia if they wish to attempt to run on those systems.&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=SSH_Changes_in_May_2019&amp;diff=2162</id>
		<title>SSH Changes in May 2019</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=SSH_Changes_in_May_2019&amp;diff=2162"/>
		<updated>2019-05-30T22:23:02Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* Qu'est-ce qui a changé? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=English=&lt;br /&gt;
== What Changed? ==&lt;br /&gt;
&lt;br /&gt;
During the 29-30 May 2019 shutdown, we made the following ssh security improvements on Niagara:&lt;br /&gt;
&lt;br /&gt;
# Disabled certain weak encryption algorithms.&lt;br /&gt;
# Disabled certain weak public key types.&lt;br /&gt;
# Regenerated Niagara's host keys.&lt;br /&gt;
&lt;br /&gt;
== Updating your client's known host list ==&lt;br /&gt;
&lt;br /&gt;
The first time you login to Niagara after the shutdown, you will probably see the following warning message:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@&lt;br /&gt;
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @&lt;br /&gt;
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@&lt;br /&gt;
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!&lt;br /&gt;
Someone could be eavesdropping on you right now (man-in-the-middle attack)!&lt;br /&gt;
It is also possible that a host key has just been changed.&lt;br /&gt;
The fingerprint for the ED25519 key sent by the remote host is&lt;br /&gt;
SHA256:SauX2nL+Yso9KBo2Ca6GH/V9cSFLFXwxOECGWXZ5pxc.&lt;br /&gt;
Please contact your system administrator.&lt;br /&gt;
Add correct host key in /home/username/.ssh/known_hosts to get rid of this message.&lt;br /&gt;
Offending ECDSA key in /home/username/.ssh/known_hosts:109&lt;br /&gt;
ED25519 host key for niagara.scinet.utoronto.ca has changed and you have requested strict checking.&lt;br /&gt;
Host key verification failed.&lt;br /&gt;
Killed by signal 1.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This warning is displayed because the host keys on Niagara changed to increase the data centres security, and ssh clients remember old host keys to prevent [https://en.wikipedia.org/wiki/Man-in-the-middle_attack &amp;quot;man-in-the-middle&amp;quot; attacks]. &lt;br /&gt;
&lt;br /&gt;
You may also get a warning regarding &amp;quot;DNS spoofing&amp;quot;, which is related to the same change.&lt;br /&gt;
&lt;br /&gt;
If you are using MobaXTerm or Putty as your ssh client under Windows, the warning will appear in a pop-up window and will allow you to accept the new host key by clicking &amp;quot;Yes&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If you are using the command line ssh command on macOS, Linux, GitBash or Cygwin, you should tell your system to &amp;quot;forget&amp;quot; the old host keys, by running the following commands:&lt;br /&gt;
&lt;br /&gt;
 $ ssh-keygen -R niagara.scinet.utoronto.ca&lt;br /&gt;
 $ ssh-keygen -R niagara.computecanada.ca&lt;br /&gt;
 $ ssh-keygen -R 142.150.188.70&lt;br /&gt;
&lt;br /&gt;
Afterwards, the next time you ssh to Niagara you'll be asked to confirm the new host keys, e.g.:&lt;br /&gt;
&lt;br /&gt;
 $ ssh niagara.scinet.utoronto.ca&lt;br /&gt;
 The authenticity of host 'niagara.scinet.utoronto.ca (142.150.188.70)' can't be established.&lt;br /&gt;
 ED25519 key fingerprint is SHA256:SauX2nL+Yso9KBo2Ca6GH/V9cSFLFXwxOECGWXZ5pxc.&lt;br /&gt;
 ED25519 key fingerprint is MD5:b4:ae:76:a5:2b:37:8d:57:06:0e:9a:de:62:00:26:be.&lt;br /&gt;
 Are you sure you want to continue connecting (yes/no)? &lt;br /&gt;
&lt;br /&gt;
Make sure the fingerprints are correct! You'll either see the above ED25519 fingerprints, or the following RSA fingerprints:&lt;br /&gt;
&lt;br /&gt;
 RSA key fingerprint is SHA256:k6YEhYsI73M+NJIpZ8yF+wqWeuXS9avNs2s5QS/0VhU.&lt;br /&gt;
 RSA key fingerprint is MD5:98:e7:7a:07:89:ef:3f:d8:68:3d:47:9c:6e:a6:71:5e.&lt;br /&gt;
&lt;br /&gt;
If the fingerprints don't match, someone may be trying to hijack your connection.&lt;br /&gt;
&lt;br /&gt;
WINSCP?&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting ==&lt;br /&gt;
&lt;br /&gt;
=== I can't connect! ===&lt;br /&gt;
&lt;br /&gt;
If you see one of the following error messages:&lt;br /&gt;
&lt;br /&gt;
 Unable to negotiate with 142.150.188.70 port 22: no matching cipher found.&lt;br /&gt;
 Unable to negotiate with 142.150.188.70 port 22: no matching key exchange method found.&lt;br /&gt;
 Unable to negotiate with 142.150.188.70 port 22: no matching mac found.&lt;br /&gt;
&lt;br /&gt;
you need to upgrade your ssh client.&lt;br /&gt;
&lt;br /&gt;
=== My SSH key no longer works ===&lt;br /&gt;
&lt;br /&gt;
If you're being asked for a password, but were using SSH keys,&lt;br /&gt;
it's because 1024-bit DSA &amp;amp; RSA keys have been disabled.&lt;br /&gt;
&lt;br /&gt;
You need to generate a new stronger key, see the [[SSH keys]] page for details.&lt;br /&gt;
&lt;br /&gt;
=Les changement de SSH à mai 2019 (version française)=&lt;br /&gt;
&lt;br /&gt;
==Qu'est-ce qui a changé?==&lt;br /&gt;
&lt;br /&gt;
Pendant l'arrêt planifié des 29 et 30 mai 2019, nous avons mis en place&lt;br /&gt;
les mesures suivantes à Niagara pour améliorer sa securité:&lt;br /&gt;
&lt;br /&gt;
# Certaines méthodes de chiffrement faibles ont été désactivées.&lt;br /&gt;
# Certains types de clé publique SSH ont été désactivés.&lt;br /&gt;
# Les clés d'hôte de Niagara ont été régénérées.&lt;br /&gt;
&lt;br /&gt;
==Mettre à jour la liste de clés d'hôte de votre client SSH==&lt;br /&gt;
&lt;br /&gt;
Lorsque vous vous reconnectez à Niagara pour la première fois apres&lt;br /&gt;
l'arrêt de maintenance, vous verrez probablement le message suivante:&lt;br /&gt;
&lt;br /&gt;
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@&lt;br /&gt;
 @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @&lt;br /&gt;
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@&lt;br /&gt;
 IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!&lt;br /&gt;
 Someone could be eavesdropping on you right now (man-in-the-middle attack)!&lt;br /&gt;
 It is also possible that a host key has just been changed.&lt;br /&gt;
 The fingerprint for the ED25519 key sent by the remote host is&lt;br /&gt;
 SHA256:SauX2nL+Yso9KBo2Ca6GH/V9cSFLFXwxOECGWXZ5pxc.&lt;br /&gt;
 Please contact your system administrator.&lt;br /&gt;
 Add correct host key in /home/username/.ssh/known_hosts to get rid of this message.&lt;br /&gt;
 Offending ECDSA key in /home/username/.ssh/known_hosts:109&lt;br /&gt;
 ED25519 host key for niagara.scinet.utoronto.ca has changed and you have requested strict checking.&lt;br /&gt;
 Host key verification failed.&lt;br /&gt;
 Killed by signal 1.&lt;br /&gt;
&lt;br /&gt;
Ce message d'avertissement s'affiche parce que les clés d'hôte de Niagara ont été régénérées pour améliorer le securité de le centre de données, mais le client SSH a stocké les clés d'hôte anciennes pour empêcher une [https://fr.wikipedia.org/wiki/Attaque_de_l%27homme_du_milieu « attaque de l'homme du milieu »]. &lt;br /&gt;
&lt;br /&gt;
C'est possible que vous verrez aussi un message d'avertissement sur « DNS spoofing ».&lt;br /&gt;
&lt;br /&gt;
Si vous utilisez un des outils de connexion MobaXTerm et PuTTY, le message d'avertissment s'affiche dans une fenêtre contextuelle qui offre l'option d'accepter la nouvelle clé de hôte par appuyer sur le bouton approprié.&lt;br /&gt;
&lt;br /&gt;
Si vous utilisez le command « ssh » dans une fenêtre de terminal sur macOS, Linux, GitBash ou Cygwin, vous devez dire à votre système d'oublier les anciennes clés de l'hôte avec les commandes suivantes:&lt;br /&gt;
&lt;br /&gt;
 $ ssh-keygen -R niagara.scinet.utoronto.ca&lt;br /&gt;
 $ ssh-keygen -R niagara.computecanada.ca&lt;br /&gt;
 $ ssh-keygen -R 142.150.188.70&lt;br /&gt;
&lt;br /&gt;
Ensuite, lors de votre prochaine visite à Niagara, vous devrez confirmer les nouvelles clés d’hôte, par exemple:&lt;br /&gt;
&lt;br /&gt;
 $ ssh niagara.scinet.utoronto.ca&lt;br /&gt;
 The authenticity of host 'niagara.scinet.utoronto.ca (142.150.188.70)' can't be established.&lt;br /&gt;
 ED25519 key fingerprint is SHA256:SauX2nL+Yso9KBo2Ca6GH/V9cSFLFXwxOECGWXZ5pxc.&lt;br /&gt;
 ED25519 key fingerprint is MD5:b4:ae:76:a5:2b:37:8d:57:06:0e:9a:de:62:00:26:be.&lt;br /&gt;
 Are you sure you want to continue connecting (yes/no)? &lt;br /&gt;
&lt;br /&gt;
Assurez-vous que les empreintes digitales (« fingerprints ») sont correctes! Vous verrez soit les empreintes ED25519 ci-dessus, soit les empreintes RSA suivantes:&lt;br /&gt;
&lt;br /&gt;
 RSA key fingerprint is SHA256:k6YEhYsI73M+NJIpZ8yF+wqWeuXS9avNs2s5QS/0VhU.&lt;br /&gt;
 RSA key fingerprint is MD5:98:e7:7a:07:89:ef:3f:d8:68:3d:47:9c:6e:a6:71:5e.&lt;br /&gt;
&lt;br /&gt;
Si les empreintes digitales ne correspondent pas, il est possible que quelqu'un tente de détourner votre connexion.&lt;br /&gt;
&lt;br /&gt;
==Dépannage==&lt;br /&gt;
&lt;br /&gt;
===Je n'arrive pas à me connecter!===&lt;br /&gt;
&lt;br /&gt;
Si vous voyez l'un des messages d'erreur suivants:&lt;br /&gt;
&lt;br /&gt;
 Unable to negotiate with 142.150.188.70 port 22: no matching cipher found.&lt;br /&gt;
 Unable to negotiate with 142.150.188.70 port 22: no matching key exchange method found.&lt;br /&gt;
 Unable to negotiate with 142.150.188.70 port 22: no matching mac found.&lt;br /&gt;
&lt;br /&gt;
vous devez mettre à jour le client ssh.&lt;br /&gt;
&lt;br /&gt;
===Ma clé SSH ne fonctionne plus===&lt;br /&gt;
&lt;br /&gt;
Si un mot de passe vous est demandé alors que vous utilisiez auparavant des clés SSH, c'est probablement parce que les clés DSA et RSA 1024 bits ont été désactivées.&lt;br /&gt;
&lt;br /&gt;
Vous devez générer une nouvelle clé plus puissante, voir la page [[SSH keys]] pour plus de détails.&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2073</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2073"/>
		<updated>2019-04-10T02:24:39Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up|Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up|BGQ|BGQ}}&lt;br /&gt;
|{{Up|SOSCIP&amp;amp;nbsp;GPU|SOSCIP_GPU}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|P7|P7}}&lt;br /&gt;
|{{Up|P8|P8}}&lt;br /&gt;
|{{Up|Teach|Teach}}&lt;br /&gt;
|{{Up|Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up|External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up|Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Tue  9 Apr 2019 22:24:14 EDT:  Network connection restored.&lt;br /&gt;
&lt;br /&gt;
April 9, 2019, 15:20: Network connection down.  Investigating.&lt;br /&gt;
&lt;br /&gt;
April 5, 2019: Planned, short outage in connectivity to the SciNet datacentre from 7:30 am to 8:55 am EST for maintenance of the network.  This outage will not affect running or queued jobs. It may be necessary to reboot the login nodes at some point tomorrow, which could result in a short interruption of connectivity, but which will have no effect on running or queued jobs.&lt;br /&gt;
&lt;br /&gt;
April 5, 2019: Software updates on Niagara: The default CCEnv software stack now uses avx512 on Niagara, and there is now a NiaEnv/2019b stack (&amp;quot;epoch&amp;quot;). &lt;br /&gt;
&lt;br /&gt;
April 4, 2019: The 2019 compute and storage allocations have taken effect on Niagara. &lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara_Quickstart|Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[BGQ | SOSCIP BlueGene/Q cluster]]&lt;br /&gt;
* [[SOSCIP_GPU | SOSCIP GPU cluster]]&lt;br /&gt;
* [[P7|Experimental Power 7 cluster]]&lt;br /&gt;
* [[P8|Experimental Power 8 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging_SciNet | Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://courses.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2051</id>
		<title>NAMD</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2051"/>
		<updated>2019-03-31T16:16:24Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* NAMD v2.13 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;====NAMD v2.13====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.13 Scalable Molecular Dynamics package.&lt;br /&gt;
NAMD_2.13_Linux-x86_64-ibverbs-smp.tar.gz binary tarball from TCBG.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, plus smp, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
module load namd/2.13&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=2&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun ++verbose +p $P ++ppn $PPN ++nodelist $NODELIST $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;br /&gt;
&lt;br /&gt;
====NAMD v2.12====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.12 Scalable Molecular Dynamics package.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# Note that the module will likely be taken out of experimental mode at some point.&lt;br /&gt;
module load namd/.experimental-2.12-ibverbs-smp&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun ++verbose +p $P ++ppn $PPN ++nodelist $NODELIST $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2050</id>
		<title>NAMD</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2050"/>
		<updated>2019-03-31T14:42:01Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* NAMD v2.13 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;====NAMD v2.13====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.13 Scalable Molecular Dynamics package.&lt;br /&gt;
NAMD_2.13_Linux-x86_64-ibverbs-smp.tar.gz binary tarball from TCBG.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, plus smp, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# The intel and openmpi modules are needed in order to have mpiexec in the path, to use it to launch the processes&lt;br /&gt;
module load intel&lt;br /&gt;
module load openmpi&lt;br /&gt;
module load namd/2.13&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=2&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun +p&amp;lt;procs&amp;gt; ++ppn $PPN ++mpiexec  $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;br /&gt;
&lt;br /&gt;
====NAMD v2.12====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.12 Scalable Molecular Dynamics package.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# Note that the module will likely be taken out of experimental mode at some point.&lt;br /&gt;
module load namd/.experimental-2.12-ibverbs-smp&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun ++verbose +p $P ++ppn $PPN ++nodelist $NODELIST $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2049</id>
		<title>NAMD</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2049"/>
		<updated>2019-03-31T14:33:40Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* NAMD v2.13 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;====NAMD v2.13====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.13 Scalable Molecular Dynamics package.&lt;br /&gt;
NAMD_2.13_Linux-x86_64-ibverbs-smp.tar.gz binary tarball from TCBG.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, plus smp, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# The intel and openmpi modules are needed in order to have mpiexec in the path, to use it to launch the processes&lt;br /&gt;
module load intel&lt;br /&gt;
module load openmpi&lt;br /&gt;
module load namd/2.13&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun +p&amp;lt;procs&amp;gt; ++ppn $PPN ++mpiexec  $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;br /&gt;
&lt;br /&gt;
====NAMD v2.12====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.12 Scalable Molecular Dynamics package.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# Note that the module will likely be taken out of experimental mode at some point.&lt;br /&gt;
module load namd/.experimental-2.12-ibverbs-smp&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun ++verbose +p $P ++ppn $PPN ++nodelist $NODELIST $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2048</id>
		<title>NAMD</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2048"/>
		<updated>2019-03-30T23:49:05Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* NAMD v2.13 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;====NAMD v2.13====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.13 Scalable Molecular Dynamics package.&lt;br /&gt;
NAMD_2.13_Linux-x86_64-ibverbs-smp.tar.gz binary tarball from TCBG.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, plus smp, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# The intel and openmpi modules are needed in order to have mpiexec in the path, to use it to launch the processes&lt;br /&gt;
module load intel&lt;br /&gt;
module load openmpi&lt;br /&gt;
module load namd/2.13&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun +p&amp;lt;procs&amp;gt; ++mpiexec  $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;br /&gt;
&lt;br /&gt;
====NAMD v2.12====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.12 Scalable Molecular Dynamics package.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# Note that the module will likely be taken out of experimental mode at some point.&lt;br /&gt;
module load namd/.experimental-2.12-ibverbs-smp&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun ++verbose +p $P ++ppn $PPN ++nodelist $NODELIST $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2047</id>
		<title>NAMD</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=NAMD&amp;diff=2047"/>
		<updated>2019-03-30T23:43:12Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* NAMD v2.12 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;====NAMD v2.13====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.13 Scalable Molecular Dynamics package.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# The intel and openmpi modules are needed in order to have mpiexec in the path, to use it to launch the processes&lt;br /&gt;
module load intel&lt;br /&gt;
module load openmpi&lt;br /&gt;
module load namd/2.13&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun +p&amp;lt;procs&amp;gt; ++mpiexec  $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;br /&gt;
&lt;br /&gt;
====NAMD v2.12====&lt;br /&gt;
&lt;br /&gt;
This is the NAMD version 2.12 Scalable Molecular Dynamics package.  &lt;br /&gt;
&lt;br /&gt;
It was built directly on top of ibverbs, so it will run on InfiniBand nodes.  Here is a sample run script:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=00:15:00&lt;br /&gt;
#SBATCH --job-name namdtest&lt;br /&gt;
&lt;br /&gt;
# Note that the module will likely be taken out of experimental mode at some point.&lt;br /&gt;
module load namd/.experimental-2.12-ibverbs-smp&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $SLURM_SUBMIT_DIR is directory job was submitted from&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
# Generate NAMD nodelist&lt;br /&gt;
for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do&lt;br /&gt;
  echo &amp;quot;host $n&amp;quot; &amp;gt;&amp;gt; nodelist.$SLURM_JOBID&lt;br /&gt;
done&lt;br /&gt;
&lt;br /&gt;
NODELIST=nodelist.$SLURM_JOBID&lt;br /&gt;
cat $NODELIST&lt;br /&gt;
&lt;br /&gt;
# Calculate total processes (P) and procs per node (PPN)&lt;br /&gt;
PPN=4&lt;br /&gt;
P=$(($SLURM_NTASKS * 2))&lt;br /&gt;
&lt;br /&gt;
charmrun ++verbose +p $P ++ppn $PPN ++nodelist $NODELIST $SCINET_NAMD_ROOT/bin/namd2 input.namd&lt;br /&gt;
&lt;br /&gt;
# Cleaning&lt;br /&gt;
rm $NODELIST&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Full documentation for NAMD is available on their website:  http://www.ks.uiuc.edu/Research/namd/&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=FAQ&amp;diff=818</id>
		<title>FAQ</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=FAQ&amp;diff=818"/>
		<updated>2018-07-30T14:58:46Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* How do priorities work/why did that job jump ahead of mine in the queue? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==The Basics==&lt;br /&gt;
===Whom do I contact for support?===&lt;br /&gt;
&lt;br /&gt;
Whom do I contact if I have problems or questions about how to use the SciNet systems?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
E-mail [mailto:support@scinet.utoronto.ca &amp;lt;support@scinet.utoronto.ca&amp;gt;]  &lt;br /&gt;
&lt;br /&gt;
In your email, please include the following information:&lt;br /&gt;
&lt;br /&gt;
* your username on SciNet&lt;br /&gt;
* the cluster that your question pertains to (NIA, BGQ, GPU, ...; SciNet is not a cluster!),&lt;br /&gt;
* any relevant error messages&lt;br /&gt;
* the commands you typed before the errors occurred&lt;br /&gt;
* the path to your code (if applicable)&lt;br /&gt;
* the location of the job scripts (if applicable)&lt;br /&gt;
* the directory from which it was submitted (if applicable)&lt;br /&gt;
* a description of what it is supposed to do (if applicable)&lt;br /&gt;
* if your problem is about connecting to SciNet, the type of computer you are connecting from.&lt;br /&gt;
&lt;br /&gt;
Note that your password should never, never, never be to sent to us, even if your question is about your account.&lt;br /&gt;
&lt;br /&gt;
Avoid sending email only to specific individuals at SciNet. Your chances of a quick reply increase significantly if you email our team! (support@scinet.utoronto.ca)&lt;br /&gt;
&lt;br /&gt;
===What does ''code scaling'' mean?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Please see [[Introduction_To_Performance#Parallel_Speedup|A Performance Primer]]&lt;br /&gt;
&lt;br /&gt;
===What do you mean by ''throughput''?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Please see [[Introduction_To_Performance#Throughput|A Performance Primer]].&lt;br /&gt;
&lt;br /&gt;
Here is a simple example:&lt;br /&gt;
&lt;br /&gt;
Suppose you need to do 10 computations.  Say each of these runs for&lt;br /&gt;
1 day on 8 cores, but they take &amp;quot;only&amp;quot; 18 hours on 16 cores.  What is the&lt;br /&gt;
fastest way to get all 10 computations done - as 8-core jobs or as&lt;br /&gt;
16-core jobs?  Let us assume you have 2 nodes at your disposal.&lt;br /&gt;
The answer, after some simple arithmetic, is that running your 10&lt;br /&gt;
jobs as 8-core jobs will take 5 days, whereas if you ran them&lt;br /&gt;
as 16-core jobs it would take 7.5 days.  Take your own conclusions...&lt;br /&gt;
&lt;br /&gt;
===I changed my .bashrc/.bash_profile and now nothing works===&lt;br /&gt;
&lt;br /&gt;
The default startup scripts provided by SciNet, and guidelines for them, can be found [[Important_.bashrc_guidelines|here]].  Certain things - like sourcing &amp;lt;tt&amp;gt;/etc/profile&amp;lt;/tt&amp;gt;&lt;br /&gt;
and &amp;lt;tt&amp;gt;/etc/bashrc&amp;lt;/tt&amp;gt; are ''required'' for various SciNet routines to work!   &lt;br /&gt;
&lt;br /&gt;
If the situation is so bad that you cannot even log in, please send email [mailto:support@scinet.utoronto.ca support].&lt;br /&gt;
&lt;br /&gt;
===Could I have my login shell changed to (t)csh?===&lt;br /&gt;
&lt;br /&gt;
The login shell used on our systems is bash. While the tcsh is available on the GPC and the TCS, we do not support it as the default login shell at present.  So &amp;quot;chsh&amp;quot; will not work, but you can always run tcsh interactively. Also, csh scripts will be executed correctly provided that they have the correct &amp;quot;shebang&amp;quot; &amp;lt;tt&amp;gt;#!/bin/tcsh&amp;lt;/tt&amp;gt; at the top.&lt;br /&gt;
&lt;br /&gt;
===How can I run Matlab / IDL / Gaussian / my favourite commercial software at SciNet?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Because SciNet serves such a disparate group of user communities, there is just no way we can buy licenses for everyone's commercial package.   The only commercial software we have purchased is that which in principle can benefit everyone -- fast compilers and math libraries (Intel's on GPC, and IBM's on TCS).&lt;br /&gt;
&lt;br /&gt;
If your research group requires a commercial package that you already have or are willing to buy licenses for, contact us at [mailto:support@scinet.utoronto.ca support@scinet] and we can work together to find out if it is feasible to implement the packages licensing arrangement on the SciNet clusters, and if so, what is the the best way to do it.&lt;br /&gt;
&lt;br /&gt;
Note that it is important that you contact us before installing commercially licensed software on SciNet machines, even if you have a way to do it in your own directory without requiring sysadmin intervention.   It puts us in a very awkward position if someone is found to be running unlicensed or invalidly licensed software on our systems, so we need to be aware of what is being installed where.&lt;br /&gt;
&lt;br /&gt;
===Do you have a recommended ssh program that will allow scinet access from Windows machines?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
The [[Ssh#SSH_for_Windows_Users | SSH for Windows users]] programs we recommend are:&lt;br /&gt;
&lt;br /&gt;
* [http://mobaxterm.mobatek.net/en/ MobaXterm] is a tabbed ssh client with some Cygwin tools, including ssh and X, all wrapped up into one executable.&lt;br /&gt;
* [http://www.chiark.greenend.org.uk/~sgtatham/putty/ PuTTY]  - this is a terminal for windows that connects via ssh.  It is a quick install and will get you up and running quickly.&amp;lt;br/&amp;gt; '''WARNING:''' Make sure you download putty from the official website, because there are &amp;quot;trojanized&amp;quot; versions of putty around that will send your login information to a site in Russia (as reported [http://blogs.cisco.com/security/trojanized-putty-software here]).&amp;lt;br&amp;gt;To set up your passphrase protected ssh key with putty, see [http://the.earth.li/~sgtatham/putty/0.61/htmldoc/Chapter8.html#pubkey here].&lt;br /&gt;
* [http://www.cygwin.com/ CygWin] - this is a whole linux-like environment for windows, which also includes an X window server so that you can display remote windows on your desktop.  Make sure you include the openssh and X window system in the installation for full functionality.  This is recommended if you will be doing a lot of work on Linux machines, as it makes a very similar environment available on your computer.&amp;lt;br&amp;gt;To set up your ssh keys, following the Linux instruction on the [[Ssh keys]] page.&lt;br /&gt;
&amp;lt;br&amp;gt;To set up your ssh keys, following the Linux instruction on the [[Ssh keys]] page.&lt;br /&gt;
&lt;br /&gt;
===My ssh key does not work! WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
[[Ssh_keys#Testing_Your_Key | Testing Your Key]]&lt;br /&gt;
&lt;br /&gt;
* If this doesn't work, you should be able to login using your password, and investigate the problem. For example, if during a login session you get an message similar to the one below, just follow the instruction and delete the offending key on line 3 (you can use vi to jump to that line with ESC plus : plus 3). That only means that you may have logged in from your home computer to SciNet in the past, and that key is obsolete.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh USERNAME@login.scinet.utoronto.ca&lt;br /&gt;
&lt;br /&gt;
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@**@@@@@@@@@@@@@@@@@@@@@@@@@@@@@&lt;br /&gt;
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @&lt;br /&gt;
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@**@@@@@@@@@@@@@@@@@@@@@@@@@@@@@&lt;br /&gt;
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!&lt;br /&gt;
Someone could be eavesdropping on you right now (man-in-the-middle&lt;br /&gt;
attack)!&lt;br /&gt;
It is also possible that the RSA host key has just been changed.&lt;br /&gt;
The fingerprint for the RSA key sent by the remote host is&lt;br /&gt;
53:f9:60:71:a8:0b:5d:74:83:52:**fe:ea:1a:9e:cc:d3.&lt;br /&gt;
Please contact your system administrator.&lt;br /&gt;
Add correct host key in /home/&amp;lt;user&amp;gt;/.ssh/known_hosts to get rid of&lt;br /&gt;
this message.&lt;br /&gt;
Offending key in /home/&amp;lt;user&amp;gt;/.ssh/known_hosts:3&lt;br /&gt;
RSA host key for login.scinet.utoronto.ca &lt;br /&gt;
&amp;lt;http://login.scinet.utoronto.ca &amp;lt;http://login.scinet.utoronto.ca&amp;gt;&amp;gt; has&lt;br /&gt;
changed and you have requested&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* If you get the message below you may need to logout of your gnome session and log back in since the ssh-agent needs to be&lt;br /&gt;
restarted with the new passphrase ssh key.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh USERNAME@login.scinet.utoronto.ca&lt;br /&gt;
&lt;br /&gt;
Agent admitted failure to sign using the key.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Can't get graphics: &amp;quot;Can't open display/DISPLAY is not set&amp;quot;===&lt;br /&gt;
&lt;br /&gt;
To use graphics on SciNet machines and have it displayed on your machine, you need to have a X server running on your computer (an X server is the standard way graphics is done on linux). One an X server is running, you can log in with the &amp;quot;-Y&amp;quot; option to ssh (&amp;quot;-X&amp;quot; sometimes also works).&lt;br /&gt;
&lt;br /&gt;
How to get an X server running on your computer, depends on the operating system.  On linux machines with a graphical interface, X will already be running.  On windows, the easiest solution is using MobaXterm, which comes with an  X server (alternatives, such as cygwin with the x11 server installed, or running putty+Xming, can also work, but are a bit more work to set up.  For Macs, you will need to install Xquartz. &lt;br /&gt;
  &lt;br /&gt;
===Remote graphics stops working after a while: &amp;quot;Can't open display&amp;quot;===&lt;br /&gt;
&lt;br /&gt;
If you still cannot get graphics, or it works only for a while and then suddenly it &amp;quot;can't open display localhost:....&amp;quot;, your X11 graphics connection may have timed out (Macs seem to be particularly prone to this).  You'll have to tell your own computer not to allow, and not to timeout the X11 graphics connection.&lt;br /&gt;
&lt;br /&gt;
The following should fix it. The ssh configuration settings are in a file called /etc/ssh/ssh_config (or /etc/ssh_config in older OS X versions, or $HOME/.ssh/config for specific users). In the config file, find (or create) the section &amp;quot;Host *&amp;quot; (meaning all hosts) and add the following lines:&lt;br /&gt;
&lt;br /&gt;
  Host *&lt;br /&gt;
   ServerAliveInterval 60&lt;br /&gt;
   ServerAliveCountMax 3&lt;br /&gt;
   ForwardX11 yes&lt;br /&gt;
   ForwardX11Trusted yes&lt;br /&gt;
   ForwardX11Timeout 596h&lt;br /&gt;
&lt;br /&gt;
(The &amp;lt;tt&amp;gt;Host *&amp;lt;/tt&amp;gt; is only needed if there was no Host section yet to append these settings to.)&lt;br /&gt;
&lt;br /&gt;
If this does not resolve it, try it again with &amp;quot;ssh -vvv -Y ....&amp;quot;.  The &amp;quot;-vvv&amp;quot; spews out a lot of diagnostic messages. Look for anything resembling a timeout, and let us know (support AT scinet DOT utoronto DOT ca).&lt;br /&gt;
&lt;br /&gt;
===Can't forward X:  &amp;quot;Warning: No xauth data; using fake authentication data&amp;quot;, or &amp;quot;X11 connection rejected because of wrong authentication.&amp;quot;===&lt;br /&gt;
&lt;br /&gt;
I used to be able to forward X11 windows from SciNet to my home machine, but now I'm getting these messages; what's wrong?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
This very likely means that ssh/xauth can't update your ${HOME}/.Xauthority file. &lt;br /&gt;
&lt;br /&gt;
The simplest pssible reason for this is that you've filled your 10GB /home quota and so can't write anything to your home directory.   Use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load extras&lt;br /&gt;
$ diskUsage&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
to check to see how close you are to your disk usage on ${HOME}.&lt;br /&gt;
&lt;br /&gt;
Alternately, this could mean your .Xauthority file has become broken/corrupted/confused some how, in which case you can delete that file, and when you next log in you'll get a similar warning message involving creating .Xauthority, but things should work.&lt;br /&gt;
&lt;br /&gt;
===I have a CCDB account, but I can't login to SciNet. How can I get a SciNet account?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
You must extend your CCDB application process to also get a SciNet account:&lt;br /&gt;
&lt;br /&gt;
https://wiki.scinet.utoronto.ca/wiki/index.php/Application_Process&lt;br /&gt;
&lt;br /&gt;
https://www.scinethpc.ca/getting-a-scinet-account/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===How can I reset the password for my Compute Canada account?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
You can reset your password for your Compute Canada account here:&lt;br /&gt;
&lt;br /&gt;
https://ccdb.computecanada.ca/security/forgot&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===How can I change or reset the password for my SciNet account?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
To reset your password at SciNet please go to [https://portal.scinet.utoronto.ca/password_resets Password reset page].&lt;br /&gt;
&lt;br /&gt;
If you know your old password and want to change it, that can be done here after logging in on the portal:&lt;br /&gt;
&lt;br /&gt;
https://portal.scinet.utoronto.ca&lt;br /&gt;
&lt;br /&gt;
===Why am I getting the error &amp;quot;Permission denied (publickey,gssapi-with-mic,password)&amp;quot;?===&lt;br /&gt;
&lt;br /&gt;
This error can pop up in a variety of situations: when trying to log in, or when after a job has finished, when the error and output files fail to be copied (there are other possible reasons for this failure as well -- see [[FAQ#My_GPC_job_died.2C_telling_me_.60Copy_Stageout_Files_Failed.27|My GPC job died, telling me:Copy Stageout Files Failed]]).&lt;br /&gt;
In most cases, the &amp;quot;Permission denioed&amp;quot; error is caused by incorrect permission of the (hidden) .ssh directory. Ssh is used for logging in as well as for the copying of the standard error and output files after a job. &lt;br /&gt;
&lt;br /&gt;
For security reasons, &lt;br /&gt;
the directory .ssh should only be writable and readable to you, but yours &lt;br /&gt;
has read permission for everybody, and thus it fails.  You can change &lt;br /&gt;
this by&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   chmod 700 ~/.ssh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
And to be sure, also do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   chmod 600 ~/.ssh/id_rsa ~/authorized_keys&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===ERROR:102: Tcl command execution failed? when loading modules ===&lt;br /&gt;
Modules sometimes require other modules to be loaded first.&lt;br /&gt;
Module will let you know if you didn’t.&lt;br /&gt;
For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
$ module load python&lt;br /&gt;
python/2.6.2(11):ERROR:151: Module ’python/2.6.2’ depends on one of the module(s) ’gcc/4.4.0’&lt;br /&gt;
python/2.6.2(11):ERROR:102: Tcl command execution failed: prereq gcc/4.4.0&lt;br /&gt;
$ gpc-f103n084-$ module load gcc python&lt;br /&gt;
$&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== How do I compute the core-years usage of my code? ===&lt;br /&gt;
&lt;br /&gt;
The &amp;quot;core-years&amp;quot; quantity is a way to account for the time your code runs, by considering the total number of cores and time used, accounting for the total number of hours in a year.&lt;br /&gt;
For instance if your code uses ''HH'' hours, in ''NN'' nodes, where each node has ''CC'' cores, then &amp;quot;core-years&amp;quot; can be computed as follow:&lt;br /&gt;
&lt;br /&gt;
''HH*(NN*CC)/(365*24)''&lt;br /&gt;
&lt;br /&gt;
If you have several independent instances (batches) running on different nodes, with ''BB'' number of batches and each batch during ''HH'' hours, then your core-years usage can be computed as,&lt;br /&gt;
&lt;br /&gt;
''BB*HH*(NN*CC)/(365*24)''&lt;br /&gt;
&lt;br /&gt;
As a general rule, in our GPC system, each node has only 8 cores, so ''CC'' will be always 8.&lt;br /&gt;
&lt;br /&gt;
==Compiling your Code==&lt;br /&gt;
&lt;br /&gt;
===How can I get g77 to work?===&lt;br /&gt;
&lt;br /&gt;
The fortran 77 compilers on the GPC are ifort and gfortran. We have dropped support for g77.  This has been a conscious decision. g77 (and the associated library libg2c) were completely replaced six years ago (Apr 2005) by the gcc 4.x branch, and haven't undergone any updates at all, even bug fixes, for over five years.  &lt;br /&gt;
If we would install g77 and libg2c, we would have to deal with the inevitable confusion caused when users accidentally link against the old, broken, wrong versions of the gcc libraries instead of the correct current versions.   &lt;br /&gt;
&lt;br /&gt;
If your code for some reason specifically requires five-plus-year-old libraries,  availability, compatibility, and unfixed-known-bug problems are only going to get worse for you over time, and this might be as good an opportunity as any to address those issues. &lt;br /&gt;
&lt;br /&gt;
''A note on porting to gfortran or ifort:''&lt;br /&gt;
&lt;br /&gt;
While gfortran and ifort are rather compatible with g77, one &lt;br /&gt;
important difference is that by default, gfortran does not preserve &lt;br /&gt;
local variables between function calls, while g77 does.   Preserved &lt;br /&gt;
local variables are for instance often used in implementations of quasi-random number &lt;br /&gt;
generators.  Proper fortran requires to declare such variables as SAVE &lt;br /&gt;
but not all old code does this.&lt;br /&gt;
Luckily, you can change gfortran's default behavior with the flag &lt;br /&gt;
&amp;lt;tt&amp;gt;-fno-automatic&amp;lt;/tt&amp;gt;.   For ifort, the corresponding flag is &amp;lt;tt&amp;gt;-noautomatic&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
===Where is libg2c.so?===&lt;br /&gt;
&lt;br /&gt;
libg2c.so is part of the g77 compiler, for which we dropped support. See [[#How can I get g77 to work on the GPC?]] for our reasons.&lt;br /&gt;
&lt;br /&gt;
===Autoparallelization does not work!===&lt;br /&gt;
&lt;br /&gt;
I compiled my code with the &amp;lt;tt&amp;gt;-qsmp=omp,auto&amp;lt;/tt&amp;gt; option, and then I specified that it should be run with 64 threads - with &lt;br /&gt;
 export OMP_NUM_THREADS=64&lt;br /&gt;
&lt;br /&gt;
However, when I check the load using &amp;lt;tt&amp;gt;llq1 -n&amp;lt;/tt&amp;gt;, it shows a load on the node of 1.37.  Why?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Using the autoparallelization will only get you so far.  In fact, it usually does not do too much.  What is helpful is to run the compiler with the &amp;lt;tt&amp;gt;-qreport&amp;lt;/tt&amp;gt; option, and then read the output listing carefully to see where the compiler thought it could parallelize, where it could not, and the reasons for this.  Then you can go back to your code and carefully try to address each of the issues brought up by the compiler.&lt;br /&gt;
We ''emphasize'' that this is just a rough first guide, and that the compilers are still not magical!   For more sophisticated approaches to parallelizing your code, email us at [mailto:support@scinet.utoronto.ca &amp;lt;support@scinet.utoronto.ca&amp;gt;]  to set up an appointment with one&lt;br /&gt;
of our technical analysts.&lt;br /&gt;
&lt;br /&gt;
===How do I link against the Intel Math Kernel Library?===&lt;br /&gt;
&lt;br /&gt;
If you need to link to the Intel Math Kernal Library (MKL) with the intel compilers, just add the &amp;lt;pre&amp;gt;-mkl&amp;lt;/pre&amp;gt; flag. There are in fact three flavours: &amp;lt;tt&amp;gt;-mkl=sequential&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;-mkl=parallel&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-mkl=cluster&amp;lt;/tt&amp;gt;, for the serial version, the threaded version and the mpi version, respectively. (Note: The cluster version is available only when using the intelmpi module and mpi compilation wrappers.)&lt;br /&gt;
&lt;br /&gt;
If you need to link in the Intel Math Kernel Library (MKL) libraries to gcc/gfortran/c++, you are well advised to use the Intel(R) Math Kernel Library Link Line Advisor: http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/ for help in devising the list of libraries to link with your code.&lt;br /&gt;
&lt;br /&gt;
'''''Note that this give the link line for the command line. When using this in Makefiles, replace $MKLPATH by ${MKLPATH}.'''''&lt;br /&gt;
&lt;br /&gt;
'''''Note too that, unless the integer arguments you will be passing to the MKL libraries are actually 64-bit integers, rather than the normal int or INTEGER types, you want to specify 32-bit integers (lp64) .'''''&lt;br /&gt;
&lt;br /&gt;
===Can the compilers on the login nodes be disabled to prevent accidentally using them?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
You can accomplish this by modifying your .bashrc to not load the compiler modules. See [[Important .bashrc guidelines]].&lt;br /&gt;
&lt;br /&gt;
===&amp;quot;relocation truncated to fit: R_X86_64_PC32&amp;quot;: Huh?===&lt;br /&gt;
&lt;br /&gt;
What does this mean, and why can't I compile this code?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Welcome to the joys of the x86 architecture!  You're probably having trouble building arrays larger than 2GB, individually or together.   Generally, you have to try to use the medium or large x86 `memory model'.   For the intel compilers, this is specified with the compile options&lt;br /&gt;
&lt;br /&gt;
  -mcmodel=medium -shared-intel&lt;br /&gt;
&lt;br /&gt;
===&amp;quot;feupdateenv is not implemented and will always fail&amp;quot;===&lt;br /&gt;
&lt;br /&gt;
How do I get rid of this and what does it mean?&lt;br /&gt;
 &lt;br /&gt;
'''Answer:'''&lt;br /&gt;
First note that, as ominous as it sounds, this is really just a warning, and has to do with the intel math library. You can ignore it (unless you really are trying to manually change the exception handlers for floating point exceptions such as divide by zero), or take the safe road and get rid off it by linking with the intel math functions library:&amp;lt;pre&amp;gt;-limf&amp;lt;/pre&amp;gt;See also [[#How do I link against the Intel Math Kernel Library?]]&lt;br /&gt;
&lt;br /&gt;
===Cannot find rdmacm library when compiling on GPC===&lt;br /&gt;
&lt;br /&gt;
I get the following error building my code on GPC: &amp;quot;&amp;lt;tt&amp;gt;ld: cannot find -lrdmacm&amp;lt;/tt&amp;gt;&amp;quot;.  Where can I find this library?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
This library is part of the MPI libraries; if your compiler is having problems picking it up, it probably means you are mistakenly trying to compile on the login nodes (scinet01..scinet04).  The login nodes aren't part of the GPC; they are for logging into the data centre only.  From there you must go to the GPC or TCS development nodes to do any real work.&lt;br /&gt;
&lt;br /&gt;
=== Why do I get this error when I try to compile: &amp;quot;icpc: error #10001: could not find directory in which /usr/bin/g++41 resides&amp;quot; ?===&lt;br /&gt;
&lt;br /&gt;
You are trying to compile on the login nodes.   As described in the wiki ( https://support.scinet.utoronto.ca/wiki/index.php/GPC_Quickstart#Login ), or in the users guide you would have received with your account,   Scinet supports two main clusters, with very different architectures.  Compilation must be done on the development nodes of the appropriate cluster (in this case, gpc01-04).   Thus, log into gpc01, gpc02, gpc03, or gpc04, and compile from there.&lt;br /&gt;
&lt;br /&gt;
==Testing your Code==&lt;br /&gt;
&lt;br /&gt;
=== Can I run a something for a short time on the development nodes? ===&lt;br /&gt;
&lt;br /&gt;
I am in the process of playing around with the mpi calls in my code to get it to work. I do a lot of tests and each of them takes a couple of seconds only.  Can I do this on the development nodes?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Yes, as long as it's very brief (a few minutes).   People use the development nodes&lt;br /&gt;
for their work, and you don't want to bog it down for people, and testing a real&lt;br /&gt;
code can chew up a lot more resources than compiling, etc.    The procedures differ&lt;br /&gt;
depending on what machine you're using.&lt;br /&gt;
&lt;br /&gt;
==== TCS ====&lt;br /&gt;
&lt;br /&gt;
On the TCS you can run small MPI jobs on the tcs02 node, which is meant for &lt;br /&gt;
development use.  But even for this test run on one node, you'll need a host file --&lt;br /&gt;
a list of hosts (in this case, all tcs-f11n06, which is the `real' name of tcs02)&lt;br /&gt;
that the job will run on.  Create a file called `hostfile' containing the following:&lt;br /&gt;
&lt;br /&gt;
 tcs-f11n06&lt;br /&gt;
 tcs-f11n06&lt;br /&gt;
 tcs-f11n06&lt;br /&gt;
 tcs-f11n06&lt;br /&gt;
&lt;br /&gt;
for a 4-task run.  When you invoke &amp;quot;poe&amp;quot; or &amp;quot;mpirun&amp;quot;, there are runtime&lt;br /&gt;
arguments that you specify pointing to this file.  You can also specify it&lt;br /&gt;
in an environment variable MP_HOSTFILE, so, if your file is in your /scratch directory, say &lt;br /&gt;
${SCRATCH}/hostfile, then you would do&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 export MP_HOSTFILE=${SCRATCH}/hostfile&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
in your shell.  You will also need to create a &amp;lt;tt&amp;gt;.rhosts&amp;lt;/tt&amp;gt; file in your &lt;br /&gt;
home director, again listing &amp;lt;tt&amp;gt;tcs-f11n06&amp;lt;/tt&amp;gt; so that &amp;lt;tt&amp;gt;poe&amp;lt;/tt&amp;gt;&lt;br /&gt;
can start jobs.   After that you can simply run your program.  You can use&lt;br /&gt;
mpiexec:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 mpiexec -n 4 my_test_program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
adding &amp;lt;tt&amp;gt; -hostfile /path/to/my/hostfile&amp;lt;/tt&amp;gt; if you did not set the environment&lt;br /&gt;
variable above.  Alternatively, you can run it with the poe command (do a &amp;quot;man poe&amp;quot; for details), or even by&lt;br /&gt;
just directly running it.  In this case the number of MPI processes will by default&lt;br /&gt;
be the number of entries in your hostfile.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPC ====&lt;br /&gt;
&lt;br /&gt;
On the GPC one can run short test jobs on the GPC [[GPC_Quickstart#Compile.2FDevel_Nodes | development nodes ]]&amp;lt;tt&amp;gt;gpc01&amp;lt;/tt&amp;gt;..&amp;lt;tt&amp;gt;gpc04&amp;lt;/tt&amp;gt;;&lt;br /&gt;
if they are single-node jobs (which they should be) they don't need a hostfile.  Even better, though, is to request an [[ Moab#Interactive | interactive ]] job and run the tests either in regular batch queue or using a short high availability [[ Moab#debug | debug ]] queue that is reserved for this purpose.&lt;br /&gt;
&lt;br /&gt;
=== How do I run a longer (but still shorter than an hour) test job quickly ? ===&lt;br /&gt;
&lt;br /&gt;
'''Answer'''&lt;br /&gt;
&lt;br /&gt;
On the GPC there is a high turnover short queue called [[ Moab#debug | debug ]] that is designed for&lt;br /&gt;
this purpose.  You can use it by adding &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#PBS -q debug&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
to your submission script.&lt;br /&gt;
&lt;br /&gt;
==Submitting your jobs==&lt;br /&gt;
&lt;br /&gt;
===Error Submitting My Job: qsub: Bad UID for job execution MSG=ruserok failed ===&lt;br /&gt;
&lt;br /&gt;
I write up a submission script as in the examples, but when I attempt to submit the job, I get the above error.  What's wrong?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
This error will occur if you try to submit a job from the login nodes.   The login nodes are the gateway to all of SciNet's systems (GPC, TCS, P7, ARC), which have different hardware and queueing systems.  To submit a job, you must log into a development node for the particular cluster you are submitting to and submit from there.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== How do I charge jobs to my RAC allocation? ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Please see the [[Moab#Accounting|accounting section of Moab page]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===How can I automatically resubmit a job?===&lt;br /&gt;
&lt;br /&gt;
Commonly you may have a job that you know will take longer to run than what is &lt;br /&gt;
permissible in the queue.  As long as your program contains [[Checkpoints|checkpoint]] or &lt;br /&gt;
restart capability, you can have one job automatically submit the next. In&lt;br /&gt;
the following example it is assumed that the program finishes before &lt;br /&gt;
the 48 hour limit and then resubmits itself by logging into one&lt;br /&gt;
of the development nodes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# MOAB/Torque example submission script for auto resubmission&lt;br /&gt;
# SciNet GPC&lt;br /&gt;
#&lt;br /&gt;
#PBS -l nodes=1:ppn=8,walltime=48:00:00&lt;br /&gt;
#PBS -N my_job&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from&lt;br /&gt;
cd $PBS_O_WORKDIR&lt;br /&gt;
&lt;br /&gt;
# YOUR CODE HERE&lt;br /&gt;
./run_my_code&lt;br /&gt;
&lt;br /&gt;
# RESUBMIT 10 TIMES HERE&lt;br /&gt;
num=$NUM&lt;br /&gt;
if [ &amp;quot;$num&amp;quot; -lt 10 ]; then&lt;br /&gt;
      num=$(($num+1))&lt;br /&gt;
      ssh gpc01 &amp;quot;cd $PBS_O_WORKDIR; qsub ./script_name.sh -v NUM=$num&amp;quot;;&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
qsub script_name.sh -v NUM=0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can alternatively use [[ Moab#Job_Dependencies | Job dependencies ]] through the queuing system which will not start one job until another job has completed.&lt;br /&gt;
&lt;br /&gt;
If your job can't be made to automatically stop before the 48 hour queue window, but it does write out checkpoints, you can use the timeout command to stop the program while you still have time to resubmit; for instance&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
    timeout 2850m ./run_my_code argument1 argument2&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
will run the program for 47.5 hours (2850 minutes), and then send it SIGTERM to exit the program.&lt;br /&gt;
&lt;br /&gt;
===How can I pass in arguments to my submission script?===&lt;br /&gt;
&lt;br /&gt;
If you wish to make your scripts more generic you can use qsub's ability &lt;br /&gt;
to pass in environment variables to pass in arguments to your script.&lt;br /&gt;
The following example shows a case where an input and an output &lt;br /&gt;
file are passed in on the qsub line. Multiple variables can be &lt;br /&gt;
passed in using the qsub &amp;quot;-v&amp;quot; option and comma delimited. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# MOAB/Torque example of passing in arguments&lt;br /&gt;
# SciNet GPC&lt;br /&gt;
# &lt;br /&gt;
#PBS -l nodes=1:ppn=8,walltime=48:00:00&lt;br /&gt;
#PBS -N my_job&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from&lt;br /&gt;
cd $PBS_O_WORKDIR&lt;br /&gt;
&lt;br /&gt;
# YOUR CODE HERE&lt;br /&gt;
./run_my_code -f $INFILE -o $OUTFILE&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
qsub script_name.sh -v INFILE=input.txt,OUTFILE=outfile.txt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===I submit my GPC job, and I get an email saying it was rejected===&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
This happens because the job you've submitted breaks one of the rules of the queues and is rejected. An email&lt;br /&gt;
is sent with the JOBID, JOBNAME, and the reason it was rejected.  The following is an example where a job&lt;br /&gt;
requests more than 48 hours and was rejected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
PBS Job Id: 3462493.gpc-sched&lt;br /&gt;
Job Name:   STDIN&lt;br /&gt;
job deleted&lt;br /&gt;
Job deleted at request of root@gpc-sched&lt;br /&gt;
MOAB_INFO:  job was rejected - job violates class configuration 'wclimit too high for class 'batch_ib' (345600 &amp;gt; 172800)'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Jobs on the TCS or GPC may only run for 48 hours at a time; this restriction greatly increases responsiveness of the queue and queue throughput for all our users.  If your computation requires longer than that, as many do, you will have to [[ Checkpoints | checkpoint ]] your job and restart it after each 48-hour queue window.   You can manually re-submit jobs, or if you can have your job cleanly exit before the 48 hour window, there are ways to [[ FAQ#How_can_I_automatically_resubmit_a_job.3F | automatically resubmit jobs ]].&lt;br /&gt;
&lt;br /&gt;
Other rejections return a more cryptic error saying &amp;quot;job violates class configuration&amp;quot; such as follows:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
PBS Job Id: 3462409.gpc-sched&lt;br /&gt;
Job Name:   STDIN&lt;br /&gt;
job deleted&lt;br /&gt;
Job deleted at request of root@gpc-sched&lt;br /&gt;
MOAB_INFO:  job was rejected - job violates class configuration 'user required by class 'batch''&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The most common problems that result in this error are:&lt;br /&gt;
&lt;br /&gt;
* '''Incorrect number of processors per node''': Jobs on the GPC are scheduled per-node not per-core and since each node has 8 processor cores (ppn=8) the smallest job allowed is one node with 8 cores (nodes=1:ppn=8).  For serial jobs users must bundle or batch them together in groups of 8. See [[ FAQ#How_do_I_run_serial_jobs_on_GPC.3F | How do I run serial jobs on GPC? ]]&lt;br /&gt;
* '''No number of nodes specified''': Jobs submitted to the main queue must request a specific number of nodes, either in the submission script (with a line like &amp;lt;tt&amp;gt;#PBS -l nodes=2:ppn=8&amp;lt;/tt&amp;gt;) or on the command line (eg, &amp;lt;tt&amp;gt;qsub -l nodes=2:ppn=8,walltime=5:00:00 script.pbs&amp;lt;/tt&amp;gt;).  Note that for the debug queue, you can get away without specifying a number of nodes and a default of one will be assigned; for both technical and policy reasons, we do not enforce such a default for the main (&amp;quot;batch&amp;quot;) queue.&lt;br /&gt;
* '''There is a 15 minute walltime minimum''' on all queues except debug and if you set your walltime less than this, it will be rejected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== When submitting your job, fails saying: &amp;quot;script is written in DOS/Windows text format&amp;quot; ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Very likely you have written your script in a windows machine, so to fix this you just need to change the format of you submission script to unix from Windows/DOS.&lt;br /&gt;
Use the command below for all your script files:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dos2unix &amp;lt;pbs-script-file&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;pbs-script-file&amp;gt; has to be substituted by the name of your script file.&lt;br /&gt;
&lt;br /&gt;
==Running your jobs==&lt;br /&gt;
&lt;br /&gt;
===My job can't write to /home===&lt;br /&gt;
&lt;br /&gt;
My code works fine when I test on the development nodes, but when I submit a job, or even run interactively in the development queue on GPC, it fails.  What's wrong?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
As [[Data_Management#Home_Disk_Space | discussed]] [https://support.scinet.utoronto.ca/wiki/images/5/54/SciNet_Tutorial.pdf elsewhere], &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; is mounted read-only on the compute nodes; you can only write to &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; from the login nodes and devel nodes.  (The [[GPC_Quickstart#128Glargemem | largemem nodes]] on GPC, in this respect, are more like devel nodes than compute nodes).   In general, to run jobs you can read from &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; but you'll have to write to &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt; (or, if you were allocated space through the RAC process, on &amp;lt;tt&amp;gt;/project&amp;lt;/tt&amp;gt;).  More information on SciNet filesytems can be found on our [[Data_Management | Data Management]] page.&lt;br /&gt;
&lt;br /&gt;
===OpenMP on the TCS===&lt;br /&gt;
&lt;br /&gt;
How do I run an OpenMP job on the TCS?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Please look at the [[TCS_Quickstart#Submission_Script_for_an_OpenMP_Job | TCS Quickstart ]] page.&lt;br /&gt;
&lt;br /&gt;
===Can I can use hybrid codes consisting of MPI and openMP on the GPC?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Yes. Please look at the [[GPC_Quickstart#Hybrid_MPI.2FOpenMP_jobs | GPC Quickstart ]] page.&lt;br /&gt;
&lt;br /&gt;
===How do I run serial jobs on GPC?===&lt;br /&gt;
&lt;br /&gt;
'''Answer''':&lt;br /&gt;
&lt;br /&gt;
So it should be said first that SciNet is a parallel computing resource, &lt;br /&gt;
and our priority will always be parallel jobs.   Having said that, if &lt;br /&gt;
you can make efficient use of the resources using serial jobs and get &lt;br /&gt;
good science done, that's good too, and we're happy to help you.&lt;br /&gt;
&lt;br /&gt;
The GPC nodes each have 8 processing cores, and making efficient use of these &lt;br /&gt;
nodes means using all eight cores.  As a result, we'd like to have the &lt;br /&gt;
users take up whole nodes (eg, run multiples of 8 jobs) at a time.  &lt;br /&gt;
&lt;br /&gt;
It depends on the nature of your job what the best strategy is. Several approaches are presented on the [[User_Serial|serial run wiki page]].&lt;br /&gt;
&lt;br /&gt;
===Why can't I request only a single cpu for my job on GPC?===&lt;br /&gt;
&lt;br /&gt;
'''Answer''':&lt;br /&gt;
&lt;br /&gt;
On GPC, computers are allocated by the node - that is, in chunks of 8 processors.   If you want to run a job that requires only one processor, you need to bundle the jobs into groups of 8, so as to not be wasting the other 7 for 48 hours. See [[User_Serial|serial run wiki page]].&lt;br /&gt;
&lt;br /&gt;
===How do I run serial jobs on TCS?===&lt;br /&gt;
&lt;br /&gt;
'''Answer''': You don't.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===But in the queue I found a user who is running jobs on GPC, each of which is using only one processor, so why can't I?===&lt;br /&gt;
&lt;br /&gt;
'''Answer''':&lt;br /&gt;
&lt;br /&gt;
The pradat* and atlaspt* jobs, amongst others, are jobs of the ATLAS high energy physics project. That they are reported as single cpu jobs is an artifact of the moab scheduler. They are in fact being automatically bundled into 8-job bundles but have to run individually to be compatible with their international grid-based systems.&lt;br /&gt;
&lt;br /&gt;
===How do I use the ramdisk on GPC?===&lt;br /&gt;
&lt;br /&gt;
To use the ramdisk, create and read to / write from files in /dev/shm/.. just as one would to (eg) ${SCRATCH}. Only the amount of RAM needed to store the files will be taken up by the temporary file system; thus if you have 8 serial jobs each requiring 1 GB of RAM, and 1GB is taken up by various OS services, you would still have approximately 7GB available to use as ramdisk on a 16GB node. However, if you were to write 8 GB of data to the RAM disk, this would exceed available memory and your job would likely crash.&lt;br /&gt;
&lt;br /&gt;
It is very important to delete your files from ram disk at the end of your job. If you do not do this, the next user to use that node will have less RAM available than they might expect, and this might kill their jobs.&lt;br /&gt;
&lt;br /&gt;
''More details on how to setup your script to use the ramdisk can be found on the [[User_Ramdisk|Ramdisk wiki page]].''&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== How can I run a job longer than 48 hours? ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
The SciNet queues have a queue limit of 48 hours.   This is pretty typical for systems of its size in Canada and elsewhere, and larger systems commonly have shorter limits.   The limits are there to ensure that every user gets a fair share of the system (so that no one user ties up lots of nodes for a long time), and for safety (so that if one memory board in one node fails in the middle of a very long job, you haven't lost a months' worth of work).&lt;br /&gt;
&lt;br /&gt;
Since many of us have simulations that require more than that much time, most widely-used scientific applications have &amp;quot;checkpoint-restart&amp;quot; functionality, where every so often the complete state of the calculation is stored as a checkpoint file, and one can restart a simulation from one of these.   In fact, these restart files tend to be quite useful for a number of purposes.&lt;br /&gt;
&lt;br /&gt;
If your job will take longer, you will have to submit your job in multiple parts, restarting from a checkpoint each time.  In this way, one can run a simulation much longer than the queue limit.  In fact, one can even write job scripts which automatically re-submit themselves until a run is completed, using [[FAQ#How_can_I_automatically_resubmit_a_job.3F | automatic resubmission. ]]&lt;br /&gt;
&lt;br /&gt;
=== Why did showstart say it would take 3 hours for my job to start before, and now it says my job will start in 10 hours? ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Please look at the [[FAQ#How_do_priorities_work.2Fwhy_did_that_job_jump_ahead_of_mine_in_the_queue.3F | How do priorities work/why did that job jump ahead of mine in the queue? ]] page.&lt;br /&gt;
&lt;br /&gt;
===How do priorities work/why did that job jump ahead of mine in the queue?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
The [[Moab | queueing system]] used on SciNet machines is a [http://en.wikipedia.org/wiki/Priority_queue Priority Queue].  Jobs enter the queue at the back of the queue, and slowly make their way to the front as those ahead of them are run; but a job that enters the queue with a higher priority can `cut in line'.&lt;br /&gt;
&lt;br /&gt;
The main factor which determines priority is whether or not the user (or their PI) has an [http://wiki.scinethpc.ca/wiki/index.php/Application_Process RAC allocation].  These are competitively allocated grants of computer time; there is a call for proposals towards the end of every calendar year.    Users with an allocation have high priorities in an attempt to make sure that they can use the amount of computer time the committees granted them.   Their priority decreases as they approach their allotted usage over the current window of time; by the time that they have exhausted that allotted usage, their priority is the same as users with no allocation (unallocated, or `default' users).    Unallocated users have a fixed, low, priority.&lt;br /&gt;
&lt;br /&gt;
This priority system is called `fairshare'; the scheduler attempts to make sure everyone has their fair share of the machines, where the share that's fair has been determined by the allocation committee.    The fairshare window is a rolling window of one week; that is, any time you have a job in the queue, the fairshare calculation of its priority is given by how much of your allocation of the machine has been used in the last 7 days.&lt;br /&gt;
&lt;br /&gt;
A particular allocation might have some fraction of GPC - say 4% of the machine (if the PI had been allocated 10 million CPU hours on GPC). The allocations have labels; (called `Resource Allocation Proposal Identifiers', or RAPIs) they look something like&lt;br /&gt;
&lt;br /&gt;
  abc-123-ab&lt;br /&gt;
&lt;br /&gt;
where abc-123 is the PIs CCRI, and the suffix specifies which of the allocations granted to the PI is to be used.  These can be specified on a job-by-job basis.  On GPC, one adds the line&lt;br /&gt;
 #PBS -A RAPI&lt;br /&gt;
to your script; on TCS, one uses&lt;br /&gt;
 # @ account_no = RAPI&lt;br /&gt;
If the allocation to charge isn't specified, a default is used; each user has such a default, which can be changed at the same portal where one changes one's password, &lt;br /&gt;
&lt;br /&gt;
 https://portal.scinet.utoronto.ca/&lt;br /&gt;
&lt;br /&gt;
A jobs priority is determined primarily by the fairshare priority of the allocation it is being charged to; the previous 7 days worth of use under that allocation is calculated and compared to the allocated fraction (here, 5%) of the machine over that window (here, 7 days).   The fairshare priority is a decreasing function of the allocation left; if there is no allocation left (eg, jobs running under that allocation have already used 379,038 CPU hours in the past 7 days), the priority is the same as that of a user with no granted allocation.   (This last part has been the topic of some debate; as the machine gets more utilized, it will probably be the case that we allow RAC users who have greatly overused their quota to have their priorities to drop below that of unallocated users, to give the unallocated users some chance to run on our increasingly crowded system; this would have no undue effect on our allocated users as they still would be able to use the amount of resources they had been allocated by the committees.)   Note that all jobs charging the same allocation get the same fairshare priority.&lt;br /&gt;
&lt;br /&gt;
There are other factors that go into calculating priority, but fairshare is the most significant.   Other factors include&lt;br /&gt;
* length of time waiting in queue (measured in units of the requested runtime). A waiting queue job gains priority as it sits in the queue to avoid job starvation. &lt;br /&gt;
* User adjustment of priorities ( See below ).&lt;br /&gt;
&lt;br /&gt;
The major effect of these subdominant terms is to shuffle the order of jobs running under the same allocation.&lt;br /&gt;
&lt;br /&gt;
===How do we manage job priorities within our research group?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Obviously, managing shared resources within a large group - whether it &lt;br /&gt;
is conference funding or CPU time - takes some doing.   &lt;br /&gt;
&lt;br /&gt;
It's important to note that the fairshare periods are intentionally kept &lt;br /&gt;
quite short - just two weeks long. So, for example, let us say that in your resource &lt;br /&gt;
allocation you have about 10% of the machine.   Then for someone to use &lt;br /&gt;
up the whole two week amount of time in 2 days, they'd have to use 70% &lt;br /&gt;
of the machine in those two days - which is unlikely to happen by &lt;br /&gt;
accident.  If that does happen,  &lt;br /&gt;
those using the same allocation as the person who used 70% of the &lt;br /&gt;
machine over the two days will suffer by having much lower priority for &lt;br /&gt;
their jobs, but only for the next 12 days - and even then, if there are &lt;br /&gt;
idle cpus they'll still be able to compute.&lt;br /&gt;
&lt;br /&gt;
There will be online tools for seeing how the allocation is being used, &lt;br /&gt;
and those people who are in charge in your group will be able to use &lt;br /&gt;
that information to manage the users, telling them to dial it down or &lt;br /&gt;
up.   We know that managing a large research group is hard, and we want &lt;br /&gt;
to make sure we provide you the information you need to do your job &lt;br /&gt;
effectively.&lt;br /&gt;
&lt;br /&gt;
One way for users within a group to manage their priorities within the group&lt;br /&gt;
is with [[Moab#Adjusting_Job_Priority | user-adjusted priorities]]; this is&lt;br /&gt;
described in more detail on the [[Moab | Scheduling System]] page.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
==Errors in running jobs==&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== I couldn't find the  .o output file in the .pbs_spool directory as I used to ===&lt;br /&gt;
&lt;br /&gt;
On Feb 24 2011, the temporary location of standard input and output files was moved from the shared file system ${SCRATCH}/.pbs_spool to the&lt;br /&gt;
node-local directory /var/spool/torque/spool (which resides in ram). The final location after a job has finished is unchanged,&lt;br /&gt;
but to check the output/error of running jobs, users will now have to ssh into the (first) node assigned to the job and look in&lt;br /&gt;
/var/spool/torque/spool.&lt;br /&gt;
&lt;br /&gt;
This alleviates access contention to the temporary directory, especially for those users that are running a lot of jobs, and  reduces the burden on the file system in general.&lt;br /&gt;
&lt;br /&gt;
Note that it is good practice to redirect output to a file rather than to count on the scheduler to do this for you.&lt;br /&gt;
&lt;br /&gt;
=== My GPC job died, telling me `Copy Stageout Files Failed' ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
When a job runs on GPC, the script's standard output and error are redirected to &lt;br /&gt;
&amp;lt;tt&amp;gt;$PBS_JOBID.gpc-sched.OU&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;$PBS_JOBID.gpc-sched.ER&amp;lt;/tt&amp;gt; in&lt;br /&gt;
/var/spool/torque/spool on the (first) node on which your job is running.  At the end of the job, those .OU and .ER files are copied to where the batch script tells them to be copied, by default &amp;lt;tt&amp;gt;$PBS_JOBNAME.o$PBS_JOBID&amp;lt;/tt&amp;gt; and&amp;lt;tt&amp;gt;$PBS_JOBNAME.e$PBS_JOBID&amp;lt;/tt&amp;gt;.   (You can set those filenames to be something clearer with the -e and -o options in your PBS script.)&lt;br /&gt;
&lt;br /&gt;
When you get errors like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
An error has occurred processing your job, see below.&lt;br /&gt;
request to copy stageout files failed on node&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
it means that the copying back process has failed in some way.  There could be a few reasons for this. The first thing to '''make sure that your .bashrc does not produce any output''', as the output-stageout is performed by bash and further output can cause this to fail.&lt;br /&gt;
But it also could have just been a random filesystem error, or it  could be that your job failed spectacularly enough to shortcircuit the normal job-termination process (e.g. ran out of memory very quickly) and those files just never got copied.&lt;br /&gt;
&lt;br /&gt;
Write to [mailto:support@scinet.utoronto.ca &amp;lt;support@scinet.utoronto.ca&amp;gt;] if your input/output files got lost, as we will probably be able to retrieve them for you (please supply at least the jobid, and any other information that may be relevant). &lt;br /&gt;
&lt;br /&gt;
Mind you that it is good practice to redirect output to a file rather than depending on the job scheduler to do this for you.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
&lt;br /&gt;
===Another transport will be used instead===&lt;br /&gt;
&lt;br /&gt;
I get error messages like the following when running on the GPC at the start of the run, although the job seems to proceed OK.   Is this a problem?&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--------------------------------------------------------------------------&lt;br /&gt;
[[45588,1],0]: A high-performance Open MPI point-to-point messaging module&lt;br /&gt;
was unable to find any relevant network interfaces:&lt;br /&gt;
&lt;br /&gt;
Module: OpenFabrics (openib)&lt;br /&gt;
  Host: gpc-f101n005&lt;br /&gt;
&lt;br /&gt;
Another transport will be used instead, although this may result in&lt;br /&gt;
lower performance.&lt;br /&gt;
--------------------------------------------------------------------------&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Everything's fine.   The two MPI libraries scinet provides work for both the InifiniBand and the Gigabit Ethernet interconnects, and will always try to use the fastest interconnect available.   In this case, you ran on normal gigabit GPC nodes with no infiniband; but the MPI libraries have no way of knowing this, and try the infiniband first anyway.  This is just a harmless `failover' message; it tried to use the infiniband, which doesn't exist on this node, then fell back on using Gigabit ethernet (`another transport').&lt;br /&gt;
&lt;br /&gt;
With OpenMPI, this can be avoided by not looking for infiniband; eg, by using the option&lt;br /&gt;
&lt;br /&gt;
--mca btl ^openib&lt;br /&gt;
&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===IB Memory Errors, eg &amp;lt;tt&amp;gt; reg_mr Cannot allocate memory &amp;lt;/tt&amp;gt;===&lt;br /&gt;
&lt;br /&gt;
Infiniband requires more memory than ethernet; it can use RDMA (remote direct memory access) transport for which it sets aside registered memory to transfer data.&lt;br /&gt;
&lt;br /&gt;
In our current network configuration, it requires a _lot_ more memory, particularly as you go to larger process counts; unfortunately, that means you can't get around the &amp;quot;I need more memory&amp;quot; problem the usual way, by running on more nodes.   Machines with different memory or &lt;br /&gt;
network configurations may exhibit this problem at higher or lower MPI &lt;br /&gt;
task counts.&lt;br /&gt;
&lt;br /&gt;
Right now, the best workaround is to reduce the number and size of OpenIB queues, using XRC: with the OpenMPI, add the following options to your mpirun command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-mca btl_openib_receive_queues X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32 -mca btl_openib_max_send_size 12288&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With Intel MPI, you should be able to do&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load intelmpi/4.0.3.008&lt;br /&gt;
mpirun -genv I_MPI_FABRICS=shm:ofa  -genv I_MPI_OFA_USE_XRC=1 -genv I_MPI_OFA_DYNAMIC_QPS=1 -genv I_MPI_DEBUG=5 -np XX ./mycode&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
to the same end.  &lt;br /&gt;
&lt;br /&gt;
For more information see [[GPC MPI Versions]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===My compute job fails, saying &amp;lt;tt&amp;gt;libpng12.so.0: cannot open shared object file&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;libjpeg.so.62: cannot open shared object file&amp;lt;/tt&amp;gt;===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
To maximize the amount of memory available for compute jobs, the compute nodes have a less complete system image than the development nodes.   In particular, since interactive graphics libraries like matplotlib and gnuplot are usually used interactively, the libraries for their use are included in the devel nodes' image but not the compute nodes.&lt;br /&gt;
&lt;br /&gt;
Many of these extra libraries are, however, available in the &amp;quot;extras&amp;quot; module.   So adding a &amp;quot;module load extras&amp;quot; to your job submission  script - or, for overkill, to your .bashrc - should enable these scripts to run on the compute nodes.&lt;br /&gt;
&lt;br /&gt;
==Monitoring jobs in the queue==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Why hasn't my job started?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Use the moab command &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
checkjob -v jobid&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and the last couple of lines should explain why a job hasn't started.  &lt;br /&gt;
&lt;br /&gt;
Please see [[Moab| Job Scheduling System (Moab) ]] for more detailed information&lt;br /&gt;
&lt;br /&gt;
===How do I figure out when my job will run?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Please see [[Moab#Available_Resources| Job Scheduling System (Moab) ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- ===My GPC job is Held, and checkjob says &amp;quot;Batch:PolicyViolation&amp;quot; ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
When this happens, you'll see your job stuck in a BatchHold state.  &lt;br /&gt;
This happens because the job you've submitted breaks one of the rules of the queues, and is being held until you modify it or kill it and re-submit a conforming job.  The most common problems are:&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Running checkjob on my job gives me messages about JobFail and rejected===&lt;br /&gt;
&lt;br /&gt;
Running checkjob on my job gives me messages that suggest my job has failed, as below: what did I do wrong?&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AName: test&lt;br /&gt;
State: Idle &lt;br /&gt;
Creds:  user:xxxxxx  group:xxxxxxxx  account:xxxxxxxx  class:batch_ib  qos:ibqos&lt;br /&gt;
WallTime:   00:00:00 of 8:00:00&lt;br /&gt;
BecameEligible: Wed Jul 23 10:39:27&lt;br /&gt;
SubmitTime: Wed Jul 23 10:38:22&lt;br /&gt;
  (Time Queued  Total: 00:01:47  Eligible: 00:01:05)&lt;br /&gt;
&lt;br /&gt;
Total Requested Tasks: 8&lt;br /&gt;
&lt;br /&gt;
Req[0]  TaskCount: 8  Partition: ALL  &lt;br /&gt;
Opsys: centos6computeA  Arch: ---  Features: ---&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Notification Events: JobFail&lt;br /&gt;
&lt;br /&gt;
IWD:            /scratch/x/xxxxxxxx/xxxxxxx/xxxxxxx&lt;br /&gt;
Partition List: torque,DDR&lt;br /&gt;
Flags:          RESTARTABLE&lt;br /&gt;
Attr:           checkpoint&lt;br /&gt;
StartPriority:  76&lt;br /&gt;
rejected for Opsys        - (null)&lt;br /&gt;
rejected for State        - (null)&lt;br /&gt;
rejected for Reserved     - (null)&lt;br /&gt;
NOTE:  job req cannot run in partition torque (available procs do not meet requirements : 0 of 8 procs found)&lt;br /&gt;
idle procs: 793  feasible procs:   0&lt;br /&gt;
&lt;br /&gt;
Node Rejection Summary: [Opsys: 117][State: 2895][Reserved: 19]&lt;br /&gt;
&lt;br /&gt;
NOTE:  job violates constraints for partition SANDY (partition SANDY not in job partition mask)&lt;br /&gt;
&lt;br /&gt;
NOTE:  job violates constraints for partition GRAVITY (partition GRAVITY not in job partition mask)&lt;br /&gt;
&lt;br /&gt;
rejected for State        - (null)&lt;br /&gt;
NOTE:  &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
The output from check job is a little cryptic in places, and if you are wondering why your job hasn't started yet, you might think that &amp;quot;rejection&amp;quot; and &amp;quot;JobFail&amp;quot; suggest that there's something wrong.  But the above message is actually normal; you can use the &amp;lt;tt&amp;gt;showstart&amp;lt;/tt&amp;gt; command on your job to get a (preliminary, subject to change) estimate as to when the job will start, and you'll find that it is in fact scheduled to start up in the near future.&lt;br /&gt;
&lt;br /&gt;
In the above message:&lt;br /&gt;
&lt;br /&gt;
* `Notification Events: JobFail` just means that, if notifications are enabled, you'll get a message if the job fails;&lt;br /&gt;
* `job req cannot run in partition torque` just means that the job cannot run just yet (that's why it's queued);&lt;br /&gt;
* `job req cannot run in dynamic partition DDR now (insufficient procs available: 0 &amp;lt; 8)` says why: there aren't processors available; and&lt;br /&gt;
* `job violates constraints for partition SANDY/GRAVITY` just means that the job isn't eligable to run in those paritcular (small) sections of the cluster.&lt;br /&gt;
&lt;br /&gt;
that is, the above output is the normal and expected (if somewhat cryptic) explanation as to why the job is waiting - nothing to worry about.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===How can I monitor my running jobs on TCS?===&lt;br /&gt;
&lt;br /&gt;
How can I monitor the load of TCS jobs?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
You can get more information with the command &lt;br /&gt;
 /xcat/tools/tcs-scripts/LL/jobState.sh&lt;br /&gt;
which I alias as:&lt;br /&gt;
 alias llq1='/xcat/tools/tcs-scripts/LL/jobState.sh'&lt;br /&gt;
If you run &amp;quot;llq1 -n&amp;quot; you will see a listing of jobs together with a lot of information, including the load.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===How can I check the memory usage from my jobs?===&lt;br /&gt;
&lt;br /&gt;
How can I check the memory usage from my jobs?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
In many occasions it can be really useful to take a look at how much memory your job is using while it is running.&lt;br /&gt;
There a couple of ways to do so:&lt;br /&gt;
&lt;br /&gt;
1) using some of the [https://wiki.scinet.utoronto.ca/wiki/index.php/SciNet_Command_Line_Utilities command line utilities] we have developed, e.g: by using the '''jobperf''' or '''jobtop''' utilities, it will allow you to check the job performance and head's node utilization respectively.&lt;br /&gt;
&lt;br /&gt;
2) ''ssh'' into the nodes where your job is being run and check for memory usage and system stats right there. For instance, trying the 'top' or 'free' commands, in those nodes.&lt;br /&gt;
&lt;br /&gt;
Also, it always a good a idea and strongly encouraged to inspect the standard output-log and error-log generated for your job submissions.&lt;br /&gt;
These files are named respectively: ''JobName.{o|e}jobIdNumber''; where ''JobName'' is the name you gave to the job (via the '-N' PBS flag) and ''JobIdNumber'' is the id number of the job.&lt;br /&gt;
These files are saved in the working directory after the job is finished, but they can be also accessed on real-time using the '''jobError''' and '''jobOutput''' [https://wiki.scinet.utoronto.ca/wiki/index.php/SciNet_Command_Line_Utilities command line utilities] available loading the ''extras'' module.&lt;br /&gt;
&lt;br /&gt;
Other related topics to memory usage: &amp;lt;br&amp;gt;&lt;br /&gt;
[https://wiki.scinet.utoronto.ca/wiki/index.php/GPC_Quickstart#Ram_Disk Using Ram Disk]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://wiki.scinet.utoronto.ca/wiki/index.php/GPC_Quickstart#Memory_Configuration Different Memory Configuration nodes]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://wiki.scinet.utoronto.ca/wiki/index.php/FAQ#Monitoring_jobs_in_the_queue Monitoring Jobs in the Queue]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://wiki.scinet.utoronto.ca/wiki/images/a/a0/TechTalkJobMonitoring.pdf Tech Talk on Monitoring Jobs]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Can I run cron jobs on devel nodes to monitor my jobs?===&lt;br /&gt;
&lt;br /&gt;
Can I run cron jobs on devel nodes to monitor my jobs?&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
No, we do not permit cron jobs to be run by users.  To monitor the status of your jobs using a cron job running on your own machine, use the command&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh myusername@login.scinet.utoronto.ca &amp;quot;qstat -u myusername&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or some variation of this command.  Of course, you will need to have SSH keys setup on the machine running the cron job, so that password entry won't be necessary.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== How does one check the amount of used CPU-hours in a project, and how does one get statistics for each user in the project? ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
This information is available on the scinet portal,https://portal.scinet.utoronto.ca, See also [[SciNet Usage Reports]].&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
==Errors in running jobs==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
===On GPC, `Job cannot be executed'===&lt;br /&gt;
&lt;br /&gt;
I get error messages like this trying to run on GPC:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
PBS Job Id: 30414.gpc-sched&lt;br /&gt;
Job Name:   namd&lt;br /&gt;
Exec host:  gpc-f120n011/7+gpc-f120n011/6+gpc-f120n011/5+gpc-f120n011/4+gpc-f120n011/3+gpc-f120n011/2+gpc-f120n011/1+gpc-f120n011/0&lt;br /&gt;
Aborted by PBS Server &lt;br /&gt;
Job cannot be executed&lt;br /&gt;
See Administrator for help&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
PBS Job Id: 30414.gpc-sched&lt;br /&gt;
Job Name:   namd&lt;br /&gt;
Exec host:  gpc-f120n011/7+gpc-f120n011/6+gpc-f120n011/5+gpc-f120n011/4+gpc-f120n011/3+gpc-f120n011/2+gpc-f120n011/1+gpc-f120n011/0&lt;br /&gt;
An error has occurred processing your job, see below.&lt;br /&gt;
request to copy stageout files failed on node 'gpc-f120n011/7+gpc-f120n011/6+gpc-f120n011/5+gpc-f120n011/4+gpc-f120n011/3+gpc-f120n011/2+gpc-f120n011/1+gpc-f120n011/0' for job 30414.gpc-sched&lt;br /&gt;
&lt;br /&gt;
Unable to copy file 30414.gpc-sched.OU to USER@gpc-f101n084.scinet.local:/scratch/G/GROUP/USER/projects/sim-performance-test/runtime/l/namd/8/namd.o30414&lt;br /&gt;
*** error from copy&lt;br /&gt;
30414.gpc-sched.OU: No such file or directory&lt;br /&gt;
*** end error output&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Try doing the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ${SCRATCH}/.pbs_spool&lt;br /&gt;
ln -s ${SCRATCH}/.pbs_spool ~/.pbs_spool&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This is how all new accounts are setup on SciNet.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; on GPC for compute jobs is mounted as a read-only file system.   &lt;br /&gt;
PBS by default tries to spool its output  files to &amp;lt;tt&amp;gt;${HOME}/.pbs_spool&amp;lt;/tt&amp;gt;&lt;br /&gt;
which fails as it tries to write to a read-only file  &lt;br /&gt;
system.    New accounts at SciNet  get around this by having ${HOME}/.pbs_spool  &lt;br /&gt;
point to somewhere appropriate on &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt;, but if you've deleted that link&lt;br /&gt;
or directory, or had an old account, you will see errors like the above.&lt;br /&gt;
&lt;br /&gt;
'''On Feb 24, the input/output mechanism has been reconfigured to use a local ramdisk as the temporary location, which means that .pbs_spool is no longer needed and this error should not occur anymore.'''&lt;br /&gt;
&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== I couldn't find the  .o output file in the .pbs_spool directory as I used to ===&lt;br /&gt;
&lt;br /&gt;
On Feb 24 2011, the temporary location of standard input and output files was moved from the shared file system ${SCRATCH}/.pbs_spool to the&lt;br /&gt;
node-local directory /var/spool/torque/spool (which resides in ram). The final location after a job has finished is unchanged,&lt;br /&gt;
but to check the output/error of running jobs, users will now have to ssh into the (first) node assigned to the job and look in&lt;br /&gt;
/var/spool/torque/spool.&lt;br /&gt;
&lt;br /&gt;
This alleviates access contention to the temporary directory, especially for those users that are running a lot of jobs, and  reduces the burden on the file system in general.&lt;br /&gt;
&lt;br /&gt;
Note that it is good practice to redirect output to a file rather than to count on the scheduler to do this for you.&lt;br /&gt;
&lt;br /&gt;
=== My GPC job died, telling me `Copy Stageout Files Failed' ===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
When a job runs on GPC, the script's standard output and error are redirected to &lt;br /&gt;
&amp;lt;tt&amp;gt;$PBS_JOBID.gpc-sched.OU&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;$PBS_JOBID.gpc-sched.ER&amp;lt;/tt&amp;gt; in&lt;br /&gt;
/var/spool/torque/spool on the (first) node on which your job is running.  At the end of the job, those .OU and .ER files are copied to where the batch script tells them to be copied, by default &amp;lt;tt&amp;gt;$PBS_JOBNAME.o$PBS_JOBID&amp;lt;/tt&amp;gt; and&amp;lt;tt&amp;gt;$PBS_JOBNAME.e$PBS_JOBID&amp;lt;/tt&amp;gt;.   (You can set those filenames to be something clearer with the -e and -o options in your PBS script.)&lt;br /&gt;
&lt;br /&gt;
When you get errors like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
An error has occurred processing your job, see below.&lt;br /&gt;
request to copy stageout files failed on node&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
it means that the copying back process has failed in some way.  There could be a few reasons for this. The first thing to '''make sure that your .bashrc does not produce any output''', as the output-stageout is performed by bash and further output can cause this to fail.&lt;br /&gt;
But it also could have just been a random filesystem error, or it  could be that your job failed spectacularly enough to shortcircuit the normal job-termination process (e.g. ran out of memory very quickly) and those files just never got copied.&lt;br /&gt;
&lt;br /&gt;
Write to [mailto:support@scinet.utoronto.ca &amp;lt;support@scinet.utoronto.ca&amp;gt;] if your input/output files got lost, as we will probably be able to retrieve them for you (please supply at least the jobid, and any other information that may be relevant). &lt;br /&gt;
&lt;br /&gt;
Mind you that it is good practice to redirect output to a file rather than depending on the job scheduler to do this for you.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
&lt;br /&gt;
===Another transport will be used instead===&lt;br /&gt;
&lt;br /&gt;
I get error messages like the following when running on the GPC at the start of the run, although the job seems to proceed OK.   Is this a problem?&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--------------------------------------------------------------------------&lt;br /&gt;
[[45588,1],0]: A high-performance Open MPI point-to-point messaging module&lt;br /&gt;
was unable to find any relevant network interfaces:&lt;br /&gt;
&lt;br /&gt;
Module: OpenFabrics (openib)&lt;br /&gt;
  Host: gpc-f101n005&lt;br /&gt;
&lt;br /&gt;
Another transport will be used instead, although this may result in&lt;br /&gt;
lower performance.&lt;br /&gt;
--------------------------------------------------------------------------&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Everything's fine.   The two MPI libraries scinet provides work for both the InifiniBand and the Gigabit Ethernet interconnects, and will always try to use the fastest interconnect available.   In this case, you ran on normal gigabit GPC nodes with no infiniband; but the MPI libraries have no way of knowing this, and try the infiniband first anyway.  This is just a harmless `failover' message; it tried to use the infiniband, which doesn't exist on this node, then fell back on using Gigabit ethernet (`another transport').&lt;br /&gt;
&lt;br /&gt;
With OpenMPI, this can be avoided by not looking for infiniband; eg, by using the option&lt;br /&gt;
&lt;br /&gt;
--mca btl ^openib&lt;br /&gt;
&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===IB Memory Errors, eg &amp;lt;tt&amp;gt; reg_mr Cannot allocate memory &amp;lt;/tt&amp;gt;===&lt;br /&gt;
&lt;br /&gt;
Infiniband requires more memory than ethernet; it can use RDMA (remote direct memory access) transport for which it sets aside registered memory to transfer data.&lt;br /&gt;
&lt;br /&gt;
In our current network configuration, it requires a _lot_ more memory, particularly as you go to larger process counts; unfortunately, that means you can't get around the &amp;quot;I need more memory&amp;quot; problem the usual way, by running on more nodes.   Machines with different memory or &lt;br /&gt;
network configurations may exhibit this problem at higher or lower MPI &lt;br /&gt;
task counts.&lt;br /&gt;
&lt;br /&gt;
Right now, the best workaround is to reduce the number and size of OpenIB queues, using XRC: with the OpenMPI, add the following options to your mpirun command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-mca btl_openib_receive_queues X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32 -mca btl_openib_max_send_size 12288&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With Intel MPI, you should be able to do&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load intelmpi/4.0.3.008&lt;br /&gt;
mpirun -genv I_MPI_FABRICS=shm:ofa  -genv I_MPI_OFA_USE_XRC=1 -genv I_MPI_OFA_DYNAMIC_QPS=1 -genv I_MPI_DEBUG=5 -np XX ./mycode&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
to the same end.  &lt;br /&gt;
&lt;br /&gt;
For more information see [[GPC MPI Versions]].&lt;br /&gt;
&lt;br /&gt;
===My compute job fails, saying &amp;lt;tt&amp;gt;libpng12.so.0: cannot open shared object file&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;libjpeg.so.62: cannot open shared object file&amp;lt;/tt&amp;gt;===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
To maximize the amount of memory available for compute jobs, the compute nodes have a less complete system image than the development nodes.   In particular, since interactive graphics libraries like matplotlib and gnuplot are usually used interactively, the libraries for their use are included in the devel nodes' image but not the compute nodes.&lt;br /&gt;
&lt;br /&gt;
Many of these extra libraries are, however, available in the &amp;quot;extras&amp;quot; module.   So adding a &amp;quot;module load extras&amp;quot; to your job submission  script - or, for overkill, to your .bashrc - should enable these scripts to run on the compute nodes.&lt;br /&gt;
&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Data on SciNet disks==&lt;br /&gt;
&lt;br /&gt;
===How do I find out my disk usage?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
The standard unix/linux utilities for finding the amount of disk space used by a directory are very slow, and notoriously inefficient on the GPFS filesystems that we run on the SciNet systems.  There are utilities that very quickly report your disk usage:&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available with the 'extras' module on the login nodes, datamovers and the GPC devel nodes, provides information in a number of ways on the home, scratch, and project file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time.&lt;br /&gt;
This information is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
More information about these filesystems is available at the [[Data_Management | Data_Management]].&lt;br /&gt;
&lt;br /&gt;
===How do I transfer data to/from SciNet?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
All incoming connections to SciNet go through relatively low-speed connections to the &amp;lt;tt&amp;gt;login.scinet&amp;lt;/tt&amp;gt; gateways, so using scp to copy files the same way you ssh in is not an effective way to move lots of data.  Better tools are described in our page on [[Data_Management#Data_Transfer | Data Transfer]].&lt;br /&gt;
&lt;br /&gt;
===My group works with data files of size 1-2 GB.  Is this too large to  transfer by scp to login.scinet.utoronto.ca ?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Generally, occasion transfers of data less than 10GB is perfectly acceptible to so through the login nodes. See [[Data_Management#Data_Transfer | Data Transfer]].&lt;br /&gt;
&lt;br /&gt;
===How can I check if I have files in /scratch that are scheduled for automatic deletion?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Please see [[Storage_Quickstart#Scratch_Disk_Purging_Policy | Storage At SciNet]]&lt;br /&gt;
&lt;br /&gt;
===How to allow my supervisor to manage files for me using ACL-based commands?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
&lt;br /&gt;
Please see [[Data_Management#File.2FOwnership_Management_.28ACL.29 | File/Ownership Management]]&lt;br /&gt;
&lt;br /&gt;
===Can we buy extra storage space on SciNet?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
Yes, please see [[Data_Management#Buying_storage_space_on_GPFS_or_HPSS | Buying storage space on GPFS or HPSS ]] for more details.&lt;br /&gt;
&lt;br /&gt;
===Can I transfer files between BGQ and HPSS?===&lt;br /&gt;
&lt;br /&gt;
'''Answer:'''&lt;br /&gt;
Yes, please see [https://support.scinet.utoronto.ca/wiki/index.php/BGQ#Bridge_to_HPSS Bridge to HPSS ]  for more details.&lt;br /&gt;
&lt;br /&gt;
==Keep 'em Coming!==&lt;br /&gt;
&lt;br /&gt;
===Next question, please===&lt;br /&gt;
&lt;br /&gt;
Send your question to [mailto:support@scinet.utoronto.ca &amp;lt;support@scinet.utoronto.ca&amp;gt;];  we'll answer it asap!&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=395</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=395"/>
		<updated>2018-05-25T01:20:55Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. &amp;quot;Up2&amp;quot; and &amp;quot;Down2&amp;quot; allow for external references. --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;width:65%&amp;quot; &lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Up|Niagara|Niagara_Quickstart}}&lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Up|HPSS|HPSS}}&lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Up|BGQ|https://wiki.scinet.utoronto.ca/wiki/index.php/BGQ}}&lt;br /&gt;
|-&lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Up|P7|P7}}&lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Down|P8|P8}}&lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Up|SGC|SOSCIP_GPU}}&lt;br /&gt;
|-&lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Up|Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Up|External Network|External Network}}&lt;br /&gt;
|style=&amp;quot;width:10%&amp;quot;|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div style=&amp;quot;font-size:10pt;&amp;quot;&amp;gt;&lt;br /&gt;
Current Messages:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
* May 24, 2018: The data centre is under annual maintenance. All systems are offline. Systems are expected to be back late afternoon today; check for updates on this page.&lt;br /&gt;
* May 18, 2018: Announcement: Annual scheduled maintenance downtime: Thursday May 24, starting 7:00 AM&lt;br /&gt;
* May 16, 2018: Cooling  restored, systems online&lt;br /&gt;
* May 16, 2018: Cooling issue at datacentre again, all systems down&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 95%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara_Quickstart|Niagara cluster for large parallel jobs]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [https://wiki.scinet.utoronto.ca/wiki/index.php/BGQ SOSCIP BlueGene/Q cluster]&lt;br /&gt;
* [[SOSCIP_GPU | SOSCIP GPU cluster]]&lt;br /&gt;
* [[P7|Experimental Power 7 cluster]]&lt;br /&gt;
* [[P8|Experimental Power 8 GPU cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging_SciNet | Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://courses.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/channel/UC42CaO-AAQhwqa8RGzE3daQ SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara]] &lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=42</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=42"/>
		<updated>2018-04-11T18:28:34Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* For Non-GPC Users */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Niagara =&lt;br /&gt;
&lt;br /&gt;
==System architecture==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Total of 60,000 Intel x86-64 cores.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;1,500 Lenovo SD530 nodes&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;2x Intel Skylake 6148 CPUs (40 cores @2.4GHz per node) (with hyperthreading to 80 threads &amp;amp;amp; AVX512).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;3.02 PFlops delivered / 4.6 PFlops theoretical (would've been #42 on the TOP500 in Nov'18).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;188 GiB / 202 GB RAM per node (at least 4 GiB/core for user jobs).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Operating system: Linux (CentOS 7).&lt;br /&gt;
&amp;lt;li&amp;gt;Interconnect: EDR InfiniBand, Dragonfly+ topology with Adaptive Routing&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;1:1 up to 432 nodes, effectively 2:1 beyond that.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;No GPUs, no local disk.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Replaces the General Purpose Cluster (GPC) and Tightly Coupled System (TCS).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Migration to Niagara ==&lt;br /&gt;
&lt;br /&gt;
=== Migration for Existing Users of the GPC ===&lt;br /&gt;
&lt;br /&gt;
* Accounts, $HOME &amp;amp;amp; $PROJECT of active GPC users transferred to Niagara (except dot-files in ~).&lt;br /&gt;
* Data stored in $SCRATCH will not be transfered automatically.&lt;br /&gt;
* Users are to clean up $SCRATCH on the GPC as much as possible (remember it's temporary data!). Then they can transfer what they need using datamover nodes. Let us know if you need help.&lt;br /&gt;
* To enable this transfer, there will be a short period during which you can have access to Niagara as well as to the GPC storage resources. This period will end no later than May 9, 2018.&lt;br /&gt;
&lt;br /&gt;
=== For Non-GPC Users ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Those of you new to SciNet, but with 2018 RAC allocations on Niagara, will have your accounts created and ready for you to login.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;New, non-RAC users: we are still working out the procedure to get access. If you can't wait, for now, you can follow the old route of requesting a SciNet Consortium Account on the [https://ccdb.computecanada.ca/me/facilities CCDB site].&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using Niagara: Logging in ==&lt;br /&gt;
&lt;br /&gt;
As with all SciNet and CC (Compute Canada) compute systems, access to Niagara is via ssh (secure shell) only.&lt;br /&gt;
&lt;br /&gt;
To access SciNet systems, first open a terminal window (e.g. MobaXTerm on Windows).&lt;br /&gt;
&lt;br /&gt;
Then ssh into the Niagara login nodes with your CC credentials:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
$ ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;$ ssh -Y MYCCUSERNAME@niagara.computecanada.ca&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The Niagara login nodes are where you develop, edit, compile, prepare and submit jobs.&lt;br /&gt;
* These login nodes are not part of the Niagara compute cluster, but have the same architecture, operating system, and software stack.&lt;br /&gt;
* The optional &amp;lt;code&amp;gt;-Y&amp;lt;/code&amp;gt; is needed to open windows from the Niagara command-line onto your local X server.&lt;br /&gt;
* To run on Niagara's compute nodes, you must submit a batch job.&lt;br /&gt;
&lt;br /&gt;
== Storage Systems and Locations ==&lt;br /&gt;
&lt;br /&gt;
=== Home and scratch ===&lt;br /&gt;
&lt;br /&gt;
You have a home and scratch directory on the system, whose locations will be given by&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;$HOME=/home/g/groupname/myccusername&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;$SCRATCH=/scratch/g/groupname/myccusername&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ pwd&lt;br /&gt;
/home/s/scinet/rzon&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ cd $SCRATCH&lt;br /&gt;
&lt;br /&gt;
nia-login07:rzon$ pwd&lt;br /&gt;
/scratch/s/scinet/rzon&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Project location ===&lt;br /&gt;
&lt;br /&gt;
Users from groups with a RAC allocation will also have a project directory.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;$PROJECT=/project/g/groupname/myccusername&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''''IMPORTANT: Future-proof your scripts'''''&lt;br /&gt;
&lt;br /&gt;
Use the environment variables (HOME, SCRATCH, PROJECT) instead of the actual paths!  The paths may change in the future.&lt;br /&gt;
&lt;br /&gt;
=== Storage Limits on Niagara ===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! location&lt;br /&gt;
! quota&lt;br /&gt;
!align=&amp;quot;right&amp;quot;| block size&lt;br /&gt;
! expiration time&lt;br /&gt;
! backed up&lt;br /&gt;
! on login&lt;br /&gt;
! on compute&lt;br /&gt;
|-&lt;br /&gt;
| $HOME&lt;br /&gt;
| 100 GB&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| 1 MB&lt;br /&gt;
| &lt;br /&gt;
| yes&lt;br /&gt;
| yes&lt;br /&gt;
| read-only&lt;br /&gt;
|-&lt;br /&gt;
| $SCRATCH&lt;br /&gt;
| 25 TB&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| 16 MB&lt;br /&gt;
| 2 months&lt;br /&gt;
| no&lt;br /&gt;
| yes&lt;br /&gt;
| yes&lt;br /&gt;
|-&lt;br /&gt;
| $PROJECT&lt;br /&gt;
| by group allocation&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| 16 MB&lt;br /&gt;
| &lt;br /&gt;
| yes&lt;br /&gt;
| yes&lt;br /&gt;
| yes&lt;br /&gt;
|-&lt;br /&gt;
| $ARCHIVE&lt;br /&gt;
| by group allocation&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| &lt;br /&gt;
|&lt;br /&gt;
| dual-copy&lt;br /&gt;
| no&lt;br /&gt;
| no&lt;br /&gt;
|-&lt;br /&gt;
| $BBUFFER&lt;br /&gt;
| ?&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| 1 MB&lt;br /&gt;
| very short&lt;br /&gt;
| no&lt;br /&gt;
| ?&lt;br /&gt;
| ?&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Compute nodes do not have local storage.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Archive space is on [https://wiki.scinet.utoronto.ca/wiki/index.php/HPSS HPSS].&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Backup means a recent snapshot, not an achive of all data that ever was.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;$BBUFFER&amp;lt;/code&amp;gt; stands for the Burst Buffer, a functionality that is still being set up.  This will be a faster parallel storage tier for temporary data.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Moving data ===&lt;br /&gt;
&lt;br /&gt;
'''''Move amounts less than 10GB through the login nodes.'''''&lt;br /&gt;
&lt;br /&gt;
* Only Niagara login nodes visible from outside SciNet.&lt;br /&gt;
* Use scp or rsync to niagara.scinet.utoronto.ca or niagara.computecanada.ca (no difference).&lt;br /&gt;
* This will time out for amounts larger than about 10GB.&lt;br /&gt;
&lt;br /&gt;
'''''Move amounts larger than 10GB through the datamover nodes.'''''&lt;br /&gt;
&lt;br /&gt;
* From a Niagara login node, ssh to &amp;lt;code&amp;gt;nia-datamover1&amp;lt;/code&amp;gt; or  &amp;lt;code&amp;gt;nia-datamover2&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Transfers must originate from this datamover.&lt;br /&gt;
* The other side (e.g. your machine) must be reachable from the outside.&lt;br /&gt;
* If you do this often, consider using Globus, a web-based tool for data transfer.&lt;br /&gt;
&lt;br /&gt;
'''''Moving data to HPSS/Archive/Nearline using the scheduler.'''''&lt;br /&gt;
&lt;br /&gt;
* [https://wiki.scinet.utoronto.ca/wiki/index.php/HPSS HPSS] is a tape-based storage solution, and is SciNet's nearline a.k.a. archive facility.&lt;br /&gt;
* Storage space on HPSS is allocated through the annual [https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions Compute Canada RAC allocation].&lt;br /&gt;
&lt;br /&gt;
== Software and Libraries ==&lt;br /&gt;
&lt;br /&gt;
=== Modules ===&lt;br /&gt;
&lt;br /&gt;
Once you are on one of the login nodes, what software is already installed?&lt;br /&gt;
&lt;br /&gt;
* Other than essentials, all installed software is made available using module commands.&lt;br /&gt;
* These set environment variables (&amp;lt;code&amp;gt;PATH&amp;lt;/code&amp;gt;, etc.)&lt;br /&gt;
* Allows multiple, conflicting versions of a given package to be available.&lt;br /&gt;
* module spider shows the available software.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module spider&lt;br /&gt;
---------------------------------------------------&lt;br /&gt;
The following is a list of the modules currently av&lt;br /&gt;
---------------------------------------------------&lt;br /&gt;
  CCEnv: CCEnv&lt;br /&gt;
&lt;br /&gt;
  NiaEnv: NiaEnv/2018a&lt;br /&gt;
&lt;br /&gt;
  anaconda2: anaconda2/5.1.0&lt;br /&gt;
&lt;br /&gt;
  anaconda3: anaconda3/5.1.0&lt;br /&gt;
&lt;br /&gt;
  autotools: autotools/2017&lt;br /&gt;
    autoconf, automake, and libtool &lt;br /&gt;
&lt;br /&gt;
  boost: boost/1.66.0&lt;br /&gt;
&lt;br /&gt;
  cfitsio: cfitsio/3.430&lt;br /&gt;
&lt;br /&gt;
  cmake: cmake/3.10.2 cmake/3.10.3&lt;br /&gt;
&lt;br /&gt;
  ...&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module load &amp;amp;lt;module-name&amp;amp;gt;&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;use particular software&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;remove currently loaded modules&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;(or &amp;lt;code&amp;gt;module spider &amp;amp;lt;module-name&amp;amp;gt;&amp;lt;/code&amp;gt;)&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;list available software packages&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;list loadable software packages&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;list loaded modules&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On Niagara, there are really two software stacks:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;A Niagara software stack tuned and compiled for this machine. This stack is available by default, but if not, can be reloaded with&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;module load NiaEnv&amp;lt;/source&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;The same software stack available on Compute Canada's General Purpose clusters [https://docs.computecanada.ca/wiki/Graham Graham] and [https://docs.computecanada.ca/wiki/Cedar Cedar], compiled (for now) for a previous generation of CPUs:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;module load CCEnv&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;If you want the same default modules loaded as on Cedar and Graham, then afterwards also &amp;lt;code&amp;gt;module load StdEnv&amp;lt;/code&amp;gt;.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: the &amp;lt;code&amp;gt;*Env&amp;lt;/code&amp;gt; modules are '''''sticky'''''; remove them by &amp;lt;code&amp;gt;--force&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Tips for loading software ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;We advise '''''against''''' loading modules in your .bashrc.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;This could lead to very confusing behaviour under certain circumstances.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Instead, load modules by hand when needed, or by sourcing a separate script.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Load run-specific modules inside your job submission script.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Short names give default versions; e.g. &amp;lt;code&amp;gt;intel&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;intel/2018.2&amp;lt;/code&amp;gt;.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;It is usually better to be explicit about the versions, for future reproducibility.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Handy abbreviations:&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre class=&amp;quot;sh&amp;quot;&amp;gt;        ml → module list&lt;br /&gt;
        ml NAME → module load NAME  # if NAME is an existing module&lt;br /&gt;
        ml X → module X&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Modules sometimes require other modules to be loaded first.&amp;lt;br /&amp;gt;&lt;br /&gt;
Solve these dependencies by using &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Module spider ===&lt;br /&gt;
&lt;br /&gt;
Oddly named, the module subcommand spider is the search-and-advice facility for modules.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module load openmpi&lt;br /&gt;
Lmod has detected the error:  These module(s) exist but cannot be loaded as requested: &amp;quot;openmpi&amp;quot;&lt;br /&gt;
   Try: &amp;quot;module spider openmpi&amp;quot; to see how to load the module(s).&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module spider openmpi&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
  openmpi:&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
     Versions:&lt;br /&gt;
        openmpi/2.1.3&lt;br /&gt;
        openmpi/3.0.1&lt;br /&gt;
        openmpi/3.1.0rc3&lt;br /&gt;
&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
  For detailed information about a specific &amp;quot;openmpi&amp;quot; module (including how to load the modules) use&lt;br /&gt;
  the module s full name.&lt;br /&gt;
  For example:&lt;br /&gt;
&lt;br /&gt;
     $ module spider openmpi/3.1.0rc3&lt;br /&gt;
------------------------------------------------------------------------------------------------------&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module spider openmpi/3.1.0rc3&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
  openmpi: openmpi/3.1.0rc3&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
    You will need to load all module(s) on any one of the lines below before the &amp;quot;openmpi/3.1.0rc3&amp;quot;&lt;br /&gt;
    module is available to load.&lt;br /&gt;
&lt;br /&gt;
      NiaEnv/2018a  gcc/7.3.0&lt;br /&gt;
      NiaEnv/2018a  intel/2018.2&lt;br /&gt;
 &amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module load NiaEnv/2018a  intel/2018.2   # note: NiaEnv is usually already loaded&lt;br /&gt;
nia-login07:~$ module load openmpi/3.1.0rc3&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module list&lt;br /&gt;
Currently Loaded Modules:&lt;br /&gt;
  1) NiaEnv/2018a (S)   2) intel/2018.2   3) openmpi/3.1.0.rc3&lt;br /&gt;
&lt;br /&gt;
  Where:&lt;br /&gt;
   S:  Module is Sticky, requires --force to unload or purge&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Can I Run Commercial Software? ==&lt;br /&gt;
&lt;br /&gt;
* Possibly, but you have to bring your own license for it.&lt;br /&gt;
* SciNet and Compute Canada have an extremely large and broad user base of thousands of users, so we cannot provide licenses for everyone's favorite software.&lt;br /&gt;
* Thus, the only commercial software installed on Niagara is software that can benefit everyone: Compilers, math libraries and debuggers.&lt;br /&gt;
* That means no Matlab, Gaussian, IDL, &lt;br /&gt;
* Open source alternatives like Octave, Python, R are available.&lt;br /&gt;
* We are happy to help you to install commercial software for which you have a license.&lt;br /&gt;
* In some cases, if you have a license, you can use software in the Compute Canada stack.&lt;br /&gt;
&lt;br /&gt;
== Compiling on Niagara: Example ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module list&lt;br /&gt;
Currently Loaded Modules:&lt;br /&gt;
  1) NiaEnv/2018a (S)&lt;br /&gt;
  Where:&lt;br /&gt;
   S:  Module is Sticky, requires --force to unload or purge&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ module load intel/2018.2 gsl/2.4&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ ls&lt;br /&gt;
main.c module.c&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ icc -c -O3 -xHost -o main.o main.c&lt;br /&gt;
nia-login07:~$ icc -c -O3 -xHost -o module.o module.c&lt;br /&gt;
nia-login07:~$ icc  -o main module.o main.o -lgsl -mkl&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ ./main&amp;lt;/source&amp;gt;&lt;br /&gt;
== Testing ==&lt;br /&gt;
&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Small test jobs can be run on the login nodes.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Rule of thumb: couple of minutes, taking at most about 1-2GB of memory, couple of cores.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;You can run the the ddt debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Short tests that do not fit on a login node, or for which you need a dedicated node, request an&amp;lt;br /&amp;gt;&lt;br /&gt;
interactive debug job with the salloc command&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ salloc -pdebug --nodes N --time=1:00:00&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;where N  is the number of nodes. The duration of your interactive debug session can be at most one hour, can use at most 4 nodes, and each user can only have one such session at a time.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Submitting jobs ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Niagara uses SLURM as its job scheduler.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;You submit jobs from a login node by passing a script to the sbatch command:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ sbatch jobscript.sh&amp;lt;/source&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;This puts the job in the queue. It will run on the compute nodes in due course.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Jobs will run under their group's RRG allocation, or, if the group has none, under a RAS allocation (previously called `default' allocation).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Keep in mind:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Scheduling is by node, so in multiples of 40-cores.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Maximum walltime is 24 hours.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Jobs must write to your scratch or project directory (home is read-only on compute nodes).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Compute nodes have no internet access.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Download data you need beforehand on a login node.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Scheduling by Node ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;All job resource requests on Niagara are scheduled as a multiple of '''nodes'''.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;The nodes that your jobs run on are exclusively yours.&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;No other users are running anything on them.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;You can ssh into them to see how things are going.&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Whatever your requests to the scheduler, it will always be translated into a multiple of nodes.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Memory requests to the scheduler are of no use.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Your job gets N x 202GB of RAM if N is the number of nodes.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;You should '''use all 40 cores on each of the nodes''' that your job uses.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;You will be contacted if you don't, and we will help you get more science done.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Hyperthreading: Logical CPUs vs. cores ==&lt;br /&gt;
&lt;br /&gt;
* Hyperthreading, a technology that leverages more of the physical hardware by pretending there are twice as many logical cores than real once, is enabled on Niagara.&lt;br /&gt;
* So the OS and scheduler see 80 logical cores.&lt;br /&gt;
* 80 logical cores vs. 40 real cores typically gives about a 5-10% speedup (YMMV).&lt;br /&gt;
&lt;br /&gt;
'''Because Niagara is scheduled by node, hyperthreading is actually fairly easy to use:'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Ask for a certain number of nodes N for your jobs.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;You know that you get 40xN cores, so you will use (at least) a total of 40xN mpi processes or threads.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;(mpirun, srun, and the OS will automaticallly spread these over the real cores)&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;But you should also test if running 80xN mpi processes or threads gives you any speedup.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Regardless, your usage will be counted as 40xNx(walltime in years).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Example submission script (OpenMP) ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --cpus-per-task=40&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --job-name openmp_job&lt;br /&gt;
#SBATCH --output=openmp_output_%j.txt&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load intel/2018.2&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
./openmp_example&lt;br /&gt;
# or &amp;quot;srun ./openmp_example&amp;quot;.&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Submit this script with the command:&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ sbatch openmp_job.sh&amp;lt;/source&amp;gt;&lt;br /&gt;
* First line indicates that this is a bash script.&lt;br /&gt;
* Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&lt;br /&gt;
* sbatch reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;openmp_job&amp;lt;/code&amp;gt;) .&lt;br /&gt;
* In this case, SLURM looks for one node with 40 cores to be run inside one task, for 1 hour.&lt;br /&gt;
* Once it found such a node, it runs the script:&lt;br /&gt;
** Change to the submission directory;&lt;br /&gt;
** Loads modules;&lt;br /&gt;
** Sets an environment variable;&lt;br /&gt;
** Runs the &amp;lt;code&amp;gt;openmp_example&amp;lt;/code&amp;gt; application.&lt;br /&gt;
&lt;br /&gt;
=== Example submission script (MPI) ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;#!/bin/bash &lt;br /&gt;
#SBATCH --nodes=8&lt;br /&gt;
#SBATCH --ntasks=320&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --job-name mpi_job&lt;br /&gt;
#SBATCH --output=mpi_output_%j.txt&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load intel/2018.2&lt;br /&gt;
module load openmpi/3.1.0rc3&lt;br /&gt;
&lt;br /&gt;
mpirun ./mpi_example&lt;br /&gt;
# or &amp;quot;srun ./mpi_example&amp;quot;&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Submit this script with the command:&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ sbatch mpi_job.sh&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;First line indicates that this is a bash script.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;sbatch reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;mpi_job&amp;lt;/code&amp;gt;)&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;In this case, SLURM looks for 8 nodes with 40 cores on which to run 320 tasks, for 1 hour.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Once it found such a node, it runs the script:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Change to the submission directory;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Loads modules;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Runs the &amp;lt;code&amp;gt;mpi_example&amp;lt;/code&amp;gt; application.&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Monitoring queued jobs ==&lt;br /&gt;
&lt;br /&gt;
Once the job is incorporated into the queue, there are some command you can use to monitor its progress.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; to show the job queue (&amp;lt;code&amp;gt;squeue -u $USER&amp;lt;/code&amp;gt; for just your jobs);&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue -j JOBID&amp;lt;/code&amp;gt; to get information on a specific job&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;(alternatively, &amp;lt;code&amp;gt;scontrol show job JOBID&amp;lt;/code&amp;gt;, which is more verbose).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue -j JOBID -o &amp;amp;quot;%.9i %.9P %.8j %.8u %.2t %.10M %.6D %S&amp;amp;quot;&amp;lt;/code&amp;gt; to get an estimate for when a job will run.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;scancel -i JOBID&amp;lt;/code&amp;gt; to cancel the job.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;sinfo -pcompute&amp;lt;/code&amp;gt; to look at available nodes.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;More utilities like those that were available on the GPC are under development.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Data Management and I/O Tips ==&lt;br /&gt;
&lt;br /&gt;
* $HOME, $SCRATCH, and $PROJECT all use the parallel file system called GPFS.&lt;br /&gt;
* Your files can be seen on all Niagara login and compute nodes.&lt;br /&gt;
* GPFS is a high-performance file system which provides rapid reads and writes to large data sets in parallel from many nodes.&lt;br /&gt;
* But accessing data sets which consist of many, small files leads to poor performance.&lt;br /&gt;
* Avoid reading and writing lots of small amounts of data to disk.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Many small files on the system would waste space and would be slower to access, read and write.&lt;br /&gt;
* Write data out in binary. Faster and takes less space.&lt;br /&gt;
* Burst buffer (to come) is better for i/o heavy jobs and to speed up checkpoints.&lt;br /&gt;
&lt;br /&gt;
== Further information ==&lt;br /&gt;
&lt;br /&gt;
=== Useful sites ===&lt;br /&gt;
&lt;br /&gt;
* SciNet: https://www.scinet.utoronto.ca&lt;br /&gt;
* Niagara: https://docs.computecanada.ca/wiki/niagara&lt;br /&gt;
* System Status: https://wiki.scinet.utoronto.ca/wiki/index.php/System_Alerts&lt;br /&gt;
* Training: https://support.scinet.utoronto.ca/education&lt;br /&gt;
&lt;br /&gt;
=== Support ===&lt;br /&gt;
&lt;br /&gt;
* support@scinet.utoronto.ca&lt;br /&gt;
* niagara@computecanada.ca&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=15</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=15"/>
		<updated>2018-04-09T19:25:48Z</updated>

		<summary type="html">&lt;p&gt;Dgruner: /* Testing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Niagara =&lt;br /&gt;
&lt;br /&gt;
==System architecture==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Total of 60,000 Intel x86-64 cores.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;1,500 Lenovo SD530 nodes&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;2x Intel Skylake 6148 CPUs (40 cores @2.4GHz per node).&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;(with hyperthreading to 80 threads &amp;amp;amp; AVX512)&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;3.02 PFlops delivered / 4.6 PFlops theoretical.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;(would've been #42 on the TOP500 in Nov'18)&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;188 GiB / 202 GB RAM per node.&amp;lt;/p&amp;gt;&lt;br /&gt;
(at least 4 GiB/core for user jobs)&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Operating system: Linux (CentOS 7).&lt;br /&gt;
&amp;lt;li&amp;gt;Interconnect: EDR InfiniBand, Dragonfly+ topology with Adaptive Routing&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;1:1 up to 432 nodes, effectively 2:1 beyond that.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;No GPUs, no local disk.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Replaces the General Purpose Cluster (GPC) and Tightly Coupled System (TCS).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Migration to Niagara ==&lt;br /&gt;
&lt;br /&gt;
=== Migration for Existing Users of the GPC ===&lt;br /&gt;
&lt;br /&gt;
* Accounts, $HOME &amp;amp;amp; $PROJECT of active GPC users transferred to Niagara (except dot-files in ~).&lt;br /&gt;
* Data stored in $SCRATCH will not be transfered automatically.&lt;br /&gt;
* Users are to clean up $SCRATCH on the GPC as much as possible (remember it's temporary data!). Then they can transfer what they need using datamover nodes. Let us know if you need help.&lt;br /&gt;
* To enable this transfer, there will be a short period during which you can have access to Niagara as well as to the GPC storage resources. This period will end no later than May 9, 2018.&lt;br /&gt;
&lt;br /&gt;
=== For Non-GPC Users ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Those of you new to SciNet, but with 2018 RAC allocations on Niagara, will have your accounts created and ready for you to login.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;New, non-RAC users: we are still working out the procedure to get access.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;If you can't wait, for now, you can follow the old route of requesting a SciNet&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Consortium Account on the [https://ccsb.computecanada.ca CCDB site].&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using Niagara: Logging in ==&lt;br /&gt;
&lt;br /&gt;
As with all SciNet and CC compute systems, access to Niagara is via ssh (secure shell) only.&lt;br /&gt;
&lt;br /&gt;
To access SciNet systems, first open a terminal window (e.g. MobaXTerm on Windows).&lt;br /&gt;
&lt;br /&gt;
Then ssh into the Niagara login nodes with your CC credentials:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
$ ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;$ ssh -Y MYCCUSERNAME@niagara.computecanada.ca&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The Niagara login nodes are where you develop, edit, compile, prepare and submit jobs.&lt;br /&gt;
* These login nodes are not part of the Niagara compute cluster, but have the same architecture, operating system, and software stack.&lt;br /&gt;
* The optional &amp;lt;code&amp;gt;-Y&amp;lt;/code&amp;gt; is needed to open windows from the Niagara command-line onto your local X server.&lt;br /&gt;
* To run on Niagara's compute nodes, you must submit a batch job.&lt;br /&gt;
&lt;br /&gt;
== Storage Systems and Locations ==&lt;br /&gt;
&lt;br /&gt;
=== Home and scratch ===&lt;br /&gt;
&lt;br /&gt;
You have a home and scratch directory on the system, whose locations will be given by&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;$HOME=/home/g/groupname/myccusername&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;$SCRATCH=/scratch/g/groupname/myccusername&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ pwd&lt;br /&gt;
/home/s/scinet/rzon&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ cd $SCRATCH&lt;br /&gt;
&lt;br /&gt;
nia-login07:rzon$ pwd&lt;br /&gt;
/scratch/s/scinet/rzon&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Project location ===&lt;br /&gt;
&lt;br /&gt;
Users from groups with a RAC allocation will also have a project directory.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;$PROJECT=/project/g/groupname/myccusername&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''''IMPORTANT: Future-proof your scripts'''''&lt;br /&gt;
&lt;br /&gt;
Use the environment variables instead of the actual paths!&lt;br /&gt;
&lt;br /&gt;
=== Storage Limits on Niagara ===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! location&lt;br /&gt;
! quota&lt;br /&gt;
!align=&amp;quot;right&amp;quot;| block size&lt;br /&gt;
! expiration time&lt;br /&gt;
! backed up&lt;br /&gt;
! on login&lt;br /&gt;
! on compute&lt;br /&gt;
|-&lt;br /&gt;
| $HOME&lt;br /&gt;
| 100 GB&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| 1 MB&lt;br /&gt;
| &lt;br /&gt;
| yes&lt;br /&gt;
| yes&lt;br /&gt;
| read-only&lt;br /&gt;
|-&lt;br /&gt;
| $SCRATCH&lt;br /&gt;
| 25 TB&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| 16 MB&lt;br /&gt;
| 2 months&lt;br /&gt;
| no&lt;br /&gt;
| yes&lt;br /&gt;
| yes&lt;br /&gt;
|-&lt;br /&gt;
| $PROJECT&lt;br /&gt;
| by group allocation&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| 16 MB&lt;br /&gt;
| &lt;br /&gt;
| yes&lt;br /&gt;
| yes&lt;br /&gt;
| yes&lt;br /&gt;
|-&lt;br /&gt;
| $ARCHIVE&lt;br /&gt;
| by group allocation&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| &lt;br /&gt;
|&lt;br /&gt;
| dual-copy&lt;br /&gt;
| no&lt;br /&gt;
| no&lt;br /&gt;
|-&lt;br /&gt;
| $BBUFFER&lt;br /&gt;
| ?&lt;br /&gt;
|align=&amp;quot;right&amp;quot;| 1 MB&lt;br /&gt;
| very short&lt;br /&gt;
| no&lt;br /&gt;
| ?&lt;br /&gt;
| ?&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Compute nodes do not have local storage.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Archive space is on HPSS.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Backup means a recent snapshot, not an achive of all data that ever was.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;$BBUFFER&amp;lt;/code&amp;gt; stands for the Burst Buffer, a functionality that is still being setup,&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;but this will be a faster parallel storage tier for temporary data.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Moving data ===&lt;br /&gt;
&lt;br /&gt;
'''''Move amounts less than 10GB through the login nodes.'''''&lt;br /&gt;
&lt;br /&gt;
* Only Niagara login nodes visible from outside SciNet.&lt;br /&gt;
* Use scp or rsync to niagara.scinet.utoronto.ca or niagara.computecanada.ca (no difference).&lt;br /&gt;
* This will time out for amounts larger than about 10GB.&lt;br /&gt;
&lt;br /&gt;
'''''Move amounts larger than 10GB through the datamover node.'''''&lt;br /&gt;
&lt;br /&gt;
* From a Niagara login node, ssh to &amp;lt;code&amp;gt;nia-datamover1&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Transfers must originate from this datamover.&lt;br /&gt;
* The other side (e.g. your machine) must be reachable from the outside.&lt;br /&gt;
* If you do this often, consider using Globus, a web-based tool for data transfer.&lt;br /&gt;
&lt;br /&gt;
'''''Moving data to HPSS/Archive/Nearline using the scheduler.'''''&lt;br /&gt;
&lt;br /&gt;
* HPSS is a tape-based storage solution, and is SciNet's nearline a.k.a. archive facility.&lt;br /&gt;
* Storage space on HPSS is controled through the annual RAC allocation.&lt;br /&gt;
&lt;br /&gt;
== Software and Libraries ==&lt;br /&gt;
&lt;br /&gt;
=== Modules ===&lt;br /&gt;
&lt;br /&gt;
Once you are on one of the login nodes, what software is already installed?&lt;br /&gt;
&lt;br /&gt;
* Other than essentials, all software installed using module commands.&lt;br /&gt;
* sets environment variables (&amp;lt;code&amp;gt;PATH&amp;lt;/code&amp;gt;, etc.)&lt;br /&gt;
* Allows multiple, conflicting versions of package to be available.&lt;br /&gt;
* module spider shows available software.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module spider&lt;br /&gt;
---------------------------------------------------&lt;br /&gt;
The following is a list of the modules currently av&lt;br /&gt;
---------------------------------------------------&lt;br /&gt;
  CCEnv: CCEnv&lt;br /&gt;
&lt;br /&gt;
  NiaEnv: NiaEnv/2018a&lt;br /&gt;
&lt;br /&gt;
  anaconda2: anaconda2/5.1.0&lt;br /&gt;
&lt;br /&gt;
  anaconda3: anaconda3/5.1.0&lt;br /&gt;
&lt;br /&gt;
  autotools: autotools/2017&lt;br /&gt;
    autoconf, automake, and libtool &lt;br /&gt;
&lt;br /&gt;
  boost: boost/1.66.0&lt;br /&gt;
&lt;br /&gt;
  cfitsio: cfitsio/3.430&lt;br /&gt;
&lt;br /&gt;
  cmake: cmake/3.10.2 cmake/3.10.3&lt;br /&gt;
&lt;br /&gt;
  ...&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module load &amp;amp;lt;module-name&amp;amp;gt;&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;use particular software&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;remove currently loaded modules&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;(or &amp;lt;code&amp;gt;module spider &amp;amp;lt;module-name&amp;amp;gt;&amp;lt;/code&amp;gt;)&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;list available software packages&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;list loadable software packages&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;list loaded modules&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On Niagara, there are really two software stacks:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;A Niagara software stack tuned and compiled for this machine. This stack is available by default, but if not, can be reloaded with&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;module load NiaEnv&amp;lt;/source&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;The same software stack available on Compute Canada's General Purpose clusters Graham and Cedar, compiled (for now) for a previous generation of the CPUs:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;module load CCEnv&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;If you want the same default modules loaded as on Cedar and Graham, then afterwards also &amp;lt;code&amp;gt;module load StdEnv&amp;lt;/code&amp;gt;.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: the &amp;lt;code&amp;gt;*Env&amp;lt;/code&amp;gt; modules are '''''sticky'''''; remove them by &amp;lt;code&amp;gt;--force&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Tips for loading software ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;We advise '''''against''''' loading modules in your .bashrc.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;This could lead to very confusing behaviour under certain circumstances.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Instead, load modules by hand when needed, or by sourcing a separate script.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Load run-specific modules inside your job submission script.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Short names give default versions; e.g. &amp;lt;code&amp;gt;intel&amp;lt;/code&amp;gt; &amp;lt;math&amp;gt;\rightarrow&amp;lt;/math&amp;gt; &amp;lt;code&amp;gt;intel/2018.2&amp;lt;/code&amp;gt;.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;It is usually better to be explicit about the versions, for future reproducibility.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Handy abbreviations:&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre class=&amp;quot;sh&amp;quot;&amp;gt;        ml → module list&lt;br /&gt;
        ml NAME → module load NAME&lt;br /&gt;
        ml X → module X&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Modules sometimes require other modules to be loaded first.&amp;lt;br /&amp;gt;&lt;br /&gt;
Solve these dependencies by using &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Module spider ===&lt;br /&gt;
&lt;br /&gt;
Oddly named, the module subcommand spider is the search-and-advice facility for modules.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module load openmpi&lt;br /&gt;
Lmod has detected the error:  These module(s) exist but cannot be loaded as requested: &amp;quot;openmpi&amp;quot;&lt;br /&gt;
   Try: &amp;quot;module spider openmpi&amp;quot; to see how to load the module(s).&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module spider openmpi&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
  openmpi:&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
     Versions:&lt;br /&gt;
        openmpi/2.1.3&lt;br /&gt;
        openmpi/3.0.1&lt;br /&gt;
        openmpi/3.1.0rc3&lt;br /&gt;
&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
  For detailed information about a specific &amp;quot;openmpi&amp;quot; module (including how to load the modules) use&lt;br /&gt;
  the module s full name.&lt;br /&gt;
  For example:&lt;br /&gt;
&lt;br /&gt;
     $ module spider openmpi/3.1.0rc3&lt;br /&gt;
------------------------------------------------------------------------------------------------------&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module spider openmpi/3.1.0rc3&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
  openmpi: openmpi/3.1.0rc3&lt;br /&gt;
------------------------------------------------------------------------------------------------------&lt;br /&gt;
    You will need to load all module(s) on any one of the lines below before the &amp;quot;openmpi/3.1.0rc3&amp;quot;&lt;br /&gt;
    module is available to load.&lt;br /&gt;
&lt;br /&gt;
      NiaEnv/2018a  gcc/7.3.0&lt;br /&gt;
      NiaEnv/2018a  intel/2018.2&lt;br /&gt;
 &amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module load NiaEnv/2018a  intel/2018.2   # note: NiaEnv is usually already loaded&lt;br /&gt;
nia-login07:~$ module load openmpi/3.1.0rc3&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module list&lt;br /&gt;
Currently Loaded Modules:&lt;br /&gt;
  1) NiaEnv/2018a (S)   2) intel/2018.2   3) openmpi/3.1.0.rc3&lt;br /&gt;
&lt;br /&gt;
  Where:&lt;br /&gt;
   S:  Module is Sticky, requires --force to unload or purge&amp;lt;/source&amp;gt;&lt;br /&gt;
== Can I Run Commercial Software? ==&lt;br /&gt;
&lt;br /&gt;
* Possibly, but you have to bring your own license for it.&lt;br /&gt;
* SciNet and Compute Canada have an extremely large and broad user base of thousands of users, so we cannot provide licenses for everyone's favorite software.&lt;br /&gt;
* Thus, the only commercial software installed and accessible is software that can benefit everyone: Compilers, math libraries and debuggers.&lt;br /&gt;
* That means no Matlab, Gaussian, IDL, &lt;br /&gt;
* Open source alternatives like Octave, Python, R are available.&lt;br /&gt;
* We are happy to help you to install commercial software for which you have a license.&lt;br /&gt;
* In some cases, if you have a license, you can use software in the Compute Canada stack.&lt;br /&gt;
&lt;br /&gt;
== Compiling on Niagara: Example ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ module list&lt;br /&gt;
Currently Loaded Modules:&lt;br /&gt;
  1) NiaEnv/2018a (S)&lt;br /&gt;
  Where:&lt;br /&gt;
   S:  Module is Sticky, requires --force to unload or purge&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ module load intel/2018.2 gsl/2.4&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ ls&lt;br /&gt;
main.c module.c&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ icc -c -O3 -xHost -o main.o main.c&lt;br /&gt;
nia-login07:~$ icc -c -O3 -xHost -o module.o module.c&lt;br /&gt;
nia-login07:~$ icc  -o main module.o main.o -lgsl -mkl&lt;br /&gt;
&lt;br /&gt;
nia-login07:~$ ./main&amp;lt;/source&amp;gt;&lt;br /&gt;
== Testing ==&lt;br /&gt;
&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Small test jobs can be run on the login nodes.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Rule of thumb: couple of minutes, taking at most about 1-2GB of memory, couple of cores.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;You can run the the ddt debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Short tests that do not fit on a login node, or for which you need a dedicated node, request an&amp;lt;br /&amp;gt;&lt;br /&gt;
interactive debug job with the salloc command&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ salloc -pdebug --nodes N --time=1:00:00&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;where N  is the number of nodes. The duration of your interactive debug session can be at most one hour, can use at most  N nodes, and each user can only have one such session at a time.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Submitting jobs ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Niagara uses SLURM as its job scheduler.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;You submit jobs from a login node by passing a script to the sbatch command:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ sbatch jobscript.sh&amp;lt;/source&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;This puts the job in the queue. It will run on the compute nodes in due course.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Jobs will run under their group's RRG allocation, or, if the group has none, under a RAS allocation (previously called `default' allocation).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Keep in mind:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Scheduling is by node, so in multiples of 40-cores.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Maximum walltime is 24 hours.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Jobs must write to your scratch or project directory (home is read-only on compute nodes).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Compute nodes have no internet access.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Download data you need beforehand on a login node.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example submission script (OpenMP) ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --cpus-per-task=40&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --job-name openmp_job&lt;br /&gt;
#SBATCH --output=openmp_output_%j.txt&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load intel/2018.2&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
srun ./openmp_example&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ sbatch openmp_job.sh&amp;lt;/source&amp;gt;&lt;br /&gt;
* First line indicates that this is a bash script.&lt;br /&gt;
* Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&lt;br /&gt;
* sbatch reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;openmp_job&amp;lt;/code&amp;gt;) .&lt;br /&gt;
* In this case, SLURM looks for one node with 40 cores to be run inside one task, for 1 hour.&lt;br /&gt;
* Once it found such a node, it runs the script:&lt;br /&gt;
** Change to the submission directory;&lt;br /&gt;
** Loads modules;&lt;br /&gt;
** Sets an environment variable;&lt;br /&gt;
** Runs the &amp;lt;code&amp;gt;openmp_example&amp;lt;/code&amp;gt; application.&lt;br /&gt;
&lt;br /&gt;
=== Example submission script (MPI) ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;#!/bin/bash &lt;br /&gt;
#SBATCH --nodes=8&lt;br /&gt;
#SBATCH --ntasks=320&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --job-name mpi_job&lt;br /&gt;
#SBATCH --output=mpi_output_%j.txt&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load intel/2018.2&lt;br /&gt;
module load openmpi/3.1.0rc3&lt;br /&gt;
&lt;br /&gt;
srun ./mpi_example&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;nia-login07:~$ sbatch mpi_job.sh&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;First line indicates that this is a bash script.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;sbatch reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;mpi_job&amp;lt;/code&amp;gt;)&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;In this case, SLURM looks for 8 nodes with 40 cores on which to run 320 tasks, for 1 hour.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Once it found such a node, it runs the script:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Change to the submission directory;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Loads modules;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Runs the &amp;lt;code&amp;gt;mpi_example&amp;lt;/code&amp;gt; application.&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Monitoring queued jobs ==&lt;br /&gt;
&lt;br /&gt;
Once the job is incorporated into the queue, there are some command you can use to monitor its progress.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; to show the job queue (&amp;lt;code&amp;gt;squeue -u $USER&amp;lt;/code&amp;gt; for just your jobs);&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue -j JOBID&amp;lt;/code&amp;gt; to get information on a specific job&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;(alternatively, &amp;lt;code&amp;gt;scontrol show job JOBID&amp;lt;/code&amp;gt;, which is more verbose).&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue -j JOBID -o &amp;amp;quot;%.9i %.9P %.8j %.8u %.2t %.10M %.6D %S&amp;amp;quot;&amp;lt;/code&amp;gt; to get an estimate for when a job will run.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;scancel -i JOBID&amp;lt;/code&amp;gt; to cancel the job.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;sinfo -pcompute&amp;lt;/code&amp;gt; to look at available nodes.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;More utilities like those that were available on the GPC are under development.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Data Management and I/O Tips ==&lt;br /&gt;
&lt;br /&gt;
* $HOME, $SCRATCH, and $PROJECT all use the parallel file system called GPFS.&lt;br /&gt;
* Your files can be seen on all Niagara login and compute nodes.&lt;br /&gt;
* GPFS is a high-performance file system which provides rapid reads and writes to large data sets in parallel from many nodes.&lt;br /&gt;
* But accessing data sets which consist of many, small files leads to poor performance.&lt;br /&gt;
* Avoid reading and writing lots of small amounts of data to disk.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Many small files on the system would waste space and would be slower to access, read and write.&lt;br /&gt;
* Write data out in binary. Faster and takes less space.&lt;br /&gt;
* Burst buffer (to come) is better for i/o heavy jobs and to speed up checkpoints.&lt;br /&gt;
&lt;br /&gt;
== Further information ==&lt;br /&gt;
&lt;br /&gt;
=== Useful sites ===&lt;br /&gt;
&lt;br /&gt;
* SciNet: https://www.scinet.utoronto.ca&lt;br /&gt;
* Niagara: https://docs.computecanada.ca/wiki/niagara&lt;br /&gt;
* System Status: https://wiki.scinet.utoronto.ca/wiki/index.php/System_Alerts&lt;br /&gt;
* Training: https://support.scinet.utoronto.ca/education&lt;br /&gt;
&lt;br /&gt;
=== Support ===&lt;br /&gt;
&lt;br /&gt;
* support@scinet.utoronto.ca&lt;br /&gt;
* niagara@computecanada.ca&lt;/div&gt;</summary>
		<author><name>Dgruner</name></author>
	</entry>
</feed>