<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://docs.scinet.utoronto.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Ymeiron</id>
	<title>SciNet Users Documentation - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://docs.scinet.utoronto.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Ymeiron"/>
	<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php/Special:Contributions/Ymeiron"/>
	<updated>2026-07-05T00:40:04Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.12</generator>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7755</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7755"/>
		<updated>2026-06-10T18:32:27Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Up3 | OnDemand|https://docs.alliancecan.ca/wiki/Trillium_Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | HPSS|HPSS}}&lt;br /&gt;
|{{Up | Balam|Balam}}&lt;br /&gt;
|{{Up | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Announcement: Trillium AI Expansion Installation Maintenance Shutdowns'''&lt;br /&gt;
&lt;br /&gt;
# '''May 26/27, 2026:''' Shutdown of the Trillium compute nodes and HPSS, starting at 4 AM EDT on May 26th. The Trillium login nodes as well as OnDemand, the Teach cluster, and Balam, will remain available during this maintenance. HPSS will be back in service later on the same day (May 26th), while the Trillium compute nodes are expected to be back in service at some time on May 27th.&lt;br /&gt;
# '''June 9/10, 2026:''' A multi-day chiller maintenance; this will involve a shutdown of all systems. &lt;br /&gt;
# '''June 22-25, 2026:''' A four-day full datacentre shutdown.&lt;br /&gt;
# '''mid July:''' A two-week long shutdown of Trillium compute nodes will be required in the summer. During this time the Trillium login nodes, storage and Open Ondemand will remain up.&lt;br /&gt;
&lt;br /&gt;
More information and precise dates for these last three maintenance shutdowns will be announced later.&lt;br /&gt;
&lt;br /&gt;
'''Wed Apr 30, 2026, 3:00 pm:''' System have been updated to mitigate known security risks, and are back in service. Note that no actual security breaches were found.&lt;br /&gt;
&lt;br /&gt;
'''Wed Apr 29, 2026, 5:25 pm:''' For security reasons, login access to all systems has been disabled, as have OnDemand Apps.  Compute jobs are still allowed to run.&lt;br /&gt;
 &lt;br /&gt;
'''Thu Apr 23, 2026, 10:00 am:''' The Trillium file system has issues and may be slow on certain nodes. We are still investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7697</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7697"/>
		<updated>2026-04-30T19:24:17Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Down3 | OnDemand|https://docs.alliancecan.ca/wiki/Trillium_Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Down | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down | HPSS|HPSS}}&lt;br /&gt;
|{{up | Balam|Balam}}&lt;br /&gt;
|{{Up | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{up | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Wed Apr 30, 2026, 3:00 pm:''' Most systems are back.&lt;br /&gt;
&lt;br /&gt;
'''Wed Apr 29, 2026, 5:25 pm:''' For security reasons, login access to all systems has been disabled, as have OnDemand Apps.  Compute jobs are still allowed to run.&lt;br /&gt;
 &lt;br /&gt;
'''Thu Apr 23, 2026, 10:00 am:''' The Trillium file system has issues and may be slow on certain nodes. We are still investigating.&lt;br /&gt;
&lt;br /&gt;
'''Tue Apr 14, 2026, 10:30 am:''' The Trillium globus endpoint is working again.&lt;br /&gt;
&lt;br /&gt;
'''Sat Apr 11, 2026, 10:00 pm:''' The Trillium globus endpoint is not operational (it times out). We are investigating.&lt;br /&gt;
&lt;br /&gt;
'''Thu Apr 09, 2026, 10:30 am:''' tri-dm4.scinet.utoronto.ca and robot4.scinet.utoronto.ca are in maintenance. Use 1,2, or 3 instead.&lt;br /&gt;
&lt;br /&gt;
'''Wed Apr 08, 2026, 6:30 pm:''' Software from the CVMFS 'restricted' area is back.&lt;br /&gt;
&lt;br /&gt;
'''Wed Apr 08, 2026, 1:40 pm:''' Software from the CVMFS 'restricted' area is not available on many of the Trillium and Open OnDemand nodes. We are investigating.&lt;br /&gt;
&lt;br /&gt;
'''Mon Apr 06, 2026, 10:00 pm:''' We will have to reschedule the HPSS update. This attempt didn't work as expected&lt;br /&gt;
&lt;br /&gt;
'''Mon Apr 06, 2026, 8:00 pm:''' HPSS scheduled maintenance: update of HPSS to v11.3_u4 and hsi-htar to v11.3_u1 (bug fixes)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7616</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7616"/>
		<updated>2026-03-25T20:51:34Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Partial3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Up3 | OnDemand|https://docs.alliancecan.ca/wiki/Trillium_Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down | HPSS|HPSS}}&lt;br /&gt;
|{{Partial | Balam|Balam}}&lt;br /&gt;
|{{Up | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Wed Mar 25, 2026, 9:00 pm:''' Teach is operational again.&lt;br /&gt;
&lt;br /&gt;
'''Tue Mar 24, 2026, 8:45 pm:''' Open OnDemand is operational again.&lt;br /&gt;
&lt;br /&gt;
'''Tue Mar 24, 2026, 1:00 pm:''' External connectivity is back. &lt;br /&gt;
&lt;br /&gt;
'''Tue Mar 24, 2026, 12:05 pm:''' External connectivity to the data centre was lost. &lt;br /&gt;
&lt;br /&gt;
'''Tue Mar 24, 2026, 7:00 am:''' Maintenance has started.&lt;br /&gt;
&lt;br /&gt;
'''Mon Mar 16, 2026. 13:30pm''' Recovering.  Almost all systems are up again. Please resubmit your jobs that crashed. &lt;br /&gt;
&lt;br /&gt;
'''Mon Mar 16, 2026. 12:00pm''' Power glitch at the data centre caused compute nodes to go down.&lt;br /&gt;
&lt;br /&gt;
'''Thu Mar 12, 2026, 4:15 pm''' Connection to Trillium are operational again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Mar 12, 2026, 1:00 pm''' We've had some login issues particularly for Trillium-GPU. We're investigating.&lt;br /&gt;
&lt;br /&gt;
'''Downtime Announcement:'''  The winter cooling tower maintenance for the SciNet data centre will take place on March 24 and 25, 2026, starting at 7:00 a.m. on the 24th.  All SciNet systems (Trillium, OnDemand, Balam, S4H, Teach, as well as hosted equipment) will have their compute nodes shut down. Login nodes, file systems, and the HPSS system will remain available, and&lt;br /&gt;
jobs will be held in the queue until maintenance is complete.  Starting 7am on Mar 23, users are encouraged to submit small and short jobs that may be scheduled before the maintenance begins.&lt;br /&gt;
&lt;br /&gt;
'''Fri Feb 20, 2026, 11:35 pm:''' Power glitch, ~480 compute nodes rebooted. Regional power quality has been quite poor lately ([https://www.yorkregion.com/news/road-salt-blamed-for-power-outages/article_1a36d25d-5f97-56ee-a0c7-c49c7b732d38.html 1],&lt;br /&gt;
[https://www.yorkregion.com/news/power-company-executive-responds-to-york-region-outages/article_c4d072e7-2892-5c9c-8deb-ac5e1936779c.html 2]).&lt;br /&gt;
&lt;br /&gt;
'''Thu Feb 19, 2026, 3:00 pm:''' Systems restored. Please report issues to support@scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
'''Tue Feb 17, 2026, 8:40 am:''' Power outage at the data centre.  Cooling issues have developed as a result.  Major systems (Trillium, S4H) are expected to be down until sometime Thursday. Login nodes and file systems will remain accessible.&lt;br /&gt;
&lt;br /&gt;
'''Mon Feb 16, 2026, 8:40 pm:''' Electricity is unstable in the data centre area due to severe snowfall.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 1:40 pm:''' All services are operational again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 12:00 pm:''' The Trillium and Open OnDemand compute nodes are operational again. We are still working on bringing Balam, Neptune and S4H nodes up again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 10:00 am:''' There was a power glitch at the data centre overnight. The login nodes are accessible but the compute nodes are down.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7598</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7598"/>
		<updated>2026-03-24T14:00:43Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Partial3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Partial3 | OnDemand|https://docs.alliancecan.ca/wiki/Trillium_Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down | HPSS|HPSS}}&lt;br /&gt;
|{{Down | Balam|Balam}}&lt;br /&gt;
|{{Down | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Partial | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Tue Mar 24, 2026, 7:00 am:''' Maintenance has started.&lt;br /&gt;
&lt;br /&gt;
'''Mon Mar 16, 2026. 13:30pm''' Recovering.  Almost all systems are up again. Please resubmit your jobs that crashed. &lt;br /&gt;
&lt;br /&gt;
'''Mon Mar 16, 2026. 12:00pm''' Power glitch at the data centre caused compute nodes to go down.&lt;br /&gt;
&lt;br /&gt;
'''Thu Mar 12, 2026, 4:15 pm''' Connection to Trillium are operational again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Mar 12, 2026, 1:00 pm''' We've had some login issues particularly for Trillium-GPU. We're investigating.&lt;br /&gt;
&lt;br /&gt;
'''Downtime Announcement:'''  The winter cooling tower maintenance for the SciNet data centre will take place on March 24 and 25, 2026, starting at 7:00 a.m. on the 24th.  All SciNet systems (Trillium, OnDemand, Balam, S4H, Teach, as well as hosted equipment) will have their compute nodes shut down. Login nodes, file systems, and the HPSS system will remain available, and&lt;br /&gt;
jobs will be held in the queue until maintenance is complete.  Starting 7am on Mar 23, users are encouraged to submit small and short jobs that may be scheduled before the maintenance begins.&lt;br /&gt;
&lt;br /&gt;
'''Fri Feb 20, 2026, 11:35 pm:''' Power glitch, ~480 compute nodes rebooted. Regional power quality has been quite poor lately ([https://www.yorkregion.com/news/road-salt-blamed-for-power-outages/article_1a36d25d-5f97-56ee-a0c7-c49c7b732d38.html 1],&lt;br /&gt;
[https://www.yorkregion.com/news/power-company-executive-responds-to-york-region-outages/article_c4d072e7-2892-5c9c-8deb-ac5e1936779c.html 2]).&lt;br /&gt;
&lt;br /&gt;
'''Thu Feb 19, 2026, 3:00 pm:''' Systems restored. Please report issues to support@scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
'''Tue Feb 17, 2026, 8:40 am:''' Power outage at the data centre.  Cooling issues have developed as a result.  Major systems (Trillium, S4H) are expected to be down until sometime Thursday. Login nodes and file systems will remain accessible.&lt;br /&gt;
&lt;br /&gt;
'''Mon Feb 16, 2026, 8:40 pm:''' Electricity is unstable in the data centre area due to severe snowfall.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 1:40 pm:''' All services are operational again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 12:00 pm:''' The Trillium and Open OnDemand compute nodes are operational again. We are still working on bringing Balam, Neptune and S4H nodes up again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 10:00 am:''' There was a power glitch at the data centre overnight. The login nodes are accessible but the compute nodes are down.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7547</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7547"/>
		<updated>2026-02-17T22:44:23Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Up3 | OnDemand|https://docs.alliancecan.ca/wiki/Trillium_Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | HPSS|HPSS}}&lt;br /&gt;
|{{Up | Balam|Balam}}&lt;br /&gt;
|{{Up | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
'''Tue Feb 17, 2026, 8:40 am:''' Power outage at the data centre.  Cooling issues have developed as a result.  Major systems (Trillium, S4H) are expected to be down until sometime Thursday. Login nodes and file systems will remain accessible.&lt;br /&gt;
&lt;br /&gt;
'''Mon Feb 16, 2026, 8:40 pm:''' Electricity is unstable in the data centre area due to severe snowfall.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 1:40 pm:''' All services are operational again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 12:00 pm:''' The Trillium and Open OnDemand compute nodes are operational again. We are still working on bringing Balam, Neptune and S4H nodes up again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 29, 2026, 10:00 am:''' There was a power glitch at the data centre overnight. The login nodes are accessible but the compute nodes are down.  &lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 16, 2026, 11:00 pm:''' HPSS is back online, and accessible via alliancecan#hpss Globus endpoint. &lt;br /&gt;
&lt;br /&gt;
'''Thu Jan 15, 2026, 10:00 pm:''' HPSS will undergo maintenance on Friday morning, Jan/16/2025, , including alliancecan#hpss Globus endpoint &lt;br /&gt;
&lt;br /&gt;
'''Tue Jan 6, 2026, 10:15 am:''' OnDemand has been fixed and is working again.&lt;br /&gt;
&lt;br /&gt;
'''Mon Jan 5, 2026, 9:00 pm:''' The authentication mechanism of OnDemand is not working.&lt;br /&gt;
&lt;br /&gt;
'''Wed Dec 31, 2025, 12:40 pm:''' We believe the problem has now been resolved.  Please let us know if you still experience login problems or aborted jobs.&lt;br /&gt;
&lt;br /&gt;
'''Tue Dec 30, 2025, 2:10 pm:''' We are experiencing problems with authentication, resulting in failed logins, OOD errors, and aborted jobs (with &amp;quot;prolog error&amp;quot;).  Please bear with us, as we are very short-staffed during the holiday break.  We will post updates here.&lt;br /&gt;
&lt;br /&gt;
'''Tue Dec 3, 2025, 11:30 am:''' Open OnDemand is fully operational again.&lt;br /&gt;
&lt;br /&gt;
'''Sat Nov 29, 2025, 00:40 am:''' There has been a problem with the water chiller. Some systems are offline.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 12:55 pm:''' Balam is back online.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 10:00 am:''' Open OnDemand is back online.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 11:00 pm:''' Most of the work is done, data movers, Globus, and HPSS are back online. Remaining services will be worked on tomorrow.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 8:30 am:''' Scheduled network maintenance. Trillium cluster is *not* affected.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 17:30 am:''' Balam maintenance finished.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 7:00 am:''' Balam maintenance day.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:55 pm:''' Trillium inbound connections through trillium.alliancecan.ca or trillium.scinet.utoronto.ca are working again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:05 pm:''' Trillium is experiencing external network issues for both incoming traffic. Please try: ssh USERNAME@tri-login01.scinet.utoronto.ca in the meantime.&lt;br /&gt;
 &lt;br /&gt;
'''Thu Oct 06, 2025, 8:00 pm:''' HPSS is fully functional. You may submit archive jobs from trillium login nodes, datamovers and robots.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 03, 2025, 6:30 pm:''' HPSS is back online, and already accessible via alliancecan#hpss Globus endpoint. Directory tree now follows the other Alliance clusters. We're still working on job submission via Slurm&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 0:00 am:''' Niagara compute nodes are now unavailable for regular users. The login nodes will remain available for a while to allow a few last data transfers, although transfers from the Niagara file systems to Trillium are best done on nia-dm1.scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 9:30 am:''' HPSS is down for scheduled maintenance, including alliancecan#hpss Globus endpoint&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7364</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7364"/>
		<updated>2025-12-05T22:36:49Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Up | OnDemand|Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | HPSS|HPSS}}&lt;br /&gt;
|{{Up | Balam|Balam}}&lt;br /&gt;
|{{Up | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Tue Dec 3, 2025, 11:30 am:''' Open OnDemand is fully operational again.&lt;br /&gt;
&lt;br /&gt;
'''Sat Nov 29, 2025, 00:40 am:''' There has been a problem with the water chiller. Some systems are offline.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 12:55 pm:''' Balam is back online.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 10:00 am:''' Open OnDemand is back online.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 11:00 pm:''' Most of the work is done, data movers, Globus, and HPSS are back online. Remaining services will be worked on tomorrow.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 8:30 am:''' Scheduled network maintenance. Trillium cluster is *not* affected.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 17:30 am:''' Balam maintenance finished.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 7:00 am:''' Balam maintenance day.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:55 pm:''' Trillium inbound connections through trillium.alliancecan.ca or trillium.scinet.utoronto.ca are working again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:05 pm:''' Trillium is experiencing external network issues for both incoming traffic. Please try: ssh USERNAME@tri-login01.scinet.utoronto.ca in the meantime.&lt;br /&gt;
 &lt;br /&gt;
'''Thu Oct 06, 2025, 8:00 pm:''' HPSS is fully functional. You may submit archive jobs from trillium login nodes, datamovers and robots.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 03, 2025, 6:30 pm:''' HPSS is back online, and already accessible via alliancecan#hpss Globus endpoint. Directory tree now follows the other Alliance clusters. We're still working on job submission via Slurm&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 0:00 am:''' Niagara compute nodes are now unavailable for regular users. The login nodes will remain available for a while to allow a few last data transfers, although transfers from the Niagara file systems to Trillium are best done on nia-dm1.scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 9:30 am:''' HPSS is down for scheduled maintenance, including alliancecan#hpss Globus endpoint&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7361</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7361"/>
		<updated>2025-12-05T15:19:30Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Up | OnDemand|Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | HPSS|HPSS}}&lt;br /&gt;
|{{Up | Balam|Balam}}&lt;br /&gt;
|{{Down | S4H | S4H}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down | Teach|Teach}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Tue Dec 3, 2025, 11:30 am:''' Open OnDemand is fully operational again.&lt;br /&gt;
&lt;br /&gt;
'''Sat Nov 29, 2025, 00:40 am:''' There has been a problem with the water chiller. Some systems are offline.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 12:55 pm:''' Balam is back online.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 10:00 am:''' Open OnDemand is back online.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 11:00 pm:''' Most of the work is done, data movers, Globus, and HPSS are back online. Remaining services will be worked on tomorrow.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 8:30 am:''' Scheduled network maintenance. Trillium cluster is *not* affected.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 17:30 am:''' Balam maintenance finished.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 7:00 am:''' Balam maintenance day.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:55 pm:''' Trillium inbound connections through trillium.alliancecan.ca or trillium.scinet.utoronto.ca are working again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:05 pm:''' Trillium is experiencing external network issues for both incoming traffic. Please try: ssh USERNAME@tri-login01.scinet.utoronto.ca in the meantime.&lt;br /&gt;
 &lt;br /&gt;
'''Thu Oct 06, 2025, 8:00 pm:''' HPSS is fully functional. You may submit archive jobs from trillium login nodes, datamovers and robots.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 03, 2025, 6:30 pm:''' HPSS is back online, and already accessible via alliancecan#hpss Globus endpoint. Directory tree now follows the other Alliance clusters. We're still working on job submission via Slurm&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 0:00 am:''' Niagara compute nodes are now unavailable for regular users. The login nodes will remain available for a while to allow a few last data transfers, although transfers from the Niagara file systems to Trillium are best done on nia-dm1.scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 9:30 am:''' HPSS is down for scheduled maintenance, including alliancecan#hpss Globus endpoint&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7343</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=7343"/>
		<updated>2025-12-02T21:38:44Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added S4H&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up3 | Trillium|https://docs.alliancecan.ca/wiki/Trillium_Quickstart}}&lt;br /&gt;
|{{Down | Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down | Teach|Teach}}&lt;br /&gt;
|{{Down | Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Partial | OnDemand|Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up3 | Scheduler|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Submitting_jobs_to_the_scheduler}}&lt;br /&gt;
|{{Up3 | File system|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Storage}}&lt;br /&gt;
|{{Up | Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | HPSS|HPSS}}&lt;br /&gt;
|{{Up3 | Login Nodes|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up3 | External Network|https://docs.alliancecan.ca/wiki/Trillium_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up  | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | Balam|Balam}}&lt;br /&gt;
|{{Up3 | Cvmfs|https://docs.alliancecan.ca/wiki/Standard_software_environments}}&lt;br /&gt;
|{{Up | S4H | S4H}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
'''Sat Nov 29, 2025, 00:40 am:''' There has been a problem with the water chiller. Some systems are offline.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 12:55 pm:''' Balam is back online.&lt;br /&gt;
&lt;br /&gt;
'''Wed Nov 5, 2025, 10:00 am:''' Open OnDemand is back online.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 11:00 pm:''' Most of the work is done, data movers, Globus, and HPSS are back online. Remaining services will be worked on tomorrow.&lt;br /&gt;
&lt;br /&gt;
'''Tue Nov 4, 2025, 8:30 am:''' Scheduled network maintenance. Trillium cluster is *not* affected.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 17:30 am:''' Balam maintenance finished.&lt;br /&gt;
&lt;br /&gt;
'''Tue Oct 21, 2025, 7:00 am:''' Balam maintenance day.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:55 pm:''' Trillium inbound connections through trillium.alliancecan.ca or trillium.scinet.utoronto.ca are working again.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 15, 2025, 3:05 pm:''' Trillium is experiencing external network issues for both incoming traffic. Please try: ssh USERNAME@tri-login01.scinet.utoronto.ca in the meantime.&lt;br /&gt;
 &lt;br /&gt;
'''Thu Oct 06, 2025, 8:00 pm:''' HPSS is fully functional. You may submit archive jobs from trillium login nodes, datamovers and robots.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 03, 2025, 6:30 pm:''' HPSS is back online, and already accessible via alliancecan#hpss Globus endpoint. Directory tree now follows the other Alliance clusters. We're still working on job submission via Slurm&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 0:00 am:''' Niagara compute nodes are now unavailable for regular users. The login nodes will remain available for a while to allow a few last data transfers, although transfers from the Niagara file systems to Trillium are best done on nia-dm1.scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
'''Thu Oct 01, 2025, 9:30 am:''' HPSS is down for scheduled maintenance, including alliancecan#hpss Globus endpoint&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium Quickstart]&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=S4H&amp;diff=7340</id>
		<title>S4H</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=S4H&amp;diff=7340"/>
		<updated>2025-12-02T21:36:40Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Page created&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Introduction =&lt;br /&gt;
S4H (formerly SciNet4Health) is our secure computing environment pilot, providing users with the ability to run [https://docs.alliancecan.ca/wiki/Trillium_Quickstart Trillium] jobs on confidential data. This subsystem is comprised of a dedicated login node and a storage appliance, but it is highly integrated with Trillium. Security concerns are addressed by&lt;br /&gt;
&lt;br /&gt;
* Hardened access&lt;br /&gt;
* Encryption at rest&lt;br /&gt;
* Group isolation&lt;br /&gt;
* Data egress control (optional)&lt;br /&gt;
&lt;br /&gt;
Usage of S4H is by request only. Access must be requested by a principle investigator (PI) on behalf of their group members (i.e. sponsored users on CCDB).&lt;br /&gt;
&lt;br /&gt;
= Policies =&lt;br /&gt;
Each user is assigned one of three policies:&lt;br /&gt;
&lt;br /&gt;
* '''Permissive:''' the user may connect to the login node using SSH from pre-approved source IP addresses, and has unrestricted internet access from the login node&lt;br /&gt;
* '''Restrictive:''' the user may connect to the login node using SSH from pre-approved source IP addresses, but internet access from the login node is restricted&lt;br /&gt;
* '''Prohibitive:''' the user may only connect to the login node using a remote desktop client program from pre-approved source IP addresses, and internet access from the login node is restricted&lt;br /&gt;
&lt;br /&gt;
If you don't know what policy you belong to, you should ask your PI.&lt;br /&gt;
&lt;br /&gt;
= Login =&lt;br /&gt;
== Direct ==&lt;br /&gt;
Users with permission to connect directly to the login node (permissive and restrictive policies) should first make sure that they are able to login to Trillium (i.e. they have uploaded an SSH public key to, and set up second factor authentication on CCDB). If access to Trillium is successful, use the same username and SSH key to login to the following address:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
s4h.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You should be prompted for the second factor, like in Trillium.&lt;br /&gt;
&lt;br /&gt;
The connection must be made from one of the '''IP addresses pre-approved by the PI''' for that user (e.g. a workstation or a jump host in your lab).&lt;br /&gt;
&lt;br /&gt;
== Through the graphical gateway ==&lt;br /&gt;
Users with permission to connect through the graphical gateway should use an RDP-enabled remote desktop client and login to the following address using their CCDB username and password:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
s4h-ggw.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to set the display resolution to 1600×900 if the resolution is not picked up automatically by the program.&lt;br /&gt;
&lt;br /&gt;
Additionally, a &amp;quot;pre-login&amp;quot; step has to be performed. In this step, an SSH ''agent'' must be forwarded to the above address. The graphical gateway asks the user's SSH agent program to perform the authentication (i.e. the workstation where the user is connecting from has the private key and it has been added to the agent). In a sense this is a 3-factor authentication: one needs the password, the SSH private key, and have either a YubiKey or the Duo mobile app registered with CCDB. Here is an example of this process:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
eval $(ssh-agent)&lt;br /&gt;
ssh-add /home/alice/.ssh/ccdb_ed25519&lt;br /&gt;
ssh -T -A alice@s4h-ggw.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that no shell access is expected after the &amp;lt;code&amp;gt;ssh&amp;lt;/code&amp;gt; command; but the window on the remote desktop program should now prompt for the YubiKey passcode or Duo mobile app push. Once that is done as well, the SSH client will print _Login successful_ and quit.&lt;br /&gt;
&lt;br /&gt;
The connection must be made from one of the '''IP addresses pre-approved by the PI''' for that user (e.g. a workstation or a jump host in your lab).&lt;br /&gt;
&lt;br /&gt;
= Storage =&lt;br /&gt;
== Directories ==&lt;br /&gt;
Trillium file systems are accessible via their usual paths but are read-only on S4H (to prevent accidentally saving sensitive data there). Instead, home, scratch, and project spaces are provided on alternative paths under &amp;lt;code&amp;gt;/s4h&amp;lt;/code&amp;gt; (indicating the encrypted storage appliance). If the user &amp;quot;alice&amp;quot; belongs to the group &amp;quot;def-bob&amp;quot; on S4H, their home directory (which can be expanded from &amp;lt;code&amp;gt;~&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt;) will be located in &amp;lt;code&amp;gt;/s4h/def-bob/home/alice&amp;lt;/code&amp;gt; and similarly their scratch directory (can be expanded from &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt;) will be in &amp;lt;code&amp;gt;/s4h/def-bob/scratch/alice&amp;lt;/code&amp;gt;. The project directory is &amp;lt;code&amp;gt;/s4h/def-bob/project&amp;lt;/code&amp;gt;, users may create their own directories there as needed.&lt;br /&gt;
&lt;br /&gt;
The environment variables &amp;lt;code&amp;gt;$TRIHOME&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;$TRISCRATCH&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;$TRIPROJECT&amp;lt;/code&amp;gt; expand to the corresponding file system paths for Trillium (as noted above, they are read-only on S4H)&lt;br /&gt;
&lt;br /&gt;
== Data transfer ==&lt;br /&gt;
For users in the permissive and restrictive policies, please use an SSH-based program (such as &amp;lt;code&amp;gt;scp&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;rsync&amp;lt;/code&amp;gt;) to transfer data directly in and out of the S4H login node. There is no dedicated datamover for S4H.&lt;br /&gt;
&lt;br /&gt;
Under the prohibitive policy, users may not transfer sensitive data in and out of S4H. They may only upload non-sensitive data to Trillium (storage not encrypted at rest) where they can be accessed from S4H. For egress purposes, the PI should designate at least one user (could be themselves) that is not under the prohibitive policy. Other users in the group could share files for egress with the designated user (e.g. by putting them in the group's project directory).&lt;br /&gt;
&lt;br /&gt;
== Data policies ==&lt;br /&gt;
It is important to understand that:&lt;br /&gt;
* Within a group, file access is managed by traditional POSIX permissions and access-control lists, like on Trillium. In case users in a group are working on separate sub-projects where there should not be mutual access, it is their responsibility to make sure that permissions are set up correctly.&lt;br /&gt;
* There is no way to facilitate cross-group file sharing of sensitive files on S4H; each group has a different encryption key and the system is set up so that a compute node can only use one key at a time.&lt;br /&gt;
* &amp;lt;span style=&amp;quot;color: red&amp;quot;&amp;gt;No backup is provided for encrypted storage;&amp;lt;/span&amp;gt; deletion is irreversible. This ensure that data are securely disposed of in compliance with a provision found in many data sharing agreements.&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
Same as Trillium.&lt;br /&gt;
&lt;br /&gt;
Note that you may use software (including, for example, Python virtual environments) that you installed in your Trillium file systems on S4H (but not vice versa). This could be useful for users under the restrictive or prohibitive policy, that may otherwise have difficulty installing software in their encrypted storage spaces.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
This is largely the same as Trillium. &amp;lt;span style=&amp;quot;color: red&amp;quot;&amp;gt;Note however that job metadata are not kept confidential!&amp;lt;/span&amp;gt; In particular, the submitting user, work directory, job name, comment, and command should be ''considered public information'' and the users must not include any sensitive information in these.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Trillium_Quickstart&amp;diff=6839</id>
		<title>Trillium Quickstart</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Trillium_Quickstart&amp;diff=6839"/>
		<updated>2025-08-07T21:28:41Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added another note about the address of the GPU system&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Trillium.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Trillium&lt;br /&gt;
|installed=Aug 2025&lt;br /&gt;
|operatingsystem= Rocky Linux 9.6&lt;br /&gt;
|loginnode= trillium.scinet.utoronto.ca, trillium-gpu.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  1284 nodes (240768 cores)&lt;br /&gt;
|rampernode= 768 GB&lt;br /&gt;
|corespernode= 192 (CPU nodes) and 96 (GPU nodes)&lt;br /&gt;
|interconnect=Mellanox Dragonfly+&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= System Overview =&lt;br /&gt;
&lt;br /&gt;
The Trillium system is a state-of-the-art high performance computing platform, consisting of three main components:&lt;br /&gt;
&lt;br /&gt;
1. CPU Subcluster&lt;br /&gt;
* ~240,000 cores across homogeneous CPU nodes  &lt;br /&gt;
* Non-blocking 400 Gb/s NDR InfiniBand interconnect  &lt;br /&gt;
* Ideal for large-scale parallel workloads  &lt;br /&gt;
&lt;br /&gt;
2. GPU Subcluster&lt;br /&gt;
* 61 GPU nodes, each with 4 x NVIDIA H100 (SXM) GPUs  &lt;br /&gt;
* 800 Gb/s bandwidth per node (200 Gb/s per GPU) over InfiniBand  &lt;br /&gt;
* Optimized for AI/ML and accelerated science workloads  &lt;br /&gt;
* Note: This subcluster is in high demand and not ideal for training extremely large models (multi-100B parameters)&lt;br /&gt;
* To access, SSH into &amp;lt;code&amp;gt;trillium-gpu.scinet.utoronto.ca&amp;lt;/code&amp;gt; from outside, or to &amp;lt;code&amp;gt;trig-login01&amp;lt;/code&amp;gt; from other Trillium nodes.&lt;br /&gt;
&lt;br /&gt;
3. Storage System&lt;br /&gt;
* Unified 29 PB VAST NVMe storage for all workloads  &lt;br /&gt;
* No tiering — all flash-based for consistent performance  &lt;br /&gt;
* Accessible via POSIX or S3 under a unified namespace  &lt;br /&gt;
&lt;br /&gt;
== Specifications ==&lt;br /&gt;
&lt;br /&gt;
The Trillium cluster is a large cluster comprised of two types of nodes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! nodes !! cores !! available memory !! CPU !! GPU&lt;br /&gt;
|-&lt;br /&gt;
| 1224 || 192 || 768GB DDR5 ||2 x AMD EPYC 9655 (Zen 5) @ 2.6 GHz, 384MB cache L3 ||&lt;br /&gt;
|-&lt;br /&gt;
|  60 || 96 || 768GB DDR5 || 1 x AMD EPYC 9654 (Zen 4) @ 2.4 GHz, 384MB cache L3 || 4 x NVIDIA H100 SXM (80 GB memory)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Each node of the cluster has 768 GB RAM per node.  Being designed for large parallel workloads, it has a fast interconnect consisting of NDR InfiniBand in a Dragonfly+ topology with Adaptive Routing. The compute nodes are accessed through a queueing system that allows jobs with a minimum of 15 minutes and a maximum of 24 hours.&lt;br /&gt;
&lt;br /&gt;
== Storage System ==&lt;br /&gt;
&lt;br /&gt;
Trillium features a unified high-performance storage system based on the VAST platform, with no tiering. It serves the following directories:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;/home&amp;lt;/code&amp;gt; – For personal files and configurations.&lt;br /&gt;
* &amp;lt;code&amp;gt;/scratch&amp;lt;/code&amp;gt; – High-speed, temporary storage for job data.&lt;br /&gt;
* &amp;lt;code&amp;gt;/project&amp;lt;/code&amp;gt; – Shared storage for project teams and collaborations.&lt;br /&gt;
&lt;br /&gt;
The storage is accessible via the NDR InfiniBand fabric for maximum performance across all workloads.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Trillium =&lt;br /&gt;
&lt;br /&gt;
Access to Trillium is not enabled automatically for everyone with an account with the {{DigitalResearchAllianceOfCanada}}, but anyone with an active Alliance account can get their access enabled.&lt;br /&gt;
 &lt;br /&gt;
Trillium is not automatically available to all Alliance account holders. If you are new to SciNet or your Supervisor/PI does not hold a current {{Alliance}} [https://alliancecan.ca/en/services/advanced-research-computing/research-portal/accessing-resources/resource-allocation-competitions RAC] allocation, you will need to request access on the [https://ccdb.alliancecan.ca/me/access_systems Access Systems] page on the CCDB site. After clicking the &amp;quot;I request access&amp;quot; button, it usually takes only one or two business days for access to be granted.&lt;br /&gt;
&lt;br /&gt;
You can check if you already have Trillium access by attempting to log in. If you receive a &amp;quot;Permission denied&amp;quot; error (and your SSH key is correctly set up), you may need to opt in.&lt;br /&gt;
&lt;br /&gt;
Please read this document carefully.  The [https://docs.scinet.utoronto.ca/index.php/FAQ FAQ] is also a useful resource.  If at any time you require assistance, or if something is unclear, please do not hesitate to [mailto:support@scinet.utoronto.ca contact us].&lt;br /&gt;
&lt;br /&gt;
== Logging in ==&lt;br /&gt;
&lt;br /&gt;
Trillium runs Rocky Linux 9.6, which is a type of Linux.  You will need to be familiar with Linux systems to work on Trillium.  If you are not it will be worth your time to review our [https://support.scinet.utoronto.ca/education/browse.php?category=-1&amp;amp;search=scmp101&amp;amp;include=all&amp;amp;filter=Filter Introduction to Linux Shell] class.&lt;br /&gt;
&lt;br /&gt;
As with all SciNet and {{Alliance}} compute systems, access to Trillium is done via [[SSH]] (secure shell) only and authentication is only allowed via SSH keys. [https://docs.alliancecan.ca/wiki/SSH_Keys Please refer to this page] to generate your SSH key pair and make sure you use them securely.&lt;br /&gt;
 &lt;br /&gt;
Open a terminal window (e.g. Connecting with [https://docs.alliancecan.ca/wiki/Connecting_with_PuTTY PuTTY] on Windows or Connecting with [https://docs.alliancecan.ca/wiki/Connecting_with_MobaXTerm MobaXTerm]), then SSH into the Trillium login nodes with your {{Alliance}} credentials:&lt;br /&gt;
&lt;br /&gt;
 $ ssh -i /path/to/ssh_private_key -Y MYALLIANCEUSERNAME@trillium.scinet.utoronto.ca&lt;br /&gt;
&lt;br /&gt;
* The Trillium login nodes are where you develop, edit, compile, prepare and submit jobs.&lt;br /&gt;
* These login nodes are not part of the Trillium compute cluster, but have the same architecture, operating system, and software stack.&lt;br /&gt;
* The optional &amp;lt;code&amp;gt;-Y&amp;lt;/code&amp;gt; enables X11 forwarding, allowing graphical programs to open windows on your local computer.&lt;br /&gt;
* To run on Trillium compute nodes, you must [[#Submitting_jobs | submit a batch job]].&lt;br /&gt;
&lt;br /&gt;
If you cannot log in, be sure to first check the [https://docs.scinet.utoronto.ca System Status] on this site's front page.&lt;br /&gt;
&lt;br /&gt;
Note: We plan to add browser access to Trillium via Open OnDemand in the future. In the meantime you can still access our existing Open OnDemand deployment by following the instructions in our [https://docs.scinet.utoronto.ca/index.php/Open_OnDemand_Quickstart quickstart guide].&lt;br /&gt;
&lt;br /&gt;
== Software Environment ==&lt;br /&gt;
&lt;br /&gt;
Trillium uses the '''environment modules''' system to manage compilers, libraries, and other software packages. Modules dynamically modify your environment (e.g., &amp;lt;code&amp;gt;PATH&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;LD_LIBRARY_PATH&amp;lt;/code&amp;gt;) so you can access different versions of software without conflicts.&lt;br /&gt;
&lt;br /&gt;
A detailed explanation can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Commonly used module commands:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Load the default version of a software package.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt; – Load a specific version.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt; – Unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt; – List available modules that can be loaded.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt; – Show currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Search for available modules and their versions.&lt;br /&gt;
&lt;br /&gt;
Handy abbreviations are available:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; – Equivalent to &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;.&lt;br /&gt;
* &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Equivalent to &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Tips for Loading Software ==&lt;br /&gt;
&lt;br /&gt;
Properly managing your software environment is key to avoiding conflicts and ensuring reproducibility. Here are some best practices:&lt;br /&gt;
&lt;br /&gt;
* Avoid loading modules in your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file. Doing so can cause unexpected behavior, particularly in non-interactive environments like batch jobs or remote shells. For more information, see our [[bashrc guidelines|.bashrc guidelines]].&lt;br /&gt;
&lt;br /&gt;
* Instead, load modules manually or from a separate script. This approach gives you more control and helps keep environments clean.&lt;br /&gt;
&lt;br /&gt;
* Load required modules inside your job submission script. This ensures that your job runs with the expected software environment, regardless of your interactive shell settings.&lt;br /&gt;
&lt;br /&gt;
* Be explicit about module versions. Short names like &amp;lt;code&amp;gt;gcc&amp;lt;/code&amp;gt; will load the system default (e.g., &amp;lt;code&amp;gt;gcc/12.3&amp;lt;/code&amp;gt;), which may change in the future. Specify full versions (e.g., &amp;lt;code&amp;gt;gcc/13.3&amp;lt;/code&amp;gt;) for long-term reproducibility.&lt;br /&gt;
&lt;br /&gt;
* Resolve dependencies with &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;. Some modules depend on others. Use &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; to discover which modules are required and how to load them in the correct order. For more, see [[Using_modules#Module_spider | Using &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
== Using Commercial Software ==&lt;br /&gt;
&lt;br /&gt;
You may be able to use commercial software on Trillium, but there are a few important considerations:&lt;br /&gt;
&lt;br /&gt;
* Bring your own license. You can use commercial software on Trillium if you have a valid license. If the software requires a license server, you can connect to it securely using [[SSH_Tunneling | SSH tunneling]].&lt;br /&gt;
&lt;br /&gt;
* SciNet and the {{Alliance}} do not provide user-specific licenses. Due to the large and diverse user base, we cannot provide licenses for individual or specialized commercial packages.&lt;br /&gt;
&lt;br /&gt;
* Freely available commercial tools. Some widely useful commercial tools are available system-wide, such as compilers, math libraries, debuggers.&lt;br /&gt;
&lt;br /&gt;
* Software not available (unless you bring your own license): tools like [[MATLAB]], Gaussian, and IDL are not provided centrally. If you have your own license, you are welcome to install and use them.&lt;br /&gt;
&lt;br /&gt;
* Open-source alternatives are available. Consider using freely available tools such as [[Python]], [[R]], and Octave, which are well-supported and widely used on the system.&lt;br /&gt;
&lt;br /&gt;
* We're here to help. If you have a valid license and need help installing commercial software, feel free to contact us, we'll assist where possible.&lt;br /&gt;
&lt;br /&gt;
A list of commercial software currently installed on Trillium (for which you must supply a license to use) is available on the [[Commercial_software | Commercial Software page]].&lt;br /&gt;
&lt;br /&gt;
= Technical Details =&lt;br /&gt;
&lt;br /&gt;
== Cooling and Energy Efficiency ==&lt;br /&gt;
&lt;br /&gt;
Trillium is fully direct liquid cooled using warm water (35–40 °C input), resulting in:&lt;br /&gt;
&lt;br /&gt;
* PUE below 1.03 (high energy efficiency)&lt;br /&gt;
* Use of closed-loop dry fluid coolers, avoiding evaporative towers and new water usage&lt;br /&gt;
* Heat reuse: Trillium supplies excess heat to nearby facilities to minimize climate impact&lt;br /&gt;
&lt;br /&gt;
== Storage System ==&lt;br /&gt;
&lt;br /&gt;
The VAST high-performance file system is comprised of a unified 29 PB NVMe-backed storage pool, with:&lt;br /&gt;
&lt;br /&gt;
* 29 PB effective capacity (deduplicated via VAST)&lt;br /&gt;
* 16.7 PB raw flash capacity&lt;br /&gt;
* 714 GB/s read bandwidth, 275 GB/s write bandwidth&lt;br /&gt;
* 10 million read IOPS, 2 million write IOPS&lt;br /&gt;
* POSIX and S3 access protocols under a unified namespace&lt;br /&gt;
* 48 C-Boxes and 14 D-Boxes for data services&lt;br /&gt;
&lt;br /&gt;
== Backup and Archive Storage ==&lt;br /&gt;
&lt;br /&gt;
An additional 114 PB HPSS tape-based archive is available for nearline storage:&lt;br /&gt;
&lt;br /&gt;
* Dual-copy archive across geographically separate libraries&lt;br /&gt;
* Used for both backup and archival purposes&lt;br /&gt;
* Backups are managed using Atempo backup software&lt;br /&gt;
&lt;br /&gt;
= Testing and Debugging =&lt;br /&gt;
&lt;br /&gt;
Before submitting your job to the cluster, it's important to test your code to ensure correctness and determine the resources it requires.&lt;br /&gt;
&lt;br /&gt;
* '''Lightweight tests''' can be run directly on the login nodes. As a rule of thumb, these should:&lt;br /&gt;
** Run in under a few minutes  &lt;br /&gt;
** Use no more than 1–2 GB of memory  &lt;br /&gt;
** Use only 1–2 CPU cores&lt;br /&gt;
&lt;br /&gt;
* You can also run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after loading it with: &amp;lt;code&amp;gt;module load ddt-cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* For short tests that exceed login node limits or require dedicated resources, request an interactive debug job using the &amp;lt;code&amp;gt;debugjob&amp;lt;/code&amp;gt; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:~$ debugjob --clean N&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Replace &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; with the number of nodes (1 to 4). If &amp;lt;code&amp;gt;N=1&amp;lt;/code&amp;gt;, you will get 1 hour of interactive time; with &amp;lt;code&amp;gt;N=4&amp;lt;/code&amp;gt; (the maximum), you will get 22 minutes.  &lt;br /&gt;
The &amp;lt;code&amp;gt;--clean&amp;lt;/code&amp;gt; flag is optional but recommended, as it starts the session with no modules loaded, better mimicking the clean environment of batch jobs.&lt;br /&gt;
&lt;br /&gt;
* If your test job requires more time than allowed by &amp;lt;code&amp;gt;debugjob&amp;lt;/code&amp;gt;, you can request an interactive session from the regular queue using &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt;:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:~$ salloc --nodes=N --time=M:00:00 --x11&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; is the number of nodes  &lt;br /&gt;
* &amp;lt;code&amp;gt;M&amp;lt;/code&amp;gt; is the number of hours the job should run  &lt;br /&gt;
* &amp;lt;code&amp;gt;--x11&amp;lt;/code&amp;gt; is required for graphical applications (e.g., when using [[Parallel Debugging with DDT|DDT]] or DDD)&lt;br /&gt;
&lt;br /&gt;
'''Note:''' Jobs submitted with &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; may take longer to start, as they are scheduled like any other batch job. See the [[Testing_With_Graphics|Testing with graphics]] page for more information on graphical testing options.&lt;br /&gt;
&lt;br /&gt;
= Submitting Jobs on the CPU Subcluster =&lt;br /&gt;
&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Trillium login nodes and confirmed that it behaves correctly, you are ready to submit jobs to the cluster. These jobs will run on Trillium's compute nodes, and their execution is managed by the SLURM scheduler.&lt;br /&gt;
&lt;br /&gt;
Trillium uses SLURM as its job scheduler. More advanced details of how to interact with the scheduler can be found on the [[Slurm | Slurm page]].&lt;br /&gt;
&lt;br /&gt;
To submit a job, use the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command on a login node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This places your job into the queue. It will begin execution on available compute nodes when scheduled. Note: jobs must be submitted from a login node, submitting from datamover nodes is not allowed.&lt;br /&gt;
&lt;br /&gt;
In most cases, you should submit jobs from your &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt; directory, not &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt;, since the home directory is read-only on compute nodes. Output from your jobs must be written to &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Jobs will run under your group's RRG allocation, or, if one is not available, under a RAS allocation (previously called the “default” allocation).&lt;br /&gt;
&lt;br /&gt;
Some example job scripts are shown below.&lt;br /&gt;
&lt;br /&gt;
=== Key Points to Remember ===&lt;br /&gt;
&lt;br /&gt;
* Scheduling is by node, not by core or CPU.&lt;br /&gt;
* Each node has 192 cores and 768 GB of memory.&lt;br /&gt;
* Jobs are limited to a maximum of 24 hours walltime.&lt;br /&gt;
* Output must be written to &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt;. &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;$PROJECT&amp;lt;/code&amp;gt; are read-only on compute nodes.&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script must load all necessary modules explicitly using &amp;lt;code&amp;gt;module load&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Ensure [[Data_Management#Moving_data|your input data is on Trillium]] before submitting jobs.&lt;br /&gt;
&lt;br /&gt;
== Scheduling by Node ==&lt;br /&gt;
&lt;br /&gt;
On many systems that use SLURM, the scheduler will deduce from the specifications of the number of tasks and the number of CPUs per node what resources should be allocated. On Trillium, things are a bit different.&lt;br /&gt;
&lt;br /&gt;
* All job resource requests on Trillium are scheduled as a multiple of '''nodes'''.&lt;br /&gt;
* The nodes that your jobs run on are exclusively yours, for as long as the job is running on them:&lt;br /&gt;
** no other user can run jobs on them;&lt;br /&gt;
** you can [[SSH]] into your nodes during execution to monitor progress.&lt;br /&gt;
* Even if your job does not use all 192 cores, you still get the '''full''' node. Trillium does not share nodes between users.&lt;br /&gt;
* Memory requests are ignored. Your job receives &amp;lt;code&amp;gt;N × 768GB &amp;lt;/code&amp;gt; of RAM, where &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; is the number of nodes and 768GB is the amount of memory on each node.&lt;br /&gt;
* If running serial or low-core jobs you must still use all 192 cores on the node by bundling multiple independent tasks in one job script. See [[Running_Serial_Jobs_on_Niagara|this page]] for examples.&lt;br /&gt;
* If your job underutilizes the cores, our support team may reach out to assist you in optimizing your workflow, or you can [mailto:support@scinet.utoronto.ca contact us] to get assistance.&lt;br /&gt;
&lt;br /&gt;
== Limits ==&lt;br /&gt;
&lt;br /&gt;
There are limits to the size and duration of your jobs, the number of jobs you can run, and the number of jobs you can have queued. It matters whether a user is part of a group with a [https://www.alliancecan.ca/research-portal/accessing-resources/resource-allocation-competitions/ Resources for Research Group allocation] or not. It also matters in which &amp;quot;partition&amp;quot; the job runs. &amp;quot;Partitions&amp;quot; are SLURM-speak for use cases. You specify the partition with the &amp;lt;code&amp;gt;-p&amp;lt;/code&amp;gt; parameter to &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt;, but if you do not specify one, your job will run in the &amp;lt;code&amp;gt;compute&amp;lt;/code&amp;gt; partition, which is the most common case.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Usage&lt;br /&gt;
!Partition&lt;br /&gt;
!Limit on Running jobs&lt;br /&gt;
!Limit on Submitted jobs (incl. running)&lt;br /&gt;
!Min. size of jobs&lt;br /&gt;
!Max. size of jobs&lt;br /&gt;
!Min. walltime&lt;br /&gt;
!Max. walltime &lt;br /&gt;
|-&lt;br /&gt;
|Compute jobs ||compute || 50 || 1000 || 1 node (192&amp;amp;nbsp;cores) || default:&amp;amp;nbsp;20&amp;amp;nbsp;nodes&amp;amp;nbsp;(3840&amp;amp;nbsp;cores) &amp;lt;br&amp;gt; with&amp;amp;nbsp;allocation:&amp;amp;nbsp;1000&amp;amp;nbsp;nodes&amp;amp;nbsp;(192000&amp;amp;nbsp;cores)|| 15 minutes || 24 hours&lt;br /&gt;
|-&lt;br /&gt;
|Testing or troubleshooting || debug || 1 || 1 || 1 node (192&amp;amp;nbsp;cores) || 4 nodes (768 cores)|| N/A || 1 hour&lt;br /&gt;
|-&lt;br /&gt;
|Archiving or retrieving data in [[HPSS]]|| archivelong || 2 per user (5 in total) || 10 per user || N/A || N/A|| 15 minutes || 72 hours&lt;br /&gt;
|-&lt;br /&gt;
|Inspecting archived data, small archival actions in [[HPSS]] || archiveshort vfsshort || 2 per user|| 10 per user || N/A || N/A || 15 minutes || 1 hour&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Even if you respect these limits, your jobs will still have to wait in the queue. The waiting time depends on many factors such as your group's allocation amount, how much allocation has been used in the recent past, the number of requested nodes and walltime, and how many other jobs are waiting in the queue.&lt;br /&gt;
&lt;br /&gt;
== Example submission script (MPI) ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=192&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --job-name=mpi_job&lt;br /&gt;
#SBATCH --output=mpi_output_%j.txt&lt;br /&gt;
#SBATCH --mail-type=FAIL&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load StdEnv/2023&lt;br /&gt;
module load gcc/12.3&lt;br /&gt;
module load openmpi/4.1.5&lt;br /&gt;
&lt;br /&gt;
mpirun ./mpi_example&lt;br /&gt;
# or &amp;quot;srun ./mpi_example&amp;quot;&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Submit this script from your &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt; directory with the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:scratch$ sbatch mpi_job.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;First line indicates that this is a bash script.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;mpi_job&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In this case, SLURM looks for 2 nodes each running 192 tasks, for 1 hour.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Note that the mpifun flag &amp;lt;code&amp;gt;--ppn&amp;lt;/code&amp;gt; (processors per node) is ignored.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Once it finds such a node, it runs the script:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Change to the submission directory;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Loads modules;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Runs the &amp;lt;code&amp;gt;mpi_example&amp;lt;/code&amp;gt; application (SLURM will inform &amp;lt;code&amp;gt;mpirun&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;srun&amp;lt;/code&amp;gt; how many processes to run).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example submission script (OpenMP) ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=192&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --job-name=openmp_job&lt;br /&gt;
#SBATCH --output=openmp_output_%j.txt&lt;br /&gt;
#SBATCH --mail-type=FAIL&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load StdEnv/2023&lt;br /&gt;
module load gcc/12.3&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
./openmp_example&lt;br /&gt;
# or &amp;quot;srun ./openmp_example&amp;quot;&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Submit this script from your &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt; directory with the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:scratch$ sbatch openmp_job.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;First line indicates that this is a Bash script.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; are directives for SLURM.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;openmp_job&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In this case, SLURM looks for one node with 192 CPUs for a single task running up to 192 OpenMP threads, for 1 hour.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Once such a node is allocated, it runs the script:&lt;br /&gt;
  &amp;lt;ul&amp;gt;&lt;br /&gt;
    &amp;lt;li&amp;gt;Changes to the submission directory;&amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li&amp;gt;Loads the required modules;&amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li&amp;gt;Sets &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; based on SLURM’s CPU allocation;&amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li&amp;gt;Runs the &amp;lt;code&amp;gt;openmp_example&amp;lt;/code&amp;gt; application.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Monitoring Queued and Running Jobs ==&lt;br /&gt;
&lt;br /&gt;
Once your job is submitted to the queue, you can monitor its status and performance using the following SLURM commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; shows all jobs in the queue. Use &amp;lt;code&amp;gt;squeue -u $USER&amp;lt;/code&amp;gt; to view only your jobs.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;!-- &amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;sqc&amp;lt;/code&amp;gt; is a SciNet-specific, faster version of &amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; that shows a cached snapshot of the queue.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt; --&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue -j JOBID&amp;lt;/code&amp;gt; shows the current status of a specific job. Alternatively, use &amp;lt;code&amp;gt;scontrol show job JOBID&amp;lt;/code&amp;gt; for detailed information, including allocated nodes, resources, and job flags.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue --start -j JOBID&amp;lt;/code&amp;gt; gives a rough estimate of when a pending job is expected to start. Note that this estimate is often inaccurate and can change depending on system load and priorities.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;scancel JOBID&amp;lt;/code&amp;gt; cancels a job you submitted.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;jobperf JOBID&amp;lt;/code&amp;gt; gives a live snapshot of the CPU and memory usage of your job while it is running.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;sacct&amp;lt;/code&amp;gt; shows information about your past jobs, including start time, run time, node usage, and exit status.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
More details on monitoring jobs can be found on the [[Slurm#Monitoring_jobs | Slurm page]].&lt;br /&gt;
&lt;br /&gt;
You can also view and manage your current and past jobs, resource usage, and allocation history through the [https://my.scinet.utoronto.ca my.SciNet] portal.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Trillium_Quickstart&amp;diff=6836</id>
		<title>Trillium Quickstart</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Trillium_Quickstart&amp;diff=6836"/>
		<updated>2025-08-07T21:24:36Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added GPU login node address&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Trillium.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Trillium&lt;br /&gt;
|installed=Aug 2025&lt;br /&gt;
|operatingsystem= Rocky Linux 9.6&lt;br /&gt;
|loginnode= trillium.scinet.utoronto.ca, trillium-gpu.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  1284 nodes (240768 cores)&lt;br /&gt;
|rampernode= 768 GB&lt;br /&gt;
|corespernode= 192 (CPU nodes) and 96 (GPU nodes)&lt;br /&gt;
|interconnect=Mellanox Dragonfly+&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= System Overview =&lt;br /&gt;
&lt;br /&gt;
The Trillium system is a state-of-the-art high performance computing platform, consisting of three main components:&lt;br /&gt;
&lt;br /&gt;
1. CPU Subcluster&lt;br /&gt;
* ~240,000 cores across homogeneous CPU nodes  &lt;br /&gt;
* Non-blocking 400 Gb/s NDR InfiniBand interconnect  &lt;br /&gt;
* Ideal for large-scale parallel workloads  &lt;br /&gt;
&lt;br /&gt;
2. GPU Subcluster&lt;br /&gt;
* 61 GPU nodes, each with 4 x NVIDIA H100 (SXM) GPUs  &lt;br /&gt;
* 800 Gb/s bandwidth per node (200 Gb/s per GPU) over InfiniBand  &lt;br /&gt;
* Optimized for AI/ML and accelerated science workloads  &lt;br /&gt;
* Note: This subcluster is in high demand and not ideal for training extremely large models (multi-100B parameters)&lt;br /&gt;
&lt;br /&gt;
3. Storage System&lt;br /&gt;
* Unified 29 PB VAST NVMe storage for all workloads  &lt;br /&gt;
* No tiering — all flash-based for consistent performance  &lt;br /&gt;
* Accessible via POSIX or S3 under a unified namespace  &lt;br /&gt;
&lt;br /&gt;
== Specifications ==&lt;br /&gt;
&lt;br /&gt;
The Trillium cluster is a large cluster comprised of two types of nodes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! nodes !! cores !! available memory !! CPU !! GPU&lt;br /&gt;
|-&lt;br /&gt;
| 1224 || 192 || 768GB DDR5 ||2 x AMD EPYC 9655 (Zen 5) @ 2.6 GHz, 384MB cache L3 ||&lt;br /&gt;
|-&lt;br /&gt;
|  60 || 96 || 768GB DDR5 || 1 x AMD EPYC 9654 (Zen 4) @ 2.4 GHz, 384MB cache L3 || 4 x NVIDIA H100 SXM (80 GB memory)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Each node of the cluster has 768 GB RAM per node.  Being designed for large parallel workloads, it has a fast interconnect consisting of NDR InfiniBand in a Dragonfly+ topology with Adaptive Routing. The compute nodes are accessed through a queueing system that allows jobs with a minimum of 15 minutes and a maximum of 24 hours.&lt;br /&gt;
&lt;br /&gt;
== Storage System ==&lt;br /&gt;
&lt;br /&gt;
Trillium features a unified high-performance storage system based on the VAST platform, with no tiering. It serves the following directories:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;/home&amp;lt;/code&amp;gt; – For personal files and configurations.&lt;br /&gt;
* &amp;lt;code&amp;gt;/scratch&amp;lt;/code&amp;gt; – High-speed, temporary storage for job data.&lt;br /&gt;
* &amp;lt;code&amp;gt;/project&amp;lt;/code&amp;gt; – Shared storage for project teams and collaborations.&lt;br /&gt;
&lt;br /&gt;
The storage is accessible via the NDR InfiniBand fabric for maximum performance across all workloads.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Trillium =&lt;br /&gt;
&lt;br /&gt;
Access to Trillium is not enabled automatically for everyone with an account with the {{DigitalResearchAllianceOfCanada}}, but anyone with an active Alliance account can get their access enabled.&lt;br /&gt;
 &lt;br /&gt;
Trillium is not automatically available to all Alliance account holders. If you are new to SciNet or your Supervisor/PI does not hold a current {{Alliance}} [https://alliancecan.ca/en/services/advanced-research-computing/research-portal/accessing-resources/resource-allocation-competitions RAC] allocation, you will need to request access on the [https://ccdb.alliancecan.ca/me/access_systems Access Systems] page on the CCDB site. After clicking the &amp;quot;I request access&amp;quot; button, it usually takes only one or two business days for access to be granted.&lt;br /&gt;
&lt;br /&gt;
You can check if you already have Trillium access by attempting to log in. If you receive a &amp;quot;Permission denied&amp;quot; error (and your SSH key is correctly set up), you may need to opt in.&lt;br /&gt;
&lt;br /&gt;
Please read this document carefully.  The [https://docs.scinet.utoronto.ca/index.php/FAQ FAQ] is also a useful resource.  If at any time you require assistance, or if something is unclear, please do not hesitate to [mailto:support@scinet.utoronto.ca contact us].&lt;br /&gt;
&lt;br /&gt;
== Logging in ==&lt;br /&gt;
&lt;br /&gt;
Trillium runs Rocky Linux 9.6, which is a type of Linux.  You will need to be familiar with Linux systems to work on Trillium.  If you are not it will be worth your time to review our [https://support.scinet.utoronto.ca/education/browse.php?category=-1&amp;amp;search=scmp101&amp;amp;include=all&amp;amp;filter=Filter Introduction to Linux Shell] class.&lt;br /&gt;
&lt;br /&gt;
As with all SciNet and {{Alliance}} compute systems, access to Trillium is done via [[SSH]] (secure shell) only and authentication is only allowed via SSH keys. [https://docs.alliancecan.ca/wiki/SSH_Keys Please refer to this page] to generate your SSH key pair and make sure you use them securely.&lt;br /&gt;
 &lt;br /&gt;
Open a terminal window (e.g. Connecting with [https://docs.alliancecan.ca/wiki/Connecting_with_PuTTY PuTTY] on Windows or Connecting with [https://docs.alliancecan.ca/wiki/Connecting_with_MobaXTerm MobaXTerm]), then SSH into the Trillium login nodes with your {{Alliance}} credentials:&lt;br /&gt;
&lt;br /&gt;
 $ ssh -i /path/to/ssh_private_key -Y MYALLIANCEUSERNAME@trillium.scinet.utoronto.ca&lt;br /&gt;
&lt;br /&gt;
* The Trillium login nodes are where you develop, edit, compile, prepare and submit jobs.&lt;br /&gt;
* These login nodes are not part of the Trillium compute cluster, but have the same architecture, operating system, and software stack.&lt;br /&gt;
* The optional &amp;lt;code&amp;gt;-Y&amp;lt;/code&amp;gt; enables X11 forwarding, allowing graphical programs to open windows on your local computer.&lt;br /&gt;
* To run on Trillium compute nodes, you must [[#Submitting_jobs | submit a batch job]].&lt;br /&gt;
&lt;br /&gt;
If you cannot log in, be sure to first check the [https://docs.scinet.utoronto.ca System Status] on this site's front page.&lt;br /&gt;
&lt;br /&gt;
Note: We plan to add browser access to Trillium via Open OnDemand in the future. In the meantime you can still access our existing Open OnDemand deployment by following the instructions in our [https://docs.scinet.utoronto.ca/index.php/Open_OnDemand_Quickstart quickstart guide].&lt;br /&gt;
&lt;br /&gt;
== Software Environment ==&lt;br /&gt;
&lt;br /&gt;
Trillium uses the '''environment modules''' system to manage compilers, libraries, and other software packages. Modules dynamically modify your environment (e.g., &amp;lt;code&amp;gt;PATH&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;LD_LIBRARY_PATH&amp;lt;/code&amp;gt;) so you can access different versions of software without conflicts.&lt;br /&gt;
&lt;br /&gt;
A detailed explanation can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Commonly used module commands:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Load the default version of a software package.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt; – Load a specific version.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt; – Unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt; – List available modules that can be loaded.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt; – Show currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Search for available modules and their versions.&lt;br /&gt;
&lt;br /&gt;
Handy abbreviations are available:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; – Equivalent to &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;.&lt;br /&gt;
* &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Equivalent to &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Tips for Loading Software ==&lt;br /&gt;
&lt;br /&gt;
Properly managing your software environment is key to avoiding conflicts and ensuring reproducibility. Here are some best practices:&lt;br /&gt;
&lt;br /&gt;
* Avoid loading modules in your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file. Doing so can cause unexpected behavior, particularly in non-interactive environments like batch jobs or remote shells. For more information, see our [[bashrc guidelines|.bashrc guidelines]].&lt;br /&gt;
&lt;br /&gt;
* Instead, load modules manually or from a separate script. This approach gives you more control and helps keep environments clean.&lt;br /&gt;
&lt;br /&gt;
* Load required modules inside your job submission script. This ensures that your job runs with the expected software environment, regardless of your interactive shell settings.&lt;br /&gt;
&lt;br /&gt;
* Be explicit about module versions. Short names like &amp;lt;code&amp;gt;gcc&amp;lt;/code&amp;gt; will load the system default (e.g., &amp;lt;code&amp;gt;gcc/12.3&amp;lt;/code&amp;gt;), which may change in the future. Specify full versions (e.g., &amp;lt;code&amp;gt;gcc/13.3&amp;lt;/code&amp;gt;) for long-term reproducibility.&lt;br /&gt;
&lt;br /&gt;
* Resolve dependencies with &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;. Some modules depend on others. Use &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; to discover which modules are required and how to load them in the correct order. For more, see [[Using_modules#Module_spider | Using &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
== Using Commercial Software ==&lt;br /&gt;
&lt;br /&gt;
You may be able to use commercial software on Trillium, but there are a few important considerations:&lt;br /&gt;
&lt;br /&gt;
* Bring your own license. You can use commercial software on Trillium if you have a valid license. If the software requires a license server, you can connect to it securely using [[SSH_Tunneling | SSH tunneling]].&lt;br /&gt;
&lt;br /&gt;
* SciNet and the {{Alliance}} do not provide user-specific licenses. Due to the large and diverse user base, we cannot provide licenses for individual or specialized commercial packages.&lt;br /&gt;
&lt;br /&gt;
* Freely available commercial tools. Some widely useful commercial tools are available system-wide, such as compilers, math libraries, debuggers.&lt;br /&gt;
&lt;br /&gt;
* Software not available (unless you bring your own license): tools like [[MATLAB]], Gaussian, and IDL are not provided centrally. If you have your own license, you are welcome to install and use them.&lt;br /&gt;
&lt;br /&gt;
* Open-source alternatives are available. Consider using freely available tools such as [[Python]], [[R]], and Octave, which are well-supported and widely used on the system.&lt;br /&gt;
&lt;br /&gt;
* We're here to help. If you have a valid license and need help installing commercial software, feel free to contact us, we'll assist where possible.&lt;br /&gt;
&lt;br /&gt;
A list of commercial software currently installed on Trillium (for which you must supply a license to use) is available on the [[Commercial_software | Commercial Software page]].&lt;br /&gt;
&lt;br /&gt;
= Technical Details =&lt;br /&gt;
&lt;br /&gt;
== Cooling and Energy Efficiency ==&lt;br /&gt;
&lt;br /&gt;
Trillium is fully direct liquid cooled using warm water (35–40 °C input), resulting in:&lt;br /&gt;
&lt;br /&gt;
* PUE below 1.03 (high energy efficiency)&lt;br /&gt;
* Use of closed-loop dry fluid coolers, avoiding evaporative towers and new water usage&lt;br /&gt;
* Heat reuse: Trillium supplies excess heat to nearby facilities to minimize climate impact&lt;br /&gt;
&lt;br /&gt;
== Storage System ==&lt;br /&gt;
&lt;br /&gt;
The VAST high-performance file system is comprised of a unified 29 PB NVMe-backed storage pool, with:&lt;br /&gt;
&lt;br /&gt;
* 29 PB effective capacity (deduplicated via VAST)&lt;br /&gt;
* 16.7 PB raw flash capacity&lt;br /&gt;
* 714 GB/s read bandwidth, 275 GB/s write bandwidth&lt;br /&gt;
* 10 million read IOPS, 2 million write IOPS&lt;br /&gt;
* POSIX and S3 access protocols under a unified namespace&lt;br /&gt;
* 48 C-Boxes and 14 D-Boxes for data services&lt;br /&gt;
&lt;br /&gt;
== Backup and Archive Storage ==&lt;br /&gt;
&lt;br /&gt;
An additional 114 PB HPSS tape-based archive is available for nearline storage:&lt;br /&gt;
&lt;br /&gt;
* Dual-copy archive across geographically separate libraries&lt;br /&gt;
* Used for both backup and archival purposes&lt;br /&gt;
* Backups are managed using Atempo backup software&lt;br /&gt;
&lt;br /&gt;
= Testing and Debugging =&lt;br /&gt;
&lt;br /&gt;
Before submitting your job to the cluster, it's important to test your code to ensure correctness and determine the resources it requires.&lt;br /&gt;
&lt;br /&gt;
* '''Lightweight tests''' can be run directly on the login nodes. As a rule of thumb, these should:&lt;br /&gt;
** Run in under a few minutes  &lt;br /&gt;
** Use no more than 1–2 GB of memory  &lt;br /&gt;
** Use only 1–2 CPU cores&lt;br /&gt;
&lt;br /&gt;
* You can also run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after loading it with: &amp;lt;code&amp;gt;module load ddt-cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* For short tests that exceed login node limits or require dedicated resources, request an interactive debug job using the &amp;lt;code&amp;gt;debugjob&amp;lt;/code&amp;gt; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:~$ debugjob --clean N&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Replace &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; with the number of nodes (1 to 4). If &amp;lt;code&amp;gt;N=1&amp;lt;/code&amp;gt;, you will get 1 hour of interactive time; with &amp;lt;code&amp;gt;N=4&amp;lt;/code&amp;gt; (the maximum), you will get 22 minutes.  &lt;br /&gt;
The &amp;lt;code&amp;gt;--clean&amp;lt;/code&amp;gt; flag is optional but recommended, as it starts the session with no modules loaded, better mimicking the clean environment of batch jobs.&lt;br /&gt;
&lt;br /&gt;
* If your test job requires more time than allowed by &amp;lt;code&amp;gt;debugjob&amp;lt;/code&amp;gt;, you can request an interactive session from the regular queue using &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt;:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:~$ salloc --nodes=N --time=M:00:00 --x11&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; is the number of nodes  &lt;br /&gt;
* &amp;lt;code&amp;gt;M&amp;lt;/code&amp;gt; is the number of hours the job should run  &lt;br /&gt;
* &amp;lt;code&amp;gt;--x11&amp;lt;/code&amp;gt; is required for graphical applications (e.g., when using [[Parallel Debugging with DDT|DDT]] or DDD)&lt;br /&gt;
&lt;br /&gt;
'''Note:''' Jobs submitted with &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; may take longer to start, as they are scheduled like any other batch job. See the [[Testing_With_Graphics|Testing with graphics]] page for more information on graphical testing options.&lt;br /&gt;
&lt;br /&gt;
= Submitting Jobs on the CPU Subcluster =&lt;br /&gt;
&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Trillium login nodes and confirmed that it behaves correctly, you are ready to submit jobs to the cluster. These jobs will run on Trillium's compute nodes, and their execution is managed by the SLURM scheduler.&lt;br /&gt;
&lt;br /&gt;
Trillium uses SLURM as its job scheduler. More advanced details of how to interact with the scheduler can be found on the [[Slurm | Slurm page]].&lt;br /&gt;
&lt;br /&gt;
To submit a job, use the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command on a login node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This places your job into the queue. It will begin execution on available compute nodes when scheduled. Note: jobs must be submitted from a login node, submitting from datamover nodes is not allowed.&lt;br /&gt;
&lt;br /&gt;
In most cases, you should submit jobs from your &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt; directory, not &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt;, since the home directory is read-only on compute nodes. Output from your jobs must be written to &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Jobs will run under your group's RRG allocation, or, if one is not available, under a RAS allocation (previously called the “default” allocation).&lt;br /&gt;
&lt;br /&gt;
Some example job scripts are shown below.&lt;br /&gt;
&lt;br /&gt;
=== Key Points to Remember ===&lt;br /&gt;
&lt;br /&gt;
* Scheduling is by node, not by core or CPU.&lt;br /&gt;
* Each node has 192 cores and 768 GB of memory.&lt;br /&gt;
* Jobs are limited to a maximum of 24 hours walltime.&lt;br /&gt;
* Output must be written to &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt;. &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;$PROJECT&amp;lt;/code&amp;gt; are read-only on compute nodes.&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script must load all necessary modules explicitly using &amp;lt;code&amp;gt;module load&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Ensure [[Data_Management#Moving_data|your input data is on Trillium]] before submitting jobs.&lt;br /&gt;
&lt;br /&gt;
== Scheduling by Node ==&lt;br /&gt;
&lt;br /&gt;
On many systems that use SLURM, the scheduler will deduce from the specifications of the number of tasks and the number of CPUs per node what resources should be allocated. On Trillium, things are a bit different.&lt;br /&gt;
&lt;br /&gt;
* All job resource requests on Trillium are scheduled as a multiple of '''nodes'''.&lt;br /&gt;
* The nodes that your jobs run on are exclusively yours, for as long as the job is running on them:&lt;br /&gt;
** no other user can run jobs on them;&lt;br /&gt;
** you can [[SSH]] into your nodes during execution to monitor progress.&lt;br /&gt;
* Even if your job does not use all 192 cores, you still get the '''full''' node. Trillium does not share nodes between users.&lt;br /&gt;
* Memory requests are ignored. Your job receives &amp;lt;code&amp;gt;N × 768GB &amp;lt;/code&amp;gt; of RAM, where &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; is the number of nodes and 768GB is the amount of memory on each node.&lt;br /&gt;
* If running serial or low-core jobs you must still use all 192 cores on the node by bundling multiple independent tasks in one job script. See [[Running_Serial_Jobs_on_Niagara|this page]] for examples.&lt;br /&gt;
* If your job underutilizes the cores, our support team may reach out to assist you in optimizing your workflow, or you can [mailto:support@scinet.utoronto.ca contact us] to get assistance.&lt;br /&gt;
&lt;br /&gt;
== Limits ==&lt;br /&gt;
&lt;br /&gt;
There are limits to the size and duration of your jobs, the number of jobs you can run, and the number of jobs you can have queued. It matters whether a user is part of a group with a [https://www.alliancecan.ca/research-portal/accessing-resources/resource-allocation-competitions/ Resources for Research Group allocation] or not. It also matters in which &amp;quot;partition&amp;quot; the job runs. &amp;quot;Partitions&amp;quot; are SLURM-speak for use cases. You specify the partition with the &amp;lt;code&amp;gt;-p&amp;lt;/code&amp;gt; parameter to &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt;, but if you do not specify one, your job will run in the &amp;lt;code&amp;gt;compute&amp;lt;/code&amp;gt; partition, which is the most common case.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Usage&lt;br /&gt;
!Partition&lt;br /&gt;
!Limit on Running jobs&lt;br /&gt;
!Limit on Submitted jobs (incl. running)&lt;br /&gt;
!Min. size of jobs&lt;br /&gt;
!Max. size of jobs&lt;br /&gt;
!Min. walltime&lt;br /&gt;
!Max. walltime &lt;br /&gt;
|-&lt;br /&gt;
|Compute jobs ||compute || 50 || 1000 || 1 node (192&amp;amp;nbsp;cores) || default:&amp;amp;nbsp;20&amp;amp;nbsp;nodes&amp;amp;nbsp;(3840&amp;amp;nbsp;cores) &amp;lt;br&amp;gt; with&amp;amp;nbsp;allocation:&amp;amp;nbsp;1000&amp;amp;nbsp;nodes&amp;amp;nbsp;(192000&amp;amp;nbsp;cores)|| 15 minutes || 24 hours&lt;br /&gt;
|-&lt;br /&gt;
|Testing or troubleshooting || debug || 1 || 1 || 1 node (192&amp;amp;nbsp;cores) || 4 nodes (768 cores)|| N/A || 1 hour&lt;br /&gt;
|-&lt;br /&gt;
|Archiving or retrieving data in [[HPSS]]|| archivelong || 2 per user (5 in total) || 10 per user || N/A || N/A|| 15 minutes || 72 hours&lt;br /&gt;
|-&lt;br /&gt;
|Inspecting archived data, small archival actions in [[HPSS]] || archiveshort vfsshort || 2 per user|| 10 per user || N/A || N/A || 15 minutes || 1 hour&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Even if you respect these limits, your jobs will still have to wait in the queue. The waiting time depends on many factors such as your group's allocation amount, how much allocation has been used in the recent past, the number of requested nodes and walltime, and how many other jobs are waiting in the queue.&lt;br /&gt;
&lt;br /&gt;
== Example submission script (MPI) ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=192&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --job-name=mpi_job&lt;br /&gt;
#SBATCH --output=mpi_output_%j.txt&lt;br /&gt;
#SBATCH --mail-type=FAIL&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load StdEnv/2023&lt;br /&gt;
module load gcc/12.3&lt;br /&gt;
module load openmpi/4.1.5&lt;br /&gt;
&lt;br /&gt;
mpirun ./mpi_example&lt;br /&gt;
# or &amp;quot;srun ./mpi_example&amp;quot;&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Submit this script from your &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt; directory with the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:scratch$ sbatch mpi_job.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;First line indicates that this is a bash script.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;mpi_job&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In this case, SLURM looks for 2 nodes each running 192 tasks, for 1 hour.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Note that the mpifun flag &amp;lt;code&amp;gt;--ppn&amp;lt;/code&amp;gt; (processors per node) is ignored.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Once it finds such a node, it runs the script:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Change to the submission directory;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Loads modules;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Runs the &amp;lt;code&amp;gt;mpi_example&amp;lt;/code&amp;gt; application (SLURM will inform &amp;lt;code&amp;gt;mpirun&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;srun&amp;lt;/code&amp;gt; how many processes to run).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example submission script (OpenMP) ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=192&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --job-name=openmp_job&lt;br /&gt;
#SBATCH --output=openmp_output_%j.txt&lt;br /&gt;
#SBATCH --mail-type=FAIL&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load StdEnv/2023&lt;br /&gt;
module load gcc/12.3&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
./openmp_example&lt;br /&gt;
# or &amp;quot;srun ./openmp_example&amp;quot;&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Submit this script from your &amp;lt;code&amp;gt;$SCRATCH&amp;lt;/code&amp;gt; directory with the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tri-login01:scratch$ sbatch openmp_job.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;First line indicates that this is a Bash script.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; are directives for SLURM.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;openmp_job&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In this case, SLURM looks for one node with 192 CPUs for a single task running up to 192 OpenMP threads, for 1 hour.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Once such a node is allocated, it runs the script:&lt;br /&gt;
  &amp;lt;ul&amp;gt;&lt;br /&gt;
    &amp;lt;li&amp;gt;Changes to the submission directory;&amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li&amp;gt;Loads the required modules;&amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li&amp;gt;Sets &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; based on SLURM’s CPU allocation;&amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li&amp;gt;Runs the &amp;lt;code&amp;gt;openmp_example&amp;lt;/code&amp;gt; application.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Monitoring Queued and Running Jobs ==&lt;br /&gt;
&lt;br /&gt;
Once your job is submitted to the queue, you can monitor its status and performance using the following SLURM commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; shows all jobs in the queue. Use &amp;lt;code&amp;gt;squeue -u $USER&amp;lt;/code&amp;gt; to view only your jobs.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;!-- &amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;sqc&amp;lt;/code&amp;gt; is a SciNet-specific, faster version of &amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; that shows a cached snapshot of the queue.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt; --&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue -j JOBID&amp;lt;/code&amp;gt; shows the current status of a specific job. Alternatively, use &amp;lt;code&amp;gt;scontrol show job JOBID&amp;lt;/code&amp;gt; for detailed information, including allocated nodes, resources, and job flags.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;squeue --start -j JOBID&amp;lt;/code&amp;gt; gives a rough estimate of when a pending job is expected to start. Note that this estimate is often inaccurate and can change depending on system load and priorities.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;scancel JOBID&amp;lt;/code&amp;gt; cancels a job you submitted.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;jobperf JOBID&amp;lt;/code&amp;gt; gives a live snapshot of the CPU and memory usage of your job while it is running.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;&amp;lt;code&amp;gt;sacct&amp;lt;/code&amp;gt; shows information about your past jobs, including start time, run time, node usage, and exit status.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
More details on monitoring jobs can be found on the [[Slurm#Monitoring_jobs | Slurm page]].&lt;br /&gt;
&lt;br /&gt;
You can also view and manage your current and past jobs, resource usage, and allocation history through the [https://my.scinet.utoronto.ca my.SciNet] portal.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=6830</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=6830"/>
		<updated>2025-08-07T21:16:18Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Trillium is up!!!&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot;, &amp;quot;Partial&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up   | Trillium|Trillium_Quickstart}}&lt;br /&gt;
|{{Partial | Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up   | Teach|Teach}}&lt;br /&gt;
|{{Up | Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up | OnDemand|Open_OnDemand_Quickstart}}&lt;br /&gt;
|{{Up   | Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up   | File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up   | Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   | HPSS|HPSS}}&lt;br /&gt;
|{{Up   | Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   | External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up   | Globus |Globus}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up   | Balam|Balam}}&lt;br /&gt;
|{{Up   | Cvmfs|Using_modules}}&lt;br /&gt;
|{{Partial | Mist|Mist}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''August 7, 2025:''' CVMFS issues are resolved.&lt;br /&gt;
&lt;br /&gt;
'''August 6, 2025:''' We are seeing intermittent issues with the software on CVMFS on Niagara. We're investigating.&lt;br /&gt;
&lt;br /&gt;
'''July 31, 2025, 4:00 PM EDT - 5:00 PM EDT:''' As announced, all systems connected to the Niagara file system (Mist, Niagara, HPSS, Balam, and Rouge) will be paused and inaccessible for one hour to start the transfer of files from the Niagara file system to the Trillium file system. &lt;br /&gt;
&lt;br /&gt;
'''January 6, 2025:''' As part of the installation of the new computing cluster Trillium, there is now a permanent reduction in computing capacity of Niagara to 50% and of Mist to 35%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: --&amp;gt;&lt;br /&gt;
[[Previous messages]]&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Trillium_Quickstart&amp;diff=6764</id>
		<title>Trillium Quickstart</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Trillium_Quickstart&amp;diff=6764"/>
		<updated>2025-08-05T20:14:19Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Changed the picture&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Trillium.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Trillium&lt;br /&gt;
|installed=Aug 2025&lt;br /&gt;
|operatingsystem= Rocky Linux 9.6&lt;br /&gt;
|loginnode= trillium.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  1284 nodes (240768 cores)&lt;br /&gt;
|rampernode= 768 GB&lt;br /&gt;
|corespernode= 192 (CPU nodes) and 96 (GPU nodes)&lt;br /&gt;
|interconnect=Mellanox Dragonfly+&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= System Overview =&lt;br /&gt;
&lt;br /&gt;
The Trillium system is a state-of-the-art high performance computing platform, consisting of three main components:&lt;br /&gt;
&lt;br /&gt;
1. CPU Partition&lt;br /&gt;
* ~240,000 cores across homogeneous CPU nodes  &lt;br /&gt;
* Non-blocking 400 Gb/s NDR InfiniBand interconnect  &lt;br /&gt;
* Ideal for large-scale parallel workloads  &lt;br /&gt;
&lt;br /&gt;
2. GPU Partition&lt;br /&gt;
* 61 GPU nodes, each with 4 x NVIDIA H100 (SXM) GPUs  &lt;br /&gt;
* 800 Gb/s bandwidth per node (200 Gb/s per GPU) over InfiniBand  &lt;br /&gt;
* Optimized for AI/ML and accelerated science workloads  &lt;br /&gt;
* Note: This partition is in high demand and not ideal for training extremely large models (multi-100B parameters)&lt;br /&gt;
&lt;br /&gt;
3. Storage System&lt;br /&gt;
* Unified 29 PB VAST NVMe storage for all workloads  &lt;br /&gt;
* No tiering — all flash-based for consistent performance  &lt;br /&gt;
* Accessible via POSIX or S3 under a unified namespace  &lt;br /&gt;
&lt;br /&gt;
== Cooling and Energy Efficiency ==&lt;br /&gt;
&lt;br /&gt;
Trillium is fully direct liquid cooled using warm water (35–40 °C input), resulting in:&lt;br /&gt;
&lt;br /&gt;
* PUE below 1.03 (high energy efficiency)&lt;br /&gt;
* Use of closed-loop dry fluid coolers, avoiding evaporative towers and new water usage&lt;br /&gt;
* Heat reuse: Trillium supplies excess heat to nearby facilities to minimize climate impact&lt;br /&gt;
&lt;br /&gt;
== Specifications ==&lt;br /&gt;
&lt;br /&gt;
The Trillium cluster is a large cluster comprised of two types of nodes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
! nodes !! cores !! available memory !! CPU !! GPU&lt;br /&gt;
|-&lt;br /&gt;
| 1224 || 192 || 768GB DDR5 ||2 x AMD EPYC 9655 (Zen 5) @ 2.6 GHz, 384MB cache L3 ||&lt;br /&gt;
|-&lt;br /&gt;
|  60 || 96 || 768GB DDR5 || 1 x AMD EPYC 9654 (Zen 4) @ 2.4 GHz, 384MB cache L3 || 4 x NVIDIA H100 SXM (80 GB memory)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Each node of the cluster has 768 GB RAM per node.  Being designed for large parallel workloads, it has a fast interconnect consisting of NDR InfiniBand in a Dragonfly+ topology with Adaptive Routing. The compute nodes are accessed through a queueing system that allows jobs with a minimum of 15 minutes and a maximum of 24 hours.&lt;br /&gt;
&lt;br /&gt;
== Storage System ==&lt;br /&gt;
&lt;br /&gt;
Trillium features a unified high-performance storage system based on the VAST platform, with no tiering. It serves the following directories:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;/home&amp;lt;/code&amp;gt; – For personal files and configurations.&lt;br /&gt;
* &amp;lt;code&amp;gt;/scratch&amp;lt;/code&amp;gt; – High-speed, temporary storage for job data.&lt;br /&gt;
* &amp;lt;code&amp;gt;/project&amp;lt;/code&amp;gt; – Shared storage for project teams and collaborations.&lt;br /&gt;
&lt;br /&gt;
All three share a unified 29 PB NVMe-backed storage pool, with:&lt;br /&gt;
&lt;br /&gt;
* 29 PB effective capacity (deduplicated via VAST)&lt;br /&gt;
* 16.7 PB raw flash capacity&lt;br /&gt;
* 714 GB/s read bandwidth, 275 GB/s write bandwidth&lt;br /&gt;
* 10 million read IOPS, 2 million write IOPS&lt;br /&gt;
* POSIX and S3 access protocols under a unified namespace&lt;br /&gt;
* 48 C-Boxes and 14 D-Boxes for data services&lt;br /&gt;
&lt;br /&gt;
The storage is accessible via the NDR InfiniBand fabric for maximum performance across all workloads.&lt;br /&gt;
&lt;br /&gt;
== Backup and Archive Storage ==&lt;br /&gt;
&lt;br /&gt;
An additional 114 PB HPSS tape-based archive is available for nearline storage:&lt;br /&gt;
&lt;br /&gt;
* Dual-copy archive across geographically separate libraries&lt;br /&gt;
* Used for both backup and archival purposes&lt;br /&gt;
* Backups are managed using Atempo backup software&lt;br /&gt;
&lt;br /&gt;
= Getting started on Trillium =&lt;br /&gt;
&lt;br /&gt;
Access to Trillium is not enabled automatically for everyone with an account with the {{DigitalResearchAllianceOfCanada}}, but anyone with an active Alliance account can get their access enabled.&lt;br /&gt;
 &lt;br /&gt;
Trillium is not automatically available to all Alliance account holders. If you are new to SciNet or your Supervisor/PI does not hold a current {{Alliance}} [https://alliancecan.ca/en/services/advanced-research-computing/research-portal/accessing-resources/resource-allocation-competitions RAC] allocation, you will need to request access on the [https://ccdb.alliancecan.ca/me/access_systems Access Systems] page on the CCDB site. After clicking the &amp;quot;I request access&amp;quot; button, it usually takes only one or two business days for access to be granted.&lt;br /&gt;
&lt;br /&gt;
You can check if you already have Trillium access by attempting to log in. If you receive a &amp;quot;Permission denied&amp;quot; error (and your SSH key is correctly set up), you may need to opt in.&lt;br /&gt;
&lt;br /&gt;
Please read this document carefully.  The [https://docs.scinet.utoronto.ca/index.php/FAQ FAQ] is also a useful resource.  If at any time you require assistance, or if something is unclear, please do not hesitate to [mailto:support@scinet.utoronto.ca contact us].&lt;br /&gt;
&lt;br /&gt;
== Logging in ==&lt;br /&gt;
&lt;br /&gt;
Trillium runs Rocky Linux 9.6, which is a type of Linux.  You will need to be familiar with Linux systems to work on Trillium.  If you are not it will be worth your time to review our [https://support.scinet.utoronto.ca/education/browse.php?category=-1&amp;amp;search=scmp101&amp;amp;include=all&amp;amp;filter=Filter Introduction to Linux Shell] class.&lt;br /&gt;
&lt;br /&gt;
As with all SciNet and {{Alliance}} compute systems, access to Trillium is done via [[SSH]] (secure shell) only and authentication is only allowed via SSH keys. [https://docs.alliancecan.ca/wiki/SSH_Keys Please refer to this page] to generate your SSH key pair and make sure you use them securely.&lt;br /&gt;
 &lt;br /&gt;
Open a terminal window (e.g. Connecting with [https://docs.alliancecan.ca/wiki/Connecting_with_PuTTY PuTTY] on Windows or Connecting with [https://docs.alliancecan.ca/wiki/Connecting_with_MobaXTerm MobaXTerm]), then SSH into the Trillium login nodes with your {{Alliance}} credentials:&lt;br /&gt;
&lt;br /&gt;
 $ ssh -i /path/to/ssh_private_key -Y MYALLIANCEUSERNAME@trillium.scinet.utoronto.ca&lt;br /&gt;
&lt;br /&gt;
* The Trillium login nodes are where you develop, edit, compile, prepare and submit jobs.&lt;br /&gt;
* These login nodes are not part of the Trillium compute cluster, but have the same architecture, operating system, and software stack.&lt;br /&gt;
* The optional &amp;lt;code&amp;gt;-Y&amp;lt;/code&amp;gt; enables X11 forwarding, allowing graphical programs to open windows on your local computer.&lt;br /&gt;
* To run on Trillium compute nodes, you must [[#Submitting_jobs | submit a batch job]].&lt;br /&gt;
&lt;br /&gt;
If you cannot log in, be sure to first check the [https://docs.scinet.utoronto.ca System Status] on this site's front page.&lt;br /&gt;
&lt;br /&gt;
== Software Environment ==&lt;br /&gt;
&lt;br /&gt;
Trillium uses the '''environment modules''' system to manage compilers, libraries, and other software packages. Modules dynamically modify your environment (e.g., &amp;lt;code&amp;gt;PATH&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;LD_LIBRARY_PATH&amp;lt;/code&amp;gt;) so you can access different versions of software without conflicts.&lt;br /&gt;
&lt;br /&gt;
A detailed explanation can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Commonly used module commands:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Load the default version of a software package.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt; – Load a specific version.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt; – Unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt; – List available modules that can be loaded.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt; – Show currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Search for available modules and their versions.&lt;br /&gt;
&lt;br /&gt;
Handy abbreviations are available:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; – Equivalent to &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;.&lt;br /&gt;
* &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; – Equivalent to &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Tips for Loading Software ==&lt;br /&gt;
&lt;br /&gt;
Properly managing your software environment is key to avoiding conflicts and ensuring reproducibility. Here are some best practices:&lt;br /&gt;
&lt;br /&gt;
* Avoid loading modules in your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file. Doing so can cause unexpected behavior, particularly in non-interactive environments like batch jobs or remote shells. For more information, see our [[bashrc guidelines|.bashrc guidelines]].&lt;br /&gt;
&lt;br /&gt;
* Instead, load modules manually or from a separate script. This approach gives you more control and helps keep environments clean.&lt;br /&gt;
&lt;br /&gt;
* Load required modules inside your job submission script. This ensures that your job runs with the expected software environment, regardless of your interactive shell settings.&lt;br /&gt;
&lt;br /&gt;
* Be explicit about module versions. Short names like &amp;lt;code&amp;gt;gcc&amp;lt;/code&amp;gt; will load the system default (e.g., &amp;lt;code&amp;gt;gcc/12.3&amp;lt;/code&amp;gt;), which may change in the future. Specify full versions (e.g., &amp;lt;code&amp;gt;gcc/13.3&amp;lt;/code&amp;gt;) for long-term reproducibility.&lt;br /&gt;
&lt;br /&gt;
* Resolve dependencies with &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;. Some modules depend on others. Use &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; to discover which modules are required and how to load them in the correct order. For more, see [[Using_modules#Module_spider | Using &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
== Using Commercial Software ==&lt;br /&gt;
&lt;br /&gt;
You may be able to use commercial software on Trillium, but there are a few important considerations:&lt;br /&gt;
&lt;br /&gt;
* Bring your own license. You can use commercial software on Trillium if you have a valid license. If the software requires a license server, you can connect to it securely using [[SSH_Tunneling | SSH tunneling]].&lt;br /&gt;
&lt;br /&gt;
* SciNet and the {{Alliance}} do not provide user-specific licenses. Due to the large and diverse user base, we cannot provide licenses for individual or specialized commercial packages.&lt;br /&gt;
&lt;br /&gt;
* Freely available commercial tools. Some widely useful commercial tools are available system-wide, such as compilers, math libraries, debuggers.&lt;br /&gt;
&lt;br /&gt;
* Software not available (unless you bring your own license): tools like [[MATLAB]], Gaussian, and IDL are not provided centrally. If you have your own license, you are welcome to install and use them.&lt;br /&gt;
&lt;br /&gt;
* Open-source alternatives are available. Consider using freely available tools such as [[Python]], [[R]], and Octave, which are well-supported and widely used on the system.&lt;br /&gt;
&lt;br /&gt;
* We're here to help. If you have a valid license and need help installing commercial software, feel free to contact us, we'll assist where possible.&lt;br /&gt;
&lt;br /&gt;
A list of commercial software currently installed on Trillium (for which you must supply a license to use) is available on the [[Commercial_software | Commercial Software page]].&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=File:Trillium.jpg&amp;diff=6761</id>
		<title>File:Trillium.jpg</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=File:Trillium.jpg&amp;diff=6761"/>
		<updated>2025-08-05T20:13:56Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=5693</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=5693"/>
		<updated>2024-06-10T18:11:22Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Updated Ray&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 8.2&lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
'''&amp;lt;span style=&amp;quot;background:#fc8383&amp;quot;&amp;gt;Important note:&amp;lt;/span&amp;gt;''' the majority of computer systems as of 2021 (laptops, desktops, and HPC) use the 64 bit x86 instruction set architecture (ISA) in their microprocessors produced by Intel and AMD. This ISA is incompatible with Mist, whose hardware uses the 64 bit PPC ISA (set to little endian mode). The practical meaning is that x86-compiled binaries (executables and libraries) cannot be installed on Mist. For this reason, the Niagara and {{Alliance}} software stacks (modules) cannot be made available on Mist, and using closed-source software is only possible when the vendor provides a compatible version of their application. '''Python applications''' almost always rely on bindings to libraries originally written in C or C++, some of them are not available on PyPI or various Conda channels as precompiled binaries compatible with Mist. The recommended way to use Python on Mist is to create a [[#Anaconda (Python)|Conda]] environment and install packages from the anaconda (default) channel, where most popular packages have a linux-ppc64le (Mist-compatible) version available. Some popular machine learning packages should be installed from the internal [[#Open-CE|Open-CE]] channel. Where a compatible Conda package cannot be found, installing from PyPI (&amp;lt;code&amp;gt;pip install&amp;lt;/code&amp;gt;) can be attempted. Pip will attempt to compile the package’s source code if no compatible precompiled wheel is available, therefore a compiler module (such as &amp;lt;code&amp;gt;gcc/.core&amp;lt;/code&amp;gt;) should be loaded in advance. Some packages require tweaking of the source code or build procedure to successfully compile on Mist, please contact [[#Support|support]] if you need assistance.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
As of January 22 2022, authentication is only allowed via SSH keys. [https://docs.computecanada.ca/wiki/SSH_Keys Please refer to this page] to generate your SSH key pair and make sure you use them securely.&lt;br /&gt;
&lt;br /&gt;
Mist can be accessed directly:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]] and a list of [[Modules for Mist]] is also available.&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/11.0.3&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''11.0.3''' (default) and '''10.2.2 (10.2.89)''', '''11.2.2''', '''11.4.4''', '''11.6.2''', '''11.7.1''' and '''11.8.0''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/11.0.3&lt;br /&gt;
module load cuda/10.2.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or NVHPC/PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 450.119.04.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/11.4.0 (must load cuda 11.7.1)&lt;br /&gt;
gcc/11.3.0 (must load cuda 11.4.4) &lt;br /&gt;
gcc/9.4.0 (must load cuda 11.0.3)&lt;br /&gt;
gcc/8.5.0 (must load cuda 10.2.2, 11.0.3 or 11.2.2)&lt;br /&gt;
gcc/10.3.0 (w/o cuda)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.10  (must load cuda 11.0.3)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== NVHPC/PGI ===&lt;br /&gt;
PGI compiler is provided in NVHPC (NVIDIA HPC SDK).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load nvhpc/21.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05 cmake/3.19.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=FALSE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal P100 and later GPUs like V100 do not scale beyond a single GPU'''. It is highly suggested to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.2.2 gcc/8.5.0 cudnn nccl anaconda3/2021.05&lt;br /&gt;
conda create -n cupy-env python=3.8 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$MODULE_CUDNN_PREFIX/include -I$MODULE_NCCL_PREFIX/include -I$MODULE_CUDA_PREFIX/include&amp;quot; LDFLAGS=&amp;quot;-L$MODULE_CUDNN_PREFIX/lib64 -L$MODULE_NCCL_PREFIX/lib&amp;quot; CUDA_PATH=$MODULE_CUDA_PREFIX pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
*'''GROMACS 2019'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020 and later''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.4&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.6&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2 (testing purpose only)&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.4&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2022&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 tMPI thread (w/ multiple OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* Gromacs 2019 example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Gromacs 2020 or later example: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s).&lt;br /&gt;
'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.&lt;br /&gt;
'''&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks/tmpi threads + 1 PME rank/tmpi thread: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''-ntmpi 4''' and '''export OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;-ntmpi 16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;-ntmpi 4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_thread_MPI_ranks(-ntmpi) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.'''&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== 2.14 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with single GPU ====&lt;br /&gt;
If you have many jobs to run, it is always suggested to run with a single gpu per job. This makes jobs easier to be scheduled and gives better overall performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -bind-to none -hostfile nodelist-$SLURM_JOB_ID `which namd2` +idlepoll +ppn 8 +p 8 stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Running with one process per node (4 GPUs)====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU (4 GPUs)====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Open-CE ==&lt;br /&gt;
[https://github.com/open-ce/open-ce Open-CE] is an '''IBM''' repo for feedstock collection, environment data, and scripts for building Tensorflow, Pytorch, and other machine learning packages and dependencies. Open-CE is distributed as a '''conda channel''' on Mist cluster.&lt;br /&gt;
'''Available packages and versions are listed here [https://github.com/open-ce/open-ce/releases/tag/open-ce-v1.7.2 Open-CE Releases]'''. Currently only python 3.8 and CUDA 11.4 are supported. If you need a different python or cuda version, please check old versions on OSU server https://ftp.osuosl.org/pub/open-ce/ or contact SOSCIP/SciNet support.&lt;br /&gt;
&lt;br /&gt;
*Packages can be installed by setting Open-CE conda channel:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://ftp.osuosl.org/pub/open-ce/1.7.2/ python=3.8 cudatoolkit=11.4 PACKAGE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+Available Packages:&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow&lt;br /&gt;
|TensorFlow Estimators&lt;br /&gt;
|TensorFlow Probability&lt;br /&gt;
|TensorBoard&lt;br /&gt;
|TensorBoard Data Server&lt;br /&gt;
|TensorFlow Text&lt;br /&gt;
|TensorFlow Model Optimizations&lt;br /&gt;
|TensorFlow Addons (tensorflow-addons)&lt;br /&gt;
|TensorFlow Datasets&lt;br /&gt;
|TensorFlow Hub&lt;br /&gt;
|-&lt;br /&gt;
|TensorFlow MetaData&lt;br /&gt;
|PyTorch&lt;br /&gt;
|TorchText&lt;br /&gt;
|TorchVision&lt;br /&gt;
|PyTorch Lightning&lt;br /&gt;
|PyTorch Lightning Bolts&lt;br /&gt;
|ONNX&lt;br /&gt;
|Onnx-runtime&lt;br /&gt;
|skl2onnx&lt;br /&gt;
|tf2onnx&lt;br /&gt;
|-&lt;br /&gt;
|onnxmltools&lt;br /&gt;
|onnxconverter-common&lt;br /&gt;
|XGBoost&lt;br /&gt;
|LightGBM&lt;br /&gt;
|Transformers&lt;br /&gt;
|Tokenizers&lt;br /&gt;
|SentencePiece&lt;br /&gt;
|Spacy&lt;br /&gt;
|DALI&lt;br /&gt;
|OpenCV&lt;br /&gt;
|-&lt;br /&gt;
|Horovod&lt;br /&gt;
|PyArrow&lt;br /&gt;
|grpc&lt;br /&gt;
|uwsgi&lt;br /&gt;
|ORC&lt;br /&gt;
|Mamba&lt;br /&gt;
|Ray (ray-tune)&lt;br /&gt;
|pytorch_geometric&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from Open-CE Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install PyTorch using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch python=3.9&lt;br /&gt;
source activate pytorch&lt;br /&gt;
&lt;br /&gt;
#must force to use Open-CE channel to avoid the cpu-only version of PyTorch from default Anaconda channel&lt;br /&gt;
conda config --prepend channels /scinet/mist/ibm/open-ce-1.9.1&lt;br /&gt;
conda config --set channel_priority strict&lt;br /&gt;
&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.9.1 pytorch=2.0.1 cudatoolkit=11.8&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
#remove .condarc to reset conda channel priority&lt;br /&gt;
rm -f $HOME/.condarc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add below command into your job script before python command to get deterministic results, see details here: [https://github.com/pytorch/pytorch/issues/39849]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CUBLAS_WORKSPACE_CONFIG=:4096:2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: cuDF(GPU DataFrames), cuML(GPU Machine Learning Algorithms), cuStrings(GPU String Manipulation), etc.&lt;br /&gt;
*'''The most recent version supported is 0.13.0. Newer version is no longer provided by IBM's powerai channel.'''&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel. Python 3.8+ is not supported.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Open-CE Conda channel. User needs to prepare a conda environment and install TensorFlow using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.8&lt;br /&gt;
source activate tf_env&lt;br /&gt;
&lt;br /&gt;
conda install -c https://ftp.osuosl.org/pub/open-ce/1.7.2/ tensorflow=2.9.2 cudatoolkit=11.4&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Ray ==&lt;br /&gt;
Ray is an API for building distributed applications. The easiest way to install Ray on Mist is using an Open-CE Conda channel. The &amp;lt;code&amp;gt;open-ce-1.9.1&amp;lt;/code&amp;gt; (local) channel has a version of Ray that is compatible with Python 3.9 and can be installed as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.9.1 ray-all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 15 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script. Users needs to load module and activate the conda environment after a debug job starts. It is recommended to do a 'conda clean' before 'source activate ENV' in a debug job if --clean flag is missed.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Limits ==&lt;br /&gt;
&lt;br /&gt;
There are limits to the size and duration of your jobs, the number of jobs you can run and the number of jobs you can have queued.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Usage&lt;br /&gt;
!Partition&lt;br /&gt;
!Running jobs&lt;br /&gt;
!Jobs in queue&lt;br /&gt;
!Min. size of jobs&lt;br /&gt;
!Max. size of jobs&lt;br /&gt;
!Min. walltime&lt;br /&gt;
!Max. walltime &lt;br /&gt;
|-&lt;br /&gt;
|Compute jobs ||compute || 100 GPUs || 1000 || 1 GPU (8&amp;amp;nbsp;cores) || default:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs) &amp;lt;br&amp;gt; with&amp;amp;nbsp;allocation:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs)|| 15 minutes || 24 hours&lt;br /&gt;
|-&lt;br /&gt;
|Testing or troubleshooting || debug || 1 || 1 || 1 GPU (8 cores) || 2 nodes (8 GPUs)|| N/A || 2/n&amp;lt;sub&amp;gt;gpu&amp;lt;/sub&amp;gt; hours&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Even if you respect these limits, your jobs will still have to wait in the queue. The waiting time depends on many factors such as your group's allocation amount, how much allocation has been used in the recent past, the number of requested nodes and walltime, and how many other jobs are waiting in the queue.&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=5553</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=5553"/>
		<updated>2024-03-27T13:46:16Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Updated PyTorch&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 8.2&lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
'''&amp;lt;span style=&amp;quot;background:#fc8383&amp;quot;&amp;gt;Important note:&amp;lt;/span&amp;gt;''' the majority of computer systems as of 2021 (laptops, desktops, and HPC) use the 64 bit x86 instruction set architecture (ISA) in their microprocessors produced by Intel and AMD. This ISA is incompatible with Mist, whose hardware uses the 64 bit PPC ISA (set to little endian mode). The practical meaning is that x86-compiled binaries (executables and libraries) cannot be installed on Mist. For this reason, the Niagara and {{Alliance}} software stacks (modules) cannot be made available on Mist, and using closed-source software is only possible when the vendor provides a compatible version of their application. '''Python applications''' almost always rely on bindings to libraries originally written in C or C++, some of them are not available on PyPI or various Conda channels as precompiled binaries compatible with Mist. The recommended way to use Python on Mist is to create a [[#Anaconda (Python)|Conda]] environment and install packages from the anaconda (default) channel, where most popular packages have a linux-ppc64le (Mist-compatible) version available. Some popular machine learning packages should be installed from the internal [[#Open-CE|Open-CE]] channel. Where a compatible Conda package cannot be found, installing from PyPI (&amp;lt;code&amp;gt;pip install&amp;lt;/code&amp;gt;) can be attempted. Pip will attempt to compile the package’s source code if no compatible precompiled wheel is available, therefore a compiler module (such as &amp;lt;code&amp;gt;gcc/.core&amp;lt;/code&amp;gt;) should be loaded in advance. Some packages require tweaking of the source code or build procedure to successfully compile on Mist, please contact [[#Support|support]] if you need assistance.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
As of January 22 2022, authentication is only allowed via SSH keys. [https://docs.computecanada.ca/wiki/SSH_Keys Please refer to this page] to generate your SSH key pair and make sure you use them securely.&lt;br /&gt;
&lt;br /&gt;
Mist can be accessed directly:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]] and a list of [[Modules for Mist]] is also available.&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/11.0.3&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''11.0.3''' (default) and '''10.2.2 (10.2.89)''', '''11.2.2''', '''11.4.4''', '''11.6.2''', '''11.7.1''' and '''11.8.0''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/11.0.3&lt;br /&gt;
module load cuda/10.2.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or NVHPC/PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 450.119.04.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/11.4.0 (must load cuda 11.7.1)&lt;br /&gt;
gcc/11.3.0 (must load cuda 11.4.4) &lt;br /&gt;
gcc/9.4.0 (must load cuda 11.0.3)&lt;br /&gt;
gcc/8.5.0 (must load cuda 10.2.2, 11.0.3 or 11.2.2)&lt;br /&gt;
gcc/10.3.0 (w/o cuda)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.10  (must load cuda 11.0.3)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== NVHPC/PGI ===&lt;br /&gt;
PGI compiler is provided in NVHPC (NVIDIA HPC SDK).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load nvhpc/21.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05 cmake/3.19.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=FALSE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal P100 and later GPUs like V100 do not scale beyond a single GPU'''. It is highly suggested to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.2.2 gcc/8.5.0 cudnn nccl anaconda3/2021.05&lt;br /&gt;
conda create -n cupy-env python=3.8 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$MODULE_CUDNN_PREFIX/include -I$MODULE_NCCL_PREFIX/include -I$MODULE_CUDA_PREFIX/include&amp;quot; LDFLAGS=&amp;quot;-L$MODULE_CUDNN_PREFIX/lib64 -L$MODULE_NCCL_PREFIX/lib&amp;quot; CUDA_PATH=$MODULE_CUDA_PREFIX pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
*'''GROMACS 2019'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020 and later''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.4&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.6&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2 (testing purpose only)&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.4&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2022&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 tMPI thread (w/ multiple OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* Gromacs 2019 example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Gromacs 2020 or later example: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s).&lt;br /&gt;
'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.&lt;br /&gt;
'''&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks/tmpi threads + 1 PME rank/tmpi thread: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''-ntmpi 4''' and '''export OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;-ntmpi 16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;-ntmpi 4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_thread_MPI_ranks(-ntmpi) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.'''&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== 2.14 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with single GPU ====&lt;br /&gt;
If you have many jobs to run, it is always suggested to run with a single gpu per job. This makes jobs easier to be scheduled and gives better overall performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -bind-to none -hostfile nodelist-$SLURM_JOB_ID `which namd2` +idlepoll +ppn 8 +p 8 stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Running with one process per node (4 GPUs)====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU (4 GPUs)====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Open-CE ==&lt;br /&gt;
[https://github.com/open-ce/open-ce Open-CE] is an '''IBM''' repo for feedstock collection, environment data, and scripts for building Tensorflow, Pytorch, and other machine learning packages and dependencies. Open-CE is distributed as a '''conda channel''' on Mist cluster.&lt;br /&gt;
'''Available packages and versions are listed here [https://github.com/open-ce/open-ce/releases/tag/open-ce-v1.7.2 Open-CE Releases]'''. Currently only python 3.8 and CUDA 11.4 are supported. If you need a different python or cuda version, please check old versions on OSU server https://ftp.osuosl.org/pub/open-ce/ or contact SOSCIP/SciNet support.&lt;br /&gt;
&lt;br /&gt;
*Packages can be installed by setting Open-CE conda channel:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://ftp.osuosl.org/pub/open-ce/1.7.2/ python=3.8 cudatoolkit=11.4 PACKAGE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+Available Packages:&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow&lt;br /&gt;
|TensorFlow Estimators&lt;br /&gt;
|TensorFlow Probability&lt;br /&gt;
|TensorBoard&lt;br /&gt;
|TensorBoard Data Server&lt;br /&gt;
|TensorFlow Text&lt;br /&gt;
|TensorFlow Model Optimizations&lt;br /&gt;
|TensorFlow Addons (tensorflow-addons)&lt;br /&gt;
|TensorFlow Datasets&lt;br /&gt;
|TensorFlow Hub&lt;br /&gt;
|-&lt;br /&gt;
|TensorFlow MetaData&lt;br /&gt;
|PyTorch&lt;br /&gt;
|TorchText&lt;br /&gt;
|TorchVision&lt;br /&gt;
|PyTorch Lightning&lt;br /&gt;
|PyTorch Lightning Bolts&lt;br /&gt;
|ONNX&lt;br /&gt;
|Onnx-runtime&lt;br /&gt;
|skl2onnx&lt;br /&gt;
|tf2onnx&lt;br /&gt;
|-&lt;br /&gt;
|onnxmltools&lt;br /&gt;
|onnxconverter-common&lt;br /&gt;
|XGBoost&lt;br /&gt;
|LightGBM&lt;br /&gt;
|Transformers&lt;br /&gt;
|Tokenizers&lt;br /&gt;
|SentencePiece&lt;br /&gt;
|Spacy&lt;br /&gt;
|DALI&lt;br /&gt;
|OpenCV&lt;br /&gt;
|-&lt;br /&gt;
|Horovod&lt;br /&gt;
|PyArrow&lt;br /&gt;
|grpc&lt;br /&gt;
|uwsgi&lt;br /&gt;
|ORC&lt;br /&gt;
|Mamba&lt;br /&gt;
|Ray (ray-tune)&lt;br /&gt;
|pytorch_geometric&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from Open-CE Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install PyTorch using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch python=3.9&lt;br /&gt;
source activate pytorch&lt;br /&gt;
&lt;br /&gt;
#must force to use Open-CE channel to avoid the cpu-only version of PyTorch from default Anaconda channel&lt;br /&gt;
conda config --prepend channels /scinet/mist/ibm/open-ce-1.9.1&lt;br /&gt;
conda config --set channel_priority strict&lt;br /&gt;
&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.9.1 pytorch=2.0.1 cudatoolkit=11.8&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
#remove .condarc to reset conda channel priority&lt;br /&gt;
rm -f $HOME/.condarc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add below command into your job script before python command to get deterministic results, see details here: [https://github.com/pytorch/pytorch/issues/39849]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CUBLAS_WORKSPACE_CONFIG=:4096:2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: cuDF(GPU DataFrames), cuML(GPU Machine Learning Algorithms), cuStrings(GPU String Manipulation), etc.&lt;br /&gt;
*'''The most recent version supported is 0.13.0. Newer version is no longer provided by IBM's powerai channel.'''&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel. Python 3.8+ is not supported.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Open-CE Conda channel. User needs to prepare a conda environment and install TensorFlow using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.8&lt;br /&gt;
source activate tf_env&lt;br /&gt;
&lt;br /&gt;
conda install -c https://ftp.osuosl.org/pub/open-ce/1.7.2/ tensorflow=2.9.2 cudatoolkit=11.4&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Ray ==&lt;br /&gt;
Ray is an API for building distributed applications. A local wheel is available for Python 3.8, but some tinckering is required to succesfully install it. Please activate your Conda environment and follow this installation recipe:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install tabulate tensorboardX pandas dataclasses aiohttp aioredis click colorama colorful filelock gpustat grpcio jsonschema numpy protobuf py-spy pyyaml requests redis opencensus prometheus_client beautifulsoup4 soupsieve cython wheel&lt;br /&gt;
pip install msgpack google aiohttp_cors&lt;br /&gt;
# Manually add py-spy version info, since Conda forgot to do that&lt;br /&gt;
PYSPYVERSION=$(py-spy --version | cut -d ' ' -f 2)&lt;br /&gt;
mkdir $CONDA_PREFIX/lib/python3.8/site-packages/py_spy-$PYSPYVERSION.dist-info&lt;br /&gt;
echo -e &amp;quot;Metadata-Version: 2.1\nName: py-spy\nVersion: $PYSPYVERSION&amp;quot; &amp;gt; $CONDA_PREFIX/lib/python3.8/site-packages/py_spy-$PYSPYVERSION.dist-info/METADATA&lt;br /&gt;
pip install /scinet/mist/wheelhouse/experimental/2021a/ray-1.0.1.post1-cp38-cp38-linux_ppc64le.whl&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf ~/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 15 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script. Users needs to load module and activate the conda environment after a debug job starts. It is recommended to do a 'conda clean' before 'source activate ENV' in a debug job if --clean flag is missed.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Limits ==&lt;br /&gt;
&lt;br /&gt;
There are limits to the size and duration of your jobs, the number of jobs you can run and the number of jobs you can have queued.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Usage&lt;br /&gt;
!Partition&lt;br /&gt;
!Running jobs&lt;br /&gt;
!Jobs in queue&lt;br /&gt;
!Min. size of jobs&lt;br /&gt;
!Max. size of jobs&lt;br /&gt;
!Min. walltime&lt;br /&gt;
!Max. walltime &lt;br /&gt;
|-&lt;br /&gt;
|Compute jobs ||compute || 100 GPUs || 1000 || 1 GPU (8&amp;amp;nbsp;cores) || default:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs) &amp;lt;br&amp;gt; with&amp;amp;nbsp;allocation:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs)|| 15 minutes || 24 hours&lt;br /&gt;
|-&lt;br /&gt;
|Testing or troubleshooting || debug || 1 || 1 || 1 GPU (8 cores) || 2 nodes (8 GPUs)|| N/A || 2/n&amp;lt;sub&amp;gt;gpu&amp;lt;/sub&amp;gt; hours&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Even if you respect these limits, your jobs will still have to wait in the queue. The waiting time depends on many factors such as your group's allocation amount, how much allocation has been used in the recent past, the number of requested nodes and walltime, and how many other jobs are waiting in the queue.&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=VS_Code&amp;diff=5451</id>
		<title>VS Code</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=VS_Code&amp;diff=5451"/>
		<updated>2024-02-13T19:29:52Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Version 1.86.1 supports legacy server&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;As of version 1.86, VS Code does not support CentOS 7 on the server side (see [https://code.visualstudio.com/docs/remote/linux here]). The implication is that remote development is no longer compatible with Niagara. The easiest solution that is currently working is to continue using version 1.85, if that is possible. Otherwise, one could try to implement the workaround described below.&lt;br /&gt;
&lt;br /&gt;
'''Update:''' as of version 1.86.1 VS Code is working again on Niagara without going through the procedure below (but support may be dropped again in a future version, see [https://github.com/microsoft/vscode/issues/204135 link]).&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
In this workaround you will make VS Code connect into a containerized environment that has newer versions of the operating system libraries. While this environment runs on Niagara, it is mostly isolated in terms of software. Thus, running software (modules, virtual environments, code that you have compiled, job submission) within this environment may not work or cause unexpected issues. It should be enough though for VS Code to install the server components and provide at least basic functionality. It is recommended to use the built-in terminal in VS Code to SSH out of this container to a normal Niagara login node, where normal functionality will be available in the shell, as described below.&lt;br /&gt;
&lt;br /&gt;
This procedure is a &amp;quot;hack&amp;quot; and may or may not provide full functionality, or work at all. Unfortunately we cannot guarantee anything, and provide this recipe &amp;quot;as is&amp;quot;. Nevertheless, please contact [mailto:support@scinet.utoronto.ca SciNet support] to report any issues, as we may be able to make some improvements.&lt;br /&gt;
&lt;br /&gt;
== Adding special host configuration ==&lt;br /&gt;
In this step, you add a new host to your SSH configurations '''on the client side''' (i.e. your laptop or workstation). That is typically done by editing the file &amp;lt;code&amp;gt;~/.ssh/config&amp;lt;/code&amp;gt; for OpenSSH clients. Add the following text&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Host niagara-rocky9&lt;br /&gt;
    HostName niagara.scinet.utoronto.ca&lt;br /&gt;
    User ccdbusername&lt;br /&gt;
    SetEnv XMODIFIERS=niagara-rocky9&lt;br /&gt;
    ForwardAgent yes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and replace &amp;lt;code&amp;gt;ccdbusername&amp;lt;/code&amp;gt; with your correct user name. The important bits here are to set the &amp;lt;code&amp;gt;XMODIFIERS&amp;lt;/code&amp;gt; environment variable in the remote (i.e. on Niagara), and enabling SSH agent forwarding, which will enable authentication to &amp;quot;escape&amp;quot; the container after VS Code has started.&lt;br /&gt;
&lt;br /&gt;
== Modifying &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; ==&lt;br /&gt;
In this step, we instruct the shell interpreter on Niagara to switch to the containerized environment when needed. Edit the file &amp;lt;code&amp;gt;~/.bashrc&amp;lt;/code&amp;gt; '''on Niagara''' and copy the text into '''the top''' of the file. Be very careful in this step, since incorrect definition in this file could prevent you from connecting to Niagara altogether (if that happens, only a system administrator would be able to revert the change).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
if [[ -z &amp;quot;$NIAGARA_ROCKY9_ENABLED&amp;quot; ]] &amp;amp;&amp;amp; [[ &amp;quot;$XMODIFIERS&amp;quot; == &amp;quot;niagara-rocky9&amp;quot; ]]; then&lt;br /&gt;
    export NIAGARA_ROCKY9_ENABLED=1&lt;br /&gt;
    exec /scinet/niagara/containers/rocky9&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Starting SSH agent ==&lt;br /&gt;
This step is not mandatory, but necessary if you want to escape the containerized environment in the built-in terminal in VS Code. Prior to starting VS Code, start an SSH agent on your laptop or workstation and make sure your appropriate (CCDB) keys are added. See these links for help on the topic: [[SSH#.28Optional.29_Using_ssh-agent_to_Remember_Your_Key|SciNet Wiki]], [https://docs.alliancecan.ca/wiki/Using_SSH_keys_in_Linux#Using_ssh-agent Alliance Wiki], [https://code.visualstudio.com/docs/remote/troubleshooting#_making-local-ssh-agent-available-on-the-remote VS Code troubleshooting page].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Using VS Code now ==&lt;br /&gt;
At this point you can start VS Code and find &amp;lt;code&amp;gt;niagara-rocky9&amp;lt;/code&amp;gt; in the list of remotes. You can connect to it normally and find your working directory on Niagara. If you started an SSH agent as described in the previous step, you should be able to open the terminal and type &amp;lt;code&amp;gt;ssh $HOSTNAME&amp;lt;/code&amp;gt; and get a normal bash shell on one of the Niagara login nodes. You may also SSH into Mist if that is needed.&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting ==&lt;br /&gt;
Some issues can be solved by restarting the server process. To do that, close all VS Code windows, connect to Niagara ''normally'' via SSH and run the command &amp;lt;code&amp;gt;kill-vscode-server&amp;lt;/code&amp;gt;. Then start VS Code and connect again. Please report difficulties to [mailto:support@scinet.utoronto.ca SciNet support].&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=VS_Code&amp;diff=5448</id>
		<title>VS Code</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=VS_Code&amp;diff=5448"/>
		<updated>2024-02-06T20:44:45Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added troubleshooting&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;As of version 1.86, VS Code does not support CentOS 7 on the server side (see [https://code.visualstudio.com/docs/remote/linux here]). The implication is that remote development is no longer compatible with Niagara. The easiest solution that is currently working is to continue using version 1.85, if that is possible. Otherwise, one could try to implement the workaround described below.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
In this workaround you will make VS Code connect into a containerized environment that has newer versions of the operating system libraries. While this environment runs on Niagara, it is mostly isolated in terms of software. Thus, running software (modules, virtual environments, code that you have compiled, job submission) within this environment may not work or cause unexpected issues. It should be enough though for VS Code to install the server components and provide at least basic functionality. It is recommended to use the built-in terminal in VS Code to SSH out of this container to a normal Niagara login node, where normal functionality will be available in the shell, as described below.&lt;br /&gt;
&lt;br /&gt;
This procedure is a &amp;quot;hack&amp;quot; and may or may not provide full functionality, or work at all. Unfortunately we cannot guarantee anything, and provide this recipe &amp;quot;as is&amp;quot;. Nevertheless, please contact [mailto:support@scinet.utoronto.ca SciNet support] to report any issues, as we may be able to make some improvements.&lt;br /&gt;
&lt;br /&gt;
== Adding special host configuration ==&lt;br /&gt;
In this step, you add a new host to your SSH configurations '''on the client side''' (i.e. your laptop or workstation). That is typically done by editing the file &amp;lt;code&amp;gt;~/.ssh/config&amp;lt;/code&amp;gt; for OpenSSH clients. Add the following text&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Host niagara-rocky9&lt;br /&gt;
    HostName niagara.scinet.utoronto.ca&lt;br /&gt;
    User ccdbusername&lt;br /&gt;
    SetEnv XMODIFIERS=niagara-rocky9&lt;br /&gt;
    ForwardAgent yes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and replace &amp;lt;code&amp;gt;ccdbusername&amp;lt;/code&amp;gt; with your correct user name. The important bits here are to set the &amp;lt;code&amp;gt;XMODIFIERS&amp;lt;/code&amp;gt; environment variable in the remote (i.e. on Niagara), and enabling SSH agent forwarding, which will enable authentication to &amp;quot;escape&amp;quot; the container after VS Code has started.&lt;br /&gt;
&lt;br /&gt;
== Modifying &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; ==&lt;br /&gt;
In this step, we instruct the shell interpreter on Niagara to switch to the containerized environment when needed. Edit the file &amp;lt;code&amp;gt;~/.bashrc&amp;lt;/code&amp;gt; '''on Niagara''' and copy the text into '''the top''' of the file. Be very careful in this step, since incorrect definition in this file could prevent you from connecting to Niagara altogether (if that happens, only a system administrator would be able to revert the change).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
if [[ -z &amp;quot;$NIAGARA_ROCKY9_ENABLED&amp;quot; ]] &amp;amp;&amp;amp; [[ &amp;quot;$XMODIFIERS&amp;quot; == &amp;quot;niagara-rocky9&amp;quot; ]]; then&lt;br /&gt;
    export NIAGARA_ROCKY9_ENABLED=1&lt;br /&gt;
    exec /scinet/niagara/containers/rocky9&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Starting SSH agent ==&lt;br /&gt;
This step is not mandatory, but necessary if you want to escape the containerized environment in the built-in terminal in VS Code. Prior to starting VS Code, start an SSH agent on your laptop or workstation and make sure your appropriate (CCDB) keys are added. See these links for help on the topic: [[SSH#.28Optional.29_Using_ssh-agent_to_Remember_Your_Key|SciNet Wiki]], [https://docs.alliancecan.ca/wiki/Using_SSH_keys_in_Linux#Using_ssh-agent Alliance Wiki], [https://code.visualstudio.com/docs/remote/troubleshooting#_making-local-ssh-agent-available-on-the-remote VS Code troubleshooting page].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Using VS Code now ==&lt;br /&gt;
At this point you can start VS Code and find &amp;lt;code&amp;gt;niagara-rocky9&amp;lt;/code&amp;gt; in the list of remotes. You can connect to it normally and find your working directory on Niagara. If you started an SSH agent as described in the previous step, you should be able to open the terminal and type &amp;lt;code&amp;gt;ssh $HOSTNAME&amp;lt;/code&amp;gt; and get a normal bash shell on one of the Niagara login nodes. You may also SSH into Mist if that is needed.&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting ==&lt;br /&gt;
Some issues can be solved by restarting the server process. To do that, close all VS Code windows, connect to Niagara ''normally'' via SSH and run the command &amp;lt;code&amp;gt;kill-vscode-server&amp;lt;/code&amp;gt;. Then start VS Code and connect again. Please report difficulties to [mailto:support@scinet.utoronto.ca SciNet support].&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=VS_Code&amp;diff=5445</id>
		<title>VS Code</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=VS_Code&amp;diff=5445"/>
		<updated>2024-02-05T02:57:47Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;As of version 1.86, VS Code does not support CentOS 7 on the server side (see [https://code.visualstudio.com/docs/remote/linux here]). The implication is that remote development is no longer compatible with Niagara. The easiest solution that is currently working is to continue using version 1.85, if that is possible. Otherwise, one could try to implement the workaround described below.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
In this workaround you will make VS Code connect into a containerized environment that has newer versions of the operating system libraries. While this environment runs on Niagara, it is mostly isolated in terms of software. Thus, running software (modules, virtual environments, code that you have compiled, job submission) within this environment may not work or cause unexpected issues. It should be enough though for VS Code to install the server components and provide at least basic functionality. It is recommended to use the built-in terminal in VS Code to SSH out of this container to a normal Niagara login node, where normal functionality will be available in the shell, as described below.&lt;br /&gt;
&lt;br /&gt;
This procedure is a &amp;quot;hack&amp;quot; and may or may not provide full functionality, or work at all. Unfortunately we cannot guarantee anything, and provide this recipe &amp;quot;as is&amp;quot;. Nevertheless, please contact [mailto:support@scinet.utoronto.ca SciNet support] to report any issues, as we may be able to make some improvements.&lt;br /&gt;
&lt;br /&gt;
== Adding special host configuration ==&lt;br /&gt;
In this step, you add a new host to your SSH configurations '''on the client side''' (i.e. your laptop or workstation). That is typically done by editing the file &amp;lt;code&amp;gt;~/.ssh/config&amp;lt;/code&amp;gt; for OpenSSH clients. Add the following text&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Host niagara-rocky9&lt;br /&gt;
    HostName niagara.scinet.utoronto.ca&lt;br /&gt;
    User ccdbusername&lt;br /&gt;
    SetEnv XMODIFIERS=niagara-rocky9&lt;br /&gt;
    ForwardAgent yes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and replace &amp;lt;code&amp;gt;ccdbusername&amp;lt;/code&amp;gt; with your correct user name. The important bits here are to set the &amp;lt;code&amp;gt;XMODIFIERS&amp;lt;/code&amp;gt; environment variable in the remote (i.e. on Niagara), and enabling SSH agent forwarding, which will enable authentication to &amp;quot;escape&amp;quot; the container after VS Code has started.&lt;br /&gt;
&lt;br /&gt;
== Modifying &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; ==&lt;br /&gt;
In this step, we instruct the shell interpreter on Niagara to switch to the containerized environment when needed. Edit the file &amp;lt;code&amp;gt;~/.bashrc&amp;lt;/code&amp;gt; '''on Niagara''' and copy the text into '''the top''' of the file. Be very careful in this step, since incorrect definition in this file could prevent you from connecting to Niagara altogether (if that happens, only a system administrator would be able to revert the change).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
if [[ -z &amp;quot;$NIAGARA_ROCKY9_ENABLED&amp;quot; ]] &amp;amp;&amp;amp; [[ &amp;quot;$XMODIFIERS&amp;quot; == &amp;quot;niagara-rocky9&amp;quot; ]]; then&lt;br /&gt;
    export NIAGARA_ROCKY9_ENABLED=1&lt;br /&gt;
    exec /scinet/niagara/containers/rocky9&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Starting SSH agent ==&lt;br /&gt;
This step is not mandatory, but necessary if you want to escape the containerized environment in the built-in terminal in VS Code. Prior to starting VS Code, start an SSH agent on your laptop or workstation and make sure your appropriate (CCDB) keys are added. See these links for help on the topic: [[SSH#.28Optional.29_Using_ssh-agent_to_Remember_Your_Key|SciNet Wiki]], [https://docs.alliancecan.ca/wiki/Using_SSH_keys_in_Linux#Using_ssh-agent Alliance Wiki], [https://code.visualstudio.com/docs/remote/troubleshooting#_making-local-ssh-agent-available-on-the-remote VS Code troubleshooting page].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Using VS Code now ==&lt;br /&gt;
At this point you can start VS Code and find &amp;lt;code&amp;gt;niagara-rocky9&amp;lt;/code&amp;gt; in the list of remotes. You can connect to it normally and find your working directory on Niagara. If you started an SSH agent as described in the previous step, you should be able to open the terminal and type &amp;lt;code&amp;gt;ssh $HOSTNAME&amp;lt;/code&amp;gt; and get a normal bash shell on one of the Niagara login nodes. You may also SSH into Mist if that is needed.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=VS_Code&amp;diff=5442</id>
		<title>VS Code</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=VS_Code&amp;diff=5442"/>
		<updated>2024-02-05T02:31:34Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Created page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;As of version 1.86, VS Code does not support CentOS 7 on the server side (see [https://code.visualstudio.com/docs/remote/linux here]). The implication is that remote development is no longer compatible with Niagara. The easiest solution that is currently working is to continue using version 1.85, if that is possible. Otherwise, one could try to implement the workaround described below.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
In this workaround you will make VS Code connect into a containerized environment that has newer versions of the operating system libraries. While this environment runs on Niagara, it is mostly isolated in terms of software. Thus, running software (modules, virtual environments, code that you have compiled, job submission) within this environment may not work or cause unexpected issues. It should be enough though for VS Code to install the server components and provide at least basic functionality. It is recommended to use the built-in terminal in VS Code to SSH out of this container to a normal Niagara login node, where normal functionality will be available in the shell, as described below.&lt;br /&gt;
&lt;br /&gt;
This procedure is a &amp;quot;hack&amp;quot; and may or may not provide full functionality, or work at all. Unfortunately we cannot guarantee anything, and provide this recipe &amp;quot;as is&amp;quot;. Nevertheless, please contact [mailto:support@scinet.utoronto.ca SciNet support] to report any issues, as we may be able to make some improvements.&lt;br /&gt;
&lt;br /&gt;
== Adding special host configuration ==&lt;br /&gt;
In this step, you add a new host to your SSH configurations '''on the client side''' (i.e. your laptop or workstation). That is typically done by editing the file &amp;lt;code&amp;gt;~/.ssh/config&amp;lt;/code&amp;gt; for OpenSSH clients. Add the following text&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Host niagara-rocky9&lt;br /&gt;
    HostName niagara.scinet.utoronto.ca&lt;br /&gt;
    User ccdbusername&lt;br /&gt;
    SetEnv XMODIFIERS=niagara-rocky9&lt;br /&gt;
    ForwardAgent yes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and replace &amp;lt;code&amp;gt;ccdbusername&amp;lt;/code&amp;gt; with your correct user name. The important bits here are to set the &amp;lt;code&amp;gt;XMODIFIERS&amp;lt;/code&amp;gt; environment variable in the remote (i.e. on Niagara), and enabling SSH agent forwarding, which will enable host-based authentication to &amp;quot;escape&amp;quot; the container after VS Code has started.&lt;br /&gt;
&lt;br /&gt;
== Modifying &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; ==&lt;br /&gt;
In this step, we instruct the shell interpreter on Niagara to switch to the containerized environment when needed. Edit the file &amp;lt;code&amp;gt;~/.bashrc&amp;lt;/code&amp;gt; '''on Niagara''' and copy the text into '''the top''' of the file. Be very careful in this step, since incorrect definition in this file could prevent you from connecting to Niagara altogether (if that happens, only a system administrator would be able to revert the change).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
if [[ -z &amp;quot;$NIAGARA_ROCKY9_ENABLED&amp;quot; ]] &amp;amp;&amp;amp; [[ &amp;quot;$XMODIFIERS&amp;quot; == &amp;quot;niagara-rocky9&amp;quot; ]]; then&lt;br /&gt;
    export NIAGARA_ROCKY9_ENABLED=1&lt;br /&gt;
    exec /scinet/niagara/containers/rocky9&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Starting SSH agent ==&lt;br /&gt;
This step is not mandatory, but necessary if you want to escape the containerized environment in the built-in terminal in VS Code. Prior to starting VS Code, start an SSH agent on your laptop or workstation and make sure your appropriate (CCDB) keys are added. See these links for help on the topic: [[SSH#.28Optional.29_Using_ssh-agent_to_Remember_Your_Key|SciNet Wiki]], [https://docs.alliancecan.ca/wiki/Using_SSH_keys_in_Linux#Using_ssh-agent Alliance Wiki], [https://code.visualstudio.com/docs/remote/troubleshooting#_making-local-ssh-agent-available-on-the-remote VS Code troubleshooting page].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Using VS Code now ==&lt;br /&gt;
At this point you can start VS Code and find &amp;lt;code&amp;gt;niagara-rocky9&amp;lt;/code&amp;gt; in the list of remotes. You can connect to it normally and find your working directory on Niagara. If you started an SSH agent as described in the previous step, you should be able to open the terminal and type &amp;lt;code&amp;gt;ssh $HOSTNAME&amp;lt;/code&amp;gt; and get a normal bash shell on one of the Niagara login nodes. You may also SSH into Mist if that is needed.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3734</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3734"/>
		<updated>2022-04-21T14:25:22Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added Conda cleaning instructions for Ray&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 8.2&lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
'''&amp;lt;span style=&amp;quot;background:#fc8383&amp;quot;&amp;gt;Important note:&amp;lt;/span&amp;gt;''' the majority of computer systems as of 2021 (laptops, desktops, and HPC) use the 64 bit x86 instruction set architecture (ISA) in their microprocessors produced by Intel and AMD. This ISA is incompatible with Mist, whose hardware uses the 64 bit PPC ISA (set to little endian mode). The practical meaning is that x86-compiled binaries (executables and libraries) cannot be installed on Mist. For this reason, the Niagara and Compute Canada software stacks (modules) cannot be made available on Mist, and using closed-source software is only possible when the vendor provides a compatible version of their application. '''Python applications''' almost always rely on bindings to libraries originally written in C or C++, some of them are not available on PyPI or various Conda channels as precompiled binaries compatible with Mist. The recommended way to use Python on Mist is to create a [[#Anaconda (Python)|Conda]] environment and install packages from the anaconda (default) channel, where most popular packages have a linux-ppc64le (Mist-compatible) version available. Some popular machine learning packages should be installed from the internal [[#Open-CE|Open-CE]] channel. Where a compatible Conda package cannot be found, installing from PyPI (&amp;lt;code&amp;gt;pip install&amp;lt;/code&amp;gt;) can be attempted. Pip will attempt to compile the package’s source code if no compatible precompiled wheel is available, therefore a compiler module (such as &amp;lt;code&amp;gt;gcc/.core&amp;lt;/code&amp;gt;) should be loaded in advance. Some packages require tweaking of the source code or build procedure to successfully compile on Mist, please contact [[#Support|support]] if you need assistance.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
As of January 22 2022, authentication is only allowed via SSH keys. [https://docs.computecanada.ca/wiki/SSH_Keys Please refer to this page] to generate your SSH key pair and make sure you use them securely.&lt;br /&gt;
&lt;br /&gt;
Mist can be accessed directly:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]] and a list of [[Modules for Mist]] is also available.&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/11.0.3&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''11.0.3''' and '''10.2.2 (10.2.89)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/11.0.3&lt;br /&gt;
module load cuda/10.2.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or NVHPC/PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 450.119.04.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/9.3.0 (must load CUDA 11)&lt;br /&gt;
gcc/8.5.0 (must load CUDA 10 or 11)&lt;br /&gt;
gcc/10.3.0 (w/o CUDA)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.10&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== NVHPC/PGI ===&lt;br /&gt;
PGI compiler is provided in NVHPC (NVIDIA HPC SDK).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load nvhpc/21.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05 cmake/3.19.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=FALSE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal P100 and later GPUs like V100 do not scale beyond a single GPU'''. It is highly suggested to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.2.2 gcc/8.5.0 cudnn nccl anaconda3/2021.05&lt;br /&gt;
conda create -n cupy-env python=3.8 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$MODULE_CUDNN_PREFIX/include -I$MODULE_NCCL_PREFIX/include -I$MODULE_CUDA_PREFIX/include&amp;quot; LDFLAGS=&amp;quot;-L$MODULE_CUDNN_PREFIX/lib64 -L$MODULE_NCCL_PREFIX/lib&amp;quot; CUDA_PATH=$MODULE_CUDA_PREFIX pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
*'''GROMACS 2019'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020 and later''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.4&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.6&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2 (testing purpose only)&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.4&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2022&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 tMPI thread (w/ multiple OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* Gromacs 2019 example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Gromacs 2020 or later example: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
export GMX_FORCE_UPDATE_DEFAULT_GPU=true&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s).&lt;br /&gt;
'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.&lt;br /&gt;
'''&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks/tmpi threads + 1 PME rank/tmpi thread: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''-ntmpi 4''' and '''export OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;-ntmpi 16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;-ntmpi 4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_thread_MPI_ranks(-ntmpi) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.'''&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== 2.14 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with single GPU ====&lt;br /&gt;
If you have many jobs to run, it is always suggested to run with a single gpu per job. This makes jobs easier to be scheduled and gives better overall performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -bind-to none -hostfile nodelist-$SLURM_JOB_ID `which namd2` +idlepoll +ppn 8 +p 8 stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Running with one process per node (4 GPUs)====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU (4 GPUs)====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Open-CE ==&lt;br /&gt;
[https://github.com/open-ce/open-ce Open-CE] is an '''IBM''' repo for feedstock collection, environment data, and scripts for building Tensorflow, Pytorch, and other machine learning packages and dependencies. Open-CE is distributed as a '''conda channel''' on Mist cluster.&lt;br /&gt;
'''Available packages and versions are listed here [https://github.com/open-ce/open-ce/releases/tag/open-ce-v1.5.2 Open-CE Releases]'''. Currently only python 3.8 and CUDA 11.2 are supported. If you need a different python or cuda version, please contact SOSCIP or SciNet support.&lt;br /&gt;
&lt;br /&gt;
*Packages can be installed by setting Open-CE conda channel:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce python=3.8 cudatoolkit=11.2 PACKAGE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+Available Packages:&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow&lt;br /&gt;
|TensorFlow Estimators&lt;br /&gt;
|TensorFlow Probability&lt;br /&gt;
|TensorBoard&lt;br /&gt;
|TensorBoard Data Server&lt;br /&gt;
|TensorFlow Text&lt;br /&gt;
|TensorFlow Model Optimizations&lt;br /&gt;
|TensorFlow Addons&lt;br /&gt;
|TensorFlow Datasets&lt;br /&gt;
|TensorFlow Hub&lt;br /&gt;
|-&lt;br /&gt;
|TensorFlow MetaData&lt;br /&gt;
|PyTorch&lt;br /&gt;
|TorchText&lt;br /&gt;
|TorchVision&lt;br /&gt;
|PyTorch Lightning&lt;br /&gt;
|PyTorch Lightning Bolts&lt;br /&gt;
|ONNX&lt;br /&gt;
|Onnx-runtime&lt;br /&gt;
|skl2onnx&lt;br /&gt;
|tf2onnx&lt;br /&gt;
|-&lt;br /&gt;
|onnxmltools&lt;br /&gt;
|onnxconverter-common&lt;br /&gt;
|XGBoost&lt;br /&gt;
|LightGBM&lt;br /&gt;
|Transformers&lt;br /&gt;
|Tokenizers&lt;br /&gt;
|SentencePiece&lt;br /&gt;
|Spacy&lt;br /&gt;
|DALI&lt;br /&gt;
|OpenCV&lt;br /&gt;
|-&lt;br /&gt;
|Horovod&lt;br /&gt;
|PyArrow&lt;br /&gt;
|grpc&lt;br /&gt;
|uwsgi&lt;br /&gt;
|ORC&lt;br /&gt;
|Mamba&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Open-CE Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install PyTorch using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.8&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
&lt;br /&gt;
#must force to use Open-CE channel to avoid the cpu-only version of PyTorch from default Anaconda channel&lt;br /&gt;
conda config --prepend channels /scinet/mist/ibm/open-ce&lt;br /&gt;
conda config --set channel_priority strict&lt;br /&gt;
&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce pytorch=1.10.2 cudatoolkit=11.2&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.2 pytorch=1.7.1 cudatoolkit=11.0 (or 10.2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
#remove .condarc to reset conda channel priority&lt;br /&gt;
rm $HOME/.condarc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add below command into your job script before python command to get deterministic results, see details here: [https://github.com/pytorch/pytorch/issues/39849]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CUBLAS_WORKSPACE_CONFIG=:4096:2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Open-CE Conda channel. User needs to prepare a conda environment and install TensorFlow using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.8&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce tensorflow==2.7.1 cudatoolkit=11.2&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.2 tensorflow==2.4.3 cudatoolkit=11.0 (or 10.2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Ray ==&lt;br /&gt;
Ray is an API for building distributed applications. A local wheel is available for Python 3.8, but some tinckering is required to succesfully install it. Please activate your Conda environment and follow this installation recipe:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install tabulate tensorboardX pandas dataclasses aiohttp aioredis click colorama colorful filelock gpustat grpcio jsonschema numpy protobuf py-spy pyyaml requests redis opencensus prometheus_client beautifulsoup4 soupsieve cython wheel&lt;br /&gt;
pip install msgpack google aiohttp_cors&lt;br /&gt;
# Manually add py-spy version info, since Conda forgot to do that&lt;br /&gt;
PYSPYVERSION=$(py-spy --version | cut -d ' ' -f 2)&lt;br /&gt;
mkdir $CONDA_PREFIX/lib/python3.8/site-packages/py_spy-$PYSPYVERSION.dist-info&lt;br /&gt;
echo -e &amp;quot;Metadata-Version: 2.1\nName: py-spy\nVersion: $PYSPYVERSION&amp;quot; &amp;gt; $CONDA_PREFIX/lib/python3.8/site-packages/py_spy-$PYSPYVERSION.dist-info/METADATA&lt;br /&gt;
pip install /scinet/mist/wheelhouse/experimental/2021a/ray-1.0.1.post1-cp38-cp38-linux_ppc64le.whl&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf ~/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 15 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script. Users needs to load module and activate the conda environment after a debug job starts. It is recommended to do a 'conda clean' before 'source activate ENV' in a debug job if --clean flag is missed.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Limits ==&lt;br /&gt;
&lt;br /&gt;
There are limits to the size and duration of your jobs, the number of jobs you can run and the number of jobs you can have queued.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Usage&lt;br /&gt;
!Partition&lt;br /&gt;
!Running jobs&lt;br /&gt;
!Jobs in queue&lt;br /&gt;
!Min. size of jobs&lt;br /&gt;
!Max. size of jobs&lt;br /&gt;
!Min. walltime&lt;br /&gt;
!Max. walltime &lt;br /&gt;
|-&lt;br /&gt;
|Compute jobs ||compute || 100 GPUs || 1000 || 1 GPU (8&amp;amp;nbsp;cores) || default:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs) &amp;lt;br&amp;gt; with&amp;amp;nbsp;allocation:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs)|| 15 minutes || 24 hours&lt;br /&gt;
|-&lt;br /&gt;
|Testing or troubleshooting || debug || 1 || 1 || 1 GPU (8 cores) || 2 nodes (8 GPUs)|| N/A || 2/n&amp;lt;sub&amp;gt;gpu&amp;lt;/sub&amp;gt; hours&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Even if you respect these limits, your jobs will still have to wait in the queue. The waiting time depends on many factors such as your group's allocation amount, how much allocation has been used in the recent past, the number of requested nodes and walltime, and how many other jobs are waiting in the queue.&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3731</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3731"/>
		<updated>2022-04-21T14:24:01Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added an installation recipe for Ray&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 8.2&lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
'''&amp;lt;span style=&amp;quot;background:#fc8383&amp;quot;&amp;gt;Important note:&amp;lt;/span&amp;gt;''' the majority of computer systems as of 2021 (laptops, desktops, and HPC) use the 64 bit x86 instruction set architecture (ISA) in their microprocessors produced by Intel and AMD. This ISA is incompatible with Mist, whose hardware uses the 64 bit PPC ISA (set to little endian mode). The practical meaning is that x86-compiled binaries (executables and libraries) cannot be installed on Mist. For this reason, the Niagara and Compute Canada software stacks (modules) cannot be made available on Mist, and using closed-source software is only possible when the vendor provides a compatible version of their application. '''Python applications''' almost always rely on bindings to libraries originally written in C or C++, some of them are not available on PyPI or various Conda channels as precompiled binaries compatible with Mist. The recommended way to use Python on Mist is to create a [[#Anaconda (Python)|Conda]] environment and install packages from the anaconda (default) channel, where most popular packages have a linux-ppc64le (Mist-compatible) version available. Some popular machine learning packages should be installed from the internal [[#Open-CE|Open-CE]] channel. Where a compatible Conda package cannot be found, installing from PyPI (&amp;lt;code&amp;gt;pip install&amp;lt;/code&amp;gt;) can be attempted. Pip will attempt to compile the package’s source code if no compatible precompiled wheel is available, therefore a compiler module (such as &amp;lt;code&amp;gt;gcc/.core&amp;lt;/code&amp;gt;) should be loaded in advance. Some packages require tweaking of the source code or build procedure to successfully compile on Mist, please contact [[#Support|support]] if you need assistance.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
As of January 22 2022, authentication is only allowed via SSH keys. [https://docs.computecanada.ca/wiki/SSH_Keys Please refer to this page] to generate your SSH key pair and make sure you use them securely.&lt;br /&gt;
&lt;br /&gt;
Mist can be accessed directly:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]] and a list of [[Modules for Mist]] is also available.&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/11.0.3&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''11.0.3''' and '''10.2.2 (10.2.89)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/11.0.3&lt;br /&gt;
module load cuda/10.2.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or NVHPC/PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 450.119.04.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/9.3.0 (must load CUDA 11)&lt;br /&gt;
gcc/8.5.0 (must load CUDA 10 or 11)&lt;br /&gt;
gcc/10.3.0 (w/o CUDA)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.10&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== NVHPC/PGI ===&lt;br /&gt;
PGI compiler is provided in NVHPC (NVIDIA HPC SDK).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load nvhpc/21.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05 cmake/3.19.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=FALSE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal P100 and later GPUs like V100 do not scale beyond a single GPU'''. It is highly suggested to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.2.2 gcc/8.5.0 cudnn nccl anaconda3/2021.05&lt;br /&gt;
conda create -n cupy-env python=3.8 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$MODULE_CUDNN_PREFIX/include -I$MODULE_NCCL_PREFIX/include -I$MODULE_CUDA_PREFIX/include&amp;quot; LDFLAGS=&amp;quot;-L$MODULE_CUDNN_PREFIX/lib64 -L$MODULE_NCCL_PREFIX/lib&amp;quot; CUDA_PATH=$MODULE_CUDA_PREFIX pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
*'''GROMACS 2019'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020 and later''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.4&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.6&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2 (testing purpose only)&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.4&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2022&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 tMPI thread (w/ multiple OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* Gromacs 2019 example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Gromacs 2020 or later example: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
export GMX_FORCE_UPDATE_DEFAULT_GPU=true&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s).&lt;br /&gt;
'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.&lt;br /&gt;
'''&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks/tmpi threads + 1 PME rank/tmpi thread: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''-ntmpi 4''' and '''export OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;-ntmpi 16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;-ntmpi 4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_thread_MPI_ranks(-ntmpi) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.'''&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== 2.14 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with single GPU ====&lt;br /&gt;
If you have many jobs to run, it is always suggested to run with a single gpu per job. This makes jobs easier to be scheduled and gives better overall performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -bind-to none -hostfile nodelist-$SLURM_JOB_ID `which namd2` +idlepoll +ppn 8 +p 8 stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Running with one process per node (4 GPUs)====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU (4 GPUs)====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Open-CE ==&lt;br /&gt;
[https://github.com/open-ce/open-ce Open-CE] is an '''IBM''' repo for feedstock collection, environment data, and scripts for building Tensorflow, Pytorch, and other machine learning packages and dependencies. Open-CE is distributed as a '''conda channel''' on Mist cluster.&lt;br /&gt;
'''Available packages and versions are listed here [https://github.com/open-ce/open-ce/releases/tag/open-ce-v1.5.2 Open-CE Releases]'''. Currently only python 3.8 and CUDA 11.2 are supported. If you need a different python or cuda version, please contact SOSCIP or SciNet support.&lt;br /&gt;
&lt;br /&gt;
*Packages can be installed by setting Open-CE conda channel:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce python=3.8 cudatoolkit=11.2 PACKAGE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+Available Packages:&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow&lt;br /&gt;
|TensorFlow Estimators&lt;br /&gt;
|TensorFlow Probability&lt;br /&gt;
|TensorBoard&lt;br /&gt;
|TensorBoard Data Server&lt;br /&gt;
|TensorFlow Text&lt;br /&gt;
|TensorFlow Model Optimizations&lt;br /&gt;
|TensorFlow Addons&lt;br /&gt;
|TensorFlow Datasets&lt;br /&gt;
|TensorFlow Hub&lt;br /&gt;
|-&lt;br /&gt;
|TensorFlow MetaData&lt;br /&gt;
|PyTorch&lt;br /&gt;
|TorchText&lt;br /&gt;
|TorchVision&lt;br /&gt;
|PyTorch Lightning&lt;br /&gt;
|PyTorch Lightning Bolts&lt;br /&gt;
|ONNX&lt;br /&gt;
|Onnx-runtime&lt;br /&gt;
|skl2onnx&lt;br /&gt;
|tf2onnx&lt;br /&gt;
|-&lt;br /&gt;
|onnxmltools&lt;br /&gt;
|onnxconverter-common&lt;br /&gt;
|XGBoost&lt;br /&gt;
|LightGBM&lt;br /&gt;
|Transformers&lt;br /&gt;
|Tokenizers&lt;br /&gt;
|SentencePiece&lt;br /&gt;
|Spacy&lt;br /&gt;
|DALI&lt;br /&gt;
|OpenCV&lt;br /&gt;
|-&lt;br /&gt;
|Horovod&lt;br /&gt;
|PyArrow&lt;br /&gt;
|grpc&lt;br /&gt;
|uwsgi&lt;br /&gt;
|ORC&lt;br /&gt;
|Mamba&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Open-CE Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install PyTorch using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.8&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
&lt;br /&gt;
#must force to use Open-CE channel to avoid the cpu-only version of PyTorch from default Anaconda channel&lt;br /&gt;
conda config --prepend channels /scinet/mist/ibm/open-ce&lt;br /&gt;
conda config --set channel_priority strict&lt;br /&gt;
&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce pytorch=1.10.2 cudatoolkit=11.2&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.2 pytorch=1.7.1 cudatoolkit=11.0 (or 10.2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
#remove .condarc to reset conda channel priority&lt;br /&gt;
rm $HOME/.condarc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add below command into your job script before python command to get deterministic results, see details here: [https://github.com/pytorch/pytorch/issues/39849]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CUBLAS_WORKSPACE_CONFIG=:4096:2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Open-CE Conda channel. User needs to prepare a conda environment and install TensorFlow using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.8&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce tensorflow==2.7.1 cudatoolkit=11.2&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.2 tensorflow==2.4.3 cudatoolkit=11.0 (or 10.2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Ray ==&lt;br /&gt;
Ray is an API for building distributed applications. A local wheel is available for Python 3.8, but some tinckering is required to succesfully install it. Please activate your Conda environment and follow this installation recipe:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install tabulate tensorboardX pandas dataclasses aiohttp aioredis click colorama colorful filelock gpustat grpcio jsonschema numpy protobuf py-spy pyyaml requests redis opencensus prometheus_client beautifulsoup4 soupsieve cython wheel&lt;br /&gt;
pip install msgpack google aiohttp_cors&lt;br /&gt;
# Manually add py-spy version info, since Conda forgot to do that&lt;br /&gt;
PYSPYVERSION=$(py-spy --version | cut -d ' ' -f 2)&lt;br /&gt;
mkdir $CONDA_PREFIX/lib/python3.8/site-packages/py_spy-$PYSPYVERSION.dist-info&lt;br /&gt;
echo -e &amp;quot;Metadata-Version: 2.1\nName: py-spy\nVersion: $PYSPYVERSION&amp;quot; &amp;gt; $CONDA_PREFIX/lib/python3.8/site-packages/py_spy-$PYSPYVERSION.dist-info/METADATA&lt;br /&gt;
pip install /scinet/mist/wheelhouse/experimental/2021a/ray-1.0.1.post1-cp38-cp38-linux_ppc64le.whl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 15 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script. Users needs to load module and activate the conda environment after a debug job starts. It is recommended to do a 'conda clean' before 'source activate ENV' in a debug job if --clean flag is missed.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Limits ==&lt;br /&gt;
&lt;br /&gt;
There are limits to the size and duration of your jobs, the number of jobs you can run and the number of jobs you can have queued.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Usage&lt;br /&gt;
!Partition&lt;br /&gt;
!Running jobs&lt;br /&gt;
!Jobs in queue&lt;br /&gt;
!Min. size of jobs&lt;br /&gt;
!Max. size of jobs&lt;br /&gt;
!Min. walltime&lt;br /&gt;
!Max. walltime &lt;br /&gt;
|-&lt;br /&gt;
|Compute jobs ||compute || 100 GPUs || 1000 || 1 GPU (8&amp;amp;nbsp;cores) || default:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs) &amp;lt;br&amp;gt; with&amp;amp;nbsp;allocation:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs)|| 15 minutes || 24 hours&lt;br /&gt;
|-&lt;br /&gt;
|Testing or troubleshooting || debug || 1 || 1 || 1 GPU (8 cores) || 2 nodes (8 GPUs)|| N/A || 2/n&amp;lt;sub&amp;gt;gpu&amp;lt;/sub&amp;gt; hours&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Even if you respect these limits, your jobs will still have to wait in the queue. The waiting time depends on many factors such as your group's allocation amount, how much allocation has been used in the recent past, the number of requested nodes and walltime, and how many other jobs are waiting in the queue.&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Slurm&amp;diff=3644</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Slurm&amp;diff=3644"/>
		<updated>2022-03-09T15:57:34Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Updated debugjob max time&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The queueing system used at SciNet is based around the [https://slurm.schedmd.com Slurm Workload Manager].  This &amp;quot;scheduler&amp;quot;, Slurm, determines which jobs will be run on which compute nodes, and when.  This page outlines how to submit jobs, how to interact with the scheduler, and some of the most common Slurm commands.&lt;br /&gt;
&lt;br /&gt;
Some common questions about the queuing system can be found on the [[FAQ]] as well.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a Niagara login node.  This is done by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
 nia-login07:~$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job, described by the job script, into the queue.  The scheduler will will run the job on the compute nodes in due course.  A typical submission script is as follows.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;#!/bin/bash &lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --job-name mpi_job&lt;br /&gt;
#SBATCH --output=mpi_output_%j.txt&lt;br /&gt;
#SBATCH --mail-type=FAIL&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load intel/2018.2&lt;br /&gt;
module load openmpi/3.1.0&lt;br /&gt;
&lt;br /&gt;
mpirun ./mpi_example&lt;br /&gt;
# or &amp;quot;srun ./mpi_example&amp;quot;&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some notes about this example:&lt;br /&gt;
* The first line indicates that this is a bash script.&lt;br /&gt;
* Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&lt;br /&gt;
* sbatch reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;mpi_job&amp;lt;/code&amp;gt;).&lt;br /&gt;
* In this case, SLURM looks for 2 nodes with 40 cores on which to run 80 tasks, for 1 hour.&lt;br /&gt;
* Note that the mpifun flag &amp;quot;--ppn&amp;quot; (processors per node) is ignored.  Slurm takes care of this detail.&lt;br /&gt;
* Once the scheduler finds a spot to run the job, it runs the script:&lt;br /&gt;
** It changes to the submission directory;&lt;br /&gt;
** Loads modules;&lt;br /&gt;
** Runs the &amp;lt;code&amp;gt;mpi_example&amp;lt;/code&amp;gt; application.&lt;br /&gt;
* To use hyperthreading, just change --ntasks-per-node=40 to --ntasks-per-node=80, and add --bind-to none to the mpirun command (the latter is necessary for OpenMPI only, not when using IntelMPI).&lt;br /&gt;
&lt;br /&gt;
To create a job script appropriate for your work, you must modify the commands above to instruct Slurm to run the commands you need run.&lt;br /&gt;
&lt;br /&gt;
== Things to remember ==&lt;br /&gt;
&lt;br /&gt;
There are some things to always bear in mind when crafting your submission script:&lt;br /&gt;
* Scheduling is by node, so in multiples of 40 cores.  You are expected to use all 40 cores!  If you are running serial jobs, and need assistance bundling your work into multiples of 40, please see the [[Running_Serial_Jobs_on_Niagara | serial jobs]] page.&lt;br /&gt;
* Jobs must write to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access. Download data you need before submitting your job.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below).&lt;br /&gt;
* Jobs will run under your group's RRG allocation.  If your group does not have an allocation, your job will run under your group's RAS allocation (previously called `default' allocation).  Note that groups with an allocation cannot run under a default allocation.&lt;br /&gt;
* The maximum [[Wallclock_time | walltime]] for all users is 24 hours. The minimum and default walltime is 15 minutes.&lt;br /&gt;
&lt;br /&gt;
= Scheduling details =&lt;br /&gt;
&lt;br /&gt;
We now present the details of how to write a job script, and some extra commands which you might find useful.&lt;br /&gt;
&lt;br /&gt;
== SLURM nomenclature: jobs, nodes, tasks, cpus, cores, threads  ==&lt;br /&gt;
&lt;br /&gt;
SLURM has a somewhat different way of referring to things like MPI processes and thread tasks, as compared to our previous scheduler, MOAB.  The SLURM nomenclature is reflected in the names of scheduler options (i.e., resource requests). SLURM strictly enforces those requests, so it is important to get this right.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!term &lt;br /&gt;
!meaning &lt;br /&gt;
!SLURM term&lt;br /&gt;
!related scheduler options &lt;br /&gt;
|-&lt;br /&gt;
|job&lt;br /&gt;
|scheduled piece of work for which specific resources were requested.&lt;br /&gt;
|job&lt;br /&gt;
|&amp;lt;tt&amp;gt;sbatch, salloc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|node&lt;br /&gt;
|basic computing component with several cores (40 for Niagara) that share memory  &lt;br /&gt;
|node&lt;br /&gt;
|&amp;lt;tt&amp;gt;--nodes -N&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|mpi process&lt;br /&gt;
|one of a group of running programs using Message Passing Interface for parallel computing&lt;br /&gt;
|task&lt;br /&gt;
|&amp;lt;tt&amp;gt;--ntasks -n --ntasks-per-node&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|core ''or'' physical cpu&lt;br /&gt;
|A fully functional independent physical execution unit.&lt;br /&gt;
| -   &lt;br /&gt;
| -&lt;br /&gt;
|-&lt;br /&gt;
|logical cpu&lt;br /&gt;
|An execution unit that the operating system can assign work to. Operating systems can be configured to overload physical cores with multiple logical cpus using hyperthreading.&lt;br /&gt;
|cpu&lt;br /&gt;
|&amp;lt;tt&amp;gt;--cpus-per-task&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|thread&lt;br /&gt;
|one of possibly multiple simultaneous execution paths within a program, which can share memory.&lt;br /&gt;
| -&lt;br /&gt;
| &amp;lt;tt&amp;gt;--cpus-per-task&amp;lt;/tt&amp;gt; '''and''' &amp;lt;tt&amp;gt;OMP_NUM_THREADS&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|hyperthread&lt;br /&gt;
|a thread run in a collection of threads that is larger than the number of physical cores.&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Scheduling by Node ==&lt;br /&gt;
&lt;br /&gt;
* On many systems that use SLURM, the scheduler will deduce from the job script specifications (the number of tasks and the number of cpus-per-node) what resources should be allocated.  On Niagara, this is a bit different.&lt;br /&gt;
* All job resource requests on Niagara are scheduled as a multiple of '''nodes'''.&lt;br /&gt;
* The nodes that your jobs run on are exclusively yours.&lt;br /&gt;
** No other users are running anything on them.&lt;br /&gt;
** You can ssh into them, while your job is running, to see how things are going.&lt;br /&gt;
* Whatever you request of the scheduler, your request will always be translated into a multiple of nodes allocated to your job.&lt;br /&gt;
* Memory requests to the scheduler are of no use. Your job always gets N x 202GB of RAM, where N is the number of nodes.  Each node has about 202GB of RAM available.&lt;br /&gt;
* You should try to use all the cores on the nodes allocated to your job. Since there are 40 cores per node, your job should use N x 40 cores. If this is not the case, we will be contacted you to help you optimize your workflow.  Again, users which have serials jobs should consult the [[Running Serial Jobs on Niagara | serial jobs]] page.&lt;br /&gt;
&lt;br /&gt;
== Hyperthreading: Logical CPUs vs. cores ==&lt;br /&gt;
&lt;br /&gt;
Hyperthreading, a technology that leverages more of the physical hardware by pretending there are twice as many logical cores than real cores, is enabled on Niagara.&lt;br /&gt;
The operating system and scheduler see 80 logical CPUs.&lt;br /&gt;
&lt;br /&gt;
Using 80 logical CPUs versus 40 real cores typically gives about a 5-10% speedup, depending on your application (your mileage may vary).&lt;br /&gt;
&lt;br /&gt;
Because Niagara is scheduled by node, hyperthreading is actually fairly easy to use:&lt;br /&gt;
* Ask for a certain number of nodes, N, for your job.&lt;br /&gt;
* You know that you get 40 x N cores, so you will use (at least) a total of 40 x N MPI processes or threads (mpirun, srun, and the OS will automaticallly spread these over the real cores).&lt;br /&gt;
* But you should also test if running 80 x N MPI processes or threads gives you any speedup.&lt;br /&gt;
* Regardless, your usage will be counted as 40 x N x (walltime in years).&lt;br /&gt;
&lt;br /&gt;
Many applications which are communication-heavy can benefit from the use of hyperthreading.&lt;br /&gt;
&lt;br /&gt;
= Submission script details =&lt;br /&gt;
&lt;br /&gt;
This section outlines some details of how to interact with the scheduler, and how it implements Niagara's scheduling policies.&lt;br /&gt;
&lt;br /&gt;
== Queues ==&lt;br /&gt;
&lt;br /&gt;
There are 3 queues available on SciNet systems.  These queues have different limits; see the [[#Limits | Limits]] section for further details.&lt;br /&gt;
&lt;br /&gt;
=== Compute ===&lt;br /&gt;
&lt;br /&gt;
The compute queue is the default queue.  Most jobs will run in this queue.  If no flags are specified in the submission script this is the queue where your job will land.&lt;br /&gt;
&lt;br /&gt;
=== Debug ===&lt;br /&gt;
&lt;br /&gt;
The Debug queue is a high-priority queue, used for short-term testing of your code.  Do NOT use the debug queue for production work.  You can use the debug queue one of two ways.  To submit a standard job script to the debug queue, add the line&lt;br /&gt;
 #SBATCH -p debug&lt;br /&gt;
to your submission script.  This will put the job into the debug queue, and it should run in short order.&lt;br /&gt;
&lt;br /&gt;
To request an interactive debug session, where you retain control over the command line prompt, at a login node type the command&lt;br /&gt;
  nia-login07:~$ salloc -p debug --nodes 1 --time=1:00:00&lt;br /&gt;
This will request 1 node for 1 hour.  You can similarly request a debug session using the 'debugjob' command:&lt;br /&gt;
  nia-login07:~$ debugjob N&lt;br /&gt;
where N is the number of nodes, If N=1, this gives an interactive session one 1 hour, when N=4 (the maximum), it gives you 30 minutes.&lt;br /&gt;
&lt;br /&gt;
=== Archive ===&lt;br /&gt;
&lt;br /&gt;
The archivelong and archiveshort queues are only used by the [[HPSS]] system.  See that page for details on how to use these queues.&lt;br /&gt;
&lt;br /&gt;
== Limits ==&lt;br /&gt;
&lt;br /&gt;
There are limits to the size and duration of your jobs, the number of jobs you can run and the number of jobs you can have queued.  It matters whether a user is part of a group with a [https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions/ Resources for Research Group allocation] or not. It also matters in which 'partition' the jobs runs. 'Partitions' are SLURM-speak for use cases.  You specify the partition with the &amp;lt;tt&amp;gt;-p&amp;lt;/tt&amp;gt; parameter to &amp;lt;tt&amp;gt;sbatch&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;salloc&amp;lt;/tt&amp;gt;, but if you do not specify one, your job will run in the &amp;lt;tt&amp;gt;compute&amp;lt;/tt&amp;gt; partition, which is the most common case. &lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Usage&lt;br /&gt;
!Partition&lt;br /&gt;
!Running jobs&lt;br /&gt;
!Jobs in queue&lt;br /&gt;
!Min. size of jobs&lt;br /&gt;
!Max. size of jobs&lt;br /&gt;
!Min. walltime&lt;br /&gt;
!Max. walltime &lt;br /&gt;
|-&lt;br /&gt;
|Compute jobs ||compute || 50 || 1000 || 1 node (40&amp;amp;nbsp;cores) || default:&amp;amp;nbsp;20&amp;amp;nbsp;nodes&amp;amp;nbsp;(800&amp;amp;nbsp;cores) &amp;lt;br&amp;gt; with&amp;amp;nbsp;allocation:&amp;amp;nbsp;1000&amp;amp;nbsp;nodes&amp;amp;nbsp;(40000&amp;amp;nbsp;cores)|| 15 minutes || 24 hours&lt;br /&gt;
|-&lt;br /&gt;
|Testing or troubleshooting || debug || 1 || 1 || 1 node (40 cores) || 4 nodes (160 cores)|| N/A || min(1, 1.5/n&amp;lt;sub&amp;gt;node&amp;lt;/sub&amp;gt;) hours&lt;br /&gt;
|-&lt;br /&gt;
|Archiving or retrieving data in [[HPSS]]|| archivelong || 2 per user (max 5 total) || 10 per user || N/A || N/A|| 15 minutes || 72 hours&lt;br /&gt;
|-&lt;br /&gt;
|Inspecting archived data, small archival actions in [[HPSS]] || archiveshort || 2 per user|| 10 per user || N/A || N/A || 15 minutes || 1 hour&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Even if you respect these limits, your jobs will still have to wait in the queue. The waiting time depends on many factors such as your group's allocation amount, how much allocation has been used in the recent past, the number of requested nodes and walltime, and how many other jobs are waiting in the queue.&lt;br /&gt;
&lt;br /&gt;
== Slurm Accounts ==&lt;br /&gt;
&lt;br /&gt;
To be able to prioritise jobs based on groups and allocations, the Slurm scheduler uses the concept of ''accounts''.  Each group that has a Resource for Research Groups (RRG) or Research Platforms and Portals (RPP) allocation (awarded through an annual competition by Compute Canada) has an account that starts with &amp;lt;tt&amp;gt;rrg-&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;rpp-&amp;lt;/tt&amp;gt;.  Slurm assigns a 'fairshare' priority to these accounts based on the size of the award in core-years.  Groups without an RRG or RPP can use Niagara using a so-called Rapid Access Service (RAS), and have an account that starts with &amp;lt;tt&amp;gt;def-&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
On Niagara, most users will only ever use one account, and those users do not need to specify the account to Slurm.  However, users that are part of collaborations may be able to use multiple accounts, i.e., that of their sponsor and that of their collaborator, but this mean that they need to select the right account when running jobs. &lt;br /&gt;
&lt;br /&gt;
To select the account, just add &lt;br /&gt;
&lt;br /&gt;
    #SBATCH -A [account]&lt;br /&gt;
&lt;br /&gt;
to the job scripts, or use the &amp;lt;tt&amp;gt;-A [account]&amp;lt;/tt&amp;gt; to &amp;lt;tt&amp;gt;salloc&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;debugjob&amp;lt;/tt&amp;gt;. &lt;br /&gt;
&lt;br /&gt;
To see which accounts you have access to, or what their names are, use the command&lt;br /&gt;
&lt;br /&gt;
    sshare -U&lt;br /&gt;
&lt;br /&gt;
It has been noted that, in some cases, using the '-A' flag does not result in the appropriate account being used.  To get around this, specify the account when sbatch is invoked:&lt;br /&gt;
    sbatch -A account myjobscript.sh&lt;br /&gt;
&lt;br /&gt;
== Slurm environment variables ==&lt;br /&gt;
&lt;br /&gt;
There are many environment variables built into Slurm.  These are some which you may find useful:&lt;br /&gt;
* SLURM_SUBMIT_DIR: directory from which the job was submitted.&lt;br /&gt;
* SLURM_SUBMIT_HOST: host from which the job was submitted.&lt;br /&gt;
* SLURM_JOB_ID: the job's id.&lt;br /&gt;
* SLURM_JOB_NUM_NODES: number of nodes in the job.&lt;br /&gt;
* SLURM_JOB_NODELIST: list of nodes assigned to the job.&lt;br /&gt;
* SLURM_JOB_ACCOUNT: account associated with the job.&lt;br /&gt;
&lt;br /&gt;
Any of these environment variables can be accessed from within your job script.&lt;br /&gt;
&lt;br /&gt;
== Passing Variables to submission scripts ==&lt;br /&gt;
It is possible to pass values through environment variables into your SLURM submission scripts.&lt;br /&gt;
For doing so with already defined variables in your shell, just add the following directive in the submission script,&lt;br /&gt;
&lt;br /&gt;
 #SBATCH --export=ALL&lt;br /&gt;
&lt;br /&gt;
and you will have access to any predefined environment variable.&lt;br /&gt;
&lt;br /&gt;
A better way is to specify explicitly which variables you want to pass into the submision script,&lt;br /&gt;
&lt;br /&gt;
 sbatch --export=i=15,j='test' jobscript.sbatch&lt;br /&gt;
&lt;br /&gt;
You can even set the job name and output files using environment variables, eg.&lt;br /&gt;
&lt;br /&gt;
 i=&amp;quot;simulation&amp;quot;&lt;br /&gt;
 j=14&lt;br /&gt;
 sbatch --job-name=$i.$j.run --output=$i.$j.out --export=i=$i,j=$j jobscript.sbatch&lt;br /&gt;
&lt;br /&gt;
(The latter only works on the command line; you cannot use environment variables in &amp;lt;tt&amp;gt;#SBATCH&amp;lt;/tt&amp;gt; lines in the job script.)&lt;br /&gt;
&lt;br /&gt;
== Command line arguments ==&lt;br /&gt;
&lt;br /&gt;
Command line arguments can also be used for job script in the same way as command line argument for shell scripts. All command line arguments given to sbatch that follow after the job script name, will be passed to the job script. In fact, SLURM will not look at any of these arguments, so you must place all sbatch arguments before the script name, e.g.:&lt;br /&gt;
&lt;br /&gt;
 sbatch  -p debug  jobscript.sbatch  FirstArgument SecondArgument ...&lt;br /&gt;
&lt;br /&gt;
In this example, &amp;lt;tt&amp;gt;-p debug&amp;lt;/tt&amp;gt; is interpreted by SLURM, while in your submission script you can access &amp;lt;tt&amp;gt;FirstArgument&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;SecondArgument&amp;lt;/tt&amp;gt;, etc., by referring to &amp;lt;code&amp;gt;$1, $2, ...&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Job arrays ==&lt;br /&gt;
&lt;br /&gt;
Sometimes you need to run the same job script many times, but just tweaking one value each time.  One way of accomplishing this is using job arrays.  Job arrays are invoked using the &amp;quot;-a&amp;quot; flag with sbatch:&lt;br /&gt;
 sbatch -a 1-100 myjobscript.sh&lt;br /&gt;
This will submit 100 instances of myjobscript.sh.  Within the job script you can distinguish which of those instances is running using the environment variable SLURM_ARRAY_TASK_ID.&lt;br /&gt;
&lt;br /&gt;
Note that Niagara [[#Limits | currently]] has a limit of 1000 submitted jobs for users within groups with allocations, and 200 submitted jobs without an allocation.&lt;br /&gt;
&lt;br /&gt;
== Job dependencies ==&lt;br /&gt;
&lt;br /&gt;
You can make one job dependent on the successful completion of another job using the following command:&lt;br /&gt;
  sbatch --dependency=afterok:JOBID myjobscript.sh&lt;br /&gt;
This will make the current job submission not start until the parent job, with jobid JOBID, successfully completes.  There are many job dependency options available.  Visit the [https://slurm.schedmd.com/sbatch.html#OPT_dependency Slurm sbatch page ] for the full list.  &lt;br /&gt;
&lt;br /&gt;
If the parent job fails (that is, ends with a non-zero exit code) the dependent job can never be scheduled and will be automatically cancelled.&lt;br /&gt;
&lt;br /&gt;
== Email Notification ==&lt;br /&gt;
Email notification works, but you need to add the email address and type of notification you may want to receive in your submission script, eg.&lt;br /&gt;
&lt;br /&gt;
    #SBATCH --mail-user=YOUR.email.ADDRESS&lt;br /&gt;
    #SBATCH --mail-type=ALL&lt;br /&gt;
&lt;br /&gt;
The sbatch man page (type &amp;lt;tt&amp;gt;man sbatch&amp;lt;/tt&amp;gt; on Niagara) explains all possible mail-types.&lt;br /&gt;
&lt;br /&gt;
== Job Location Constraints ==&lt;br /&gt;
&lt;br /&gt;
=== Node types ===&lt;br /&gt;
&lt;br /&gt;
With the expansion of Niagara there are now two node types, 1548 Intel 6148 &amp;quot;skylake&amp;quot; CPU based nodes, and 468 Intel 6248 &amp;quot;cascadelake&amp;quot; CPU based nodes.  By default a job will be placed on the first available nodes but will not span node types.  You can specify a node type using one of the following directives to your submission script.&lt;br /&gt;
&lt;br /&gt;
    #SBATCH --constraint=skylake &lt;br /&gt;
    #SBATCH --constraint=cascade&lt;br /&gt;
&lt;br /&gt;
=== EDR/HDR Infiniband Topology ===&lt;br /&gt;
&lt;br /&gt;
The Infiniband high speed network used for job communication and file I/O on Niagara consists of 5 1:1 subscribed &amp;quot;wings&amp;quot; that connected together in a dragonfly topology with adaptive routing enabled. 4 wings (dragonfly[1-4]) consist of EDR based skylakde nodes and dragonfly5 contains all the of HDR100 cascadelake nodes.    By default multi-node jobs will run on the first available nodes which could be all within 1 wing, or span across multiple wings, but not across node types.  For most scalable parallel programs the performance difference should not be very significant, however if you wish keep your jobs from spanning wings you can use the following.&lt;br /&gt;
&lt;br /&gt;
    #SBATCH --constraint=[dragonfly1|dragonfly2|dragonfly3|dragonfly4|dragonfly5]&lt;br /&gt;
&lt;br /&gt;
= Monitoring jobs =&lt;br /&gt;
&lt;br /&gt;
There are many options available for monitoring your jobs.  The most basic of which is the squeue command:&lt;br /&gt;
&lt;br /&gt;
 nia-login07:~$ squeue -u USERNAME&lt;br /&gt;
     JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
    292047   compute   myjob4 username PD       0:00      4 (Priority)&lt;br /&gt;
    292048   compute   myjob3 username PD       0:00      4 (Priority)&lt;br /&gt;
    266829   compute   myjob2 username  R   18:56:17      2 nia[1397-1398]&lt;br /&gt;
    266828   compute   myjob1 username  R   18:56:46      1 nia1298&lt;br /&gt;
&lt;br /&gt;
Here you can see that we have two running jobs ('R') and two pending jobs ('PD').  The nodes being used are listed.&lt;br /&gt;
&lt;br /&gt;
== Job status ==&lt;br /&gt;
&lt;br /&gt;
To get an estimate of when a job will start, use the command&lt;br /&gt;
  squeue --start -j JOBID&lt;br /&gt;
Note that this is only an estimate, and tends not to be very accurate.&lt;br /&gt;
&lt;br /&gt;
Information about a specific job can be found using the &lt;br /&gt;
  squeue -j JOBID&lt;br /&gt;
or alternatively&lt;br /&gt;
  scontrol show job JOBID&lt;br /&gt;
which is more verbose.&lt;br /&gt;
&lt;br /&gt;
== SSHing to a node ==&lt;br /&gt;
&lt;br /&gt;
Once your job has started, the node belongs to you.  As such you may, from a login node, SSH into the node to check the performance of your job.  The first step is to find out which nodes are being used (see above).  Once you have your list of nodes, you can SSH into them directly.  Once there, you can run the 'top' or 'free' commands to check both CPU and memory usage.&lt;br /&gt;
&lt;br /&gt;
== jobperf ==&lt;br /&gt;
&lt;br /&gt;
The jobperf script will give you feedback on the performance of your currently-running job:&lt;br /&gt;
 nia-login07:~$ jobperf 123456&lt;br /&gt;
 ----------------------------------------------------------------------------------------------------&lt;br /&gt;
                    RUNNING          IDLE      USER       MEMORY(MB)          PROCESS NAMES&lt;br /&gt;
    HOSTNAME     #  %CPU  %MEM    DISK SLEEP   NAME    RAMDISK  USED AVAIL    (excl:bash,sh,ssh,sshd)&lt;br /&gt;
 ----------------------------------------------------------------------------------------------------&lt;br /&gt;
 nia1013         71  6999%  0.5%      0   22    ejspence      0  15060  178017   14*gmx_mpi mpiexec slurm_script&lt;br /&gt;
 nia1014         79  7677%  0.1%      0   18    ejspence      0  14803  178274   13*gmx_mpi&lt;br /&gt;
 nia1295         79  7517%  0.4%      0   18    ejspence      0  15199  177878   13*gmx_mpi&lt;br /&gt;
 ----------------------------------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
Here you can see both the CPU and memory usage of the job, for all nodes being used.&lt;br /&gt;
&lt;br /&gt;
== Other commands ==&lt;br /&gt;
&lt;br /&gt;
Some other commands had can be useful for dealing with your jobs:&lt;br /&gt;
* &amp;lt;code&amp;gt;scancel -i JOBID&amp;lt;/code&amp;gt; cancels a specific job.&lt;br /&gt;
* &amp;lt;code&amp;gt;sacct&amp;lt;/code&amp;gt; gives information about your recent jobs.&lt;br /&gt;
* &amp;lt;code&amp;gt;sinfo -p compute&amp;lt;/code&amp;gt; gives a list of available nodes.&lt;br /&gt;
* &amp;lt;code&amp;gt;qsum&amp;lt;/code&amp;gt; gives a summary of the queue by user.&lt;br /&gt;
&lt;br /&gt;
= Example submission scripts =&lt;br /&gt;
&lt;br /&gt;
Here we present some examples of how to create submission scripts for running parallel jobs.  Serial job examples can be found on the [[Running_Serial_Jobs_on_Niagara | serial jobs page]].&lt;br /&gt;
&lt;br /&gt;
== Example submission script (MPI) ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;#!/bin/bash &lt;br /&gt;
#SBATCH --nodes=8&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --job-name mpi_job&lt;br /&gt;
#SBATCH --output=mpi_output_%j.txt&lt;br /&gt;
#SBATCH --mail-type=FAIL&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load intel/2018.2&lt;br /&gt;
module load openmpi/3.1.0&lt;br /&gt;
&lt;br /&gt;
mpirun ./mpi_example&lt;br /&gt;
# or &amp;quot;srun ./mpi_example&amp;quot;&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Submit this script with the command:&lt;br /&gt;
&lt;br /&gt;
    nia-login07:~$ sbatch mpi_job.sh&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;First line indicates that this is a bash script.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;sbatch reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;mpi_job&amp;lt;/code&amp;gt;)&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;In this case, SLURM looks for 8 nodes with 40 cores on which to run 320 tasks, for 1 hour.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Note that the mpifun flag &amp;quot;--ppn&amp;quot; (processors per node) is ignored.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Once it found such a node, it runs the script:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Change to the submission directory;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Loads modules;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Runs the &amp;lt;code&amp;gt;mpi_example&amp;lt;/code&amp;gt; application.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;To use hyperthreading, just change --ntasks-per-node=40 to --ntasks-per-node=80, and add --bind-to none to the mpirun command (the latter is necessary for OpenMPI only, not when using IntelMPI).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example submission script (OpenMP) ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --cpus-per-task=40&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --job-name openmp_job&lt;br /&gt;
#SBATCH --output=openmp_output_%j.txt&lt;br /&gt;
#SBATCH --mail-type=FAIL&lt;br /&gt;
&lt;br /&gt;
cd $SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
module load intel/2018.2&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK&lt;br /&gt;
&lt;br /&gt;
./openmp_example&lt;br /&gt;
# or &amp;quot;srun ./openmp_example&amp;quot;.&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Submit this script with the command:&lt;br /&gt;
&lt;br /&gt;
    nia-login07:~$ sbatch openmp_job.sh&lt;br /&gt;
&lt;br /&gt;
* First line indicates that this is a bash script.&lt;br /&gt;
* Lines starting with &amp;lt;code&amp;gt;#SBATCH&amp;lt;/code&amp;gt; go to SLURM.&lt;br /&gt;
* sbatch reads these lines as a job request (which it gives the name &amp;lt;code&amp;gt;openmp_job&amp;lt;/code&amp;gt;) .&lt;br /&gt;
* In this case, SLURM looks for one node with 40 cores to be run inside one task, for 1 hour.&lt;br /&gt;
* Once it found such a node, it runs the script:&lt;br /&gt;
** Change to the submission directory;&lt;br /&gt;
** Loads modules;&lt;br /&gt;
** Sets an environment variable;&lt;br /&gt;
** Runs the &amp;lt;code&amp;gt;openmp_example&amp;lt;/code&amp;gt; application.&lt;br /&gt;
* To use hyperthreading, just change &amp;lt;code&amp;gt;--cpus-per-task=40&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;--cpus-per-task=80&amp;lt;/code&amp;gt;.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3641</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3641"/>
		<updated>2022-03-09T15:57:31Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Updated job limits&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 8.2&lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
'''&amp;lt;span style=&amp;quot;background:#fc8383&amp;quot;&amp;gt;Important note:&amp;lt;/span&amp;gt;''' the majority of computer systems as of 2021 (laptops, desktops, and HPC) use the 64 bit x86 instruction set architecture (ISA) in their microprocessors produced by Intel and AMD. This ISA is incompatible with Mist, whose hardware uses the 64 bit PPC ISA (set to little endian mode). The practical meaning is that x86-compiled binaries (executables and libraries) cannot be installed on Mist. For this reason, the Niagara and Compute Canada software stacks (modules) cannot be made available on Mist, and using closed-source software is only possible when the vendor provides a compatible version of their application. '''Python applications''' almost always rely on bindings to libraries originally written in C or C++, some of them are not available on PyPI or various Conda channels as precompiled binaries compatible with Mist. The recommended way to use Python on Mist is to create a [[#Anaconda (Python)|Conda]] environment and install packages from the anaconda (default) channel, where most popular packages have a linux-ppc64le (Mist-compatible) version available. Some popular machine learning packages should be installed from the internal [[#Open-CE|Open-CE]] channel. Where a compatible Conda package cannot be found, installing from PyPI (&amp;lt;code&amp;gt;pip install&amp;lt;/code&amp;gt;) can be attempted. Pip will attempt to compile the package’s source code if no compatible precompiled wheel is available, therefore a compiler module (such as &amp;lt;code&amp;gt;gcc/.core&amp;lt;/code&amp;gt;) should be loaded in advance. Some packages require tweaking of the source code or build procedure to successfully compile on Mist, please contact [[#Support|support]] if you need assistance.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
As of January 22 2022, authentication is only allowed via SSH keys. [https://docs.computecanada.ca/wiki/SSH_Keys Please refer to this page] to generate your SSH key pair and make sure you use them securely.&lt;br /&gt;
&lt;br /&gt;
Mist can be accessed directly:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /path/to/ssh_private_key -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]] and a list of [[Modules for Mist]] is also available.&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/11.0.3&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''11.0.3''' and '''10.2.2 (10.2.89)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/11.0.3&lt;br /&gt;
module load cuda/10.2.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or NVHPC/PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 450.119.04.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/9.3.0 (must load CUDA 11)&lt;br /&gt;
gcc/8.5.0 (must load CUDA 10 or 11)&lt;br /&gt;
gcc/10.3.0 (w/o CUDA)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.10&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== NVHPC/PGI ===&lt;br /&gt;
PGI compiler is provided in NVHPC (NVIDIA HPC SDK).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load nvhpc/21.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05 cmake/3.19.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=FALSE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal P100 and later GPUs like V100 do not scale beyond a single GPU'''. It is highly suggested to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0  nccl/2.9.9 anaconda3/2021.05&lt;br /&gt;
conda create -n cupy-env python=3.8 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$MODULE_CUDNN_PREFIX/include -I$MODULE_NCCL_PREFIX/include -I$MODULE_CUDA_PREFIX/include&amp;quot; LDFLAGS=&amp;quot;-L$MODULE_CUDNN_PREFIX/lib64 -L$MODULE_NCCL_PREFIX/lib&amp;quot; CUDA_PATH=$MODULE_CUDA_PREFIX pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
*'''GROMACS 2019'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020 and 2021''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.4&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.6&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2 (testing purpose only)&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.4&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 tMPI thread (w/ multiple OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* Gromacs 2019 example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Gromacs 2020 or 2021 example: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
export GMX_FORCE_UPDATE_DEFAULT_GPU=true&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s).&lt;br /&gt;
'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.&lt;br /&gt;
'''&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks/tmpi threads + 1 PME rank/tmpi thread: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''-ntmpi 4''' and '''export OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0 gromacs/2021.5&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
gmx mdrun -ntmpi 8 -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;-ntmpi 16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;-ntmpi 4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_thread_MPI_ranks(-ntmpi) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''If your simulation can fit in a single GPU job, please use single GPU to get much higher efficiency. Do not waste 3 additional GPU resource for getting only a small performance improvement.'''&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== 2.14 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with single GPU ====&lt;br /&gt;
If you have many jobs to run, it is always suggested to run with a single gpu per job. This makes jobs easier to be scheduled and gives better overall performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -bind-to none -hostfile nodelist-$SLURM_JOB_ID `which namd2` +idlepoll +ppn 8 +p 8 stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Running with one process per node (4 GPUs)====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU (4 GPUs)====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Open-CE ==&lt;br /&gt;
[https://github.com/open-ce/open-ce Open-CE] is an '''IBM''' repo for feedstock collection, environment data, and scripts for building Tensorflow, Pytorch, and other machine learning packages and dependencies. Open-CE is distributed as a '''conda channel''' on Mist cluster.&lt;br /&gt;
'''Available packages and versions are listed here [https://github.com/open-ce/open-ce/releases/tag/open-ce-v1.5.2 Open-CE Releases]'''. Currently only python 3.8 and CUDA 11.2 are supported. If you need a different python or cuda version, please contact SOSCIP or SciNet support.&lt;br /&gt;
&lt;br /&gt;
*Packages can be installed by setting Open-CE conda channel:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce python=3.8 cudatoolkit=11.2 PACKAGE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+Available Packages:&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow&lt;br /&gt;
|TensorFlow Estimators&lt;br /&gt;
|TensorFlow Probability&lt;br /&gt;
|TensorBoard&lt;br /&gt;
|TensorBoard Data Server&lt;br /&gt;
|TensorFlow Text&lt;br /&gt;
|TensorFlow Model Optimizations&lt;br /&gt;
|TensorFlow Addons&lt;br /&gt;
|TensorFlow Datasets&lt;br /&gt;
|TensorFlow Hub&lt;br /&gt;
|-&lt;br /&gt;
|TensorFlow MetaData&lt;br /&gt;
|PyTorch&lt;br /&gt;
|TorchText&lt;br /&gt;
|TorchVision&lt;br /&gt;
|PyTorch Lightning&lt;br /&gt;
|PyTorch Lightning Bolts&lt;br /&gt;
|ONNX&lt;br /&gt;
|Onnx-runtime&lt;br /&gt;
|skl2onnx&lt;br /&gt;
|tf2onnx&lt;br /&gt;
|-&lt;br /&gt;
|onnxmltools&lt;br /&gt;
|onnxconverter-common&lt;br /&gt;
|XGBoost&lt;br /&gt;
|LightGBM&lt;br /&gt;
|Transformers&lt;br /&gt;
|Tokenizers&lt;br /&gt;
|SentencePiece&lt;br /&gt;
|Spacy&lt;br /&gt;
|DALI&lt;br /&gt;
|OpenCV&lt;br /&gt;
|-&lt;br /&gt;
|Horovod&lt;br /&gt;
|PyArrow&lt;br /&gt;
|grpc&lt;br /&gt;
|uwsgi&lt;br /&gt;
|ORC&lt;br /&gt;
|Mamba&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Open-CE Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install PyTorch using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.8&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce pytorch=1.10.2 cudatoolkit=11.2&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.2 pytorch=1.7.1 cudatoolkit=11.0 (or 10.2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add below command into your job script before python command to get deterministic results, see details here: [https://github.com/pytorch/pytorch/issues/39849]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CUBLAS_WORKSPACE_CONFIG=:4096:2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Open-CE Conda channel. User needs to prepare a conda environment and install TensorFlow using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.8&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce tensorflow==2.7.1 cudatoolkit=11.2&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce-1.2 tensorflow==2.4.3 cudatoolkit=11.0 (or 10.2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 15 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script. Users needs to load module and activate the conda environment after a debug job starts. It is recommended to do a 'conda clean' before 'source activate ENV' in a debug job if --clean flag is missed.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Limits ==&lt;br /&gt;
&lt;br /&gt;
There are limits to the size and duration of your jobs, the number of jobs you can run and the number of jobs you can have queued.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Usage&lt;br /&gt;
!Partition&lt;br /&gt;
!Running jobs&lt;br /&gt;
!Jobs in queue&lt;br /&gt;
!Min. size of jobs&lt;br /&gt;
!Max. size of jobs&lt;br /&gt;
!Min. walltime&lt;br /&gt;
!Max. walltime &lt;br /&gt;
|-&lt;br /&gt;
|Compute jobs ||compute || 100 GPUs || 1000 || 1 GPU (8&amp;amp;nbsp;cores) || default:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs) &amp;lt;br&amp;gt; with&amp;amp;nbsp;allocation:&amp;amp;nbsp;4&amp;amp;nbsp;nodes&amp;amp;nbsp;(16&amp;amp;nbsp;GPUs)|| 15 minutes || 24 hours&lt;br /&gt;
|-&lt;br /&gt;
|Testing or troubleshooting || debug || 1 || 1 || 1 GPU (8 cores) || 2 nodes (8 GPUs)|| N/A || 2/n&amp;lt;sub&amp;gt;gpu&amp;lt;/sub&amp;gt; hours&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Even if you respect these limits, your jobs will still have to wait in the queue. The waiting time depends on many factors such as your group's allocation amount, how much allocation has been used in the recent past, the number of requested nodes and walltime, and how many other jobs are waiting in the queue.&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3575</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3575"/>
		<updated>2022-02-15T15:22:48Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up | Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Feb 12 2022, 12:59 EST&amp;lt;/b&amp;gt; Jupyterhub is back up, but may have hardware issue.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Feb 12 2022, 10:36 EST&amp;lt;/b&amp;gt; Issue with the Jupyterhub, since last night.  We're investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Feb 1 2022 19:20 EST&amp;lt;/b&amp;gt; Maintenance finished successfully. Systems are up. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Feb 1 2022 13:00 EST&amp;lt;/b&amp;gt; Maintenance downtime started.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Jan 31 2022 13:15:00 EST&amp;lt;/b&amp;gt;: The SciNet datacentre's cooling system needs an &amp;lt;b&amp;gt;emergency repair&amp;lt;/b&amp;gt; as soon as possible.  During this repair, all systems hosted at SciNet (Niagara, Mist, Rouge, HPSS, and Teach) will need to be switched off and will be unavailable to users. Repairs will start &amp;lt;b&amp;gt;Tuesday February 1st, at 1:00 pm EST&amp;lt;/b&amp;gt;, and could take until the end of the next day.  Please check here for updates.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Jan 29 2020 16:45:38 EST&amp;lt;/b&amp;gt; Fibre repaired.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat 29 Jan 2022 11:22:27 EST&amp;lt;/b&amp;gt; Fibre repair is underway.  Expect to have connectivity restored later today.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri 28 Jan 2022 07:35:01 EST&amp;lt;/b&amp;gt; The fibre optics cable that connects the SciNet datacentre was severed by uncoordinated digging at York University.  We expect repairs to happen as soon as possible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 27 12:46 EST PM 2022&amp;lt;/b&amp;gt; Network issues to and from the datacentre. We are investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3572</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3572"/>
		<updated>2022-02-12T17:59:22Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up | Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Partial |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Feb 12 2022, 12:59 EST&amp;lt;/b&amp;gt; Jupyterhub is back up, but may have hardware issue.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Feb 12 2022, 10:36 EST&amp;lt;/b&amp;gt; Issue with the Jupyterhub, since last night.  We're investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Feb 1 2022 19:20 EST&amp;lt;/b&amp;gt; Maintenance finished successfully. Systems are up. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Feb 1 2022 13:00 EST&amp;lt;/b&amp;gt; Maintenance downtime started.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Jan 31 2022 13:15:00 EST&amp;lt;/b&amp;gt;: The SciNet datacentre's cooling system needs an &amp;lt;b&amp;gt;emergency repair&amp;lt;/b&amp;gt; as soon as possible.  During this repair, all systems hosted at SciNet (Niagara, Mist, Rouge, HPSS, and Teach) will need to be switched off and will be unavailable to users. Repairs will start &amp;lt;b&amp;gt;Tuesday February 1st, at 1:00 pm EST&amp;lt;/b&amp;gt;, and could take until the end of the next day.  Please check here for updates.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat Jan 29 2020 16:45:38 EST&amp;lt;/b&amp;gt; Fibre repaired.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Sat 29 Jan 2022 11:22:27 EST&amp;lt;/b&amp;gt; Fibre repair is underway.  Expect to have connectivity restored later today.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri 28 Jan 2022 07:35:01 EST&amp;lt;/b&amp;gt; The fibre optics cable that connects the SciNet datacentre was severed by uncoordinated digging at York University.  We expect repairs to happen as soon as possible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Thu Jan 27 12:46 EST PM 2022&amp;lt;/b&amp;gt; Network issues to and from the datacentre. We are investigating.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH#SSH Keys|SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Database_server&amp;diff=3416</id>
		<title>Database server</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Database_server&amp;diff=3416"/>
		<updated>2022-01-10T15:22:37Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Created page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A '''PostgreSQL server''' (version 14.1) is available for users whose research requires database access. To gain access, please contact [mailto:support@scinet.utoronto.ca support] with a brief description of your needs. Users should specify whether they need a database of their own created, or just a user (role) on the database server (e.g. in the case where multiple users jointly access one database). By default, a database created for a user is named &amp;lt;span style=&amp;quot;color: gray; font-style: italic;&amp;quot;&amp;gt;username&amp;lt;/span&amp;gt;_db0, with the last digit incrementing in case more than one database is needed. The user who requests the database owns it, and can grant privileges to other users as needed.&lt;br /&gt;
&lt;br /&gt;
The PostgreSQL server is running on &amp;lt;code&amp;gt;idb1.scinet.local&amp;lt;/code&amp;gt; port 5432. SSH access to the node is not possible, only SQL connections from compatible clients are accepted. The node is only accessible internally on the SciNet network (i.e. from Niagara nodes); to access the server from outside, [[SSH Tunneling|SSH tunnelling]] has to be established through a Niagara login node.&lt;br /&gt;
&lt;br /&gt;
== Usage ==&lt;br /&gt;
The ''psql'' client program is available in the &amp;lt;code&amp;gt;postgresql&amp;lt;/code&amp;gt; module installed in &amp;lt;code&amp;gt;NiaEnv/2019b&amp;lt;/code&amp;gt;. Users can install the ''psycopg2'' package in their Python virtual environment. Querying the database (even complex queries) can be done from a Niagara login node, but compute-intensive post-processing of query results has to be done on a compute node and in parallel.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Storage and backup ==&lt;br /&gt;
The database server node storage is local (i.e. disks), separate from the other SciNet systems (GPFS). The database server data are backed up nightly as a whole, with only a few snapshots kept (depending on storage availability): the implications are that if a database object is unintentionally altered or dropped, the unintended change will eventually propagate into the backups even if no further changes are made. Therefore, time is of the essence and [mailto:support@scinet.utoronto.ca support] should be contacted immediately to extract data from a snapshot.&lt;br /&gt;
&lt;br /&gt;
== Quota ==&lt;br /&gt;
Database storage does not count toward a user's quota on SciNet filesystems, and the database server has no quota imposed at this time. We ask that users inform us if their database storage requirement is expected to exceed 1 TB.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3312</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3312"/>
		<updated>2021-11-05T23:44:37Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 19:35 EDT 2021 &amp;lt;/b&amp;gt; The filesystem issue from earlier in the afternoon is resolved.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 16:58 EDT 2021 &amp;lt;/b&amp;gt; We are experiencing filesystem issues, login to the clusters may not be possible until they are resolved.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 19 noon EDT - Thu Oct 21 noon EDT:&amp;lt;/b&amp;gt; &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Niagara at Scale:&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; Only users of selected projects run at large scale during these 48 hours. Other users can still login and access their files, and submit jobs for after the event.  SOSCIP and Mist users are not affected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 14:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is back up.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 12:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is down for maintenance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Sep 27 16:11 EDT 2021 &amp;lt;/b&amp;gt; HPSS is back online.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Sep 23 17:23 EDT 2021 &amp;lt;/b&amp;gt; Systems being brought back online. HPSS may be down for some more days.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3311</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3311"/>
		<updated>2021-11-05T22:17:02Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 16:58 EDT 2021 &amp;lt;/b&amp;gt; we are experiencing filesystem issues, login to the clusters may not be possible until they are resolved.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 19 noon EDT - Thu Oct 21 noon EDT:&amp;lt;/b&amp;gt; &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Niagara at Scale:&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; Only users of selected projects run at large scale during these 48 hours. Other users can still login and access their files, and submit jobs for after the event.  SOSCIP and Mist users are not affected.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 14:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is back up.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 12:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is down for maintenance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Sep 27 16:11 EDT 2021 &amp;lt;/b&amp;gt; HPSS is back online.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Sep 23 17:23 EDT 2021 &amp;lt;/b&amp;gt; Systems being brought back online. HPSS may be down for some more days.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3298</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3298"/>
		<updated>2021-10-12T18:34:23Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 14:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is back up.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 12:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is down for maintenance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Sep 27 16:11 EDT 2021 &amp;lt;/b&amp;gt; HPSS is back online.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Sep 23 17:23 EDT 2021 &amp;lt;/b&amp;gt; Systems being brought back online. HPSS may be down for some more days.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3297</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3297"/>
		<updated>2021-10-12T17:58:48Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus |Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Tue Oct 12 12:30 EDT 2021 &amp;lt;/b&amp;gt; Mist login node is down for maintenance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Sep 27 16:11 EDT 2021 &amp;lt;/b&amp;gt; HPSS is back online.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Sep 23 17:23 EDT 2021 &amp;lt;/b&amp;gt; Systems being brought back online. HPSS may be down for some more days.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://education.scinet.utoronto.ca SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH keys]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3215</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3215"/>
		<updated>2021-09-15T21:03:12Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Reverted edits by Ymeiron (talk) to last revision by Nolta&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up|Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Sep 15 16:50 2021&amp;lt;/b&amp;gt;: filesystem issues&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Sep 13 13:15:07 EDT 2021&amp;lt;/b&amp;gt; HPSS is back online.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Sep 10 17:57:23 EDT 2021&amp;lt;/b&amp;gt; HPSS is offline due to unscheduled maintenance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Aug 18 16:13:42 EDT 2021&amp;lt;/b&amp;gt; The HPSS upgrade is complete.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;HPSS Downtime August 17th and 18th, 2021 (Tuesday and Wednesday):&amp;lt;/b&amp;gt; We'll be upgrading the HPSS software to version 8.3, along with all the clients (htar/hsi, vfs and Globus/dsi)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3214</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3214"/>
		<updated>2021-09-15T21:02:20Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Down |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Down |Mist|Mist}}&lt;br /&gt;
|{{Down |Teach|Teach}}&lt;br /&gt;
|{{Down |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Down |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |HPSS|HPSS}}&lt;br /&gt;
|{{Down |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up|Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Sep 15 16:50 2021&amp;lt;/b&amp;gt;: filesystem issues&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Sep 15 17:02:00 EDT 2021&amp;lt;/b&amp;gt; We are experiencing filesystem issues that affect access to the clusters.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Mon Sep 13 13:15:07 EDT 2021&amp;lt;/b&amp;gt; HPSS is back online.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Fri Sep 10 17:57:23 EDT 2021&amp;lt;/b&amp;gt; HPSS is offline due to unscheduled maintenance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Wed Aug 18 16:13:42 EDT 2021&amp;lt;/b&amp;gt; The HPSS upgrade is complete.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;HPSS Downtime August 17th and 18th, 2021 (Tuesday and Wednesday):&amp;lt;/b&amp;gt; We'll be upgrading the HPSS software to version 8.3, along with all the clients (htar/hsi, vfs and Globus/dsi)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Modules for Mist]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3198</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3198"/>
		<updated>2021-08-23T19:25:49Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Open-CE 1.3 with CUDA 11.2&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 8.2&lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
'''&amp;lt;span style=&amp;quot;background:#fc8383&amp;quot;&amp;gt;Important note:&amp;lt;/span&amp;gt;''' the majority of computer systems as of 2021 (laptops, desktops, and HPC) use the 64 bit x86 instruction set architecture (ISA) in their microprocessors produced by Intel and AMD. This ISA is incompatible with Mist, whose hardware uses the 64 bit PPC ISA (set to little endian mode). The practical meaning is that x86-compiled binaries (executables and libraries) cannot be installed on Mist. For this reason, the Niagara and Compute Canada software stacks (modules) cannot be made available on Mist, and using closed-source software is only possible when the vendor provides a compatible version of their application. '''Python applications''' almost always rely on bindings to libraries originally written in C or C++, some of them are not available on PyPI or various Conda channels as precompiled binaries compatible with Mist. The recommended way to use Python on Mist is to create a [[#Anaconda (Python)|Conda]] environment and install packages from the anaconda (default) channel, where most popular packages have a linux-ppc64le (Mist-compatible) version available. Some popular machine learning packages should be installed from the internal [[#Open-CE|Open-CE]] channel. Where a compatible Conda package cannot be found, installing from PyPI (&amp;lt;code&amp;gt;pip install&amp;lt;/code&amp;gt;) can be attempted. Pip will attempt to compile the package’s source code if no compatible precompiled wheel is available, therefore a compiler module (such as &amp;lt;code&amp;gt;gcc/.core&amp;lt;/code&amp;gt;) should be loaded in advance. Some packages require tweaking of the source code or build procedure to successfully compile on Mist, please contact [[#Support|support]] if you need assistance.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
Mist can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]] and a list of [[Modules for Mist]] is also available.&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/11.0.3&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''11.0.3''' and '''10.2.2 (10.2.89)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/11.0.3&lt;br /&gt;
module load cuda/10.2.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or NVHPC/PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 450.119.04.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/9.3.0 (must load CUDA 11)&lt;br /&gt;
gcc/8.5.0 (must load CUDA 10)&lt;br /&gt;
gcc/10.3.0 (w/o CUDA)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.10&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== NVHPC/PGI ===&lt;br /&gt;
PGI compiler is provided in NVHPC (NVIDIA HPC SDK).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load nvhpc/21.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05 cmake/3.19.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=FALSE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal P100 and later GPUs like V100 do not scale beyond a single GPU'''. It is highly suggested to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0  nccl/2.9.9 anaconda3/2021.05&lt;br /&gt;
conda create -n cupy-env python=3.8 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$MODULE_CUDNN_PREFIX/include -I$MODULE_NCCL_PREFIX/include -I$MODULE_CUDA_PREFIX/include&amp;quot; LDFLAGS=&amp;quot;-L$MODULE_CUDNN_PREFIX/lib64 -L$MODULE_NCCL_PREFIX/lib&amp;quot; CUDA_PATH=$MODULE_CUDA_PREFIX pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
module load MistEnv/2020a cuda/10.2.89 gcc/8.3.0 openmpi/3.1.5 gromacs/2019.6 (old RHEL 7 version for testing only)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020 and 2021''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.6&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 MPI rank (w/ OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* Gromacs 2019 example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Gromacs 2020 or 2021 example: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8 -update gpu ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s). When running multiple MPI ranks on the same GPU, NVIDIA Multi-Process Service (MPS) must be enabled.&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks + 1 PME rank: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0  openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun  -bind-to none gmx_mpi mdrun -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''--ntasks=4''' and '''OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0  openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun -bind-to none gmx_mpi mdrun -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;--ntasks=16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;--ntasks=4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_Tasks(MPI_ranks) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== 2.14 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with single GPU ====&lt;br /&gt;
If you have many jobs to run, it is always suggested to run with a single gpu per job. This makes jobs easier to be scheduled and gives better overall performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -bind-to none -hostfile nodelist-$SLURM_JOB_ID `which namd2` +idlepoll +ppn 8 +p 8 stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Running with one process per node (4 GPUs)====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU (4 GPUs)====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Open-CE ==&lt;br /&gt;
[https://github.com/open-ce/open-ce Open-CE] is an '''IBM''' repo for feedstock collection, environment data, and scripts for building Tensorflow, Pytorch, XGBoost, and other related packages and dependencies. Open-CE is distributed as a '''conda channel''' on Mist cluster.&lt;br /&gt;
Available packages and versions are listed here [https://github.com/open-ce/open-ce/releases Open-CE Releases]. Currently only python 3.7 and 3.8 are supported. Packages are built with CUDA 11.2, 11.0 and 10.2.&lt;br /&gt;
&lt;br /&gt;
*Packages can be installed by setting Open-CE conda channel:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.3 python=3.8 cudatoolkit=11.2 PACKAGE&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2 python=3.8 cudatoolkit=11.2 PACKAGE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Open-CE Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install PyTorch using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.8 (or 3.7)&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.3 pytorch=1.8.1 cudatoolkit=11.2 (or 11.0 or 10.2)&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2 pytorch=1.7.1 cudatoolkit=11.2 (or 11.0 or 10.2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add below command into your job script before python command to get deterministic results, see details here: [https://github.com/pytorch/pytorch/issues/39849]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CUBLAS_WORKSPACE_CONFIG=:4096:2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Open-CE Conda channel. User needs to prepare a conda environment and install TensorFlow using IBM's Open-CE Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.8 (or 3.7)&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.3 tensorflow==2.5.1 cudatoolkit=11.0 (or 10.2)&lt;br /&gt;
or&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2 tensorflow==2.4.2 cudatoolkit=11.0 (or 10.2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 30 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3179</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3179"/>
		<updated>2021-08-09T14:37:44Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Note about x86-ppc incompatibility and how to use Python; updated OS version; changed softwares to software&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 8.2&lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
'''&amp;lt;span style=&amp;quot;background:#fc8383&amp;quot;&amp;gt;Important note:&amp;lt;/span&amp;gt;''' the majority of computer systems as of 2021 (laptops, desktops, and HPC) use the 64 bit x86 instruction set architecture (ISA) in their microprocessors produced by Intel and AMD. This ISA is incompatible with Mist, whose hardware uses the 64 bit PPC ISA (set to little endian mode). The practical meaning is that x86-compiled binaries (executables and libraries) cannot be installed on Mist. For this reason, the Niagara and Compute Canada software stacks (modules) cannot be made available on Mist, and using closed-source software is only possible when the vendor provides a compatible version of their application. '''Python applications''' almost always rely on bindings to libraries originally written in C or C++, some of them are not available on PyPI or various Conda channels as precompiled binaries compatible with Mist. The recommended way to use Python on Mist is to create a [[#Anaconda (Python)|Conda]] environment and install packages from the anaconda (default) channel, where most popular packages have a linux-ppc64le (Mist-compatible) version available. Some popular machine learning packages should be installed from the internal [[#Open-CE|Open-CE]] channel. Where a compatible Conda package cannot be found, installing from PyPI (&amp;lt;code&amp;gt;pip install&amp;lt;/code&amp;gt;) can be attempted. Pip will attempt to compile the package’s source code if no compatible precompiled wheel is available, therefore a compiler module (such as &amp;lt;code&amp;gt;gcc/.core&amp;lt;/code&amp;gt;) should be loaded in advance. Some packages require tweaking of the source code or build procedure to successfully compile on Mist, please contact [[#Support|support]] if you need assistance.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
Mist can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]] and a list of [[Modules for Mist]] is also available.&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/11.0.3&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''11.0.3''' and '''10.2.2 (10.2.89)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/11.0.3&lt;br /&gt;
module load cuda/10.2.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or NVHPC/PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 450.119.04.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/9.3.0 (must load CUDA 11)&lt;br /&gt;
gcc/8.5.0 (must load CUDA 10)&lt;br /&gt;
gcc/10.3.0 (w/o CUDA)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.10&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== NVHPC/PGI ===&lt;br /&gt;
PGI compiler is provided in NVHPC (NVIDIA HPC SDK).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load nvhpc/21.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05 cmake/3.19.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=FALSE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal P100 and later GPUs like V100 do not scale beyond a single GPU'''. It is highly suggested to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0  nccl/2.9.9 anaconda3/2021.05&lt;br /&gt;
conda create -n cupy-env python=3.8 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$MODULE_CUDNN_PREFIX/include -I$MODULE_NCCL_PREFIX/include -I$MODULE_CUDA_PREFIX/include&amp;quot; LDFLAGS=&amp;quot;-L$MODULE_CUDNN_PREFIX/lib64 -L$MODULE_NCCL_PREFIX/lib&amp;quot; CUDA_PATH=$MODULE_CUDA_PREFIX pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
module load MistEnv/2020a cuda/10.2.89 gcc/8.3.0 openmpi/3.1.5 gromacs/2019.6 (old RHEL 7 version for testing only)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020 and 2021''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.6&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 MPI rank (w/ OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* Gromacs 2019 example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Gromacs 2020 or 2021 example: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8 -update gpu ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s). When running multiple MPI ranks on the same GPU, NVIDIA Multi-Process Service (MPS) must be enabled.&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks + 1 PME rank: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0  openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun  -bind-to none gmx_mpi mdrun -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''--ntasks=4''' and '''OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0  openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun -bind-to none gmx_mpi mdrun -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;--ntasks=16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;--ntasks=4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_Tasks(MPI_ranks) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== 2.14 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with single GPU ====&lt;br /&gt;
If you have many jobs to run, it is always suggested to run with a single gpu per job. This makes jobs easier to be scheduled and gives better overall performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -bind-to none -hostfile nodelist-$SLURM_JOB_ID `which namd2` +idlepoll +ppn 8 +p 8 stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Running with one process per node (4 GPUs)====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU (4 GPUs)====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Open-CE ==&lt;br /&gt;
[https://github.com/open-ce/open-ce Open-CE] is an '''IBM''' repo for feedstock collection, environment data, and scripts for building Tensorflow, Pytorch, XGBoost, and other related packages and dependencies. Open-CE is distributed as a '''conda channel''' on Mist cluster.&lt;br /&gt;
Available packages and versions are listed here [https://github.com/open-ce/open-ce/releases Open-CE Releases]. Currently only python 3.7 and 3.8 are supported.&lt;br /&gt;
&lt;br /&gt;
*Packages can be installed by setting Open-CE conda channel:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2.2 python=3.x PACKAGE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install PyTorch using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.7&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ pytorch=1.3.1&lt;br /&gt;
Or&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ pytorch=1.5.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''NEWER VERSIONS FROM OPEN-CE:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.8 (or 3.7)&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2.2 pytorch=1.7.1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add below command into your job script before python command to get deterministic results, see details here: [https://github.com/pytorch/pytorch/issues/39849]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CUBLAS_WORKSPACE_CONFIG=:4096:2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install TensorFlow-gpu using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.7&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==2.1.2&lt;br /&gt;
Or&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/  tensorflow-gpu==2.2.0&lt;br /&gt;
If you need TF 1.x version:&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==1.15.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''NEWER VERSIONS FROM OPEN-CE:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.8 (or 3.7)&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2.2 tensorflow==2.4.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 30 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3156</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3156"/>
		<updated>2021-07-20T23:09:37Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up |Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Down |HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Down |Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt; July 20th, 2021, 7:00 PM :&amp;lt;/b&amp;gt; &amp;lt;b&amp;gt; SLURM configuration&amp;lt;/b&amp;gt; - Changed the default behaviour to kill a job step if any task exits with a non-zero exit code. If your code is able to handle failures gracefully, please add srun's option --no-kill to recover the previous default behaviour.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt; July 20th, 2021, 7:00 PM :&amp;lt;/b&amp;gt; Maintenance finished, systems are back online.   &lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;SciNet Downtime July 20th, 2021 (Tuesday):&amp;lt;/b&amp;gt; There will be a maintenance shutdown of the SciNet data center on Tuesday July 20th, starting at 7 am EDT. There will be no access to any of the SciNet systems (Niagara, Mist, HPSS, Teach cluster, or the file systems) during this time.  We expect to be able to bring the systems back online in the evening of July 20th.  The status of the Niagara cluster can be checked on status.computecanada.ca. For up-to-date and more detailed information on the status of all the SciNet systems, you can always check back here.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;June 28th, 2021, 4:06 PM:&amp;lt;/b&amp;gt; Mist OS upgrade is complete.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;May 27, 2021:&amp;lt;/b&amp;gt; Datamovers addresses have changed to improve high bandwidth connectivity and cybersecurity. The new addresses are 142.1.174.227 for nia-datamover1.scinet.utoronto.ca, and 142.1.174.228 for nia-datamover2.scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
If you have jobs that need to connect to a software license server using an ssh tunnel through nia-gw (which actually resolves to datamover1 or datamover2), you may need to ask the system administrators of that license server to allow incoming connections from the new addresses above.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3139</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3139"/>
		<updated>2021-07-12T19:48:32Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Reverted edits by Ymeiron (talk) to last revision by Feimao&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 7.6 &lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
Mist can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/11.0.3&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU softwares.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''11.0.3''' and '''10.2.2 (10.2.89)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/11.0.3&lt;br /&gt;
module load cuda/10.2.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or NVHPC/PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 450.119.04.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/9.3.0 (must load CUDA 11)&lt;br /&gt;
gcc/8.5.0 (must load CUDA 10)&lt;br /&gt;
gcc/10.3.0 (w/o CUDA)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.10&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== NVHPC/PGI ===&lt;br /&gt;
PGI compiler is provided in NVHPC (NVIDIA HPC SDK).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load nvhpc/21.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Softwares =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05 cmake/3.19.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=FALSE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal P100 and later GPUs like V100 do not scale beyond a single GPU'''. It is highly suggest to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 anaconda3/2021.05&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0  nccl/2.9.9 anaconda3/2021.05&lt;br /&gt;
conda create -n cupy-env python=3.8 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$MODULE_CUDNN_PREFIX/include -I$MODULE_NCCL_PREFIX/include -I$MODULE_CUDA_PREFIX/include&amp;quot; LDFLAGS=&amp;quot;-L$MODULE_CUDNN_PREFIX/lib64 -L$MODULE_NCCL_PREFIX/lib&amp;quot; CUDA_PATH=$MODULE_CUDA_PREFIX pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
module load MistEnv/2020a cuda/10.2.89 gcc/8.3.0 openmpi/3.1.5 gromacs/2019.6 (old RHEL 7 version for testing only)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020 and 2021''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2020.6&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 MPI rank (w/ OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* Gromacs 2019 example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/10.2.2 gcc/8.5.0 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8  ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Gromacs 2020 or 2021 example: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 gromacs/2021.2&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8 -update gpu ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s). When running multiple MPI ranks on the same GPU, NVIDIA Multi-Process Service (MPS) must be enabled.&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks + 1 PME rank: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0  openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun  -bind-to none gmx_mpi mdrun -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''--ntasks=4''' and '''OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3  gcc/9.4.0  openmpi/4.1.1+ucx-1.10.0 gromacs/2021.2&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun -bind-to none gmx_mpi mdrun -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;--ntasks=16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;--ntasks=4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_Tasks(MPI_ranks) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== 2.14 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with single GPU ====&lt;br /&gt;
If you have many jobs to run, it is always suggested to run with a single gpu per job. This makes jobs easier to be scheduled and gives better overall performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -bind-to none -hostfile nodelist-$SLURM_JOB_ID `which namd2` +idlepoll +ppn 8 +p 8 stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Running with one process per node (4 GPUs)====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU (4 GPUs)====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load MistEnv/2021a cuda/11.0.3 gcc/9.4.0 spectrum-mpi/10.4.0 namd/2.14&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Open-CE ==&lt;br /&gt;
[https://github.com/open-ce/open-ce Open-CE] is an '''IBM''' repo for feedstock collection, environment data, and scripts for building Tensorflow, Pytorch, XGBoost, and other related packages and dependencies. Open-CE is distributed as a '''conda channel''' on Mist cluster.&lt;br /&gt;
Available packages and versions are listed here [https://github.com/open-ce/open-ce/releases Open-CE Releases]. Currently only python 3.7 and 3.8 are supported.&lt;br /&gt;
&lt;br /&gt;
*Packages can be installed by setting Open-CE conda channel:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2.2 python=3.x PACKAGE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install PyTorch using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.7&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ pytorch=1.3.1&lt;br /&gt;
Or&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ pytorch=1.5.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''NEWER VERSIONS FROM OPEN-CE:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.8 (or 3.7)&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2.2 pytorch=1.7.1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add below command into your job script before python command to get deterministic results, see details here: [https://github.com/pytorch/pytorch/issues/39849]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CUBLAS_WORKSPACE_CONFIG=:4096:2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Conda channel. User needs to prepare a conda environment and install TensorFlow-gpu using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.7&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==2.1.2&lt;br /&gt;
Or&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/  tensorflow-gpu==2.2.0&lt;br /&gt;
If you need TF 1.x version:&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==1.15.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''NEWER VERSIONS FROM OPEN-CE:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.8 (or 3.7)&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c /scinet/mist/ibm/open-ce/1.2.2 tensorflow==2.4.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 30 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3138</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=3138"/>
		<updated>2021-07-07T17:57:06Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Reverted edits by Feimao (talk) to last revision by Ymeiron&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 7.6 &lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
Mist can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/10.1.243&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU softwares.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''10.1.243''' and '''10.2.89 (default)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/&amp;lt;version&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 440.33.01.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/7.5.0&lt;br /&gt;
gcc/8.4.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== PGI ===&lt;br /&gt;
To load PGI compiler and its own OpenMPI environment, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load pgi/19.10&lt;br /&gt;
module load pgi-openmpi/3.1.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Softwares =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
1) MistEnv/2020a (S)   2) cuda/10.2.89   3) gcc/8.4.0   4) cmake/3.16.3   5) openmpi/4.0.3   6) anaconda3/2019.10   7) nccl/2.5.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=TRUE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=TRUE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal and later GPUs do not scale beyond a single GPU'''. It is highly suggest to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 nccl/2.5.6&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.7&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3/2019.10 cuda/10.2.89 gcc/7.5.0 cudnn/7.6.5.32  nccl/2.5.6 &lt;br /&gt;
conda create -n cupy-env python=3.7 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$SCINET_CUDNN_ROOT/include -I$SCINET_NCCL_ROOT/include -I$SCINET_CUDA_ROOT/include&amp;quot; LDFLAGS=&amp;quot;-L$SCINET_CUDNN_ROOT/lib64 -L$SCINET_NCCL_ROOT/lib&amp;quot; CUDA_PATH=$SCINET_CUDA_ROOT pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.5&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. '''Currently only single-GPU is supported on Mist'''. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 gromacs/2020.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 MPI rank (w/ OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* A Single-GPU Gromacs job must ask '''--ntasks=32''' even only 1 MPI rank will be launched by mpirun command. '''OMP_PLACES''' must be set to core to force OpenMP threads on physical CPU cores. '''-bind-to none''' and '''-pin off''' must be set to avoid CPU affiliate conflicts among OpenMP, MPI and Gromacs. '''OMP_NUM_THREADS''' must be set to 8 to get optimal performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --ntasks=32&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
mpirun -np 1 -bind-to none gmx_mpi mdrun -pin off -ntomp 8 ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Groamcs 2020 example: (OpenMPI module should to be loaded, but mpirun should NOT be used)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 gromacs/2020.4&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8 -update gpu ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s). When running multiple MPI ranks on the same GPU, NVIDIA Multi-Process Service (MPS) must be enabled.&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks + 1 PME rank: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun  -bind-to none gmx_mpi mdrun -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''--ntasks=4''' and '''OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun -bind-to none gmx_mpi mdrun -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;--ntasks=16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;--ntasks=4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_Tasks(MPI_ranks) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== IBM Watson Machine Learning Community Edition (PowerAI) ==&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/releases/ IBM Watson Machine Learning Community Edition (PowerAI)] contains many popular ML packages including TensorFlow, PyTorch, XGBoost and RAPIDS. It is distributed through IBM Conda channel. To install packages from PowerAI, user needs to specify IBM Conda channel when using Anaconda.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
&lt;br /&gt;
conda create --name wmlce_env -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda &amp;lt;package_name&amp;gt; (e.g. powerai, tensorflow-gpu, keras, pytorch, powerai-rapids, py-xgboost-gpu,  etc)&lt;br /&gt;
&lt;br /&gt;
source activate wmlce_env &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*The WML CE Early Access Conda channel (https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/) makes new versions of frameworks available in advance of formal WML CE releases. Easy upgrade between packages in the main and Early Access channels is not guaranteed. Using a separate conda environment for Early Access packages is recommended.&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== v2.13 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per node====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU ====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install PyTorch using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.7&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ pytorch=1.3.1 &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''FOR NEWER VERSIONS:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ pytorch=1.5.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install TensorFlow-gpu using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.7&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==2.1.2&lt;br /&gt;
If you need TF 1.x version:&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==1.15.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''FOR NEWER VERSIONS:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/  tensorflow-gpu==2.2.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 30 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 2-node, 8-rank OpenMPI job: (Each rank binds to 1 GPU and 8 physical CPU cores in this case)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.3.0 openmpi/3.1.5&lt;br /&gt;
&lt;br /&gt;
mpirun -bind-to core -map-by slot:PE=8 -report-bindings ./program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3130</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=3130"/>
		<updated>2021-06-28T21:11:49Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Mist back up&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up|Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{Up |Mist|Mist}}&lt;br /&gt;
|{{Up |Teach|Teach}}&lt;br /&gt;
|{{Up |Rouge|Rouge}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Up |Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up |File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up |Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up |HPSS|HPSS}}&lt;br /&gt;
|{{Up |Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up |Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;June 28th, 2021, 4:06 PM:&amp;lt;/b&amp;gt; Mist OS upgrade is complete.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;June 28th, 2021, 9:00 AM:&amp;lt;/b&amp;gt; Mist is under maintenance. OS upgrading from RHEL 7 to 8.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;June 11th, 2021, 8:30 AM:&amp;lt;/b&amp;gt; Maintenance complete. Systems are up.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&amp;lt;b&amp;gt;June 9th to 10th, 2021:&amp;lt;/b&amp;gt; The SciNet datacentre will have a scheduled maintenance shutdown.  Niagara, Mist, Rouge, HPSS, login nodes, the file systems, and hosted systems will all be offline during the shutdown starting at 7AM EDT on Wednesday June 9th. We expect the systems to be back up in the morning of Friday June 11th.  Check here for updates.--&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;May 27, 2021:&amp;lt;/b&amp;gt; Datamovers addresses have changed to improve high bandwidth connectivity and cybersecurity. The new addresses are 142.1.174.227 for nia-datamover1.scinet.utoronto.ca, and 142.1.174.228 for nia-datamover2.scinet.utoronto.ca.&lt;br /&gt;
&lt;br /&gt;
If you have jobs that need to connect to a software license server using an ssh tunnel through nia-gw (which actually resolves to datamover1 or datamover2), you may need to ask the system administrators of that license server to allow incoming connections from the new addresses above.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Rouge&amp;diff=3018</id>
		<title>Rouge</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Rouge&amp;diff=3018"/>
		<updated>2021-05-08T20:16:46Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Clarified compiler section&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[File:Amd1.jpeg|center|300px|thumb]] &lt;br /&gt;
|name=Rouge&lt;br /&gt;
|installed=March 2021&lt;br /&gt;
|operatingsystem= Linux (Centos 7.6)&lt;br /&gt;
|loginnode= rouge-login01&lt;br /&gt;
|nnodes=20 &lt;br /&gt;
|gpuspernode=8 MI50-32GB&lt;br /&gt;
|rampernode=512 GB&lt;br /&gt;
|corespernode=48 &lt;br /&gt;
|interconnect=Infiniband (2xEDR)&lt;br /&gt;
|vendorcompilers=rocm/gcc&lt;br /&gt;
|queuetype=slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
= Specifications=&lt;br /&gt;
&lt;br /&gt;
The Rouge cluster was donated to the University of Toronto by AMD as part of their [https://www.amd.com/en/corporate/hpc-fund#:~:text=The%20goal%20of%20the%20AMD,potential%20threats%20to%20global%20health COVID-19 HPC Fund ] support program.  The cluster consists of 20 x86_64 nodes each with a single AMD EPYC 7642 48-Core CPU running at 2.3GHz with 512GB of RAM and 8 Radeon Instinct MI50 GPUs per node.&lt;br /&gt;
 &lt;br /&gt;
The nodes are interconnected with 2xHDR100 Infiniband for internode communications and disk I/O to the SciNet Niagara filesystems.  In total this cluster contains 960 CPU cores and 160 GPUs. &lt;br /&gt;
&lt;br /&gt;
Access and support requests should be sent to '''support@scinet.utoronto.ca'''.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Rouge =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
Rouge can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@rouge.scinet.utoronto.ca&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Rouge login node '''rouge-login01''' can be accessed via the Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y rouge-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Storage ==&lt;br /&gt;
&lt;br /&gt;
The filesystem for Rouge is currently shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on : use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
&lt;br /&gt;
* The &amp;lt;tt&amp;gt;Rocm&amp;lt;/tt&amp;gt; module has to be loaded first for GPU software.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== ROCm ===&lt;br /&gt;
&lt;br /&gt;
The current installed ROCm Tookit is '''4.1.0'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load rocm/&amp;lt;version&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC or rocm-clang) module must be loaded in order to use ROCm to build any code.&lt;br /&gt;
&lt;br /&gt;
The current AMD driver version is 5.9.15.  Use '''rocm-smi -a''' for full details.&lt;br /&gt;
&lt;br /&gt;
===Other Compilers and Tools ===&lt;br /&gt;
&lt;br /&gt;
Available compiler modules are:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;gcc/10.3.0&amp;lt;/code&amp;gt; GNU Compiler Collection&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;rocm-clang/4.1.0&amp;lt;/code&amp;gt; Clang&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;hipify-clang/12.0.0&amp;lt;/code&amp;gt; Tool for translating CUDA sources into HIP sources&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;aocc/3.0.0&amp;lt;/code&amp;gt; AMD Optimizing C/C++ Compiler (Clang-based)&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is available with different compilers.&lt;br /&gt;
&lt;br /&gt;
= Software =&lt;br /&gt;
&lt;br /&gt;
== Singularity Containers ==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/scinet/rouge/amd/containers/gromacs.rocm401.ubuntu18.sif&lt;br /&gt;
/scinet/rouge/amd/containers/lammps.rocm401.ubuntu18.sif&lt;br /&gt;
/scinet/rouge/amd/containers/namd.rocm401.ubuntu18.sif&lt;br /&gt;
/scinet/rouge/amd/containers/openmm.rocm401.ubuntu18.sif&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== GROMACS ==&lt;br /&gt;
The HIP version of GROMACS 2020.3 (better performance than OpenCL version) is provided by AMD in a container. Currently it is suggested to use a single GPU for all simulations.&lt;br /&gt;
Job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
export SINGULARITY_HOME=$SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
singularity exec -B /home -B /scratch --env OMP_PLACES=cores /scinet/rouge/amd/containers/gromacs.rocm401.ubuntu18.sif gmx mdrun -pin off -ntmpi 1 -ntomp 6 ......&lt;br /&gt;
&lt;br /&gt;
# setting '-ntomp 4' might give better performance, do your own benchmark. not recommended to set larger than 6 for single GPU job&lt;br /&gt;
# if you worry about 'GPU update with domain decomposition lacks substantial testing and should be used with caution.' warning message (if there is any), add '-update cpu' to override&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
The HIP version of NAMD (3.0a) is provided by AMD in a container. Currently it is suggested to use a single GPU for all simulations.&lt;br /&gt;
Job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
export SINGULARITY_HOME=$SLURM_SUBMIT_DIR&lt;br /&gt;
&lt;br /&gt;
singularity exec -B /home -B /scratch --env LD_LIBRARY_PATH=/opt/rocm/lib:/.singularity.d/libs /scinet/rouge/amd/containers/namd.rocm401.ubuntu18.sif namd2 +idlepoll +p 12 stmv.namd&lt;br /&gt;
# do not set +p flag larger than 12, there are only 6 cores (12 threads) per single GPU job.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
Install PyTorch into a python virtual environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load python gcc&lt;br /&gt;
mkdir -p ~/.virtualenvs&lt;br /&gt;
virtualenv --system-site-packages ~/.virtualenvs/pytorch-rocm&lt;br /&gt;
source ~/.virtualenvs/pytorch-rocm/bin/activate&lt;br /&gt;
pip3 install torch -f https://download.pytorch.org/whl/rocm4.0.1/torch_stable.html&lt;br /&gt;
pip3 install ninja &amp;amp;&amp;amp; pip3 install 'git+https://github.com/pytorch/vision.git@v0.9.1'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Run PyTorch job with single GPU:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load python gcc&lt;br /&gt;
source ~/.virtualenvs/pytorch-rocm/bin/activate&lt;br /&gt;
python code.py&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
rouge-login01:~$ debugjob --clean -g G=1&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
where G is the number of gpus.  If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you a full node with 8 gpus for 30 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Rouge login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on one of Rouge's 20 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Rouge uses SLURM as its job scheduler. &lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
rouge-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by gpu each with 6 CPU cores.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below).&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a 1/8 of the node which is 1 GPU + 6/12 CPU Cores/Threads + ~64GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (8 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 1-node job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=8&lt;br /&gt;
#SBATCH --ntasks=8 #this only affects MPI job&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load &amp;lt;modules you need&amp;gt;&lt;br /&gt;
Run your program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Visualization&amp;diff=2948</id>
		<title>Visualization</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Visualization&amp;diff=2948"/>
		<updated>2021-02-25T16:36:09Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Indicating that ParaView and VisIt are currently available only in NiaEnv/2018a&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Software Available ==&lt;br /&gt;
We have recent versions of the open source visualization suites installed on Niagara: VMD (1.9.4) is available as a module in the default NiaEnv/'''2019b''' environment. The NiaEnv/'''2018a''' environment additionally has modules for VisIt (2.13.1) and ParaView (5.5.0, and 5.6.0 for offscreen rendering only).&lt;br /&gt;
&lt;br /&gt;
Notice that for using ParaView you need to explicitly specify one of the mesa flags in order to avoid trying to use openGL, i.e.,&lt;br /&gt;
after loading the paraview module, use the following command:&lt;br /&gt;
&lt;br /&gt;
  paraview --mesa-swr&lt;br /&gt;
&lt;br /&gt;
Notice that Niagara does not have specialized nodes nor specially designated hardware for visualization, so if you want to perform interactive visualization or exploration of your data you will need to submit an interactive job (debug job, see [[https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Testing]]).&lt;br /&gt;
For the same reason you won't be able to request or use GPUs for rendering as there are none!&lt;br /&gt;
&lt;br /&gt;
== Interactive Visualization ==&lt;br /&gt;
Runtime is limited on the login nodes, so you will need to request a testing job in order to have more time for exploring and visualizing your data.&lt;br /&gt;
Additionally by doing so, you will have access to the 40 cores of each of the nodes requested.&lt;br /&gt;
For performing an interactive visualization session in this way please follow these steps:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; ssh into niagara.scinet.utoronto.ca with the -X/-Y flag for [https://docs.scinet.utoronto.ca/index.php/SSH#X11_Forwarding X-forwarding]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; request an interactive job, ie.&amp;lt;/li&amp;gt;&lt;br /&gt;
   debugjob&lt;br /&gt;
this will connect you to a node, let's say for the argument &amp;quot;niaXYZW&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; run your favourite visualization program, eg. VisIt/ParaView &amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
   module load NiaEnv/2018a visit&lt;br /&gt;
   visit&lt;br /&gt;
&lt;br /&gt;
   module load NiaEnv/2018a paraview&lt;br /&gt;
   paraview --mesa-swr&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; exit the debug session.&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Parallel Visualization with VisIt ==&lt;br /&gt;
In order to be able to use VisIt's parallel rendering capabilities, the following command must be issued after loading the corresponding VisIt module:&lt;br /&gt;
&lt;br /&gt;
  export  LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${SCINET_VISIT_BASE}/2.13.1/linux-x86_64/lib&lt;br /&gt;
&lt;br /&gt;
In this case, the command is used for the visit/2.13.1 module from the NiaEnv/2018a environment.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Remote Visualization -- Client-Server Mode ==&lt;br /&gt;
You can use any of the remote visualization protocols supported for both VisIt and ParaView.&lt;br /&gt;
&lt;br /&gt;
Both, VisIt and ParaView, support &amp;quot;remote visualization&amp;quot; protocols.&lt;br /&gt;
This includes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;  accessing data remotely, ie. stored on the cluster&lt;br /&gt;
&amp;lt;li&amp;gt; rendering visualizations using the compute nodes as rendering engines&lt;br /&gt;
&amp;lt;li&amp;gt; or both&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== VisIt Client-Server Configuration ===&lt;br /&gt;
For allowing VisIt connect to the Niagara cluster you need to set up a &amp;quot;Host Configuration&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Choose *one* of the methods bellow:&lt;br /&gt;
&lt;br /&gt;
====Niagara Host Configuration File====&lt;br /&gt;
You can just download the Niagara host file, right click on the following link [https://support.scinet.utoronto.ca/~mponce/viz/host_niagara.xml host_niagara.xml] and select save as... &lt;br /&gt;
Depending on the OS you are using on your local machine:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; on a Linux/Mac OS place this file in &amp;lt;code&amp;gt;~/.visit/hosts/&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; on a Windows machine, place the file in  &amp;lt;code&amp;gt;My Documents\VisIt 2.13.0\hosts\&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Restart VisIt and check that the niagara profile should be available in your hosts.&lt;br /&gt;
&lt;br /&gt;
====Manual Niagara Host Configuration====&lt;br /&gt;
If you prefer to set up the verser yourself, instead of the configuration file from the previous section, just follow along these steps.&lt;br /&gt;
Open VisIt in your computer, go to the 'Options' menu, and click on &amp;quot;Host profiles...&amp;quot;&lt;br /&gt;
Then click on 'New Host' and select:&lt;br /&gt;
&lt;br /&gt;
 Host nickname = niagara&lt;br /&gt;
 Remote host name = niagara.scinet.utoronto.ca&lt;br /&gt;
 Username = Enter_Your_OWN_username_HERE&lt;br /&gt;
 Path to VisIt installation = /scinet/niagara/software/2018a/opt/base/visit/2.13.1&lt;br /&gt;
&lt;br /&gt;
Click on the &amp;quot;&amp;lt;code&amp;gt;Tunnel data connections through SSH&amp;lt;/code&amp;gt;&amp;quot;, and then hit Apply!&lt;br /&gt;
&lt;br /&gt;
{| align=&amp;quot;center&amp;quot;&lt;br /&gt;
| [[File:Visit_niagara-01.png|480px|]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now on the top of the window click on 'Launch Profiles' tab.&lt;br /&gt;
You will have to create two profiles:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; &amp;lt;code&amp;gt;login&amp;lt;/code&amp;gt;: for connecting through the login nodes and accessing data &amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; &amp;lt;code&amp;gt;slurm&amp;lt;/code&amp;gt;: for using compute nodes as rendering engines &amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
For doing so, click on 'New Profile', set the corresponding profile name, ie. login/slurm.&lt;br /&gt;
Then click on the Parallel tab and set the &amp;quot;Launch parallel engine&amp;quot;&lt;br /&gt;
&lt;br /&gt;
For the slurm profile, you will need to set the parameters as seen below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both&amp;quot; /&amp;gt;&lt;br /&gt;
{| align=&amp;quot;center&amp;quot;&lt;br /&gt;
| [[File:Visit_niagara-02.png|400px|]]&lt;br /&gt;
| [[File:Visit_niagara-03.png|400px|]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both&amp;quot; /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, after you are done with these changes, go to the &amp;quot;Options&amp;quot; menu and select &amp;quot;Save settings&amp;quot;, so that your changes are saved and available next time you relaunch VisIt.&lt;br /&gt;
&lt;br /&gt;
=== ParaView Client-Server Configuration ===&lt;br /&gt;
Similarly to VisIt you will need to start a &amp;lt;code&amp;gt;debugjob&amp;lt;/code&amp;gt; in order to use a compute node to files and compute resources.&lt;br /&gt;
Here are the steps to follow:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Launch an interactive job (debugjob) on Niagara,&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
  debugjob&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; After getting a compute node, let's say niaXYZW, load the ParaView &amp;quot;offscreen&amp;quot; module and start a ParaView server,&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
  module load NiaEnv/2018a paraview-offscreen/5.6.0&lt;br /&gt;
  pvserver --mesa-swr-avx2&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;--mesa-swr-avx2&amp;lt;/code&amp;gt; flag has been reported to offer faster software rendering using the OpenSWR library.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; Now, you have to wait a few seconds for the server to be ready to accept client connections.&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
  Waiting for client...&lt;br /&gt;
  Connection URL: cs://niaXYZW.scinet.local:11111&lt;br /&gt;
  Accepting connection(s): niaXYZW.scinet.local:11111&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; Open a new terminal without closing your debugjob, and ssh into Niagara using the following command,&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
  ssh YOURusername@niagara.scinet.utoronto.ca -L11111:niaXYZW:11111 -N&lt;br /&gt;
&lt;br /&gt;
this will establish a tunnel mapping the port 11111 in your computer (&amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;) to the port 11111 on the Niagara's compute node, &amp;lt;code&amp;gt;niaXYZW&amp;lt;/code&amp;gt;, where the ParaView server will be waiting for connections.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; Start ParaView on your local computer, go to &amp;quot;File -&amp;gt; Connect&amp;quot; and click on 'Add Server'.&lt;br /&gt;
You will need to point ParaView to your local port &amp;lt;code&amp;gt;11111&amp;lt;/code&amp;gt;, so you can do something like&amp;lt;/li&amp;gt;&lt;br /&gt;
 name = niagara&lt;br /&gt;
 server type = Client/Server&lt;br /&gt;
 host = localhost&lt;br /&gt;
 port = 11111&lt;br /&gt;
then click Configure, select &amp;lt;code&amp;gt;Manual&amp;lt;/code&amp;gt; and click Save.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; Once the remote server is added to the configuration, simply select the server from the list and click Connect.&lt;br /&gt;
The first terminal window that read &amp;lt;code&amp;gt;Accepting connection...&amp;lt;/code&amp;gt; will now read &amp;lt;code&amp;gt;Client connected&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt; Open a file in ParaView (it will point you to the remote filesystem) and visualize it as usual.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Multiple CPUs====&lt;br /&gt;
For performing parallel rendering using multiple CPUs, &amp;lt;code&amp;gt;pvserver&amp;lt;/code&amp;gt; should be run using &amp;lt;code&amp;gt;mpiexec&amp;lt;/code&amp;gt;, ie. either submit a job script or request a job using&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt; salloc --ntasks=N*40 --nodes=N --time=1:00:00&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
  module load NiaEnv/2018a paraview-offscreen/5.6.0&lt;br /&gt;
  mpiexec pvserver --mesa-swr-avx2&lt;br /&gt;
&lt;br /&gt;
=== Final Considerations ===&lt;br /&gt;
Usually both VisIt and ParaView require to use the same version between the local client and the remote host, please try to stick to that to avoid having incompatibility issues, which might result in potential problems during the connections.&lt;br /&gt;
&lt;br /&gt;
== Batch mode and Scripting ==&lt;br /&gt;
Both, VisIt and ParaView, allow for batch processing using scripting in different languages.&lt;br /&gt;
&lt;br /&gt;
=== VisIt ===&lt;br /&gt;
In order to run or launch the Python interpreter for VisIt, you will need to execute the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; visit -cli -no-win -norun engine_par &amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the 'engine_par' flag is needed in order to run the visualization engine in parallel.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In order to execute scripts in batch mode, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; visit -cli -no-win -norun engine_par  -s scriptName.py&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that for running in parallel, in addition to loading the VisIt module, an mpi  module should also be loaded.&lt;br /&gt;
Eg.&lt;br /&gt;
&lt;br /&gt;
   module load NiaEnv/2018a&lt;br /&gt;
   module load visit&lt;br /&gt;
   module load intel/2018.2 intelmpi/2018.2 &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Alternatively, if one wants to assign resources from the cluster just from within the script one should add the following line to the script&lt;br /&gt;
&lt;br /&gt;
   OpenComputeEngine(&amp;quot;localhost&amp;quot;, (&amp;quot;-l&amp;quot;, &amp;quot;srun&amp;quot;, &amp;quot;-np&amp;quot;, &amp;quot;40&amp;quot;))&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
References:&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://www.visitusers.org/index.php?title=ParallelPorting#Making_sure_you_actually_have_a_parallel_engine https://www.visitusers.org/index.php?title=ParallelPorting#Making_sure_you_actually_have_a_parallel_engine]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://www.visitusers.org/index.php?title=VisIt-tutorial-Python-scripting https://www.visitusers.org/index.php?title=VisIt-tutorial-Python-scripting]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://www.visitusers.org/index.php?title=Python_examples https://www.visitusers.org/index.php?title=Python_examples]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://www.visitusers.org/index.php?title=VisIt-tutorial-Advanced-scripting https://www.visitusers.org/index.php?title=VisIt-tutorial-Advanced-scripting]&lt;br /&gt;
&lt;br /&gt;
=== ParaView ===&lt;br /&gt;
ParaView offers a python interpreter &amp;lt;code&amp;gt;pvbatch&amp;lt;/code&amp;gt; to use &lt;br /&gt;
&lt;br /&gt;
 pvbatch --mesa-swr-avx2 --force-offscreen-rendering scriptName.py&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
References:&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://www.paraview.org/Wiki/ParaView_and_Batch https://www.paraview.org/Wiki/ParaView_and_Batch]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
[https://www.paraview.org/Wiki/ParaView/Python_Scripting https://www.paraview.org/Wiki/ParaView/Python_Scripting]&lt;br /&gt;
&lt;br /&gt;
== Other Versions ==&lt;br /&gt;
Alternatively you can try to use the visualization modules available on the CCEnv stack, for doing so just load the CCEnv module and select your favourite visualization module.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Jupyter_Hub&amp;diff=2924</id>
		<title>Jupyter Hub</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Jupyter_Hub&amp;diff=2924"/>
		<updated>2021-01-28T02:26:11Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added a note about Mist&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Niagara has a node which has been designated a Jupyter Hub.  This node can be used to run your Juptyer Notebook sessions.&lt;br /&gt;
&lt;br /&gt;
==Jupyter Hub node on Niagara==&lt;br /&gt;
The Niagara Jupyter Hub server is a single node, with 1TB of memory and 80 (hyperthreaded) cores.  You can access the jupyter hub directly, from outside of SciNet, via [https://jupyter.scinet.utoronto.ca jupyter.scinet.utoronto.ca].  &lt;br /&gt;
&lt;br /&gt;
* Point your browser to '[https://jupyter.scinet.utoronto.ca jupyter.scinet.utoronto.ca]' and log in with your Compute Canada account.&lt;br /&gt;
* The browser should now show the files in your $HOME on Niagara. (If not, try reloading the page, it may have timed out).&lt;br /&gt;
* To see your files on $SCRATCH, you need to have a symbolic link to $SCRATCH in your $HOME folder. This can be done by typing, once, in a terminal:&lt;br /&gt;
&lt;br /&gt;
    ln -sT $SCRATCH $HOME/scratch&lt;br /&gt;
&lt;br /&gt;
* You can open or create notebooks using Python (v3.8.5), R (3.6.8 or 4.0.3) and Julia (v1.5.2).&lt;br /&gt;
&lt;br /&gt;
* Many Python and R packages are already pre-installed.&lt;br /&gt;
&lt;br /&gt;
[[File:Jupyterscreen5.png | 800px]]&lt;br /&gt;
&lt;br /&gt;
===Tips to get started===&lt;br /&gt;
&lt;br /&gt;
* Jupyter can also browse your (Niagara) files and edit them.&lt;br /&gt;
* Use the 'new' button to create a new python notebook.&lt;br /&gt;
* Give your notebooks reasonable names.&lt;br /&gt;
* To execute a Python input line, press `Shift-Enter`.&lt;br /&gt;
* Save your work periodically (even though there is autosave).&lt;br /&gt;
* To work similarly to `ipython --pylab`, do:&lt;br /&gt;
&lt;br /&gt;
 In [1]: from pylab import *&lt;br /&gt;
         %matplotlib notebook&lt;br /&gt;
&lt;br /&gt;
==Using virtual environments in the JupyterHub==&lt;br /&gt;
&lt;br /&gt;
===Starting a new virtual environment from the JupyterHub===&lt;br /&gt;
&lt;br /&gt;
To start a new virtual environment from the JupyterHub, just start a New Terminal session, and on the command-line, use something like this:&lt;br /&gt;
&lt;br /&gt;
 module load NiaEnv/2019b python/3.8.5    # or: module load CCEnv nixpkgs python/3.6.3&lt;br /&gt;
 virtualenv --system-site-packages ~/.virtualenvs/NEWENVNAME&lt;br /&gt;
 source ~/.virtualenvs/NEWENVNAME/bin/activate&lt;br /&gt;
 pip install ipykernel&lt;br /&gt;
 python -m ipykernel install --user --name=NEWENVNAME&lt;br /&gt;
&lt;br /&gt;
Then, after reloading the JupyterHub page, you should then see a &amp;quot;NEWENVNAME&amp;quot; menu item in the &amp;quot;New&amp;quot; dropdown button of the JupyterHub. &lt;br /&gt;
&lt;br /&gt;
===Existing environments===&lt;br /&gt;
&lt;br /&gt;
If you have already [[Installing_your_own_Python_Modules | created a virtual environment]] on Niagara, and wish to use it in the notebook, you need to install an additional package and make this environment known to the Jupyterhub.  After activating your environment on the command line, type the following:&lt;br /&gt;
&lt;br /&gt;
 pip install ipykernel&lt;br /&gt;
 python -m ipykernel install --user --name=ENVNAME&lt;br /&gt;
 venv2jup&lt;br /&gt;
&lt;br /&gt;
Then, after reloading the JupyterHub page, you should see a &amp;quot;ENVNAME&amp;quot; menu item in the &amp;quot;New&amp;quot; dropdown button of the JupyterHub. &lt;br /&gt;
&lt;br /&gt;
Note: this works with virtual environments that are created from the python modules in the NiaEnv/2019b stack and the CCEnv stack, but not with those from the NiaEnv/2018a stack.&lt;br /&gt;
&lt;br /&gt;
==Using the Jupyter Hub responsibly==&lt;br /&gt;
&lt;br /&gt;
Jupyter notebooks are a useful environment for data exploration, pipeline development, and other hands-on work.  Such notebooks are not, however, intended for heavy production data crunching.  If you need to do heavy data crunching you should develop a script and run such work on the compute nodes.&lt;br /&gt;
&lt;br /&gt;
Furthermore, the Jupyter Hub is a shared resource.  Other users will be using the node at the same time you are using it.  Please do not use more than a few cores at a time, and do not use an excessive amount of memory.&lt;br /&gt;
&lt;br /&gt;
==Advantages and Disadvantages of a Notebook Environment==&lt;br /&gt;
&lt;br /&gt;
Drawbacks:&lt;br /&gt;
* Notebook files (.ipynb) are not scripts.&lt;br /&gt;
* Notebooks do not (always) work well with version control.&lt;br /&gt;
* The environment is designed to run in browser.&lt;br /&gt;
* The back-end runs on shared resources.&lt;br /&gt;
* Graphics are inline, which is great for quick exploration but make tweaking a plot harder (IPython+X works better for this).&lt;br /&gt;
* You can jump around in the notebook, and execute different parts: it can be hard to keep track of what you did.&lt;br /&gt;
&lt;br /&gt;
Advantages:&lt;br /&gt;
* You can jump around in the notebook, and execute different parts: Easier exploration, experimentation and debugging.&lt;br /&gt;
* Auto-save.&lt;br /&gt;
* You can rerun parts of your code (while, e.g., keeping large data in memory)&lt;br /&gt;
* You can add text portions, making your notebook more like an article.&lt;br /&gt;
* Which in turn can be useful for sharing, demos, teaching, ...&lt;br /&gt;
* You can still export as a script.&lt;br /&gt;
* Also has a terminal.&lt;br /&gt;
&lt;br /&gt;
= Jupyter on Mist =&lt;br /&gt;
The Mist GPU cluster has a different CPU architecture; Conda environments prepared on Mist will not work properly on the Niagara Jupyter Hub. See [[Mist#Jupyter_Notebooks|here]] how to run a Jupyter instance on Mist.&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=2923</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=2923"/>
		<updated>2021-01-28T02:16:02Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added a section on Jupyter Notebooks&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 7.6 &lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
Mist can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/10.1.243&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU softwares.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''10.1.243''' and '''10.2.89 (default)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/&amp;lt;version&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 440.33.01.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/7.5.0&lt;br /&gt;
gcc/8.4.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== PGI ===&lt;br /&gt;
To load PGI compiler and its own OpenMPI environment, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load pgi/19.10&lt;br /&gt;
module load pgi-openmpi/3.1.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Softwares =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
1) MistEnv/2020a (S)   2) cuda/10.2.89   3) gcc/8.4.0   4) cmake/3.16.3   5) openmpi/4.0.3   6) anaconda3/2019.10   7) nccl/2.5.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=TRUE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=TRUE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal and later GPUs do not scale beyond a single GPU'''. It is highly suggest to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 nccl/2.5.6&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.7&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3/2019.10 cuda/10.2.89 gcc/7.5.0 cudnn/7.6.5.32  nccl/2.5.6 &lt;br /&gt;
conda create -n cupy-env python=3.7 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$SCINET_CUDNN_ROOT/include -I$SCINET_NCCL_ROOT/include -I$SCINET_CUDA_ROOT/include&amp;quot; LDFLAGS=&amp;quot;-L$SCINET_CUDNN_ROOT/lib64 -L$SCINET_NCCL_ROOT/lib&amp;quot; CUDA_PATH=$SCINET_CUDA_ROOT pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.5&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. '''Currently only single-GPU is supported on Mist'''. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 gromacs/2020.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 MPI rank (w/ OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* A Single-GPU Gromacs job must ask '''--ntasks=32''' even only 1 MPI rank will be launched by mpirun command. '''OMP_PLACES''' must be set to core to force OpenMP threads on physical CPU cores. '''-bind-to none''' and '''-pin off''' must be set to avoid CPU affiliate conflicts among OpenMP, MPI and Gromacs. '''OMP_NUM_THREADS''' must be set to 8 to get optimal performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --ntasks=32&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
mpirun -np 1 -bind-to none gmx_mpi mdrun -pin off -ntomp 8 ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Groamcs 2020 example: (OpenMPI module should to be loaded, but mpirun should NOT be used)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 gromacs/2020.4&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8 -update gpu ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s). When running multiple MPI ranks on the same GPU, NVIDIA Multi-Process Service (MPS) must be enabled.&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks + 1 PME rank: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun  -bind-to none gmx_mpi mdrun -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''--ntasks=4''' and '''OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun -bind-to none gmx_mpi mdrun -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;--ntasks=16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;--ntasks=4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_Tasks(MPI_ranks) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== IBM Watson Machine Learning Community Edition (PowerAI) ==&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/releases/ IBM Watson Machine Learning Community Edition (PowerAI)] contains many popular ML packages including TensorFlow, PyTorch, XGBoost and RAPIDS. It is distributed through IBM Conda channel. To install packages from PowerAI, user needs to specify IBM Conda channel when using Anaconda.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
&lt;br /&gt;
conda create --name wmlce_env -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda &amp;lt;package_name&amp;gt; (e.g. powerai, tensorflow-gpu, keras, pytorch, powerai-rapids, py-xgboost-gpu,  etc)&lt;br /&gt;
&lt;br /&gt;
source activate wmlce_env &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*The WML CE Early Access Conda channel (https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/) makes new versions of frameworks available in advance of formal WML CE releases. Easy upgrade between packages in the main and Early Access channels is not guaranteed. Using a separate conda environment for Early Access packages is recommended.&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== v2.13 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per node====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU ====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install PyTorch using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.7&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ pytorch=1.3.1 &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''FOR NEWER VERSIONS:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ pytorch=1.5.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install TensorFlow-gpu using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.7&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==2.1.2&lt;br /&gt;
If you need TF 1.x version:&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==1.15.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''FOR NEWER VERSIONS:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/  tensorflow-gpu==2.2.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 30 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 2-node, 8-rank OpenMPI job: (Each rank binds to 1 GPU and 8 physical CPU cores in this case)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.3.0 openmpi/3.1.5&lt;br /&gt;
&lt;br /&gt;
mpirun -bind-to core -map-by slot:PE=8 -report-bindings ./program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Jupyter Notebooks =&lt;br /&gt;
SciNet’s [[Jupyter Hub]] is a Niagara-type node; it has a different CPU architecture and no GPUs. Conda environments prepared on Mist will not work there properly. Users who need to use Jupyter Notebook to develop and test some aspects of their workflow can create their own server on the Mist login node and use an SSH tunnel to connect to it from outside. Users who choose to do so have to keep in mind that the login node is a shared resource, and heavy calculations should be done only on compute nodes. Processes (including iPython kernels used by the notebooks) are limited to one hour of total CPU time: idle time will not be counted toward this one hour, and use of multiple cores will count proportionally to the number of cores (i.e. a kernel using all 128 virtual cores on the node will be killed after 28 seconds). Idle notebooks can still burden the node by hogging system and GPU memory, please be mindful of other users and terminate notebooks when work is done.&lt;br /&gt;
&lt;br /&gt;
As an example, let us create a new Conda environment and activate it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n jupyter_env python=3.7&lt;br /&gt;
source activate jupyter_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install the Jupyter Notebook server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Running the notebook server ==&lt;br /&gt;
When the Conda environment is active, enter:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, the Jupyter Notebook server uses port 8888 (can be overridden with the &amp;lt;code&amp;gt;--port&amp;lt;/code&amp;gt; option). If another user has already started their own server, the default port may be busy, in which case the server will be listening on a different port. Once launched, the server will output some information to the terminal that will include the actual port number used and a 48-character token. For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;http://localhost:8890/?token=54c4090d……&amp;lt;/pre&amp;gt;&lt;br /&gt;
In this example, the server is listening on port 8890.&lt;br /&gt;
&lt;br /&gt;
== Creating a tunnel ==&lt;br /&gt;
In order to access this port remotely (i.e. from your office or home), an [https://en.wikipedia.org/wiki/Tunneling_protocol#Secure_Shell_tunneling SSH tunnel] has to be established. Please refer to your SSH client’s documentation for instructions on how to do that. For the OpenSSH client (standard in most Linux distributions and macOS), a tunnel can be opened in a separate terminal session to the one where the Jupyter Notebook server is running. In the new terminal, issue this command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:8890 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(replace &amp;lt;code&amp;gt;&amp;lt;username&amp;gt;&amp;lt;/code&amp;gt; with your actual username) The tunnel is open as long as this SSH connection is alive. In this example, we tunnel Mist login node’s port 8890 (where our server is assumed to be running) to our home computer’s port 8888 (any other free port is fine). The notebook can be accessed in the browser at the &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;http://localhost:8888&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; address (followed by &amp;lt;code&amp;gt;/?token=54c4090d……&amp;lt;/code&amp;gt;, or the token can be input on the webpage).&lt;br /&gt;
&lt;br /&gt;
== Using Jupyter on compute nodes ==&lt;br /&gt;
&lt;br /&gt;
You can use the instructions here to set up a Jupyter Notebook server on a compute node (including a [[#Testing_and_debugging|debugjob]]). '''We strongly discourage''' you from running an interactive notebook on a compute node (other than for a debugjob), scheduled jobs run in arbitrary times and are not meant to be interactive. Jupyter notebooks can be run non-interactively or converted to Python scripts.&lt;br /&gt;
&lt;br /&gt;
To launch the Jupyter Notebook server, load the &amp;lt;code&amp;gt;anaconda3&amp;lt;/code&amp;gt; module and activate your environment as before (by adding the appropriate lines to the submission script, if you are not using the compute node with an interactive shell). Launching the server has to be done like so:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HOME=/dev/shm/$USER jupyter-notebook&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
That is because Jupyter will fail unless it can write to the home folder, which is read-only from compute nodes. This modification of the &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; environment variable will carry over into the notebooks, which is usually not a problem, but in case the notebook relies on this environment variable (e.g. to read certain files), it can be reset manually in the notebook (&amp;lt;code&amp;gt;import os; os.environ['HOME']=……&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
Because compute nodes are not accessible from the Internet, tunneling has to be done twice, once from the remote location (office or home) to the Mist login node, and then from the login node to the compute node. Assuming the server is running on port 8890 of the mist006 node, open the first tunnel in a new terminal session in the remote computer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L8888:localhost:9999 &amp;lt;username&amp;gt;@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where 9999 is any available port on the Mist login node (to test port availability enter &amp;lt;code&amp;gt;ss -Hln src :9999&amp;lt;/code&amp;gt; in the terminal when connected to the Mist login node; an empty output indicates that the port is free). In the same session in the login node that was created with the above command, open the second tunnel to the compute node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L9999:localhost:8890 mist006&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Be aware that the second tunnel will automatically disconnect once the job on the compute node times out or is relinquished. The Jupyter Notebook server running on the compute node can now be accessed from the browser as in the previous subsection.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=2919</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=2919"/>
		<updated>2021-01-25T22:33:50Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Added = after --account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 7.6 &lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
Mist can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/10.1.243&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU softwares.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''10.1.243''' and '''10.2.89 (default)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/&amp;lt;version&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 440.33.01.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/7.5.0&lt;br /&gt;
gcc/8.4.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== PGI ===&lt;br /&gt;
To load PGI compiler and its own OpenMPI environment, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load pgi/19.10&lt;br /&gt;
module load pgi-openmpi/3.1.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Softwares =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
1) MistEnv/2020a (S)   2) cuda/10.2.89   3) gcc/8.4.0   4) cmake/3.16.3   5) openmpi/4.0.3   6) anaconda3/2019.10   7) nccl/2.5.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=TRUE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=TRUE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal and later GPUs do not scale beyond a single GPU'''. It is highly suggest to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 nccl/2.5.6&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.7&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3/2019.10 cuda/10.2.89 gcc/7.5.0 cudnn/7.6.5.32  nccl/2.5.6 &lt;br /&gt;
conda create -n cupy-env python=3.7 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$SCINET_CUDNN_ROOT/include -I$SCINET_NCCL_ROOT/include -I$SCINET_CUDA_ROOT/include&amp;quot; LDFLAGS=&amp;quot;-L$SCINET_CUDNN_ROOT/lib64 -L$SCINET_NCCL_ROOT/lib&amp;quot; CUDA_PATH=$SCINET_CUDA_ROOT pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.5&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. '''Currently only single-GPU is supported on Mist'''. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 gromacs/2020.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 MPI rank (w/ OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* A Single-GPU Gromacs job must ask '''--ntasks=32''' even only 1 MPI rank will be launched by mpirun command. '''OMP_PLACES''' must be set to core to force OpenMP threads on physical CPU cores. '''-bind-to none''' and '''-pin off''' must be set to avoid CPU affiliate conflicts among OpenMP, MPI and Gromacs. '''OMP_NUM_THREADS''' must be set to 8 to get optimal performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --ntasks=32&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
mpirun -np 1 -bind-to none gmx_mpi mdrun -pin off -ntomp 8 ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Groamcs 2020 example: (OpenMPI module should to be loaded, but mpirun should NOT be used)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 gromacs/2020.4&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8 -update gpu ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s). When running multiple MPI ranks on the same GPU, NVIDIA Multi-Process Service (MPS) must be enabled.&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks + 1 PME rank: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun  -bind-to none gmx_mpi mdrun -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''--ntasks=4''' and '''OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun -bind-to none gmx_mpi mdrun -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;--ntasks=16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;--ntasks=4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_Tasks(MPI_ranks) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== IBM Watson Machine Learning Community Edition (PowerAI) ==&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/releases/ IBM Watson Machine Learning Community Edition (PowerAI)] contains many popular ML packages including TensorFlow, PyTorch, XGBoost and RAPIDS. It is distributed through IBM Conda channel. To install packages from PowerAI, user needs to specify IBM Conda channel when using Anaconda.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
&lt;br /&gt;
conda create --name wmlce_env -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda &amp;lt;package_name&amp;gt; (e.g. powerai, tensorflow-gpu, keras, pytorch, powerai-rapids, py-xgboost-gpu,  etc)&lt;br /&gt;
&lt;br /&gt;
source activate wmlce_env &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*The WML CE Early Access Conda channel (https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/) makes new versions of frameworks available in advance of formal WML CE releases. Easy upgrade between packages in the main and Early Access channels is not guaranteed. Using a separate conda environment for Early Access packages is recommended.&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== v2.13 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per node====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU ====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install PyTorch using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.7&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ pytorch=1.3.1 &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''FOR NEWER VERSIONS:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ pytorch=1.5.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install TensorFlow-gpu using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.7&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==2.1.2&lt;br /&gt;
If you need TF 1.x version:&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==1.15.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''FOR NEWER VERSIONS:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/  tensorflow-gpu==2.2.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 30 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 2-node, 8-rank OpenMPI job: (Each rank binds to 1 GPU and 8 physical CPU cores in this case)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.3.0 openmpi/3.1.5&lt;br /&gt;
&lt;br /&gt;
mpirun -bind-to core -map-by slot:PE=8 -report-bindings ./program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
* [mailto:niagara@computecanada.ca niagara@computecanada.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=2918</id>
		<title>Mist</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Mist&amp;diff=2918"/>
		<updated>2021-01-25T22:32:56Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: Changed -A to --account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Mist.jpg|center|300px|thumb]]&lt;br /&gt;
|name=Mist&lt;br /&gt;
|installed=Dec 2019&lt;br /&gt;
|operatingsystem= Red Hat Enterprise Linux 7.6 &lt;br /&gt;
|loginnode= mist.scinet.utoronto.ca&lt;br /&gt;
|nnodes=  54 IBM AC922&lt;br /&gt;
|rampernode= 256 GB  &lt;br /&gt;
|gpuspernode=4 V100-SMX2-32GB&lt;br /&gt;
|interconnect=Mellanox EDR&lt;br /&gt;
|vendorcompilers= NVCC, IBM XL&lt;br /&gt;
|queuetype=Slurm&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
=Specifications=&lt;br /&gt;
Mist is a SciNet-[[#SOSCIP Users |SOSCIP]] joint GPU cluster consisting of 54 IBM AC922 servers. Each node of the cluster has 32 IBM Power9 cores, 256GB RAM and 4 NVIDIA V100-SMX2-32GB GPU with NVLINKs in between. The cluster has InfiniBand EDR interconnection providing GPU-Direct RMDA capability.&lt;br /&gt;
&lt;br /&gt;
= Getting started on Mist =&lt;br /&gt;
Mist can be accessed directly.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@mist.scinet.utoronto.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Mist login node '''mist-login01''' can also be accessed via Niagara cluster.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -Y MYCCUSERNAME@niagara.scinet.utoronto.ca&lt;br /&gt;
ssh -Y mist-login01&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Storage ==&lt;br /&gt;
The filesystem for Mist is shared with Niagara cluster. See [https://docs.scinet.utoronto.ca/index.php/Niagara_Quickstart#Your_various_directories Niagara Storage] for more details.&lt;br /&gt;
&lt;br /&gt;
= Loading software modules =&lt;br /&gt;
&lt;br /&gt;
You have two options for running code on Mist: use existing software, or compile your own.  This section focuses on the former.&lt;br /&gt;
&lt;br /&gt;
Other than essentials, all installed software is made available [[Using_modules | using module commands]]. These modules set environment variables (PATH, etc.), allowing multiple, conflicting versions of a given package to be available.  A detailed explanation of the module system can be [[Using_modules | found on the modules page]].&lt;br /&gt;
&lt;br /&gt;
Common module subcommands are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;: load the default version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;/&amp;lt;module-version&amp;gt;&amp;lt;/code&amp;gt;: load a specific version of a particular software.&lt;br /&gt;
* &amp;lt;code&amp;gt;module purge&amp;lt;/code&amp;gt;: unload all currently loaded modules.&lt;br /&gt;
* &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt; (or &amp;lt;code&amp;gt;module spider &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;): list available software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module avail&amp;lt;/code&amp;gt;: list loadable software packages.&lt;br /&gt;
* &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;: list loaded modules.&lt;br /&gt;
&lt;br /&gt;
Along with modifying common environment variables, such as PATH, and LD_LIBRARY_PATH, these modules also create a SCINET_MODULENAME_ROOT environment variable, which can be used to access commonly needed software directories, such as /include and /lib.&lt;br /&gt;
&lt;br /&gt;
There are handy abbreviations for the module commands. &amp;lt;code&amp;gt;ml&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module list&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;ml &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt; is the same as &amp;lt;code&amp;gt;module load &amp;lt;module-name&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
== Tips for loading software ==&lt;br /&gt;
&lt;br /&gt;
* We advise '''''against''''' loading modules in your .bashrc.  This can lead to very confusing behaviour under certain circumstances.  Our guidelines for .bashrc files can be found [[bashrc guidelines|here]].&lt;br /&gt;
* Instead, load modules by hand when needed, or by sourcing a separate script.&lt;br /&gt;
* Load run-specific modules inside your job submission script.&lt;br /&gt;
* Short names give default versions; e.g. &amp;lt;code&amp;gt;cuda&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt;cuda/10.1.243&amp;lt;/code&amp;gt;. It is usually better to be explicit about the versions, for future reproducibility.&lt;br /&gt;
* Modules often require other modules to be loaded first.  Solve these dependencies by using [[Using_modules#Module_spider | &amp;lt;code&amp;gt;module spider&amp;lt;/code&amp;gt;]].&lt;br /&gt;
&lt;br /&gt;
= Available compilers and interpreters =&lt;br /&gt;
* &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt; module has to be loaded first for GPU softwares.&lt;br /&gt;
* For most compiled software, one should use the GNU compilers (&amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; for C, &amp;lt;tt&amp;gt;g++&amp;lt;/tt&amp;gt; for C++, and &amp;lt;tt&amp;gt;gfortran&amp;lt;/tt&amp;gt; for Fortran). Loading &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt; module makes these available. &lt;br /&gt;
* The IBM XL compiler suite (&amp;lt;tt&amp;gt;xlc_r, xlc++_r, xlf_r&amp;lt;/tt&amp;gt;) is also available, if you load one of the &amp;lt;tt&amp;gt;xl&amp;lt;/tt&amp;gt; modules.&lt;br /&gt;
* To compile mpi code, you must additionally load an &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;spectrum-mpi&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
=== CUDA ===&lt;br /&gt;
&lt;br /&gt;
The current installed CUDA Tookits are '''10.1.243''' and '''10.2.89 (default)'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/&amp;lt;version&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*A compiler (GCC, XL or PGI) module must be loaded in order to use CUDA to build any code.&lt;br /&gt;
The current NVIDIA driver version is 440.33.01.&lt;br /&gt;
&lt;br /&gt;
===GNU Compilers ===&lt;br /&gt;
&lt;br /&gt;
Available GCC modules are:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/7.5.0&lt;br /&gt;
gcc/8.4.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== IBM XL Compilers ===&lt;br /&gt;
&lt;br /&gt;
To load the native IBM xlc/xlc++ and xlf (Fortran) compilers, run&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load xl/16.1.1.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IBM XL Compilers are enabled for use with NVIDIA GPUs, including support for OpenMP GPU offloading and integration with NVIDIA's nvcc command to compile host-side code for the POWER9 CPU. Information about the IBM XL Compilers can be found at the following links:[https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL C/C++], &lt;br /&gt;
[https://www.ibm.com/support/knowledgecenter/SSAT4T_16.1.1/com.ibm.compilers.linux.doc/welcome.html IBM XL Fortran]&lt;br /&gt;
&lt;br /&gt;
=== OpenMPI ===&lt;br /&gt;
&amp;lt;tt&amp;gt;openmpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module is avaiable with different compilers including GCC and XL. &amp;lt;tt&amp;gt;spectrum-mpi/&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; module provides IBM Spectrum MPI.&lt;br /&gt;
&lt;br /&gt;
=== PGI ===&lt;br /&gt;
To load PGI compiler and its own OpenMPI environment, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load pgi/19.10&lt;br /&gt;
module load pgi-openmpi/3.1.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Softwares =&lt;br /&gt;
== Amber20 ==&lt;br /&gt;
&lt;br /&gt;
Users who hold Amber20 license can build Amber20 from its source code and run on Mist. '''SOSCIP/SciNet doesn't provide Amber license or source code.'''&lt;br /&gt;
&lt;br /&gt;
=== Building Amber20 ===&lt;br /&gt;
Modules that are needed for building Amber20:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
1) MistEnv/2020a (S)   2) cuda/10.2.89   3) gcc/8.4.0   4) cmake/3.16.3   5) openmpi/4.0.3   6) anaconda3/2019.10   7) nccl/2.5.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Cmake configuration:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/where-amber-install -DCOMPILER=GNU -DMPI=TRUE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DOPENMP=TRUE -DNCCL=TRUE -DAPPLY_UPDATES=TRUE&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Running Amber20 ===&lt;br /&gt;
'''NVIDIA Pascal and later GPUs do not scale beyond a single GPU'''. It is highly suggest to run Amber20 as a single-gpu job.&lt;br /&gt;
A job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account soscip-&amp;lt;SOSCIP-project-ID&amp;gt;&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 nccl/2.5.6&lt;br /&gt;
export PATH=$HOME/where-amber-install/bin:$PATH&lt;br /&gt;
export LD_LIBRARY_PATH=$HOME/where-amber-install/lib:$LD_LIBRARY_PATH&lt;br /&gt;
pmemd.cuda .... &amp;lt;parameters&amp;gt; ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Anaconda (Python) ==&lt;br /&gt;
Anaconda is a popular distribution of the Python programming language. It contains several common Python libraries such as SciPy and NumPy as pre-built packages, which eases installation. Anaconda is provided as modules: '''anaconda3'''&lt;br /&gt;
&lt;br /&gt;
To install Anaconda locally, user need to load the module and create a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n myPythonEnv python=3.7&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*Note: By default, conda environments are located in '''$HOME/.conda/envs'''. Cache (downloaded tarballs and packages) is under '''$HOME/.conda/pkgs'''. User may run into problem with disk quota if there are too many environments created. To clean conda cache, '''please run: &amp;quot;conda clean -y --all&amp;quot; and &amp;quot;rm -rf $HOME/.conda/pkgs/*&amp;quot; after installation of packages'''.&lt;br /&gt;
&lt;br /&gt;
To activate the conda environment: (should be activated before running python)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that you SHOULD NOT use '''conda activate myPythonEnv''' to activate the environment.  This leads to all sorts of problems.  Once the environment is activated, user can update or install packages via '''conda''' or '''pip'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install  &amp;lt;package_name&amp;gt; (preferred way to install packages)&lt;br /&gt;
pip install &amp;lt;package_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To deactivate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
source deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To remove a conda environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda remove --name myPythonEnv --all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To verify that the environment was removed, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda info --envs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Python Job ===&lt;br /&gt;
A single-gpu job example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate myPythonEnv&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CuPy ==&lt;br /&gt;
[https://cupy.chainer.org CuPy] is an open-source matrix library accelerated with NVIDIA CUDA. It also uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture. CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.&lt;br /&gt;
&lt;br /&gt;
CuPy can be install into any conda environment. Python packages: numpy, six and fastrlock are required. cuDNN and NCCL are optional.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3/2019.10 cuda/10.2.89 gcc/7.5.0 cudnn/7.6.5.32  nccl/2.5.6 &lt;br /&gt;
conda create -n cupy-env python=3.7 numpy six fastrlock&lt;br /&gt;
source activate cupy-env&lt;br /&gt;
CFLAGS=&amp;quot;-I$SCINET_CUDNN_ROOT/include -I$SCINET_NCCL_ROOT/include -I$SCINET_CUDA_ROOT/include&amp;quot; LDFLAGS=&amp;quot;-L$SCINET_CUDNN_ROOT/lib64 -L$SCINET_NCCL_ROOT/lib&amp;quot; CUDA_PATH=$SCINET_CUDA_ROOT pip install cupy&lt;br /&gt;
#building/installing CuPy will take a few minutes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Gromacs ==&lt;br /&gt;
[http://www.gromacs.org/ GROMACS] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.5&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''GROMACS 2020''' Thread-MPI version supports full GPU enablement of all key computational sections. The GPU is used throughout the timestep and repeated CPU-GPU transfers are eliminated. '''Currently only single-GPU is supported on Mist'''. Users are suggested to carefully verify the results.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 gromacs/2020.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Small/Medium Simulation ===&lt;br /&gt;
Due to the lack of PME domain decomposition support on GPU, Gromacs uses CPU to calculate PME when using multiple GPUs. '''It is always recommended to use a single GPU to do small and medium sized simulations with Gromacs.''' By using only 1 MPI rank (w/ OpenMP threads) on a single GPU, both non-bonded PP and PME are atomically offloaded to GPU when possible.&lt;br /&gt;
* A Single-GPU Gromacs job must ask '''--ntasks=32''' even only 1 MPI rank will be launched by mpirun command. '''OMP_PLACES''' must be set to core to force OpenMP threads on physical CPU cores. '''-bind-to none''' and '''-pin off''' must be set to avoid CPU affiliate conflicts among OpenMP, MPI and Gromacs. '''OMP_NUM_THREADS''' must be set to 8 to get optimal performance.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --ntasks=32&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
mpirun -np 1 -bind-to none gmx_mpi mdrun -pin off -ntomp 8 ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Groamcs 2020 example: (OpenMPI module should to be loaded, but mpirun should NOT be used)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.4.0 openmpi/4.0.3 gromacs/2020.4&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
export OMP_PLACES=cores&lt;br /&gt;
gmx mdrun -pin off -ntmpi 1 -ntomp 8 -update gpu ... &amp;lt;other parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Large Simulation ===&lt;br /&gt;
If memory size (~58GB) for single-gpu job is not sufficient for the simulation,  multiple GPUs can be used. It is suggested to test starting with one full node with 4GPUs and force PME on GPU. Multiple PME ranks are not supported with PME on GPU, so if GPU is used for the PME calculation -npme (number of PME ranks) must be set to 1. If PME has less work than PP, it is suggested to run multiple ranks per GPU, so the GPU for PME rank can also do some work on PP rank(s). When running multiple MPI ranks on the same GPU, NVIDIA Multi-Process Service (MPS) must be enabled.&lt;br /&gt;
*An example using 4 GPUs, 7 PP ranks + 1 PME rank: ('''-pin on -pme gpu -npme 1''' must be added to mdrun command in order to force GPU to do PME)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun  -bind-to none gmx_mpi mdrun -pin on -pme gpu -npme 1 ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*It is suggested to also test using '''--ntasks=4''' and '''OMP_NUM_THREADS=8''' if you receive a NOTE in Gromacs output saying &amp;quot;% performance was lost because the PME ranks had more work to do than the PP ranks&amp;quot;. In this case, NVIDIA MPS is not needed since there is only one MPI rank per GPU.&lt;br /&gt;
*'''Please note that the solving of PME on GPU is still only the initial version supporting this behaviour, and comes with a set of limitations outlined further below.'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
* Only a PME order of 4 is supported on GPUs.&lt;br /&gt;
* PME will run on a GPU only when exactly one rank has a PME task, ie. decompositions with multiple ranks doing PME are not supported.&lt;br /&gt;
* Only single precision is supported.&lt;br /&gt;
* Free energy calculations where charges are perturbed are not supported, because only single PME grids can be calculated.&lt;br /&gt;
* Only dynamical integrators are supported (ie. leap-frog, Velocity Verlet, stochastic dynamics)&lt;br /&gt;
* LJ PME is not supported on GPUs.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*An example using 4 GPUs, '''PME on CPU''': ('''-pin on''' must be added to mdrun command for proper CPU thread bindings)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89  gcc/8.3.0  openmpi/3.1.5 gromacs/2019.6&lt;br /&gt;
&lt;br /&gt;
mkdir -p /dev/shm/nvidia-mps&lt;br /&gt;
export CUDA_MPS_PIPE_DIRECTORY=/dev/shm/nvidia-mps&lt;br /&gt;
mkdir -p /dev/shm/nvidia-log&lt;br /&gt;
export CUDA_MPS_LOG_DIRECTORY=/dev/shm/nvidia-log&lt;br /&gt;
nvidia-cuda-mps-control -d&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
mpirun -bind-to none gmx_mpi mdrun -pin on  ... &amp;lt;add your parameters&amp;gt;&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;--ntasks=16, OMP_NUM_THREADS=2&amp;quot; and &amp;quot;--ntasks=4, OMP_NUM_THREADS=8&amp;quot; should also be tested.  &lt;br /&gt;
# num_Tasks(MPI_ranks) * num_OpenMP_threads = 32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*'''NOTE: The above examples will NOT work with multiple nodes. If simulation is too large for a single GPU node, please contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
== IBM Watson Machine Learning Community Edition (PowerAI) ==&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/releases/ IBM Watson Machine Learning Community Edition (PowerAI)] contains many popular ML packages including TensorFlow, PyTorch, XGBoost and RAPIDS. It is distributed through IBM Conda channel. To install packages from PowerAI, user needs to specify IBM Conda channel when using Anaconda.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
&lt;br /&gt;
conda create --name wmlce_env -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda &amp;lt;package_name&amp;gt; (e.g. powerai, tensorflow-gpu, keras, pytorch, powerai-rapids, py-xgboost-gpu,  etc)&lt;br /&gt;
&lt;br /&gt;
source activate wmlce_env &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*The WML CE Early Access Conda channel (https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/) makes new versions of frameworks available in advance of formal WML CE releases. Easy upgrade between packages in the main and Early Access channels is not guaranteed. Using a separate conda environment for Early Access packages is recommended.&lt;br /&gt;
&lt;br /&gt;
== NAMD ==&lt;br /&gt;
[http://www.ks.uiuc.edu/Research/namd/ NAMD] is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems.&lt;br /&gt;
=== v2.13 ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per node====&lt;br /&gt;
An example of the job script (using 1 node, '''one process per node''',  32 CPU threads per process + 4 GPUs per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 1 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 32 +p $((32*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
==== Running with one process per GPU ====&lt;br /&gt;
NAMD may scale better if using '''one process per GPU'''. Please do your own benchmark.&lt;br /&gt;
An example of the job script (using 1 node, '''one process per GPU''',  8 CPU threads per process):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --time=20:00&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=4&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/7.5.0 fftw/3.3.8 spectrum-mpi/10.3.1  namd/2.13&lt;br /&gt;
scontrol show hostnames &amp;gt; nodelist-$SLURM_JOB_ID&lt;br /&gt;
&lt;br /&gt;
`which charmrun` -npernode 4 -hostfile nodelist-$SLURM_JOB_ID `which namd2` +setcpuaffinity +pemap 0-127:4 +idlepoll +ppn 8 +p $((8*SLURM_NTASKS)) stmv.namd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== PyTorch ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install PyTorch on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install PyTorch using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n pytorch_env python=3.7&lt;br /&gt;
source activate pytorch_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ pytorch=1.3.1 &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''FOR NEWER VERSIONS:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/ pytorch=1.5.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAPIDS ==&lt;br /&gt;
The [https://rapids.ai RAPIDS] is a suite of open source software libraries that gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. The RAPIDS data science framework includes a collection of libraries: '''cuDF(GPU DataFrames)''', '''cuML(GPU Machine Learning Algorithms)''', '''cuStrings(GPU String Manipulation)''', etc.&lt;br /&gt;
&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install RAPIDS on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install powerai-rapids using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n rapids_env python=3.7&lt;br /&gt;
source activate rapids_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ powerai-rapids&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TensorFlow and Keras ==&lt;br /&gt;
=== Installing from IBM Conda Channel ===&lt;br /&gt;
The easiest way to install TensorFlow and Keras on Mist is using IBM's Conda channel. User needs to prepare a conda environment with Python 3.6 or 3.7 and install TensorFlow-gpu using IBM's Conda channel.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load anaconda3&lt;br /&gt;
conda create -n tf_env python=3.7&lt;br /&gt;
source activate tf_env&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==2.1.2&lt;br /&gt;
If you need TF 1.x version:&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ tensorflow-gpu==1.15.4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
'''FOR NEWER VERSIONS:'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install -c https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/  tensorflow-gpu==2.2.0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Once the installation finishes, please clean the cache:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda clean -y --all&lt;br /&gt;
rm -rf $HOME/.conda/pkgs/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Testing and debugging =&lt;br /&gt;
You really should test your code before you submit it to the cluster to know if your code is correct and what kind of resources you need.&lt;br /&gt;
* Small test jobs can be run on the login node.  Rule of thumb: tests should run no more than a couple of minutes, taking at most about 1-2GB of memory, and use no more than one gpu and a few cores.&lt;br /&gt;
&amp;lt;!-- * You can run the [[Parallel Debugging with DDT|DDT]] debugger on the login nodes after &amp;lt;code&amp;gt;module load ddt&amp;lt;/code&amp;gt;. --&amp;gt;&lt;br /&gt;
* Short tests that do not fit on a login node, or for which you need a dedicated node, request an interactive debug job with the debug command:&lt;br /&gt;
 mist-login01:~$ debugjob --clean -g G&lt;br /&gt;
where G is the number of gpus, If G=1, this gives an interactive session for 2 hours, whereas G=4 gets you a single node with 4 gpus for 30 minutes, and with G=8 (the maximum) gets you 2 nodes each with 4 gpus for 30 minutes.  The &amp;lt;tt&amp;gt;--clean&amp;lt;/tt&amp;gt; argument is optional but recommended as it will start the session without any modules loaded, thus mimicking more closely what happens when you submit a job script.&lt;br /&gt;
&lt;br /&gt;
= Submitting jobs =&lt;br /&gt;
Once you have compiled and tested your code or workflow on the Mist login nodes, and confirmed that it behaves correctly, you are ready to submit jobs to the cluster.  Your jobs will run on some of Mist's 53 compute nodes.  When and where your job runs is determined by the scheduler.&lt;br /&gt;
&lt;br /&gt;
Mist uses SLURM as its job scheduler. It is configured to allow only '''Single-GPU jobs''' and '''Full-node jobs (4 GPUs per node)'''.&lt;br /&gt;
&lt;br /&gt;
You submit jobs from a login node by passing a script to the sbatch command:&lt;br /&gt;
&lt;br /&gt;
mist-login01:scratch$ sbatch jobscript.sh&lt;br /&gt;
&lt;br /&gt;
This puts the job in the queue. It will run on the compute nodes in due course. In most cases, you should not submit from your $HOME directory, but rather, from your $SCRATCH directory, so that the output of your compute job can be written out (as mentioned above, $HOME is read-only on the compute nodes).&lt;br /&gt;
&lt;br /&gt;
Example job scripts can be found below.&lt;br /&gt;
Keep in mind:&lt;br /&gt;
* Scheduling is by single gpu or by full node, so you ask only 1 gpu or 4 gpus per node.&lt;br /&gt;
* Your job's maximum walltime is 24 hours. &lt;br /&gt;
* Jobs must write their output to your scratch or project directory (home is read-only on compute nodes).&lt;br /&gt;
* Compute nodes have no internet access.&lt;br /&gt;
* Your job script will not remember the modules you have loaded, so it needs to contain &amp;quot;module load&amp;quot; commands of all the required modules (see examples below). &lt;br /&gt;
== SOSCIP Users ==&lt;br /&gt;
*[https://www.soscip.org SOSCIP] is a consortium to bring together industrial partners and academic researchers and provide them with sophisticated advanced computing technologies and expertise to solve social, technical and business challenges across sectors and drive economic growth.&lt;br /&gt;
&lt;br /&gt;
If you are working on a SOSCIP project, please contact [mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca] to have your user account added to SOSCIP project accounts. SOSCIP users need to submit jobs with additional SLURM flag to get higher priority:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#SBATCH -A soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;    #e.g. soscip-3-001&lt;br /&gt;
OR&lt;br /&gt;
#SBATCH --account=soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Single-GPU job script ==&lt;br /&gt;
For a single GPU job, each will have a quarter of the node which is 1 GPU + 8/32 CPU Cores/Threads + ~58GB CPU memory. '''Users should never ask CPU or Memory explicitly.''' If running MPI program, user can set --ntasks to be the number of MPI ranks. '''Do NOT set --ntasks for non-MPI programs.''' &lt;br /&gt;
*It is suggested to use NVIDIA Multi-Process Service (MPS) if running multiple MPI ranks on one GPU.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=1&lt;br /&gt;
#SBATCH --gpus-per-node=1&lt;br /&gt;
#SBATCH --time=1:00:0&lt;br /&gt;
#SBATCH --account soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load anaconda3&lt;br /&gt;
source activate conda_env&lt;br /&gt;
python code.py ...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Full-node job script ==&lt;br /&gt;
'''If you are not sure the program can be executed on multiple GPUs, please follow the single-gpu job instruction above or contact SciNet/SOSCIP support.'''&lt;br /&gt;
&lt;br /&gt;
Multi-GPU job should ask for a minimum of one full node (4 GPUs). User need to specify &amp;quot;compute_full_node&amp;quot; partition in order to get all resource on a node. &lt;br /&gt;
*An example for a 2-node, 8-rank OpenMPI job: (Each rank binds to 1 GPU and 8 physical CPU cores in this case)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
#SBATCH --gpus-per-node=4&lt;br /&gt;
#SBATCH --ntasks=8&lt;br /&gt;
#SBATCH --time=1:00:00&lt;br /&gt;
#SBATCH -p compute_full_node&lt;br /&gt;
#SBATCH --account soscip-&amp;lt;SOSCIP_PROJECT_ID&amp;gt; #For SOSCIP projects only&lt;br /&gt;
&lt;br /&gt;
module load cuda/10.2.89 gcc/8.3.0 openmpi/3.1.5&lt;br /&gt;
&lt;br /&gt;
mpirun -bind-to core -map-by slot:PE=8 -report-bindings ./program&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Support =&lt;br /&gt;
&lt;br /&gt;
SciNet inquiries:&lt;br /&gt;
* [mailto:support@scinet.utoronto.ca support@scinet.utoronto.ca]&lt;br /&gt;
* [mailto:niagara@computecanada.ca niagara@computecanada.ca]&lt;br /&gt;
&lt;br /&gt;
SOSCIP inquiries:&lt;br /&gt;
*[mailto:soscip-support@scinet.utoronto.ca soscip-support@scinet.utoronto.ca]&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
	<entry>
		<id>https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2740</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://docs.scinet.utoronto.ca/index.php?title=Main_Page&amp;diff=2740"/>
		<updated>2020-08-13T12:25:01Z</updated>

		<summary type="html">&lt;p&gt;Ymeiron: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
{| style=&amp;quot;border-spacing:10px; width: 95%&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:1em; padding-top:.1em; border:2px solid #0645ad; background-color:#f6f6f6; border-radius:7px&amp;quot;|&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Use &amp;quot;Up&amp;quot; or &amp;quot;Down&amp;quot;; these are templates. --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;width:100%&amp;quot; &lt;br /&gt;
|{{Up|Niagara|Niagara_Quickstart}}&lt;br /&gt;
|{{up|HPSS|HPSS}}&lt;br /&gt;
|{{Down|Mist|Mist}}&lt;br /&gt;
|{{Up|Teach|Teach}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|Jupyter Hub|Jupyter_Hub}}&lt;br /&gt;
|{{Down|Scheduler|Niagara_Quickstart#Submitting_jobs}}&lt;br /&gt;
|{{Up|File system|Niagara_Quickstart#Storage_and_quotas}}&lt;br /&gt;
|{{Up|Burst Buffer|Burst_Buffer}}&lt;br /&gt;
|-&lt;br /&gt;
|{{Up|Login Nodes|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up|External Network|Niagara_Quickstart#Logging_in}} &lt;br /&gt;
|{{Up|Globus|Globus}}&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Current Messages: --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 10, 2020, 7:30 PM EST:&amp;lt;/b&amp;gt; Scheduler fully operational again.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 10, 2020, 3:00 PM EST:&amp;lt;/b&amp;gt; Scheduler partially functional: jobs can be submitted and are running.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 10, 2020, 2:00 PM EST:&amp;lt;/b&amp;gt; Scheduler is temporarily inoperational.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 7, 2020, 9:15 PM EST:&amp;lt;/b&amp;gt; Network is fixed, scheduler and other services are coming back.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;August 7, 2020, 8:20 PM EST:&amp;lt;/b&amp;gt; Disruption of part of the network in the data centre.  Causes issue with the scheduler, the mist login node, and possibly others. We are investigating.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;!--  When removing system status entries, please archive them to: https://docs.scinet.utoronto.ca/index.php/Previous_messages --&amp;gt;&lt;br /&gt;
{|style=&amp;quot;border-spacing: 10px;width: 100%&amp;quot;&lt;br /&gt;
|valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== QuickStart Guides ==&lt;br /&gt;
* [[Niagara Quickstart]]&lt;br /&gt;
* [[HPSS | HPSS archival storage]]&lt;br /&gt;
* [[SOSCIP_GPU | SOSCIP GPU cluster]]&lt;br /&gt;
* [[Mist| Mist Power 9 GPU cluster]]&lt;br /&gt;
* [[Teach|Teach cluster]]&lt;br /&gt;
* [[FAQ | FAQ (frequently asked questions)]]&lt;br /&gt;
* [[Acknowledging SciNet]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;margin: 1em; padding:1em; padding-top:.1em; border:2px solid #000; background-color:#fff; border-radius:7px; width: 49.5%&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Tutorials, Manuals, etc. ==&lt;br /&gt;
* [https://support.scinet.utoronto.ca/education/browse.php SciNet education material]&lt;br /&gt;
* [https://www.youtube.com/c/SciNetHPCattheUniversityofToronto SciNet's YouTube channel]&lt;br /&gt;
* [[Modules specific to Niagara|Software Modules specific to Niagara]] &lt;br /&gt;
* [[Commercial software]]&lt;br /&gt;
* [[Burst Buffer]]&lt;br /&gt;
* [[SSH Tunneling]]&lt;br /&gt;
* [[SSH#Two-Factor_authentication|Two-Factor Authentication]]&lt;br /&gt;
* [[Visualization]]&lt;br /&gt;
* [[Running Serial Jobs on Niagara]]&lt;br /&gt;
* [[Jupyter Hub]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ymeiron</name></author>
	</entry>
</feed>