Python

From SciNet Users Documentation
Jump to: navigation, search

Python is programing language that continues to grow in popularity for scientific computing. It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.


Python on Niagara

We currently have three families of Python installed on Niagara.

  • Anaconda
  • Intel Python
  • regular Python

Here we describe the differences between these packages.

Anaconda

Anaconda is a pre-assembled set of commonly-used self-consistent Python packages. The source for this collection is here. There are two types of Anaconda Python available:

  • The whole Anaconda software stack (for example anaconda3/5.1.0 module will load python 3.6).
  • Anaconda's Python, with all the Python packages, but without the rest of the Anaconda stack (gcc, bzip2, HDF5/NetCDF tools, etc) (the python/2.7.14-anaconda5.1.0 and python/3.6.4-anaconda5.1.0 modules).

As of 9 July 2018 the following Anaconda modules are available:

   $ module avail anaconda
   ----------------- /scinet/niagara/software/2018a/modules/base ------------------
    anaconda2/5.1.0    python/2.7.14-anaconda5.1.0    r/3.4.3-anaconda5.1.0
    anaconda3/5.1.0    python/3.6.4-anaconda5.1.0

Note that none of these modules require a compiler to be loaded. Also, note the presence of the R module. Anaconda now also comes with R; this package is the R analogy to the Anaconda Python modules.

You load the module in the usual way:

    $ module load anaconda3/5.1.0
    $ python
    >>>

We advice against installing your own anaconda or miniconda in your home directory. Instead, start from the anaconda3 module and use conda environments. The reason is that installing your own anaconda or miniconda would cause many more files to be installed in your $HOME directory, and this might cause trouble with the quota on the number of files you can have on $HOME.

Similarly, we would urge you do remove any conda environments that you are not using, to help reduce the number of files on the $HOME file system.

Intel Python

The Intel Python modules are based on the Anaconda package. Intel has modified the package, and optimized the libraries to use the MKL libraries, which should make them faster than the Anaconda modules for some calculations.

As of 9 July 2018 the following Intel Python modules are available:

   $ module avail intelpython
   ----------------- /scinet/niagara/software/2018a/modules/base ------------------
    intelpython2/2018.2    intelpython3/2018.2

Regular Python

The base Python program has also been installed from source. This installation comes with no Python packages installed other than virtualenv and pip. You can use this module, in concert with virtualenv and pip, to build your own virtual environment.

    $ module avail python
    ----------------- /scinet/niagara/software/2018a/modules/base ------------------
    intelpython2/2018.2            python/2.7.14
    intelpython3/2018.2            python/3.6.4-anaconda5.1.0
    python/2.7.14-anaconda5.1.0    python/3.6.5               (D)
    $ module load python/3.6.5
    $ python
    >>>

Running serial Python jobs

As with all serial jobs, if your Python computation does not use multiple cores, you should bundle them up so the 40 cores of a node are all performing work. Examples of this can be found on this page.

Installing your own Python Modules

If you need to install your own Python modules, either in Anaconda or in regular Python, you should set up a virtual environment (or a 'conda' environment if you're using Anaconda). Visit the PythonVirtualEnv page for instructions on how to set this up.

Using a Jupyter Notebook

You may develop your Python scripts in a Jupyter Notebook on Niagara. A node has been set aside as a Jupyter Hub. See this page for details on how to access that node, and develop your code.

Producing Matplotlib Figures on Niagara Compute Nodes and in Job Scripts

The conventional way of producing figures from python using matplotlib i.e.,

   import matplotlib.pyplot as plt
   plt.plot(.....)
   plt.savefig(...)

will not work on the Niagara compute nodes. The reason is that pyplot will try to open the figure in a window on the screen, but the compute nodes do not have screens or window managers. There is an easy workaround, however, that sets up a different 'backend' to matplotlib, one that does not try to open a window, as follows:

   import matplotlib as mpl
   mpl.use('Agg')
   import matplotlib.pyplot as plt
   plt.plot(.....)
   plt.savefig(...)

It is essential that the mpl.use('Agg') command precedes the importing of pyplot.

SciNet's Python Classes

There is a dizzying amount of documentation available for programming in Python on the Python.org webpage. That begin said, each fall, SciNet runs two 4-week classes on using Python for research:

  • SCMP142: Introduction to Programming with Python. This class is intended for those with little-to-no programming experience who wish to learn how to program.
  • SCMP112: Introduction to Scientific Computing with Python. This class focusses on using Python to perform research computing.

An excellent set of material for teaching scientists to program in Python is also available at the Software Carpentry homepage.