Difference between revisions of "Installing your own Python Modules"

From SciNet Users Documentation
Jump to: navigation, search
(Using Python VirtualEnv in Anaconda)
 
(22 intermediate revisions by 5 users not shown)
Line 1: Line 1:
=Python Virtual Environments=
+
There are many optional and conflicting packages for Python that users could potentially want (see e.g. http://pypi.python.org/pypi). Therefore, users need to install these additional packages locally in their home directories.  In fact, there is no choice, as users do not have permissions to install packages system-wide.
  
Virtual environments (short virtualenv) are a standard in Python to create isolated Python environments. This is useful when certain modules or certain versions of modules are not available in the thee default python environment.
+
Python provides a number of ways to install packages, the most common of which are the <tt>pip</tt> and <tt>conda</tt> commands.  By default, these commands would install in the same directory as the one in which the python executable lives,
 +
but python provides a number of ways for users to install libraries in their home directories instead.
  
VirtualEnv can be used either with the default python modules or the anaconda ones.
+
One way to do this with <tt>pip</tt> using the <tt>--user</tt> option, but you shouldn't. That approach is now mostly superseded by virtual environments, and we do not recommend using the <tt>--user</tt> option as it can interfere with other Python environments.
  
== Using Python VirtualEnv in Anaconda ==
+
Virtual environments are a standard in Python to create isolated Python environments. This is useful when certain modules or certain versions of modules are not available in the default python environment.
  
VirtualEnv are right built-in in Anaconda, see [https://conda.io/docs/user-guide/tasks/manage-environments.html]
+
Virtual environments can be used either with the [[Python#Regular_Python | regular python modules]] or the [[Python#Intel_Python | intelpython/anaconda]] modules.
You just need to load the proper anaconda module, eg.
 
  
<source lang="bash">
+
== Using Virtualenv in Regular Python ==
 # load the anaconda module
 
 module load python/3.6.4-anaconda5.1.0
 
  
 # create a virtual env.
+
===Creation===
 conda create -n myPythonEnv python=3.6
+
First load a python module, e.g.
  
 # activate your vitual env.
+
    module load NiaEnv/2019b python/3.6
 source activate myPythonEnv
 
  
 # at this point you are in your own environment and can just do the installation of any package that you need, eg.
+
or
 pip install myFAVpackage
 
  
</source>
+
    module load NiaEnv/2019b python/3.8
  
== Using Python Virtualenv in plain Python ==
+
Then create a directory for the virtual environments.
 +
One can put a virtual environment anywhere, but this directory structure is recommended:
 +
 
 +
    mkdir ~/.virtualenvs
 +
    cd ~/.virtualenvs
 +
 
 +
Now we create our first virtualenv called <code>myEnv</code> choose any name you like:
 +
 
 +
    virtualenv --system-site-packages ~/.virtualenvs/myenv
 +
 
 +
The "--system-site-packages" flag will use the system-installed versions of packages rather than installing them anew (the list of these packages can be found on the [[Python]] wiki page).  This will result in fewer files created in your virtual environment.  After that you can activate that virtual environment:
 +
 
 +
    source ~/.virtualenvs/myenv/bin/activate
 +
 
 +
As you are in the virtualenv now, you can just type <code>pip install <required module></code> to install any module into your virtual environment. 
  
First load a python module:
+
To go back to the normal python installation simply type
  
<source lang="bash">
+
    deactivate
module load python/2.7.14
 
</source>
 
  
Then create a directory for the virtual environments.
+
===Command line and job usage===
One can put an virtual environment anywhere, but this directory structure is recommended:
+
 
 +
You need to activate the appropriate environment every time you log in, and at the start of all your jobs scripts.  However, the installation of packages only needs to be done once.  In the NiaEnv/2019b stack, it is *not* necessary to load the python module before activating the environment, while in the NiaEnv/2018a stack, you need to load the python module before activating the environment. 
 +
 
 +
===Usage of your virtual environment by others===
 +
 
 +
Sharing a virtual environment with another user is easy. As long as the directory containing the virtual environment is readable by that other user (which on Niagara is the default when that user is in the same group as the directory), then they simply have to source the activate file in the bin directory of that environment, e.g.
 +
 
 +
    source /home/g/group/user/.virtualenvs/myenv/bin/activate
 +
 
 +
===Usage in the Jupyter Hub===
 +
 
 +
You can use your virtual environment in Niagara's [[Jupyter_Hub]], but there are two additional steps required to get the JupterHub to know about your environment and to make it as one of its possible "kernels" for new notebooks.
 +
 
 +
After having activated your environment, execute the following two commands
 +
 
 +
    pip install ipykernel
 +
    python -m ipykernel install --name NAME --user
 +
    venv2jup
 +
 
 +
The first installs the packages needed to interface with jupyter as a kernel, the latter puts an entry in the <tt>.share/jupyter</tt> directory, in which the jupyterhub looks for possible kernels. The final command corrects some paths and checks if all is setup properly. This procedure works for NiaEnv/2019b, but may fail for NiaEnv/2018a.
 +
 
 +
For conda environments that were installed in .conda/envs, the jupyter notebook should pick them up automatically.
 +
 
 +
== Using Virtual Environments in Intelpython/Anaconda ==
 +
 
 +
===Creation===
 +
 
 +
One can use the same kind of virtual environments for the intelpython and conda modules as for regular modules. However,
 +
environments are built-in in Anaconda, see [https://conda.io/docs/user-guide/tasks/manage-environments.html].  These "conda environments" are not the same as regular virtual environments, as they can contain general packages, such as compilers.  The latter feature means that conda environments are much more flexible, but also that they do not cooperate well with other software modules on Niagara.  Therefore, you should always use regular virtual environments and pip on Niagara and not conda, unless you have a good reason not too.
 +
 
 +
First, you just need to load a conda-like module, e.g.
 +
 
 +
   module load NiaEnv/2019b intelpython3
 +
 
 +
Then, you create a virtual environment
 +
 
 +
   conda create -n myPythonEnv python=3.6
 +
 
 +
(conda puts the environment in the directory <tt>$HOME/.conda/envs/myPythonEnv</tt>)
 +
 
 +
Next, you activate your conda environment:
 +
 
 +
   source activate myPythonEnv
 +
 
 +
At this point you are in your own environment and can just do the installation of any package that you need, e.g.
  
<source lang="bash">
+
    pip install myFAVpackage
mkdir ~/.virtualenvs
 
cd ~/.virtualenvs
 
</source>
 
  
Now we create our first virtualenv called ''myEnv'' choose any name you like:
+
or
 +
    conda install myFAVpackage
  
<source lang="bash">
+
To go back to the normal python installation, type
virtualenv myEnv
+
   
</source>
+
    source deactivate
  
After that you can activate that virtual environment:
+
===Command line and job usage===
  
<source lang="bash">
+
You need to load the intelpython/anaconda module and activate the appropriate environment every time you log in, and at the start of all your jobs scripts.  However, the installation of packages only needs to be done once.  
source ~/.virtualenvs/myenv/bin/activate
 
</source>
 
  
To go back to the normal python installation simply type ''deactivate''.
+
===Usage in the Jupyter Hub===
As you are in the virtualenv now you can just type ''pip install <required module>'' to install any module in your virtual environment.
 
  
 +
You can use conda environment in Niagara's [[Jupyter_Hub]]. If they were installed in .conda/envs, the jupyter notebook should pick them up automatically.
  
===Installing the Scientific Python Suite===
+
==Installing the Scientific Python Suite==
  
For many scientific codes the packages ''numpy'', ''scipy'', ''matplotlib'', ''pandas'' and ''ipython'' are used.
+
For many scientific codes the packages ''numpy'', ''scipy'', ''matplotlib'', ''pandas'' and ''ipython'' are used. Versions of these are already in the python modules (except for the regular python modules in the NiaEnv/2018a stack).
All of these install very simply in a virtualenv using the ''pip install <package name>''.
+
 +
However, if you need different versions, you could start your virtual environment without <tt>--system-site-packages</tt>.  In that case, for regular python modules, please install versions of package with an <tt>intel-</tt> prefix, if they exists, so that you will get the most optimized version of the package.

Latest revision as of 17:26, 5 December 2020

There are many optional and conflicting packages for Python that users could potentially want (see e.g. http://pypi.python.org/pypi). Therefore, users need to install these additional packages locally in their home directories. In fact, there is no choice, as users do not have permissions to install packages system-wide.

Python provides a number of ways to install packages, the most common of which are the pip and conda commands. By default, these commands would install in the same directory as the one in which the python executable lives, but python provides a number of ways for users to install libraries in their home directories instead.

One way to do this with pip using the --user option, but you shouldn't. That approach is now mostly superseded by virtual environments, and we do not recommend using the --user option as it can interfere with other Python environments.

Virtual environments are a standard in Python to create isolated Python environments. This is useful when certain modules or certain versions of modules are not available in the default python environment.

Virtual environments can be used either with the regular python modules or the intelpython/anaconda modules.

Using Virtualenv in Regular Python

Creation

First load a python module, e.g.

   module load NiaEnv/2019b python/3.6

or

   module load NiaEnv/2019b python/3.8

Then create a directory for the virtual environments. One can put a virtual environment anywhere, but this directory structure is recommended:

   mkdir ~/.virtualenvs
   cd ~/.virtualenvs

Now we create our first virtualenv called myEnv choose any name you like:

   virtualenv --system-site-packages ~/.virtualenvs/myenv

The "--system-site-packages" flag will use the system-installed versions of packages rather than installing them anew (the list of these packages can be found on the Python wiki page). This will result in fewer files created in your virtual environment. After that you can activate that virtual environment:

   source ~/.virtualenvs/myenv/bin/activate 

As you are in the virtualenv now, you can just type pip install <required module> to install any module into your virtual environment.

To go back to the normal python installation simply type

   deactivate

Command line and job usage

You need to activate the appropriate environment every time you log in, and at the start of all your jobs scripts. However, the installation of packages only needs to be done once. In the NiaEnv/2019b stack, it is *not* necessary to load the python module before activating the environment, while in the NiaEnv/2018a stack, you need to load the python module before activating the environment.

Usage of your virtual environment by others

Sharing a virtual environment with another user is easy. As long as the directory containing the virtual environment is readable by that other user (which on Niagara is the default when that user is in the same group as the directory), then they simply have to source the activate file in the bin directory of that environment, e.g.

   source /home/g/group/user/.virtualenvs/myenv/bin/activate

Usage in the Jupyter Hub

You can use your virtual environment in Niagara's Jupyter_Hub, but there are two additional steps required to get the JupterHub to know about your environment and to make it as one of its possible "kernels" for new notebooks.

After having activated your environment, execute the following two commands

   pip install ipykernel
   python -m ipykernel install --name NAME --user
   venv2jup

The first installs the packages needed to interface with jupyter as a kernel, the latter puts an entry in the .share/jupyter directory, in which the jupyterhub looks for possible kernels. The final command corrects some paths and checks if all is setup properly. This procedure works for NiaEnv/2019b, but may fail for NiaEnv/2018a.

For conda environments that were installed in .conda/envs, the jupyter notebook should pick them up automatically.

Using Virtual Environments in Intelpython/Anaconda

Creation

One can use the same kind of virtual environments for the intelpython and conda modules as for regular modules. However, environments are built-in in Anaconda, see [1]. These "conda environments" are not the same as regular virtual environments, as they can contain general packages, such as compilers. The latter feature means that conda environments are much more flexible, but also that they do not cooperate well with other software modules on Niagara. Therefore, you should always use regular virtual environments and pip on Niagara and not conda, unless you have a good reason not too.

First, you just need to load a conda-like module, e.g.

   module load NiaEnv/2019b intelpython3

Then, you create a virtual environment

   conda create -n myPythonEnv python=3.6

(conda puts the environment in the directory $HOME/.conda/envs/myPythonEnv)

Next, you activate your conda environment:

   source activate myPythonEnv

At this point you are in your own environment and can just do the installation of any package that you need, e.g.

   pip install myFAVpackage

or

   conda install myFAVpackage

To go back to the normal python installation, type

   source deactivate

Command line and job usage

You need to load the intelpython/anaconda module and activate the appropriate environment every time you log in, and at the start of all your jobs scripts. However, the installation of packages only needs to be done once.

Usage in the Jupyter Hub

You can use conda environment in Niagara's Jupyter_Hub. If they were installed in .conda/envs, the jupyter notebook should pick them up automatically.

Installing the Scientific Python Suite

For many scientific codes the packages numpy, scipy, matplotlib, pandas and ipython are used. Versions of these are already in the python modules (except for the regular python modules in the NiaEnv/2018a stack).

However, if you need different versions, you could start your virtual environment without --system-site-packages. In that case, for regular python modules, please install versions of package with an intel- prefix, if they exists, so that you will get the most optimized version of the package.