Jupyter Hub

From SciNet Users Documentation
Revision as of 02:23, 29 May 2019 by Rzon (talk | contribs) (Using virtual environments in the JupyterHub)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Niagara has a node which has been designated a Jupyter Hub. This node can be used to run your Juptyer Notebook sessions.

Jupyter Hub node on Niagara

The Niagara Jupyter Hub server is a single node, with 93 GB of memory and 20 cores. You can access the node directly, from outside of SciNet, via jupyter.scinet.utoronto.ca.

  • Point your browser to 'jupyter.scinet.utoronto.ca' and log in with your Compute Canada account.
  • The browser should now show the files in your $HOME on Niagara. (If not, try reloading the page, it may have timed out).
  • To see your files on $SCRATCH, you need to have a symbolic link to $SCRATCH in your $HOME folder. This can be done by typing, once, in a terminal:
   ln -sT $SCRATCH $HOME/scratch
  • You can open or create Python 2, Python 3, and R notebooks.
  • Many python packages are already pre-installed.

Jupyterscreen5.png

Tips to get started

  • Jupyter can also browse your (Niagara) files and edit them.
  • Use the 'new' button to create a new python notebook.
  • Click on the 'Conda' tab to see your existing conda environments.
  • Give your notebooks reasonable names.
  • To execute a Python input line, press `Shift-Enter`.
  • Save your work periodically (even though there is autosave).
  • To work similarly to `ipython --pylab`, do:
In [1]: from pylab import *
        %matplotlib notebook

Using virtual environments in the JupyterHub

Starting a new virtual environment from the JupyterHub

To start a new virtual environment from the JupyterHub, just start a New Terminal session, and on the command-line, use something like this:

module load NiaEnv/2019b python/3.6.8    # or: module load CCEnv nixpkgs python/3.6.3
virtualenv --system-site-packages ~/.virtualenvs/NEWENVNAME
source ~/.virtualenvs/NEWENVNAME/bin/activate
pip install ipykernel
python -m ipykernel install --user --name=NEWENVNAME

Then, after reloading the JupyterHub page, you should then see a "NEWENVNAME" menu item in the "New" dropdown button of the JupyterHub.

Existing environments

If you have already created a virtual environment on Niagara, and wish to use it in the notebook, you need to install an additional package and make this environment known to the Jupyterhub. After activating your environment on the command line, type the following:

pip install ipykernel
python -m ipykernel install --user --name=ENVNAME

Then, after reloading the JupyterHub page, you should see a "ENVNAME" menu item in the "New" dropdown button of the JupyterHub.

Note: this works with virtual environments that are created from the python modules in the NiaEnv/2019b stack and the CCEnv stack, but not with those from the NiaEnv/2018a stack.

Using the Jupyter Hub responsibly

Jupyter notebooks are a useful environment for data exploration, pipeline development, and other hands-on work. Such notebooks are not, however, intended for heavy production data crunching. If you need to do heavy data crunching you should develop a script and run such work on the compute nodes.

Furthermore, the Jupyter Hub is a shared resource. Other users will be using the node at the same time you are using it. Please do not use more than a few cores at a time, and do not use an excessive amount of memory.

Advantages and Disadvantages of a Notebook Environment

Drawbacks:

  • Notebook files (.ipynb) are not scripts.
  • Notebooks do not (always) work well with version control.
  • The environment is designed to run in browser.
  • The back-end runs on shared resources.
  • Graphics are inline, which is great for quick exploration but make tweaking a plot harder (IPython+X works better for this).
  • You can jump around in the notebook, and execute different parts: it can be hard to keep track of what you did.

Advantages:

  • You can jump around in the notebook, and execute different parts: Easier exploration, experimentation and debugging.
  • Auto-save.
  • You can rerun parts of your code (while, e.g., keeping large data in memory)
  • You can add text portions, making your notebook more like an article.
  • Which in turn can be useful for sharing, demos, teaching, ...
  • You can still export as a script.
  • Also has a terminal.