Parallel Debugging with DDT

From SciNet Users Documentation
Revision as of 19:01, 12 December 2018 by Rzon (talk | contribs) (Created page with "==ARM DDT== For parallel debugging, SciNet has DDT ("Distributed Debugging Tool") installed on all our clusters. DDT is a powerful, GUI-based commercial debugger by ARM (form...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

ARM DDT

For parallel debugging, SciNet has DDT ("Distributed Debugging Tool") installed on all our clusters. DDT is a powerful, GUI-based commercial debugger by ARM (formerly by Allinea). It supports the programming languages C, C++, and Fortran, and the parallel programming paradigms MPI, OpenMPI, and CUDA. DDT can also be very useful for serial programs. DDT provides a nice, intuitive graphical user interface. It does need graphics support, so make sure to use the '-X' or '-Y' arguments to your ssh commands, so that X11 graphics can find its way back to your screen ("X forwarding").

The most currently installed version of ddt on Niagara is DDT 18.2. The ddt license allows up to a total of 128 processes to be debugged simultaneously (shared among all users).

To use ddt, ssh in with X forwarding enabled, load your usual compiler and mpi modules, compile your code with '-g' and load the module

module load ddt

You can then start ddt with one of the following commands:

ddt

ddt <executable compiled with -g flag>

ddt <executable compiled with -g flag> <arguments>

The first time you run DDT, it will set up configuration files. It puts these in the hidden directory $SCRATCH/.allinea.

Note that most users will debug on the login nodes of the a clusters (nia-login0{1-3,5-7}), but that this is only appropriate if the number of mpi processes and threads is small, and the memory usage is not too large. If your debugging requires more resources, you should run it through the queue. On Niagara, an interactive debug session will suit most debugging purposes.

Parallel Debugging in an Interactive Session on Niagara

By requesting a job from the 'debug' partition on Niagara, you can have access to at most 4 nodes, i.e., a total of 160 physical cores (or 320 virtual cores, using Hyperthreading), for your exclusive, interactive use. Starting from a Niagara login node, you would request a debug sessions with the following command:

debugjob <numberofnodes>

where numberofnodes is 1, 2, 3, or 4.

This command will get you a prompt on a compute node (or on the 'head' node if you've asked for more than one node). Reload any modules that your application needs (e.g. module load intel openmpi), as well as the ddt module.

Note that on compute nodes, $HOME is read-only, so unless your code is on $SCRATCH, you cannot recompile it (with '-g') in the debug session; this should have been done on a login node.