Some Sample Codes for MPI on kali


This page can be reached via kali's webpage at http://www.math.umbc.edu/kali.

Purpose of this Document

This page collects a few sample codes that I have used myself to test whether I can compile and run MPI code successfully. The codes can be used on any parallel cluster with MPI in principle, but additional scripts for the scheduler and information on its use included here are particular to kali.

If you find mistakes on this page or have suggestions, please contact me.


Overview

When using a new system or a new programming language for the first time, it is a good idea to start with some really simple test codes. The point of these tests is to confirm that the software setup in your account is correct. The codes are all plain ANSI-C codes and I use the compile command mpicc. It is assumed here that you have gone through the steps outlined in the initial setup so that now, the command mpicc is well-defined in your account.

Also, I assume that you have read the information on how to run on kali using our scheduler, so that you know how to use the submission scripts included with each code for your convenience. This script is written to request 4 processes (hence the p4 extension); for a full test, one should repeat each test with modified scripts for other cases, from 1 process to 64 processes.

The following codes assume that you can compile and run serial code. Starting from there, a Hello-world program is first, followed by a progression of more involved code to test certain features. The documentation here only discusses my motivation for each code; to see exactly what it does, you have to read the source.

A final suggestion is that you create a subdirectory for these tests containing a subdirectory for each of the tests. The main point here is to avoid running the tests from the home directory, because this may hide effects regarding the redirection of stdout and stderr into files and their movement to the correct directory, for instance.


Note on Files

In each case, there is a source code in C and a qsub submission script. Additionally for the third example, there is an input file. Additionally in each case, I am posting the stdout file ending in OU for reference. I am not posting the stderr file ending in ER, because they were empty in all cases, that is, no errors occurred.


Hello-World with MPI

The simplest MPI program would have only the MPI_Init and MPI_Finalize commands; or actually, just to test mpicc, one should use a C code without any MPI commands and then only with the include "mpi.h" line added; we do not want to be this paranoid here, though.

So, we start here with the next more complicated MPI program that you can write: It just prints out a string. To make this string more interesting and useful as a test, it already incorporates useful MPI commands and outputs their results. Particularly, on a system with a scheduler, it is a useful thing to be able to see, which node each process is run on; for instance, this allows you to check that the scheduler did not run more than one process per CPU. (Hence, this is actually more advanced than a true Hello-world program that would only print out a constant string.)

Notice that our procedure for using the scheduler includes the redirection of stdout and stderr, so one of the not-so-trivial issues is to test first of all whether these files end up the current directory, as desired.

Files:


List of Nodes Used

I have found that it is very useful to hang on to the information which directory a run was conducted in in the log file of the stdout of my code. The information is obtained by a system call to pwd. The point is that in production runs, you will run many instances of your code, typically in different directories that are only distinguished by subtle differences in input files; when you later transfer or combine the results, it is good to be positively sure about which directory a result came from. That's why the code outputs this information first to stdout.

Additionally, I want to output the nodes used, but in a useful and readable order (compared to the order of the previous sample code's output!). To get the order right, I use MPI_Send/MPI_Recv command pairs; this also tests the next level of MPI commands beyond hello.c, namely actual communication commands. Notice that this output is encapsulated in a function; I in fact use this function, and the above output from pwd, in all my parallel codes. For good measure, I also test the equally basic MPI_Bcast communication command in this code.

Files:


I/O from Input File and to Multiple Output Files

A typical program in my research reads an input file (containing a few parameter values) and creates output files with the data generated by my code. More precisely on a parallel cluster, I need to be sure that every parallel process can read the same input file, but is able to output to distinct output files; the output files are distinguished by the process number in their file name. This is what this code tests.

After the run of this code, you should have as many output files as processes used, called testio-p00.out, testio-p01.out, etc. One of the maybe more subtle points of this test is that I want to make sure that my input file is found and that my output files all end up in the correct directory.

Files:


Copyright © 2005 by Matthias K. Gobbert. All Rights Reserved.
This page version 1.1, May 2005.