Pvm

For parallel programming we use pvm (Parallel Virtual Machine).

Communication primitives for c++.
Run your parallel program
Debugging
Program example
Program example with EPPA code instrumentation
More PVM
PVM Configuration.

Installation under Windows.

Communication primitives for c++.

See also the pvm manual available at the lab:

Chapter 5: for explanation (all except 5.7)
Chapter 6: example programs
Appendix B: reference manual of all pvm functions

Run your parallel program

On our LINUX system: Computers you can use to run in parallel: parallel1, parallel2 ... parallel9 (Some might not be running) 9/7/2004 SHOULD BE OK now (mail if problems)
Open 3 consoles:

To run your master program (like a normal program).
To run pvm:

type pvm: you'll get the pvm console, this are the commands you can use in pvm (see chapter 3.7 in the pvm manual):

add machines to the parallel virtual machine: add parallel1 parallel4
to show all the machines of the parallel virtual machine: conf
to show all the running processes with their tids: ps -a
to stop a running process: kill process_tid
to stop all running processes: reset
to show all running jobs: jobs
to quit pvm with stopping pvm (terminate): halt
to quit pvm without stopping pvm: quit

To show the log file (with the messages from the slaves: pvm errors and slave output )

cd /tmp (go to the temporary directory)
tail -f pvmlogfile

with this command you always see the tail of the file and this is updated when pvm writes something to it (-f option from force).
each user got its own pvm-log file (something like pvml.19441), this file name is always the same

for the student logins, this number is pvml.(19440 + the number of your login)
or type ls pvml* to list all log files

use Control-c to stop this

Terminate:

Don't forget to halt pvm when you stopped experimenting!!
Otherwise your pvm-processes keep on running! Also when you log out!

Cluster ONLY ACCESSIBLE FROM OUR LINUX SYSTEM 9/9/2004 PROBLEMS WITH node6, node8 & node9... they are unusable for the moment

Do your final experiments (performance measurements) on the cluster (ask first for permission to your assistent):

our 8 dedicated cluster computers (333MHz) are called node2, node3, ... , node9

note that node6 is a little slower

our old server parallel6 (biprocessor: 2 x 300MHz) is also available on the cluster as node1
they are connected by a 100Mb/s non-blocking crossbar swith (a dynamic interconnection network)
our server (parallel1) is also connected to this switch
the crunch computers are only accessible from the server!! So work like this:

run your master program on parallel1
also start pvm from parallel1 and add the cluster computers from there

when adding the nodes the first time, you will be asked (by RSA) if you want to continue, answer with yes. Enter your password when asked.

to do this: secure shell shell from your computer to parallel1 with command sshparallel1 (in a console)
so run the 3 consoles (to run your parallel program) parallel1 (also the log file)

Debugging in pvm

To debug your master program, just use gdb (see doc).
To debug your slaves, start the slave programs with the PvmTaskDebug flag (together with the normal PvmTaskDefault flag) and spawn the slaves on the same machine as the master (don't specify the machine name), as follows:

int flags = PvmTaskDefault;
flags += PvmTaskDebug;
string machine = "";
pvm_spawn(task, 0, flags, (char*) machine, nbr, tids);

When the master spawns the slave, it will be in a gdb debug window, where you can debug your program as always with gdb.

You should start the slave program yourself with r (run), after having put breakpoints.
Quit the debug window with q, and then press ENTER to close the window.

Program Example

Download the code

The .h, the.cc-files and the project file

tasks.cc, tasks.h (the problems)
parallelAlg.cc, parallelAlg.h (the parallel algorithm)
problem.cc (contains the main, connects the task with the algorithm)
standardlib.h, standardlib.cc (standard functions that we use at our lab)
problem.pro (project file, used to create the makefile with qmake)
Create_tar_file.sh (shell script with command to create the compressed archive file with all these files, see doc about LINUX scripts)

Download the code all at once with this compressed archive file: pvmExample.tgz, extract the files in a new directory with command: tar -zxf pvmExample.tgz

Run the program

create the makefile starting from the project file problem.pro with qmake and compile it (note that your executable problem is put in the pvm3/bin/LINUX directory.)

qmake problem.pro
make all

run your parallel program
try to understand the program (documentation)
Study it:

See all program arguments: problem -h
Give the array size: problem -s30
Run on more computers: problem -p3
Run in debug mode (see Debugging in pvm): problem -d

Also run the master under gdb: gdb problem

Check our Quicklist and

page!!

Exercises

Change the task of the parallel program, so that it checks whether the parameter is a prime
Do a small performance analysis:

Count the number of computations that each slave performs
Also measure the computation time of each slave
Is there a load imbalance?

Measure the communication time and the ratio communication versus computation
Solve the load imbalance problem!

eg: divide the given range in a great number (> p) of subranges. Send the first subranges to the slaves, send the next subranges whenever a slave has finished with its subrange.

Program Example with EPPA instrumentation

On our LINUX system
EPPA is installed under the $PARDIR directory.
Extract pvmExapleWithEPPA in a new directory with command: tar -zxf pvmExampleWithEPPA.tgz
The example problemEPPA is the same as above, but now instrumented with EPPA. Run problemEPPA with the -m option to activate the EPPA probe. View your results with the EPPA tool (the command EPPA in a console or the icon in the toolbar).

On another system (with QT, pvm and mysql), EPPA can be installed

Compressed archive file: EPPAFiles.tgz,

download into your root directory
extract in your root with command: tar -zxf EPPAFiles.tgz
goto pvm3 directory: cd pvm3
Create everything: make all (if the makefile is incorrect, recreate it (command qmake) and make (command make all) in the following directories: pvm3/parlib/src, pvm3/eppa and pvm3/pvmEPPA)

Contents

pvm3/bin/LINUX: put here your pvm executables
pvm3/pvmEPPA: the above example program, now instrumented with EPPA (run with the -m option to activate the EPPA probe)
pvm3/eppa: the performance tool EPPA to view your experiments
pvm3/parlib: our library with general files

pvm3/parlib/include: all header files to be included, interesting ones:

standardlib.h: general things
EPPAInstrumentation.h: to instrument your code with an EPPAProbe
randlib.h for a random generator

pvm3/parlib/lib: the library parlib.a that should be linked with your objects
pvm3/parlib/src: the source file

More Pvm

PVM reference manual

There are basically 3 pvm receive functions:

pvm_recv is a blocking receive, it waits until the message has arrived
pvm_nrecv is non-blocking, it checks for a message and returns a positive buffer-id if a message has arrived

use pvm_bufinfo to check the sender_tid and tag of the message

pvm_trecv waits for maximally t time for a message

A pvm_send will cause pvm to start sending the message in the background, your program will be continued and the message is being sent in the background.While packing your data before sending, it can be copied to a buffer, or it can be sent immediately from where it is. The latter is faster, since no copy operation should be made, but than it must be sure that the data is not accessed/modified while the send operation is executed (use pvm_probe) to test if the message has arrived)! For this, use the tag PVMINPLACE instead of PvmDataDefault in your pvm_initsend.
Use pvm_config to get all machines added to pvm
Use pvm_tasks to get all tasks running

Advanced Pvm

pvm_tidtohost: to know on which machine a proces (with certain tid) is running. It returns the deamon tid of the machine, with pvm_config the name of the machine can than be retrieved.
Use pvm_hostsync to see how much the clocks of the machines are differing
Check with pvm_probe whether a message has arrived
Use pvm_catchout to write slave output to a specific file (instead of the default log file of /tmp)

Configuration

To use pvm you should configure your account.

Directories

In order to make PVM work you need first to create a directory called pvm3/bin/LINUX. Do this by typing the following commands in a console (or use the file manager):

cd (= goes to your home directory)
mkdir pvm3
cd pvm3
mkdir bin
cd bin
mkdir LINUX

When you want to start a process by PVM (eg a master that starts a slave), pvm will look for the slave in the pvm3/bin directory! Based on the architecture of your machine PVM will choose a sub directory in that directory. To do so it will look at the PVM_ARCH environment variable, which is in our case LINUX (see next paragraph).

Add also this directory to your path (see linux configuration): add the line $(HOME)/pvm3/bin/LINUX to your .path.local file

Remark: sometimes we use a hostf file in our home directory to indicate this path (for the exe's).

PVM variables

In order to work PVM requires 2 environment variables to be set, namely PVM_ROOT and PVM_ARCH. These are used by the PVM daemon to find were the different files are located on the system. By default they have the following values on our machines:

PVM_ROOT = /usr/shared/pvm3
PVM_ARCH = LINUX

At this point PVM should work for you now try to type PVM at the prompt. You should get a line like this

pvm>

For further operations I refer you to the user manual.

.rhosts file

PVM uses remote shell to start programs on the different machines that constitute a PVM. It is therefore important to ensure the you can indeed do a remote shell to another computer. Choose a computer other than the on you are login in and type :

ssh other_computer_name date

This line will run the program date on the remote machine and redirect the output of that program to the local machine. For this to work you should add the name of your local machine into the .rhosts file of your home directory. Again be aware of the fact that this is a hidden file and that you need to use ls -al if you want to make sure this file exists. What you do is simply put the name of all the machine you will allow to do a remote shell on the local machine in this file.
You can create the .rhosts file with: echo $HOSTNAME $LOGNAME > .rhosts

How does it work?
When you issue a rsh to a machine this machine check the .rhosts file to see which machine have the permission to request the execution of a program. Suppose machine A wants to run a program on machine B. Then the name of machine A should be present in the .rhosts file of machine B. So the remote machine has list of the machine that can start programs. The opposite would be quite dangerous. It that scenario it would suffice to make a file on a machine in order to run programs on any machine and by just adding the name of the machine to that file.

So you can do a remote shell that is good. I recommend putting all the machines of our lab into your .rhosts file. This way you will be able to start PVM on all the machines you want.

Installation of PVM 3 on Windows 9x

IMPORTANT: Install PVM only in machines that have Visual C++ 5 or greater, because the pvm libraries in this distribution are incompatible with other compilers.

The self-installation program could be downloaded at:
http://www.epm.ornl.gov/~sscott/PVM/Software/ParallelVirtualMachine3.4.3.zip
This program will install all the necesary files.

After installation check that the following enviroment variables are set... if not, put then in the autoexec.bat:

set PVM_ROOT=3DC:\Program Files\PVM3.4
set PVM_ARCH=3DWIN32

Reboot the machine. Run the PVM console and you are ready to execute any pvm program.

For more install information, check out the links page...