How To Configure Mpi On Windows

How to configure MPI on Windows

Known to work with HTCondor version: 7.5

Note: The good folks at Kitware have also posted a comprehensive webpage detailing their setup using mpich2 with HTCondor on Windows

Installation

Download a copy of MPICH2 from the Argonne site:

http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.2.1/mpich2-1.2.1-win-ia32.msi

During the install, make sure to enable use of the tools by all users, and not just yourself. Once the installation is compleate, ensure that the MPICH pass phrase is the same on all machines in the pool:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\MPICH\SMPD]
"phrase"="behappy"

The final step is to disble the SMPD service. Using the Microsoft Managment Console , open the Services Common Console Document . Find the entry labelled MPICH2 Process Manager, Argonne National Lab and modify it's properties such that it is inactive and will no longer start on boot.

We do this so that HTCondor can manage the MPI processes itself. That is, we will configure HTCondor such that it can start, stop and monitor any or all of the MPI related process.

Configuration

The aim of installing MPICH2 is to use it in conjunction with HTCondor. To this end, we need to make some changes to HTCondor's configuration, as well as edit some helper scripts.

The following will let HTCondor manage the smpd.exe daemon rather than the Windows Service Manager:

##
# Tell the condor_master what binary to use for MPICH2's
# comunication daemon.
##
SMPD_SERVER = C:\Program Files\MPICH2\bin\smpd.exe
SMPD_SERVER_ARGS = -p 6666 -d 1

##
# We are letting HTCondor spawn and manage the smpd.exe server
# for us, so we need to tell the condor_master on this machine
# to spawn a smpd.exe server, in addition to any other daemons
# it is configured to run.
##
DAEMON_LIST = $(DAEMON_LIST), SMPD_SERVER

Unlike the HTCondor daemons, we cannot define a log file for the smpd.exe daemon in the HTCondor configuration file, ao we must do this manually:

C:\> smpd.exe -set logfile C:\condor\log\SmpdLog
logfile = C:\condor\log\SmpdLog

Note that we only need to do this once, since the option will be saved persistently in the registry.

We use the following configuration on the dedicated pool machines that will be used for MPI jobs:

DAEMON_LIST 	= MASTER, STARTD

WANT_SUSPEND	= False
WANT_VACATE	= False
START		= True
SUSPEND		= False
CONTINUE	= True
PREEMPT		= False
KILL		= False
RANK		= 0

DedicatedScheduler = "DedicatedScheduler@real.host.name"
STARTD_EXPRS 	= $(STARTD_EXPRS), DedicatedScheduler

This configuration will allow HTCondor to always runs jobs.

Helper Script

Once HTCondor has been configured to run parallel jobs, we must create a helper script such that HTCondor can use it to interact with the new MPICH2 installation.

The following can be used as a base for a more complex driver:

set _CONDOR_PROCNO=%_CONDOR_PROCNO%
set _CONDOR_NPROCS=%_CONDOR_NPROCS%

REM If not the head node, just sleep forever
if not [%_CONDOR_PROCNO%] == [0] copy con nothing

REM Set this to the bin directory of MPICH installation
set MPDIR="C:\Program Files\MPICH2\bin"

REM run the actual mpijob
%MPDIR%\mpiexec.exe -n %_CONDOR_NPROCS% -p 6666 %*

exit 0

Submitting a an MPI Job

For our purposes we will use the sample shipped with the Windows version of MPICH2 to demostrate how to use HTCondor and MPI on Windows:

C:\demo> cp "C:\Program Files\MPICH2\examples\cpi.exe" .
       1 file(s) copied.

We also need to create an input file:

10
100
1000
10000000000000
0

Where the last input, 0 , terminates the cpi.exe application.

All that remains is to define a submit file:

universe = parallel
executable = mp2script.bat
arguments = cpi.exe
machine_count = 1
input = input.file
output = out.$(NODE).log
error = error.$(NODE).log
log = work.$(NODE).log
should_transfer_files = yes
when_to_transfer_output = on_exit
transfer_input_files = cpi.exe
queue

Where mp2script.bat is the helper script, and input.file is the input file.

Now we can submit our MPI application to HTCondor:

C:\demo> condor_submit parallel.submit
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 18.

...

C:\demo> condor_q
-- Submitter: marge : <172.16.46.137:4301> : marge
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  18.0   Administrator   2/17 16:10   0+00:00:11 R  0   0.0  mp2script.bat cpi.

1 jobs; 0 idle, 1 running, 0 held

...

C:\demo> condor_q
-- Submitter: marge : <172.16.46.137:4301> : marge
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD

0 jobs; 0 idle, 0 running, 0 held

Look at out.log to see the result of the run:

C:\demo> more out.log
Enter the number of intervals: (0 quits) pi is approximately 3.1424259850010987, \
Error is 0.0008333314113056 wall clock time = 0.000014
Enter the number of intervals: (0 quits) pi is approximately 3.1416009869231254, \
Error is 0.0000083333333323 wall clock time = 0.000007
Enter the number of intervals: (0 quits) pi is approximately 3.1415927369231227, \
Error is 0.0000000833333296 wall clock time = 0.000637
Enter the number of intervals: (0 quits) pi is approximately 3.1415926535898295, \
Error is 0.0000000000000364 wall clock time = 12.460265
Enter the number of intervals: (0 quits)

Where the output are is result of an inputs of 10 , 100 , 1000 , 10000000000000 , and 0 respectively.