Escali 4.4 manual Mpimon monitor program, Basic usage, Identity of parallel processes

Page 36

Section: 3.3 Running Scali MPI Connect programs

<pid>

is the Unix process identifier of the monitor program mpimon.

<nodename>

is the name of the node where mpimon is running.

Note: SMC requires a homogenous file system image, i.e. a file system providing the same path and program names on all nodes of the cluster on which SMC is installed.

3.3.2 mpimon - monitor program

The control and start-up of an Scali MPI Connect application are monitored by mpimon. A complete listing of mpimon options can be found in “Mpimon options” on page 34.

3.3.2.1 Basic usage

Normally mpimon is invoked as:

mpimon <userprogram> <program options> -- <node name> [<count>][<nodename>

[<count>]]...

where

<userprogram>

is name of application

<program options> are options to the application

--

is the separator ending the application options

<nodename>[<count>] is name of node and the number of MPI-processes to run on that node.

The option can occur several times in the list. Mpi-processes will be given ranks sequentially according to the list of node-number pairs.

The <count> is optional and defaults to 1

Examples:

Starting the program “/opt/scali/examples/bin/hello” on a node called “hugin”:

mpimon /opt/scali/examples/bin/hello -- hugin

Starting the same program with two processes on the same node:

mpimon /opt/scali/examples/bin/hello -- hugin 2

Starting the same program on two different nodes, “hugin” and “munin”:

mpimon /opt/scali/examples/bin/hello -- hugin munin

Starting the same program on two different nodes with 4 processes on each:

mpimon /opt/scali/examples/bin/hello -- hugin 4 munin 4

Bracket expansion and grouping (if configured) can also be used :

mpimon /opt/scali/examples/bin/hello -- node[1-16] 2 node[17-32] 1

for more information regarding bracket expansion and grouping, refer to Appendix D.

3.3.2.2 Identity of parallel processes

The identification of nodes and the number of processes to run on each particular node translates directly into the rank of the MPI processes. For example, specifying n1 2 n2 2 will place process 0 and 1 on node n1 and process 2 and 3 on node n2. On the other hand, specifying n1 1 n2 1 n1 1 n2 1 will place process 0 and 2 on node n1 while process 1 and 3 are placed on node n2.

Scali MPI Connect Release 4.4 Users Guide

24

Image 36
Contents Scali MPI ConnectTM Users Guide Acknowledgement Copyright 1999-2005 Scali AS. All rights reservedScali Bronze Software Certificate Maintenance II Software License Terms CommencementGrant of License Support License ManagerSub-license and distribution Export RequirementsSCALI’s Obligations LICENSEE’s ObligationsTitle to Intellectual Property Rights TransferWarranty of Title and Substantial Performance Compliance with LicensesLimitation on Remedies and Liabilities Scali MPI Connect Release 4.4 Users Guide ViiProprietary Information MiscellaneousGoverning Law Scali MPI Connect Release 4.4 Users Guide Table of contents Profiling with Scali MPI Connect Appendix a Example MPI code Scali MPI Connect Release 4.4 Users Guide Chapter Scali MPI Connect product contextScali mailing lists SMC FAQ SMC release documents Problem reportsSupport Platforms supportedHow to read this guide Acronyms and abbreviationsLicensing FeedbackNIC Terms and conventions Typographic conventionsGUI style font Typographic conventions Description of Scali MPI Connect Scali MPI Connect componentsSMC network devices Direct Access Transport DAT Network devicesShared Memory Device Ethernet DevicesUsing detctl Using detstat3.2 DET Myrinet Infiniband4.1 GM 5.1 IBCommunication protocols on DAT-devices 6 SCIChannel buffer Inlining protocol Eagerbuffering protocolTransporter protocol MPI-2 Features Support for other interconnectsZerocopy protocol Scali MPI Connect Release 4.4 Users Guide MPI-2 Features Setting up a Scali MPI Connect environment Compiling and linkingScali MPI Connect environment variables RunningCompiler support Linker flagsRunning Scali MPI Connect programs Naming conventionsMpimon monitor program Basic usageIdentity of parallel processes Controlling options to mpimon Standard inputStandard output Program specHow to provide options to mpimon Network optionsMpirun wrapper script Mpirun usageRunning with tcp error detection Tfdr Suspending and resuming jobsRunning with dynamic interconnect failover capabilities Part partDebugging and profiling Debugging with a sequential debuggerUsing built-in segment protect violation handler Built-in-tools for debuggingAssistance for external profiling Debugging with Etnus TotalviewControlling communication resources Communication resources on DAT-devicesChannelinlinethreshold size to set threshold for inlining Using MPIIsend, MPIIrecv Using MPIBsendGood programming practice with SMC Matching MPIRecv with MPIProbeError and warning messages User interface errors and warningsFatal errors Unsafe MPI programsMpimon options Giving numeric values to mpimon PrefixPostfix Scali MPI Connect Release 4.4 Users Guide Profiling with Scali MPI Connect ExampleUsing Scali MPI Connect built-in trace TracingAbsRank MPIcallcommNamerankcall-dependant-parameters where +relSecs S eTime whereFeatures ExampleUsing Scali MPI Connect built-in timing TimingMPIcallDcallsDtimeDfreq TcallsTtimeTfreq Using the scanalyze Commrank recv from fromworldFromcommonFieldsCommrank send to toworldTocommonFields where Count!avrLen!zroLen!inline!eager!transporter! whereFor timing Using SMCs built-in CPU-usage functionality This produces the following reportScali MPI Connect Release 4.4 Users Guide Tuning communication resources Automatic buffer managementHow to optimize MPI performance BenchmarkingCaching the application program on the nodes First iteration is very slowCollective operations Memory consumption increase after warm-upFinding the best algorithm Appendix a Programs in the ScaMPItst packageImage contrast enhancement Scali MPI Connect Release 4.4 Users Guide File format OriginalWhen things do not work troubleshooting Why does not my program start to run?Appendix B Why can I not start mpid? Why does my program terminate abnormally?General problems Per node installation of Scali MPI Connect Appendix CInstall Scali MPI Connect for TCP/IP Install Scali MPI Connect for Direct EthernetInstall Scali MPI Connect for Myrinet ExampleInstall Scali MPI Connect for Infiniband Install Scali MPI Connect for SCIInstall and configure SCI management software License optionsUninstalling SMC Troubleshooting Network providersScali kernel drivers Troubleshooting 3rdparty DAT providers Troubleshooting the GM providerScali MPI Connect Release 4.4 Users Guide Appendix D Bracket expansion and grouping Bracket expansionGrouping Scali MPI Connect Release 4.4 Users Guide Appendix E Related documentationScali MPI Connect Release 4.4 Users Guide List of figures Scali MPI Connect Release 4.4 Users Guide Index Transporter protocolSSP