Escali 4.4 manual MPIcallDcallsDtimeDfreq TcallsTtimeTfreq

Page 54

1: MPI_Comm_rank

1

3.1us

3.1us

1

3.1us

3.1us

1: MPI_Comm_size

1

1.5us

1.5us

1

1.5us

1.5us

1: MPI_Gather

1

109.9us

109.9us

1

109.9us

109.9us

1: MPI_Init

1

1.0s

1.0s

1

1.0s

1.0s

1: MPI_Keyval_free

1

1.2us

1.2us

1

1.2us

1.2us

1: MPI_Reduce

1

51.5us

51.5us

1

51.5us

51.5us

1: MPI_Scatter

1

138.7us

138.7us

1

138.7us

138.7us

1: Sum

9

1.0s

112.8ms

9

1.0s

112.8ms

1: Overhead

0

0.0ns

 

9

27.2us

3.0us

1:=====================================================================

0:13.26.40 -------------Delta---------- ---------Total----------

0: Init+0.111598 s

#calls

time

tim/cal

#calls

time

tim/cal

0: MPI_Bcast

2

79.6us

39.8us

2

79.6us

39.8us

0: MPI_Comm_rank

1

3.3us

3.3us

1

3.3us

3.3us

0: MPI_Comm_size

1

1.4us

1.4us

1

1.4us

1.4us

0: MPI_Gather

1

648.8us

648.8us

1

648.8us

648.8us

0: MPI_Init

1

965.9ms

965.9ms

1

965.9ms

965.9ms

0: MPI_Keyval_free

1

1.1us

1.1us

1

1.1us

1.1us

0: MPI_Reduce

1

37.6ms

37.6ms

1

37.6ms

37.6ms

0: MPI_Scatter

1

258.1us

258.1us

1

258.1us

258.1us

0: Sum

9

1.0s

111.6ms

9

1.0s

111.6ms

0: Overhead

0

0.0ns

 

9

35.6us

4.0us

0: =====================================================================

The <seconds> field can be set to a large number in order to collect only final statistics.

We see that the output gives statistics about which MPI calls are used and their frequency and timing. Both delta numbers since last printout and the total accumulated statistics. By setting the interval timing (in -s <seconds>) to a large number, only the cumulative statistics at the end are printed. The timings are presented for each process, and with many processes this can yield a huge amount of output. There are many options for modifying SCAMPI_TIMING to reduce this output. The selection parameter can time only those MPI processes to be monitored. There are also other ways to minimize the output, by screening away selected MPI calls either before or after a certain number of calls or between an interval of calls. Some examples are:

The rest of the format has the following fields:

<MPIcall><Dcalls><Dtime><Dfreq> <Tcalls><Ttime><Tfreq>

where

 

<MPIcall>

is the name of the MPI-call

<Dcalls>

is the number of calls to <MPIcall> since the last printout

<Dtime>

is the sum of the execution-time for calls to <MPIcall> since the

 

last printout

<Dfreq>

is the average time-per-call for calls to <MPIcall> since the last

 

printout

<Tcalls>

is the number of calls to <MPIcall>

<Ttime>

is the sum of the execution-time for calls to <MPIcall>

<Tfreq>

is the average time-per-call for calls to <MPIcall>

After all detail-lines (one per MPI-call which has been called since last printout), there will be a line with the sum of all calls followed by a line giving the overhead introduced when obtaining the timing measurements.

The second part containing the buffer-statistics has two types of lines, one for receives and one for sends.

Scali MPI Connect Release 4.4 Users Guide

42

Image 54
Contents Scali MPI ConnectTM Users Guide Acknowledgement Copyright 1999-2005 Scali AS. All rights reservedScali Bronze Software Certificate Maintenance II Software License Terms CommencementGrant of License Sub-license and distribution SupportLicense Manager Export RequirementsSCALI’s Obligations LICENSEE’s ObligationsTitle to Intellectual Property Rights TransferWarranty of Title and Substantial Performance Compliance with LicensesLimitation on Remedies and Liabilities Scali MPI Connect Release 4.4 Users Guide ViiProprietary Information MiscellaneousGoverning Law Scali MPI Connect Release 4.4 Users Guide Table of contents Profiling with Scali MPI Connect Appendix a Example MPI code Scali MPI Connect Release 4.4 Users Guide Chapter Scali MPI Connect product contextSupport Scali mailing lists SMC FAQ SMC release documentsProblem reports Platforms supportedLicensing How to read this guideAcronyms and abbreviations FeedbackNIC Terms and conventions Typographic conventionsGUI style font Typographic conventions Description of Scali MPI Connect Scali MPI Connect componentsSMC network devices Shared Memory Device Direct Access Transport DATNetwork devices Ethernet DevicesUsing detctl Using detstat3.2 DET 4.1 GM MyrinetInfiniband 5.1 IBCommunication protocols on DAT-devices 6 SCIChannel buffer Inlining protocol Eagerbuffering protocolTransporter protocol MPI-2 Features Support for other interconnectsZerocopy protocol Scali MPI Connect Release 4.4 Users Guide MPI-2 Features Scali MPI Connect environment variables Setting up a Scali MPI Connect environmentCompiling and linking RunningCompiler support Linker flagsRunning Scali MPI Connect programs Naming conventionsMpimon monitor program Basic usageIdentity of parallel processes Standard output Controlling options to mpimonStandard input Program specHow to provide options to mpimon Network optionsMpirun wrapper script Mpirun usageRunning with dynamic interconnect failover capabilities Running with tcp error detection TfdrSuspending and resuming jobs Part partDebugging and profiling Debugging with a sequential debuggerAssistance for external profiling Using built-in segment protect violation handlerBuilt-in-tools for debugging Debugging with Etnus TotalviewControlling communication resources Communication resources on DAT-devicesChannelinlinethreshold size to set threshold for inlining Good programming practice with SMC Using MPIIsend, MPIIrecvUsing MPIBsend Matching MPIRecv with MPIProbeFatal errors Error and warning messagesUser interface errors and warnings Unsafe MPI programsMpimon options Giving numeric values to mpimon PrefixPostfix Scali MPI Connect Release 4.4 Users Guide Profiling with Scali MPI Connect ExampleUsing Scali MPI Connect built-in trace TracingAbsRank MPIcallcommNamerankcall-dependant-parameters where +relSecs S eTime whereFeatures ExampleUsing Scali MPI Connect built-in timing TimingMPIcallDcallsDtimeDfreq TcallsTtimeTfreq Commrank send to toworldTocommonFields where Using the scanalyzeCommrank recv from fromworldFromcommonFields Count!avrLen!zroLen!inline!eager!transporter! whereFor timing Using SMCs built-in CPU-usage functionality This produces the following reportScali MPI Connect Release 4.4 Users Guide Tuning communication resources Automatic buffer managementCaching the application program on the nodes How to optimize MPI performanceBenchmarking First iteration is very slowCollective operations Memory consumption increase after warm-upFinding the best algorithm Appendix a Programs in the ScaMPItst packageImage contrast enhancement Scali MPI Connect Release 4.4 Users Guide File format OriginalWhen things do not work troubleshooting Why does not my program start to run?Appendix B Why can I not start mpid? Why does my program terminate abnormally?General problems Per node installation of Scali MPI Connect Appendix CInstall Scali MPI Connect for Myrinet Install Scali MPI Connect for TCP/IPInstall Scali MPI Connect for Direct Ethernet ExampleInstall and configure SCI management software Install Scali MPI Connect for InfinibandInstall Scali MPI Connect for SCI License optionsUninstalling SMC Troubleshooting Network providersScali kernel drivers Troubleshooting 3rdparty DAT providers Troubleshooting the GM providerScali MPI Connect Release 4.4 Users Guide Appendix D Bracket expansion and grouping Bracket expansionGrouping Scali MPI Connect Release 4.4 Users Guide Appendix E Related documentationScali MPI Connect Release 4.4 Users Guide List of figures Scali MPI Connect Release 4.4 Users Guide Index Transporter protocolSSP