Escali 4.4 manual Tuning communication resources, Automatic buffer management

Page 59

Chapter 5

Tuning SMC to your application

 

 

 

 

Scali MPI Connect allows the user to exercise control over the communication mechanisms through adjustment of the thresholds that steer which mechanism to use for a particular message. This is one technique that can be used to improve performance of parallel applications on a cluster.

Forcing size parameters to mpimon is usually not necessary. This is only a means of optimising SMC to a particular application, based on knowledge of communication patterns. For unsafe MPI programs it may be necessary to adjust buffering to allow the program to complete.

5.1 Tuning communication resources

The communication resources allocated by Scali MPI Connect are shared among the MPI processes in the node.

Communication buffer adaption: If the communication behaviour of the application is known, explicitly providing buffersize settings to mpimon, to match the requirement of the application, will in most cases improve performance.

Example: Application sending only 900 bytes messages.

Set channel_inline_threshold 964 (64 added for alignment) and increase the channel- size significantly (32-128 k).

Setting eager_size 1k and eager_count high (16 or more).

If all messages can be buffered, the transporter-{size, count} can be set to low values to reduce shared memory consumption.

How do I control shared memory usage? Adjusting SMC buffer sizes

How do I calculate shared memory usage?

The buffer space required by a communication channel is approximately:

chunk-size = (2 * channel-entry-size * channel-entry-count) + (transporter-size * transporter-count)

+ (eager-size * eager-count)

+4096 (give-or-take-a-few-bytes) Total-usage = chunk-size * no-of-processes

5.1.1 Automatic buffer management

The pool-size is a limit for the total amount of shared memory. The automatic buffer size computations is based on full connectivity, i.e. all communicating with all others. Given a total pool of memory dedicated to communication, each communication channel will be restricted to use a partition of only(P = number of processes):

chunk = inter_pool_size / P

The automatic approach is to downsize all buffers associated with a communication channel until it fits in its part of the pool. The automatic chunk size is calculated to wrap a complete communication channel.

Scali MPI Connect Release 4.4 Users Guide

47

Image 59
Contents Scali MPI ConnectTM Users Guide Copyright 1999-2005 Scali AS. All rights reserved AcknowledgementScali Bronze Software Certificate Grant of License MaintenanceII Software License Terms Commencement Export Requirements SupportLicense Manager Sub-license and distributionLICENSEE’s Obligations SCALI’s ObligationsTransfer Title to Intellectual Property RightsCompliance with Licenses Warranty of Title and Substantial PerformanceScali MPI Connect Release 4.4 Users Guide Vii Limitation on Remedies and LiabilitiesMiscellaneous Proprietary InformationGoverning Law Scali MPI Connect Release 4.4 Users Guide Table of contents Profiling with Scali MPI Connect Appendix a Example MPI code Scali MPI Connect Release 4.4 Users Guide Scali MPI Connect product context ChapterPlatforms supported Scali mailing lists SMC FAQ SMC release documentsProblem reports SupportFeedback How to read this guideAcronyms and abbreviations LicensingNIC GUI style font Terms and conventionsTypographic conventions Typographic conventions Scali MPI Connect components Description of Scali MPI ConnectSMC network devices Ethernet Devices Direct Access Transport DATNetwork devices Shared Memory Device3.2 DET Using detctlUsing detstat 5.1 IB MyrinetInfiniband 4.1 GMChannel buffer Communication protocols on DAT-devices6 SCI Transporter protocol Inlining protocolEagerbuffering protocol Zerocopy protocol MPI-2 FeaturesSupport for other interconnects Scali MPI Connect Release 4.4 Users Guide MPI-2 Features Running Setting up a Scali MPI Connect environmentCompiling and linking Scali MPI Connect environment variablesLinker flags Compiler supportNaming conventions Running Scali MPI Connect programsIdentity of parallel processes Mpimon monitor programBasic usage Program spec Controlling options to mpimonStandard input Standard outputNetwork options How to provide options to mpimonMpirun usage Mpirun wrapper scriptPart part Running with tcp error detection TfdrSuspending and resuming jobs Running with dynamic interconnect failover capabilitiesDebugging with a sequential debugger Debugging and profilingDebugging with Etnus Totalview Using built-in segment protect violation handlerBuilt-in-tools for debugging Assistance for external profilingChannelinlinethreshold size to set threshold for inlining Controlling communication resourcesCommunication resources on DAT-devices Matching MPIRecv with MPIProbe Using MPIIsend, MPIIrecvUsing MPIBsend Good programming practice with SMCUnsafe MPI programs Error and warning messagesUser interface errors and warnings Fatal errorsMpimon options Postfix Giving numeric values to mpimonPrefix Scali MPI Connect Release 4.4 Users Guide Example Profiling with Scali MPI ConnectTracing Using Scali MPI Connect built-in trace+relSecs S eTime where AbsRank MPIcallcommNamerankcall-dependant-parameters whereExample FeaturesTiming Using Scali MPI Connect built-in timingMPIcallDcallsDtimeDfreq TcallsTtimeTfreq Count!avrLen!zroLen!inline!eager!transporter! where Using the scanalyzeCommrank recv from fromworldFromcommonFields Commrank send to toworldTocommonFields whereFor timing This produces the following report Using SMCs built-in CPU-usage functionalityScali MPI Connect Release 4.4 Users Guide Automatic buffer management Tuning communication resourcesFirst iteration is very slow How to optimize MPI performanceBenchmarking Caching the application program on the nodesMemory consumption increase after warm-up Collective operationsFinding the best algorithm Image contrast enhancement Appendix aPrograms in the ScaMPItst package Scali MPI Connect Release 4.4 Users Guide Original File formatAppendix B When things do not work troubleshootingWhy does not my program start to run? General problems Why can I not start mpid?Why does my program terminate abnormally? Appendix C Per node installation of Scali MPI ConnectExample Install Scali MPI Connect for TCP/IPInstall Scali MPI Connect for Direct Ethernet Install Scali MPI Connect for MyrinetLicense options Install Scali MPI Connect for InfinibandInstall Scali MPI Connect for SCI Install and configure SCI management softwareScali kernel drivers Uninstalling SMCTroubleshooting Network providers Troubleshooting the GM provider Troubleshooting 3rdparty DAT providersScali MPI Connect Release 4.4 Users Guide Grouping Appendix D Bracket expansion and groupingBracket expansion Scali MPI Connect Release 4.4 Users Guide Related documentation Appendix EScali MPI Connect Release 4.4 Users Guide List of figures Scali MPI Connect Release 4.4 Users Guide Transporter protocol IndexSSP