Escali 4.4 manual Controlling communication resources, Communication resources on DAT-devices

Page 43

Section: 3.8 Controlling communication resources

3.8 Controlling communication resources

Even though it is normally not necessary to set buffer parameters when running applications, it can be done, e.g., for performance reasons. Scali MPI Connect automatically adjusts communication resources based on the number of processes in each node and based on pool_size and chunk_size.

The built-in devices SMP and TCP/IP use a simplified protocol based on serial transfers. This can be visualized as data being written into one end of a pipe and read from the other end. Messages arriving out-of-order are buffered by the reader. The names of these standard devices are SMP for intra-node-communication and TCP for node-to-node-communication.

The size of the buffer inside the pipe can be adjusted by setting the following environment variables:

SCAFUN_TCP_TXBUFSZ - Sets the size of the transmit buffer.

SCAFUN_TCP_RXBUFSZ - Sets the size of the receive buffer.

SCAFUN_SMP_BUFSZ - Sets the size of the buffer for intranode-communication.

The ringbuffers are divided into equally sized entries. The size varies differs for different architectures and networks; see Scali MPI Connect Release Notes” for details. An entry in the ringbuffer, which is used to hold the information forming the message envelope, is reserved each time a message is being sent, and is used by the inline protocol, the eagerbuffering protocol, and the transporter protocol. In addition, one ore more entries are used by the inline protocol for application data being transmitted.

mpimon has the following interface for the eagerbuffer and channel thresholds:

Channel threshold definitions

-channel_inline_threshold <size> to set threshold for inlining

Eager threshold definitions

-eager_threshold <size> to set threshold for eager buffering

3.8.1 Communication resources on DAT-devices

All resources (buffers) used by SMC reside in shared memory in the nodes. This way multiple processes (typically when a node has multiple CPUs) can share the communication resources.

SMC operates on a buffer pool. The pool is divided into equally sized parts called chunks. SMC uses one chunk per connection to other processes. The mpimon option “pool_ size” limits the total size of the pool and the “chunk_size” limits the block of memory that can be allocated for a single connection.

To set the pool size and the chunk size, specify:

-pool_size <size>

to set the buffer pool size

-chunk_size <size>

to set the chunk size

Scali MPI Connect Release 4.4 Users Guide

31

Image 43
Contents Scali MPI ConnectTM Users Guide Copyright 1999-2005 Scali AS. All rights reserved AcknowledgementScali Bronze Software Certificate II Software License Terms Commencement MaintenanceGrant of License Export Requirements SupportLicense Manager Sub-license and distributionLICENSEE’s Obligations SCALI’s ObligationsTransfer Title to Intellectual Property RightsCompliance with Licenses Warranty of Title and Substantial PerformanceScali MPI Connect Release 4.4 Users Guide Vii Limitation on Remedies and LiabilitiesMiscellaneous Proprietary InformationGoverning Law Scali MPI Connect Release 4.4 Users Guide Table of contents Profiling with Scali MPI Connect Appendix a Example MPI code Scali MPI Connect Release 4.4 Users Guide Scali MPI Connect product context ChapterPlatforms supported Scali mailing lists SMC FAQ SMC release documentsProblem reports SupportFeedback How to read this guideAcronyms and abbreviations LicensingNIC Typographic conventions Terms and conventionsGUI style font Typographic conventions Scali MPI Connect components Description of Scali MPI ConnectSMC network devices Ethernet Devices Direct Access Transport DATNetwork devices Shared Memory DeviceUsing detstat Using detctl3.2 DET 5.1 IB MyrinetInfiniband 4.1 GM6 SCI Communication protocols on DAT-devicesChannel buffer Eagerbuffering protocol Inlining protocolTransporter protocol Support for other interconnects MPI-2 FeaturesZerocopy protocol Scali MPI Connect Release 4.4 Users Guide MPI-2 Features Running Setting up a Scali MPI Connect environmentCompiling and linking Scali MPI Connect environment variablesLinker flags Compiler supportNaming conventions Running Scali MPI Connect programsBasic usage Mpimon monitor programIdentity of parallel processes Program spec Controlling options to mpimonStandard input Standard outputNetwork options How to provide options to mpimonMpirun usage Mpirun wrapper scriptPart part Running with tcp error detection TfdrSuspending and resuming jobs Running with dynamic interconnect failover capabilitiesDebugging with a sequential debugger Debugging and profilingDebugging with Etnus Totalview Using built-in segment protect violation handlerBuilt-in-tools for debugging Assistance for external profilingCommunication resources on DAT-devices Controlling communication resourcesChannelinlinethreshold size to set threshold for inlining Matching MPIRecv with MPIProbe Using MPIIsend, MPIIrecvUsing MPIBsend Good programming practice with SMCUnsafe MPI programs Error and warning messagesUser interface errors and warnings Fatal errorsMpimon options Prefix Giving numeric values to mpimonPostfix Scali MPI Connect Release 4.4 Users Guide Example Profiling with Scali MPI ConnectTracing Using Scali MPI Connect built-in trace+relSecs S eTime where AbsRank MPIcallcommNamerankcall-dependant-parameters whereExample FeaturesTiming Using Scali MPI Connect built-in timingMPIcallDcallsDtimeDfreq TcallsTtimeTfreq Count!avrLen!zroLen!inline!eager!transporter! where Using the scanalyzeCommrank recv from fromworldFromcommonFields Commrank send to toworldTocommonFields whereFor timing This produces the following report Using SMCs built-in CPU-usage functionalityScali MPI Connect Release 4.4 Users Guide Automatic buffer management Tuning communication resourcesFirst iteration is very slow How to optimize MPI performanceBenchmarking Caching the application program on the nodesMemory consumption increase after warm-up Collective operationsFinding the best algorithm Programs in the ScaMPItst package Appendix aImage contrast enhancement Scali MPI Connect Release 4.4 Users Guide Original File formatWhy does not my program start to run? When things do not work troubleshootingAppendix B Why does my program terminate abnormally? Why can I not start mpid?General problems Appendix C Per node installation of Scali MPI ConnectExample Install Scali MPI Connect for TCP/IPInstall Scali MPI Connect for Direct Ethernet Install Scali MPI Connect for MyrinetLicense options Install Scali MPI Connect for InfinibandInstall Scali MPI Connect for SCI Install and configure SCI management softwareTroubleshooting Network providers Uninstalling SMCScali kernel drivers Troubleshooting the GM provider Troubleshooting 3rdparty DAT providersScali MPI Connect Release 4.4 Users Guide Bracket expansion Appendix D Bracket expansion and groupingGrouping Scali MPI Connect Release 4.4 Users Guide Related documentation Appendix EScali MPI Connect Release 4.4 Users Guide List of figures Scali MPI Connect Release 4.4 Users Guide Transporter protocol IndexSSP