Escali 4.4 manual Description of Scali MPI Connect, Scali MPI Connect components

Page 23

Chapter 2 Description of Scali MPI Connect

This chapter gives the details of the operations of Scali MPI Connect (SMC). SMC consists of libraries to be linked and loaded with user application program(s), and a set of executables which control the start-up and execution of the user application program(s). The relationship between these components and their interfaces are described in this chapter. It is necessary to understand this chapter in order to control the execution of parallel processes and be able to tune Scali MPI Connect for optimal application performance.

2.1 Scali MPI Connect components

Scali MPI Connect consists of a number of programs, daemons, libraries, include and configuration files that together implements the MPI functionality needed by applications. Starting applications rely on the following daemons and launchers:

mpimon is a monitor program which is the user’s interface for running the application program.

mpisubmon is a submonitor program which controls the execution of application programs. One submonitor program is started on each node per run.

mpiboot is a bootstrap program used when running in manual-/debug-mode.

mpid is a daemon program running on all nodes that are able to run SMC. mpid is used for starting the mpisubmon programs (to avoid using Unix facilities like the remote shell rsh). mpid is started automatically when a node boots, and must run at all times

Figure 2-1:The way from application startup to execution

Scali MPI Connect Release 4.4 Users Guide

11

Image 23
Contents Scali MPI ConnectTM Users Guide Copyright 1999-2005 Scali AS. All rights reserved AcknowledgementScali Bronze Software Certificate Grant of License MaintenanceII Software License Terms Commencement Export Requirements SupportLicense Manager Sub-license and distributionLICENSEE’s Obligations SCALI’s ObligationsTransfer Title to Intellectual Property RightsCompliance with Licenses Warranty of Title and Substantial PerformanceScali MPI Connect Release 4.4 Users Guide Vii Limitation on Remedies and LiabilitiesMiscellaneous Proprietary InformationGoverning Law Scali MPI Connect Release 4.4 Users Guide Table of contents Profiling with Scali MPI Connect Appendix a Example MPI code Scali MPI Connect Release 4.4 Users Guide Scali MPI Connect product context ChapterPlatforms supported Scali mailing lists SMC FAQ SMC release documentsProblem reports SupportFeedback How to read this guideAcronyms and abbreviations LicensingNIC GUI style font Terms and conventionsTypographic conventions Typographic conventions Scali MPI Connect components Description of Scali MPI ConnectSMC network devices Ethernet Devices Direct Access Transport DATNetwork devices Shared Memory Device3.2 DET Using detctlUsing detstat 5.1 IB MyrinetInfiniband 4.1 GMChannel buffer Communication protocols on DAT-devices6 SCI Transporter protocol Inlining protocolEagerbuffering protocol Zerocopy protocol MPI-2 FeaturesSupport for other interconnects Scali MPI Connect Release 4.4 Users Guide MPI-2 Features Running Setting up a Scali MPI Connect environmentCompiling and linking Scali MPI Connect environment variablesLinker flags Compiler supportNaming conventions Running Scali MPI Connect programsIdentity of parallel processes Mpimon monitor programBasic usage Program spec Controlling options to mpimonStandard input Standard outputNetwork options How to provide options to mpimonMpirun usage Mpirun wrapper scriptPart part Running with tcp error detection TfdrSuspending and resuming jobs Running with dynamic interconnect failover capabilitiesDebugging with a sequential debugger Debugging and profilingDebugging with Etnus Totalview Using built-in segment protect violation handlerBuilt-in-tools for debugging Assistance for external profilingChannelinlinethreshold size to set threshold for inlining Controlling communication resourcesCommunication resources on DAT-devices Matching MPIRecv with MPIProbe Using MPIIsend, MPIIrecvUsing MPIBsend Good programming practice with SMCUnsafe MPI programs Error and warning messagesUser interface errors and warnings Fatal errorsMpimon options Postfix Giving numeric values to mpimonPrefix Scali MPI Connect Release 4.4 Users Guide Example Profiling with Scali MPI ConnectTracing Using Scali MPI Connect built-in trace+relSecs S eTime where AbsRank MPIcallcommNamerankcall-dependant-parameters whereExample FeaturesTiming Using Scali MPI Connect built-in timingMPIcallDcallsDtimeDfreq TcallsTtimeTfreq Count!avrLen!zroLen!inline!eager!transporter! where Using the scanalyzeCommrank recv from fromworldFromcommonFields Commrank send to toworldTocommonFields whereFor timing This produces the following report Using SMCs built-in CPU-usage functionalityScali MPI Connect Release 4.4 Users Guide Automatic buffer management Tuning communication resourcesFirst iteration is very slow How to optimize MPI performanceBenchmarking Caching the application program on the nodesMemory consumption increase after warm-up Collective operationsFinding the best algorithm Image contrast enhancement Appendix aPrograms in the ScaMPItst package Scali MPI Connect Release 4.4 Users Guide Original File formatAppendix B When things do not work troubleshootingWhy does not my program start to run? General problems Why can I not start mpid?Why does my program terminate abnormally? Appendix C Per node installation of Scali MPI ConnectExample Install Scali MPI Connect for TCP/IPInstall Scali MPI Connect for Direct Ethernet Install Scali MPI Connect for MyrinetLicense options Install Scali MPI Connect for InfinibandInstall Scali MPI Connect for SCI Install and configure SCI management softwareScali kernel drivers Uninstalling SMCTroubleshooting Network providers Troubleshooting the GM provider Troubleshooting 3rdparty DAT providersScali MPI Connect Release 4.4 Users Guide Grouping Appendix D Bracket expansion and groupingBracket expansion Scali MPI Connect Release 4.4 Users Guide Related documentation Appendix EScali MPI Connect Release 4.4 Users Guide List of figures Scali MPI Connect Release 4.4 Users Guide Transporter protocol IndexSSP