Escali 4.4 manual Myrinet, Infiniband, 4.1 GM, 5.1 IB

Page 27

Section: 2.2 SMC network devices

root# detstat -r det0 # reset statistics for the det0 device.

root# detstat -r -a # resets statistics for all DET devices.

2.2.4 Myrinet

2.2.4.1 GM

This is a RDMA capable device that uses the Myricom GM driver and library. A GM release above

2.0is required. This device is straight forward and requires no configuration other than the presence of the libgm.so library in the library path (see /etc/ld.so.conf).

Note:

Myricom GM software is not provided by Scali. If you have purchased a Myrinet interconnect you have the right to use the GM source, and a source tar ball is available from Myricom. It is necessary to obtain the GM source since it must be compiled per kernel version. Scali provides tools for generating binary RPMs to ease installing and management. These tools are provided in the scagmbuilder package; see the Release Notes/Readme file for detailed instructions.

If you used Scali Manage to install your compute nodes, and supplied it with the GM source tar ball, the installation is already complete.

2.2.5 Infiniband

2.2.5.1 IB

Infiniband is a relatively new interconnect that has been available since 2002, and became affordable in 2003. On PCI-X based systems you can expect latencies around 5υS and

bandwidth up to 700-800Mb/s (please note that performance results may vary based on processors, memory sub system, and the PCI bridge in the chipsets).

There are various Infiniband vendors that provide slightly different hardware and software environments. Scali have established relationships with the following vendors: Mellanox, Silverstorm, Cisco, and Voltaire.

See release notes on the exact versions of software stack that is supported. Scali provide a utility known as ScaIBbuilder that does an automated install of some of these stacks. (See IBbuilders release notes).

The different vendors’ InfiniBand switches vary in feature sets, but the most important difference is whether they have a built in subnet manager or not. An InfiniBand network must have a subnet manager (SM) and if the switches don't come with a builtin SM, one has to be started on a node attached to the IB network. The SMs of choice for software SMs are OpenSM or minism. If you have SM-less switches your vendor will provide one as part of their software bundle.

SMC uses either the uDAPL (User DAT Provider Library) supplied by the IB vendor, or the low level VAPI/IBA layer. DAT is an established standard and is guaranteed to work with SMC. However better performance is usually achieved with the VAPI/IBT interfaces. However, VAPI is an API that is in flux and SMC is not guaranteed to work with all (current nor future) versions of VAPI.

Scali MPI Connect Release 4.4 Users Guide

15

Image 27
Contents Scali MPI ConnectTM Users Guide Copyright 1999-2005 Scali AS. All rights reserved AcknowledgementScali Bronze Software Certificate Maintenance II Software License Terms CommencementGrant of License Export Requirements SupportLicense Manager Sub-license and distributionLICENSEE’s Obligations SCALI’s ObligationsTransfer Title to Intellectual Property RightsCompliance with Licenses Warranty of Title and Substantial PerformanceScali MPI Connect Release 4.4 Users Guide Vii Limitation on Remedies and LiabilitiesMiscellaneous Proprietary InformationGoverning Law Scali MPI Connect Release 4.4 Users Guide Table of contents Profiling with Scali MPI Connect Appendix a Example MPI code Scali MPI Connect Release 4.4 Users Guide Scali MPI Connect product context ChapterPlatforms supported Scali mailing lists SMC FAQ SMC release documentsProblem reports SupportFeedback How to read this guideAcronyms and abbreviations LicensingNIC Terms and conventions Typographic conventionsGUI style font Typographic conventions Scali MPI Connect components Description of Scali MPI ConnectSMC network devices Ethernet Devices Direct Access Transport DATNetwork devices Shared Memory DeviceUsing detctl Using detstat3.2 DET 5.1 IB MyrinetInfiniband 4.1 GMCommunication protocols on DAT-devices 6 SCIChannel buffer Inlining protocol Eagerbuffering protocolTransporter protocol MPI-2 Features Support for other interconnectsZerocopy protocol Scali MPI Connect Release 4.4 Users Guide MPI-2 Features Running Setting up a Scali MPI Connect environmentCompiling and linking Scali MPI Connect environment variablesLinker flags Compiler supportNaming conventions Running Scali MPI Connect programsMpimon monitor program Basic usageIdentity of parallel processes Program spec Controlling options to mpimonStandard input Standard outputNetwork options How to provide options to mpimonMpirun usage Mpirun wrapper scriptPart part Running with tcp error detection TfdrSuspending and resuming jobs Running with dynamic interconnect failover capabilitiesDebugging with a sequential debugger Debugging and profilingDebugging with Etnus Totalview Using built-in segment protect violation handlerBuilt-in-tools for debugging Assistance for external profilingControlling communication resources Communication resources on DAT-devicesChannelinlinethreshold size to set threshold for inlining Matching MPIRecv with MPIProbe Using MPIIsend, MPIIrecvUsing MPIBsend Good programming practice with SMCUnsafe MPI programs Error and warning messagesUser interface errors and warnings Fatal errorsMpimon options Giving numeric values to mpimon PrefixPostfix Scali MPI Connect Release 4.4 Users Guide Example Profiling with Scali MPI ConnectTracing Using Scali MPI Connect built-in trace+relSecs S eTime where AbsRank MPIcallcommNamerankcall-dependant-parameters whereExample FeaturesTiming Using Scali MPI Connect built-in timingMPIcallDcallsDtimeDfreq TcallsTtimeTfreq Count!avrLen!zroLen!inline!eager!transporter! where Using the scanalyzeCommrank recv from fromworldFromcommonFields Commrank send to toworldTocommonFields whereFor timing This produces the following report Using SMCs built-in CPU-usage functionalityScali MPI Connect Release 4.4 Users Guide Automatic buffer management Tuning communication resourcesFirst iteration is very slow How to optimize MPI performanceBenchmarking Caching the application program on the nodesMemory consumption increase after warm-up Collective operationsFinding the best algorithm Appendix a Programs in the ScaMPItst packageImage contrast enhancement Scali MPI Connect Release 4.4 Users Guide Original File formatWhen things do not work troubleshooting Why does not my program start to run?Appendix B Why can I not start mpid? Why does my program terminate abnormally?General problems Appendix C Per node installation of Scali MPI ConnectExample Install Scali MPI Connect for TCP/IPInstall Scali MPI Connect for Direct Ethernet Install Scali MPI Connect for MyrinetLicense options Install Scali MPI Connect for InfinibandInstall Scali MPI Connect for SCI Install and configure SCI management softwareUninstalling SMC Troubleshooting Network providersScali kernel drivers Troubleshooting the GM provider Troubleshooting 3rdparty DAT providersScali MPI Connect Release 4.4 Users Guide Appendix D Bracket expansion and grouping Bracket expansionGrouping Scali MPI Connect Release 4.4 Users Guide Related documentation Appendix EScali MPI Connect Release 4.4 Users Guide List of figures Scali MPI Connect Release 4.4 Users Guide Transporter protocol IndexSSP