PAR Technologies V5 manual Insight ParaStation5, ParaStation5 pscom communication library

Page 21

Chapter 5. Insight ParaStation5

This chapter provides more technical details and background information about ParaStation5.

5.1. ParaStation5 pscom communication library

The ParaStation communication library libpscom offers secure and reliable end-to-end connectivity. It hides the actual transport and communication characteristics from the application and higher level libraries.

The libpscom library supports a wide range of interconnects and protocols for data transfers. Using a generic plug-in system, this library may open connections using the following networks and protocols:

TCP: uses standard TCP/IP sockets to transfer data. This protocol may use any interconnect. Support for this protocol is built-in to the libpscom.

P4sock: uses an optimized network protocol for Ethernet (see Section 5.2, “ParaStation5 protocol p4sock”, below). Support for this protocol is built-in to the libpscom.

InfiniBand: based on a vapi kernel layer and a libvapi library, typically provided by the hardware vendor, the libpscom may use InfiniBand to actually transfer data. The corresponding plug-in library is called libpscom4vapi.

Myrinet: using the GM library and kernel level module, the libpscom library is able to use Myrinet for data transfer. The particular plug-in library is called libpscom4gm.

Shared Memory: for communication within a SMP node, the libpscom library uses shared memory. Support for this protocol is built-in to the libpscom.

DAPL: The libpscom supports a DAPL transport layer. Using the libpscom4dapl plug-in, it may transfer data across various networks like Infiniband or 10G Ethernet using a vendor-provided libdapl.

QsNet: The libpscom supports the QsNetII transport layer. Using the libpscom4elan plug-in, it may transfer data using the libelan.

The interconnect and protocol used between two distinct processes is chosen while opening the connection between those processes. Depending on available hardware, configuration (see Section 4.1, “Configuration of the ParaStation system”) and current environment variables (see Section 5.8, “ControllingParaStation5 communication paths”), the library automatically selects the fastest available communication path.

The library routines for sending and receiving data handle arbitrary large buffers. If necessary, the buffers will be fragmented and reassembled to meet the underlying transport requirements.

The application is dynamically linked with the libpscom.so library. At runtime, this library loads plug- ins for various interconnects, see above. For more information on controlling ParaStation communication pathes, refer to Section 5.8, “ControllingParaStation5 communication paths”.

5.2.ParaStation5 protocol p4sock

ParaStation5 provides its own communication protocol for Ethernet, called p4sock. This protocol is designed for extremely fast and reliable communication within a closed and homogeneous compute cluster environment.

The protocol implements a reliable, connection-oriented communication layer, especially designed for very low overhead. As a result, it delivers very low latencies.

The p4sock protocol is encapsulated within the kernel module p4sock.ko. This module is loaded on system startup or whenever the ParaStation5 daemon psid(8) starts up and the p4sock protocol is enabled within the configuration file parastation.conf(5).

ParaStation5 Administrator's Guide

17

Image 21
Contents Administrators Guide Info@par-tec.com ParaStation5 Administrators GuideTable of Contents Problem different groups of nodes are seen as up or down History of ParaStation IntroductionAbout this document Runtime daemon Technical overviewLibraries Kernel modulesLicense Installation PrerequisitesHardware Directory structure SoftwareKernel version Getting the ParaStation5 RPM packages Installation via RPM packagesMan Mpi2, mpi2-intel, mpi2-pgi, mpi2-pscInstalling the RPMs Compiling the ParaStation5 packages from sourceFile Version Installing the documentation Etc/init.d/xinetd reloadParaStation entries Further steps Installing MPI# rpm -Uv psdoc-5.0.0-1.noarch.rpm # rpm -Uv psmpi2.5.0.0-1.i586.rpmUninstalling ParaStation5 ParaStation5 Administrators Guide Configuration of the ParaStation system ConfigurationCopy template Define Number of nodesEnable optimized network drivers # /opt/parastation/bin/testconfigHostname id HWType runJob starter accounter Testing the installation # /opt/parastation/bin/testnodes -np nodes # /opt/parastation/bin/psiadmin -s -c listParaStation5 pscom communication library Insight ParaStation5# echo 10 /proc/sys/ps4/state/ResendTimeout Directory /proc/sys/ps4/state# cat /proc/sys/ps4/state/connections Directory /proc/sys/ps4/local Controlling process placementUsing the ParaStation5 queuing facility Using non-ParaStationapplicationsExporting environment variables for a task Export LDPRELOAD=/opt/parastation/lib64/libp4tcp.so Controlling ParaStation5 communication pathsPspshm or Pspsharedmem Authentication within ParaStation5PSPP4S or PSPP4SOCK Export PSPLIB=/opt/parastation/lib64/libpscomopenib.soSingle system view Homogeneous user ID spaceParallel shell tool Nodes and CPUsIntegrating external queuing systems Integration with AFSTok2env PSIRARGPRE0=/some/path/env2tokMulticasts Copying files in parallel Using ParaStation accounting# UseMCast Route add -net 224.0.0.0 netmask 240.0.0.0 dev ethXUsing memory binding Using ParaStation process pinningChanging the default ports for psid8 Spawning processes belonging to all groupsPort Troubleshooting Problem psiadmin returns errorProblem node shown as down Problem bad performance Problem cannot start parallel taskProblem different groups of nodes are seen as up or down Problem cannot start process on frontendProblem psid does not startup, reports port in use Problem pssh failsProblem processes cannot access files on remote nodes Reference Pages ParaStation5 Administrators Guide Parastation.conf InstallDir inst-dir , InstallationDir inst-dirDescription ParametersStartscript SetupscriptStopscript StatusscriptOpenib P4sockMvapi ElanNrOfNodes num Accounter$GENERATE 1-96 node$0,2 $0 Node node17 16 HWType ethernet p4sock starter yes runJobs noDeadInterval num SelectTime timeLogLevel num MCastGroup group-numCPUTime time Core sizeDataSize size MemLock sizeProc CPUmap map Processes maxprocsStatusTimeout ms RdpTimeout msRdpClosedTimeout ms RdpResendTimeout msSee also ErrorsParaStation5 Administrators Guide Psiadmin SynopsisOptions Standard Input Standard ErrorStandard Output Extended descriptionExit AllAllproc cnt count Down Count hw hwHardware LoadRdp Summary max maxQuit User nodes Accounters nodesGroup nodes Maxproc nodesFreeOnSuspend nodes Master nodesHandleOldBins nodes NodesSort nodesRlrss nodes Cpumap nodesStatusTimeout nodes RdpTimeout nodesRdpClosedTimeout nodes RdpResendTimeout nodesRestart nodes Resolve nodesPsiddebug mask nodes Selecttime time nodesPattern Name Description HandleOldBins 0 1 nodes Rdpmaxretrans val nodes StatusTimeout ms nodes RdpTimeout ms nodesRdpClosedTimeout ms nodes RdpResendTimeout ms nodesQuiet FilesNormal VerbosePsid Configfile=file Debug=levelLogfile=file Filename TestconfigNum ? , --usage Show a help messageParaStation5 Administrators Guide Np num TestnodesCnt num MapParaStation5 Administrators Guide Testpse -npnum TestpseParaStation5 Administrators Guide Sock P4statNet ?,--helpParaStation5 Administrators Guide P4tcp AddDelete ParaStation5 Administrators Guide Pattern Description PsaccounterCoredir=dir Dumpcore?, --help Var/account/yyyymmdd Accounting files, one per dayPsaccview Lu,--ltotuser Lj,--ljobsLg,--ltotgroup Ls,--ltotsumAqtime CpuweightCputime EndInitialization file Mlisten ParaStation5 Administrators Guide Appendix A. Quick Installation Guide # /opt/parastation/bin/psiadmin psiadmin add # chkconfig -a /etc/init.d/parastationTesting Appendix B. ParaStation license Page Page Page # psiadmin -s Building and installing ParaStation5 packagesAppendix C. Upgrading ParaStation4 to ParaStation5 Changes to the runtime environmentPage ARP GlossarySee ParaStation Logger To share a common address space within a node ParaStation5 Administrators Guide