PAR Technologies V5 manual Installation, Prerequisites, Hardware

Page 9

Chapter 3. Installation

This chapter describes the installation of ParaStation5. At first, the prerequisites to use ParaStation5 are discussed. Next, the directory structure of all installed components is explained. Finally, the installation using RPM packages is described in detail.

Of course, the less automated the chosen way of installation is, the more possibilities of customization within the installation process occur. On the other hand even the most automated way of installation, the installation via RPM, will give a suitable result in most cases.

For a quick installation guide refer to Appendix A, Quick Installation Guide.

3.1. Prerequisites

In order to prepare a bunch of nodes for the installation of the ParaStation5 communication system, a few prerequisites have to be met.

Hardware

The cluster must have a homogeneous processor architecture, i.e. Intel IA32 and AMD IA32 can be used together, but not Intel IA32 and IA64 1. The supported processor architectures up to now are:

i586: Intel IA32 (including AMD Athlon)

ia64: Intel IA64

x86_64: Intel EM64T and AMD64

ppc: IBM Power4 and Power5

Multi-core CPUs are supported, as well as single and multi-CPU (SMP) nodes.

Furthermore the nodes need to be interconnected. In principle, ParaStation5 uses two different kinds of interconnects:

At first a so called administration network which is used to handle all the administrative tasks that have to be dealt with within a cluster. Besides commonly used services like sharing of NFS partitions or NIS tables, on a ParaStation cluster, this also includes the inter-daemon communication used to implement the effective cluster administration and parallel task handling mechanisms. This administration network is usually implemented using a Fast or Gigabit Ethernet network.

Secondly a high speed interconnect is required in order to do high bandwidth, low latency communication within parallel applications. While historically this kind of communication is usually done using specialized highspeed networks like Myrinet, nowadays Gigabit Ethernet is a much cheaper and only slightly slower alternative. ParaStation5 currently supports Ethernet (Fast, Gigabit and 10G Ethernet), Myrinet, InfiniBand, QsNetII and Shared Memory.

If IP connections over the high speed interconnect are available, it is not required to really have two distinct networks. Instead it is possible to use one physical network for both tasks. IP connections are usually configured by default in the case of Ethernet. For other networks, particular measures have to be taken in order to enable IP over these interconnects.

1It is possible to spawn a ParaStation cluster across multiple processor architectures. The daemons will communicate with each other, but this is currently not a supported configuration. For more details, please contact <support@par-tec.com>.

ParaStation5 Administrator's Guide

5

Image 9
Contents Administrators Guide Info@par-tec.com ParaStation5 Administrators GuideTable of Contents Problem different groups of nodes are seen as up or down History of ParaStation IntroductionAbout this document Runtime daemon Technical overviewLibraries Kernel modulesLicense Installation PrerequisitesHardware Directory structure SoftwareKernel version Getting the ParaStation5 RPM packages Installation via RPM packagesMan Mpi2, mpi2-intel, mpi2-pgi, mpi2-pscInstalling the RPMs Compiling the ParaStation5 packages from sourceFile Version Installing the documentation Etc/init.d/xinetd reloadParaStation entries Further steps Installing MPI# rpm -Uv psdoc-5.0.0-1.noarch.rpm # rpm -Uv psmpi2.5.0.0-1.i586.rpmUninstalling ParaStation5 ParaStation5 Administrators Guide Configuration of the ParaStation system ConfigurationCopy template Define Number of nodesEnable optimized network drivers # /opt/parastation/bin/testconfigHostname id HWType runJob starter accounter Testing the installation # /opt/parastation/bin/testnodes -np nodes # /opt/parastation/bin/psiadmin -s -c listParaStation5 pscom communication library Insight ParaStation5# echo 10 /proc/sys/ps4/state/ResendTimeout Directory /proc/sys/ps4/state# cat /proc/sys/ps4/state/connections Directory /proc/sys/ps4/local Controlling process placementUsing the ParaStation5 queuing facility Using non-ParaStationapplicationsExporting environment variables for a task Export LDPRELOAD=/opt/parastation/lib64/libp4tcp.so Controlling ParaStation5 communication pathsPspshm or Pspsharedmem Authentication within ParaStation5PSPP4S or PSPP4SOCK Export PSPLIB=/opt/parastation/lib64/libpscomopenib.soSingle system view Homogeneous user ID spaceParallel shell tool Nodes and CPUsIntegrating external queuing systems Integration with AFSTok2env PSIRARGPRE0=/some/path/env2tokMulticasts Copying files in parallel Using ParaStation accounting# UseMCast Route add -net 224.0.0.0 netmask 240.0.0.0 dev ethXUsing memory binding Using ParaStation process pinningChanging the default ports for psid8 Spawning processes belonging to all groupsPort Troubleshooting Problem psiadmin returns errorProblem node shown as down Problem bad performance Problem cannot start parallel taskProblem different groups of nodes are seen as up or down Problem cannot start process on frontendProblem psid does not startup, reports port in use Problem pssh failsProblem processes cannot access files on remote nodes Reference Pages ParaStation5 Administrators Guide Parastation.conf InstallDir inst-dir , InstallationDir inst-dirDescription ParametersStartscript SetupscriptStopscript StatusscriptOpenib P4sockMvapi ElanNrOfNodes num Accounter$GENERATE 1-96 node$0,2 $0 Node node17 16 HWType ethernet p4sock starter yes runJobs noDeadInterval num SelectTime timeLogLevel num MCastGroup group-numCPUTime time Core sizeDataSize size MemLock sizeProc CPUmap map Processes maxprocsStatusTimeout ms RdpTimeout msRdpClosedTimeout ms RdpResendTimeout msSee also ErrorsParaStation5 Administrators Guide Psiadmin SynopsisOptions Standard Input Standard ErrorStandard Output Extended descriptionExit AllAllproc cnt count Down Count hw hwHardware LoadRdp Summary max maxQuit User nodes Accounters nodesGroup nodes Maxproc nodesFreeOnSuspend nodes Master nodesHandleOldBins nodes NodesSort nodesRlrss nodes Cpumap nodesStatusTimeout nodes RdpTimeout nodesRdpClosedTimeout nodes RdpResendTimeout nodesRestart nodes Resolve nodesPsiddebug mask nodes Selecttime time nodesPattern Name Description HandleOldBins 0 1 nodes Rdpmaxretrans val nodes StatusTimeout ms nodes RdpTimeout ms nodesRdpClosedTimeout ms nodes RdpResendTimeout ms nodesQuiet FilesNormal VerbosePsid Configfile=file Debug=levelLogfile=file Filename TestconfigNum ? , --usage Show a help messageParaStation5 Administrators Guide Np num TestnodesCnt num MapParaStation5 Administrators Guide Testpse -npnum TestpseParaStation5 Administrators Guide Sock P4statNet ?,--helpParaStation5 Administrators Guide P4tcp AddDelete ParaStation5 Administrators Guide Pattern Description PsaccounterCoredir=dir Dumpcore?, --help Var/account/yyyymmdd Accounting files, one per dayPsaccview Lu,--ltotuser Lj,--ljobsLg,--ltotgroup Ls,--ltotsumAqtime CpuweightCputime EndInitialization file Mlisten ParaStation5 Administrators Guide Appendix A. Quick Installation Guide # /opt/parastation/bin/psiadmin psiadmin add # chkconfig -a /etc/init.d/parastationTesting Appendix B. ParaStation license Page Page Page # psiadmin -s Building and installing ParaStation5 packagesAppendix C. Upgrading ParaStation4 to ParaStation5 Changes to the runtime environmentPage ARP GlossarySee ParaStation Logger To share a common address space within a node ParaStation5 Administrators Guide