PAR Technologies V5 manual Configuration of the ParaStation system, Copy template, HWType

Page 17

Chapter 4. Configuration

After installing the ParaStation software successfully, only few modifications to the configuration file parastation.conf(5) have to be made in order to enable ParaStation on the local cluster.

4.1. Configuration of the ParaStation system

Within this section the basic configuration procedure to enable ParaStation will be described. It covers the configuration of ParaStation5 using TCP/IP (Ethernet) and the optimized ParaStation5 protocol p4sock.

The primarily configuration work is reduced to editing the central configuration file parastation.conf, which is located in /etc.

A template file can be found in /opt/parastation/config/parastation.conf.tmpl. Copy this file to /etc/parastation.conf and edit it as appropriate.

This section describes all parameters of /etc/parastation.conf necessary to customize ParaStation for a basic cluster environment. A detailed description of all possible configuration parameters in the configuration file can be found within the parastation.conf(5) manual page.

The following steps have to be executed on the frontend node to configure the ParaStation daemon psid(8):

1.Copy template

Copy the file /opt/parastation/config/parastation.conf.tmpl to /etc/ parastation.conf.

The template file contains all possible parameters known by the ParaStation daemon psid(8). Most of these parameters are set to their default value within lines marked as comments. Only those that have to be modified in order to adapt ParaStation to the local environment are enabled. Additionally all parameters are exemplified using comments. A more detailed description of all the parameters can be found in the parastation.conf(5) manual page.

The template file is a good starting point to create a working configuration of ParaStation for your cluster. Beside basic information about the cluster, this template file defines all hardware components ParaStation is able to handle. Since these definitions require a deeper knowledge of ParaStation, it is easier to copy the template file anyway.

2.Define Number of nodes

The parameter NrOfNodes has to be set to the actual number of nodes within the cluster. Front end nodes have to be considered as part of the cluster. E.g. if the cluster contains 8 nodes with a fast interconnect plus a front end node then NrOfNodes has to be set to 9 in order to allow the start of parallel tasks from this machine.

3.HWType

In order to tell ParaStation which general kind of communication hardware should be used, the HWType parameter has to be set. This could be changed on a per node basis within the nodes section (see below).

For clusters running ParaStation5 utilizing the optimized ParaStation communication stack on Ethernet hardware of any flavor this parameter has to be set to:

HWType { p4sock ethernet }

This will use the optimized ParaStation protocol, if available. Otherwise, TCP/IP will be used.

ParaStation5 Administrator's Guide

13

Image 17
Contents Administrators Guide Info@par-tec.com ParaStation5 Administrators GuideTable of Contents Problem different groups of nodes are seen as up or down History of ParaStation IntroductionAbout this document Runtime daemon Technical overviewLibraries Kernel modulesLicense Hardware InstallationPrerequisites Kernel version Directory structureSoftware Getting the ParaStation5 RPM packages Installation via RPM packagesMan Mpi2, mpi2-intel, mpi2-pgi, mpi2-pscFile Version Installing the RPMsCompiling the ParaStation5 packages from source ParaStation entries Installing the documentationEtc/init.d/xinetd reload Further steps Installing MPI# rpm -Uv psdoc-5.0.0-1.noarch.rpm # rpm -Uv psmpi2.5.0.0-1.i586.rpmUninstalling ParaStation5 ParaStation5 Administrators Guide Configuration of the ParaStation system ConfigurationCopy template Define Number of nodesHostname id HWType runJob starter accounter Enable optimized network drivers# /opt/parastation/bin/testconfig Testing the installation # /opt/parastation/bin/testnodes -np nodes # /opt/parastation/bin/psiadmin -s -c listParaStation5 pscom communication library Insight ParaStation5# cat /proc/sys/ps4/state/connections # echo 10 /proc/sys/ps4/state/ResendTimeoutDirectory /proc/sys/ps4/state Directory /proc/sys/ps4/local Controlling process placementExporting environment variables for a task Using the ParaStation5 queuing facilityUsing non-ParaStationapplications Export LDPRELOAD=/opt/parastation/lib64/libp4tcp.so Controlling ParaStation5 communication pathsPspshm or Pspsharedmem Authentication within ParaStation5PSPP4S or PSPP4SOCK Export PSPLIB=/opt/parastation/lib64/libpscomopenib.soSingle system view Homogeneous user ID spaceParallel shell tool Nodes and CPUsIntegrating external queuing systems Integration with AFSTok2env PSIRARGPRE0=/some/path/env2tokMulticasts Copying files in parallel Using ParaStation accounting# UseMCast Route add -net 224.0.0.0 netmask 240.0.0.0 dev ethXUsing memory binding Using ParaStation process pinningChanging the default ports for psid8 Spawning processes belonging to all groupsPort Problem node shown as down TroubleshootingProblem psiadmin returns error Problem bad performance Problem cannot start parallel taskProblem different groups of nodes are seen as up or down Problem cannot start process on frontendProblem psid does not startup, reports port in use Problem pssh failsProblem processes cannot access files on remote nodes Reference Pages ParaStation5 Administrators Guide Parastation.conf InstallDir inst-dir , InstallationDir inst-dirDescription ParametersStartscript SetupscriptStopscript StatusscriptOpenib P4sockMvapi ElanNrOfNodes num Accounter$GENERATE 1-96 node$0,2 $0 Node node17 16 HWType ethernet p4sock starter yes runJobs noDeadInterval num SelectTime timeLogLevel num MCastGroup group-numCPUTime time Core sizeDataSize size MemLock sizeProc CPUmap map Processes maxprocsStatusTimeout ms RdpTimeout msRdpClosedTimeout ms RdpResendTimeout msSee also ErrorsParaStation5 Administrators Guide Options PsiadminSynopsis Standard Input Standard ErrorStandard Output Extended descriptionAllproc cnt count ExitAll Down Count hw hwHardware LoadQuit RdpSummary max max User nodes Accounters nodesGroup nodes Maxproc nodesFreeOnSuspend nodes Master nodesHandleOldBins nodes NodesSort nodesRlrss nodes Cpumap nodesStatusTimeout nodes RdpTimeout nodesRdpClosedTimeout nodes RdpResendTimeout nodesRestart nodes Resolve nodesPsiddebug mask nodes Selecttime time nodesPattern Name Description HandleOldBins 0 1 nodes Rdpmaxretrans val nodes StatusTimeout ms nodes RdpTimeout ms nodesRdpClosedTimeout ms nodes RdpResendTimeout ms nodesQuiet FilesNormal VerbosePsid Logfile=file Configfile=fileDebug=level Filename TestconfigNum ? , --usage Show a help messageParaStation5 Administrators Guide Np num TestnodesCnt num MapParaStation5 Administrators Guide Testpse -npnum TestpseParaStation5 Administrators Guide Sock P4statNet ?,--helpParaStation5 Administrators Guide Delete P4tcpAdd ParaStation5 Administrators Guide Pattern Description PsaccounterCoredir=dir Dumpcore?, --help Var/account/yyyymmdd Accounting files, one per dayPsaccview Lu,--ltotuser Lj,--ljobsLg,--ltotgroup Ls,--ltotsumAqtime CpuweightCputime EndInitialization file Mlisten ParaStation5 Administrators Guide Appendix A. Quick Installation Guide Testing # /opt/parastation/bin/psiadmin psiadmin add# chkconfig -a /etc/init.d/parastation Appendix B. ParaStation license Page Page Page # psiadmin -s Building and installing ParaStation5 packagesAppendix C. Upgrading ParaStation4 to ParaStation5 Changes to the runtime environmentPage ARP GlossarySee ParaStation Logger To share a common address space within a node ParaStation5 Administrators Guide