PAR Technologies V5 Enable optimized network drivers, Hostname id HWType runJob starter accounter

Page 18

Enable optimized network drivers

The values that might be assigned to the HWType parameter have to be defined within the parastation.conf configuration file. Have a brief look at the various Hardware sections of this file in order to find out which hardware types are actually defined.

Other possible types are: mvapi, openib, gm, ipath, elan, dapl.

To enable shared memory communication used within SMP nodes, no dedicated hardware entry is required. Shared memory support is always enabled by default. As there are no options for shared memory, no dedicated hardware section for this kind of interconnect is provided.

4.Define Nodes

Furthermore ParaStation has to be told which nodes should be part of the cluster. The usual way of using the Nodes parameter is the environment mode, that is already enabled in the template file.

The general syntax of the Nodes environment is one entry per line. Each entry has the form

hostname id [HWType] [runJob] [starter] [accounter]

This will register the node hostname to the ParaStation system with the ParaStation ID id. The ParaStation ID has to be an integer number between 0 and NrOfNodes-1.

For each cluster node defined within the Nodes environment at least the hostname of the node and the ParaStation ID of this node have to be given. The optional parameters HWType, runJobs, starter and accounter may be ignored for now. For a detailed description of these parameters refer to the parastation.conf(5) manual page.

Usually the nodes will be enlisted ordered by increasing ParaStation IDs, beginning with 0 for the first node. If a front end node exists and furthermore should be integrated into the ParaStation system, it usually should be configured with ID 0.

Within an Ethernet cluster the mapping between hostnames and ParaStation ID is completely unrestricted.

5.More options

More configuration options may be set as described in the configuration file parastation.conf. For details refer to the parastation.conf(5) manual page.

If using vapi (HwType ib) or DAPL (HwType dapl) layers for communication, e.g. for Infiniband or 10G Ethernet, the amount of lockable memory must be increased. To do so, use the option rlimit memlock within the configuration file.

6. Copy configuration file to all other nodes

The modified configuration file must be copied to all other nodes of the cluster. E.g., use psh to do so. Restart all ParaStation daemons.

In order to verify the configuration, the command

# /opt/parastation/bin/test_config

could be run. This command will analyze the configuration file and report any configuration failures. After finishing these steps, the configuration of ParaStation is done.

4.2. Enable optimized network drivers

As explained in the previous chapter, ParaStation5 comes with its own versions of adapted network drivers for Intel (e1000) and Broadcom (bcm5700) NICs. If the optimized ParaStation protocol p4sock is used to

14

ParaStation5 Administrator's Guide

Image 18
Contents Administrators Guide ParaStation5 Administrators Guide Info@par-tec.comTable of Contents Problem different groups of nodes are seen as up or down Introduction History of ParaStationAbout this document Libraries Technical overviewRuntime daemon Kernel modulesLicense Installation PrerequisitesHardware Directory structure SoftwareKernel version Man Installation via RPM packagesGetting the ParaStation5 RPM packages Mpi2, mpi2-intel, mpi2-pgi, mpi2-pscInstalling the RPMs Compiling the ParaStation5 packages from sourceFile Version Installing the documentation Etc/init.d/xinetd reloadParaStation entries # rpm -Uv psdoc-5.0.0-1.noarch.rpm Installing MPIFurther steps # rpm -Uv psmpi2.5.0.0-1.i586.rpmUninstalling ParaStation5 ParaStation5 Administrators Guide Copy template ConfigurationConfiguration of the ParaStation system Define Number of nodesEnable optimized network drivers # /opt/parastation/bin/testconfigHostname id HWType runJob starter accounter Testing the installation # /opt/parastation/bin/psiadmin -s -c list # /opt/parastation/bin/testnodes -np nodesInsight ParaStation5 ParaStation5 pscom communication library# echo 10 /proc/sys/ps4/state/ResendTimeout Directory /proc/sys/ps4/state# cat /proc/sys/ps4/state/connections Controlling process placement Directory /proc/sys/ps4/localUsing the ParaStation5 queuing facility Using non-ParaStationapplicationsExporting environment variables for a task Controlling ParaStation5 communication paths Export LDPRELOAD=/opt/parastation/lib64/libp4tcp.soPSPP4S or PSPP4SOCK Authentication within ParaStation5Pspshm or Pspsharedmem Export PSPLIB=/opt/parastation/lib64/libpscomopenib.soParallel shell tool Homogeneous user ID spaceSingle system view Nodes and CPUsTok2env Integration with AFSIntegrating external queuing systems PSIRARGPRE0=/some/path/env2tokMulticasts # UseMCast Using ParaStation accountingCopying files in parallel Route add -net 224.0.0.0 netmask 240.0.0.0 dev ethXChanging the default ports for psid8 Using ParaStation process pinningUsing memory binding Spawning processes belonging to all groupsPort Troubleshooting Problem psiadmin returns errorProblem node shown as down Problem different groups of nodes are seen as up or down Problem cannot start parallel taskProblem bad performance Problem cannot start process on frontendProblem pssh fails Problem psid does not startup, reports port in useProblem processes cannot access files on remote nodes Reference Pages ParaStation5 Administrators Guide Description InstallDir inst-dir , InstallationDir inst-dirParastation.conf ParametersStopscript SetupscriptStartscript StatusscriptMvapi P4sockOpenib ElanAccounter NrOfNodes numNode node17 16 HWType ethernet p4sock starter yes runJobs no $GENERATE 1-96 node$0,2 $0LogLevel num SelectTime timeDeadInterval num MCastGroup group-numDataSize size Core sizeCPUTime time MemLock sizeProc Processes maxprocs CPUmap mapRdpClosedTimeout ms RdpTimeout msStatusTimeout ms RdpResendTimeout msErrors See alsoParaStation5 Administrators Guide Psiadmin SynopsisOptions Standard Output Standard ErrorStandard Input Extended descriptionExit AllAllproc cnt count Hardware Count hw hwDown LoadRdp Summary max maxQuit Group nodes Accounters nodesUser nodes Maxproc nodesHandleOldBins nodes Master nodesFreeOnSuspend nodes NodesSort nodesCpumap nodes Rlrss nodesRdpClosedTimeout nodes RdpTimeout nodesStatusTimeout nodes RdpResendTimeout nodesResolve nodes Restart nodesSelecttime time nodes Psiddebug mask nodesPattern Name Description HandleOldBins 0 1 nodes Rdpmaxretrans val nodes RdpClosedTimeout ms nodes RdpTimeout ms nodesStatusTimeout ms nodes RdpResendTimeout ms nodesNormal FilesQuiet VerbosePsid Configfile=file Debug=levelLogfile=file Num TestconfigFilename ? , --usage Show a help messageParaStation5 Administrators Guide Cnt num TestnodesNp num MapParaStation5 Administrators Guide Testpse Testpse -npnumParaStation5 Administrators Guide Net P4statSock ?,--helpParaStation5 Administrators Guide P4tcp AddDelete ParaStation5 Administrators Guide Psaccounter Pattern Description?, --help DumpcoreCoredir=dir Var/account/yyyymmdd Accounting files, one per dayPsaccview Lg,--ltotgroup Lj,--ljobsLu,--ltotuser Ls,--ltotsumCputime CpuweightAqtime EndInitialization file Mlisten ParaStation5 Administrators Guide Appendix A. Quick Installation Guide # /opt/parastation/bin/psiadmin psiadmin add # chkconfig -a /etc/init.d/parastationTesting Appendix B. ParaStation license Page Page Page Appendix C. Upgrading ParaStation4 to ParaStation5 Building and installing ParaStation5 packages# psiadmin -s Changes to the runtime environmentPage Glossary ARPSee ParaStation Logger To share a common address space within a node ParaStation5 Administrators Guide