PAR Technologies V5 manual Accounter, NrOfNodes num

Page 42

accounter

This is actually a pseudo communication layer. It is only used for configuring nodes running the ParaStation accounting daemon and should be used only in a particular Nodes entry.

NrOfNodes num

Define the number of connected nodes including the frontend node. The nodes will be numbered 0

num-1.

There is no default value for NrOfNodes. NrOfNodes has to be declared within the configuration file in any case.

The number of connected nodes has to be declared before any Nodes.

HWType { ethernet p4sock openib mvapi gm elan dapl none }

HWType { { ethernet p4sock openib mvapi gm elan dapl none }... }

Define the default communication hardware available on the nodes of the ParaStation cluster. This may be overruled by an explicit HWType option in a Node statement.

The hardware types used within this command have to be defined in Hardware declarations before.

Further hardware declarations might be defined by the user, but this is pretty much undocumented.

It is possible to enable more than one hardware type, either as default or on a per node basis.

The default value of HWType is none.

starter { true yes 1 false no 0 }

If the argument is one of yes, true or 1, all nodes declared within a Node statement will allow to start parallel tasks, unless otherwise stated.

If the argument is one of no, false or 0, starting will be not allowed.

It might be useful to prohibit the startup of parallel task from the frontend machine if a batch system is used. This will force all users to use the batch system in order to start their tasks. Otherwise it would be possible to circumvent the batch system by starting parallel task directly from the frontend machine.

The default is to allow the starting of parallel tasks from all nodes.

runJobs { true yes 1 false no 0 }

If the argument is one of yes, true or 1, all nodes declared within a Node statement will allow to run processes of parallel tasks, unless otherwise stated.

If the argument is one of no, false or 0, ParaStation will not start processes on these nodes.

It might be useful to prohibit the start of processes on a frontend machine since usually this machine is reserved for interactive work done by the users. If the execution of processes is forbidden on a distinct node, parallel tasks might be started from this node anyhow.

The default is to allow all nodes to run processes of parallel tasks.

38

ParaStation5 Administrator's Guide

Image 42
Contents Administrators Guide ParaStation5 Administrators Guide Info@par-tec.comTable of Contents Problem different groups of nodes are seen as up or down Introduction History of ParaStationAbout this document Libraries Technical overviewRuntime daemon Kernel modulesLicense Installation PrerequisitesHardware Directory structure SoftwareKernel version Man Installation via RPM packagesGetting the ParaStation5 RPM packages Mpi2, mpi2-intel, mpi2-pgi, mpi2-pscInstalling the RPMs Compiling the ParaStation5 packages from sourceFile Version Installing the documentation Etc/init.d/xinetd reloadParaStation entries # rpm -Uv psdoc-5.0.0-1.noarch.rpm Installing MPIFurther steps # rpm -Uv psmpi2.5.0.0-1.i586.rpmUninstalling ParaStation5 ParaStation5 Administrators Guide Copy template ConfigurationConfiguration of the ParaStation system Define Number of nodesEnable optimized network drivers # /opt/parastation/bin/testconfigHostname id HWType runJob starter accounter Testing the installation # /opt/parastation/bin/psiadmin -s -c list # /opt/parastation/bin/testnodes -np nodesInsight ParaStation5 ParaStation5 pscom communication library# echo 10 /proc/sys/ps4/state/ResendTimeout Directory /proc/sys/ps4/state# cat /proc/sys/ps4/state/connections Controlling process placement Directory /proc/sys/ps4/localUsing the ParaStation5 queuing facility Using non-ParaStationapplicationsExporting environment variables for a task Controlling ParaStation5 communication paths Export LDPRELOAD=/opt/parastation/lib64/libp4tcp.soPSPP4S or PSPP4SOCK Authentication within ParaStation5Pspshm or Pspsharedmem Export PSPLIB=/opt/parastation/lib64/libpscomopenib.soParallel shell tool Homogeneous user ID spaceSingle system view Nodes and CPUsTok2env Integration with AFSIntegrating external queuing systems PSIRARGPRE0=/some/path/env2tokMulticasts # UseMCast Using ParaStation accountingCopying files in parallel Route add -net 224.0.0.0 netmask 240.0.0.0 dev ethXChanging the default ports for psid8 Using ParaStation process pinningUsing memory binding Spawning processes belonging to all groupsPort Troubleshooting Problem psiadmin returns errorProblem node shown as down Problem different groups of nodes are seen as up or down Problem cannot start parallel taskProblem bad performance Problem cannot start process on frontendProblem pssh fails Problem psid does not startup, reports port in useProblem processes cannot access files on remote nodes Reference Pages ParaStation5 Administrators Guide Description InstallDir inst-dir , InstallationDir inst-dirParastation.conf ParametersStopscript SetupscriptStartscript StatusscriptMvapi P4sockOpenib ElanAccounter NrOfNodes numNode node17 16 HWType ethernet p4sock starter yes runJobs no $GENERATE 1-96 node$0,2 $0LogLevel num SelectTime timeDeadInterval num MCastGroup group-numDataSize size Core sizeCPUTime time MemLock sizeProc Processes maxprocs CPUmap mapRdpClosedTimeout ms RdpTimeout msStatusTimeout ms RdpResendTimeout msErrors See alsoParaStation5 Administrators Guide Psiadmin SynopsisOptions Standard Output Standard ErrorStandard Input Extended descriptionExit AllAllproc cnt count Hardware Count hw hwDown LoadRdp Summary max maxQuit Group nodes Accounters nodesUser nodes Maxproc nodesHandleOldBins nodes Master nodesFreeOnSuspend nodes NodesSort nodesCpumap nodes Rlrss nodesRdpClosedTimeout nodes RdpTimeout nodesStatusTimeout nodes RdpResendTimeout nodesResolve nodes Restart nodesSelecttime time nodes Psiddebug mask nodesPattern Name Description HandleOldBins 0 1 nodes Rdpmaxretrans val nodes RdpClosedTimeout ms nodes RdpTimeout ms nodesStatusTimeout ms nodes RdpResendTimeout ms nodesNormal FilesQuiet VerbosePsid Configfile=file Debug=levelLogfile=file Num TestconfigFilename ? , --usage Show a help messageParaStation5 Administrators Guide Cnt num TestnodesNp num MapParaStation5 Administrators Guide Testpse Testpse -npnumParaStation5 Administrators Guide Net P4statSock ?,--helpParaStation5 Administrators Guide P4tcp AddDelete ParaStation5 Administrators Guide Psaccounter Pattern Description?, --help DumpcoreCoredir=dir Var/account/yyyymmdd Accounting files, one per dayPsaccview Lg,--ltotgroup Lj,--ljobsLu,--ltotuser Ls,--ltotsumCputime CpuweightAqtime EndInitialization file Mlisten ParaStation5 Administrators Guide Appendix A. Quick Installation Guide # /opt/parastation/bin/psiadmin psiadmin add # chkconfig -a /etc/init.d/parastationTesting Appendix B. ParaStation license Page Page Page Appendix C. Upgrading ParaStation4 to ParaStation5 Building and installing ParaStation5 packages# psiadmin -s Changes to the runtime environmentPage Glossary ARPSee ParaStation Logger To share a common address space within a node ParaStation5 Administrators Guide