PAR Technologies V5 manual Processes maxprocs, CPUmap map

Page 47

This only comes into play, if the user does not define a sorting strategy explicitely via PSI_NODES_SORT. Be aware of the fact that using a batch-system like PBS or LSF *will* set the strategy explicitely, namely to NONE.

overbook { true yes 1 false no 0 }

If the argument is one of yes, true or 1, all nodes may be overbooked by the user using the PSI_OVERBOOK environment variable.

If the argument is one of no, false or 0, ParaStation will deny overbooking of the nodes, even if PSI_OVERBOOK is set.

It might be useful to prohibit the start of processes on a frontend machine since usually this machine is reserved for interactive work done by the users. When the execution of processes is forbidden on a distinct node, parallel task might be started from this node anyhow.

The default is to allow all nodes to run processes of parallel tasks.

processes maxprocs

Define the maximum number of processes per node.

This parameter can be set during runtime via the set maxproc directive within the ParaStation administration and management tool psiadmin(1).

pinProcs { true yes 1 false no 0 }

Enables or disables process pinning for compute tasks. If enabled, tasks will be pinned down to particular CPU-slots. The mapping between those CPU-slots and physical CPUs and cores is made using a mapping list. See CPUmap below.

The pinProcs parameter can be set during runtime via the set pinprocs directive within the ParaStation administration and management tool psiadmin(1).

bindMem { true yes 1 false no 0 }

This parameter must be set to true if nodes providing non-Uniform memory access (NUMA) should use 'local' memory for the tasks.

This parameter can be set during runtime via the set bindmem directive within the ParaStation administration and management tool psiadmin(1).

CPUmap { map }

Set the map used to assign CPU-slots to physical cores to map. Map is a quoted string containing a space-separated permutation of the number 0 to Ncore-1. Here Ncore is the number of physical cores available on this node. The number of cores within a distinct node may be determined via list hw. The first number in map is the number of the physical core the first CPU-slot will be mapped to, and so on.

This parameter can be set during runtime via the set bindmem directive within the ParaStation administration and management tool psiadmin(1).

supplGrps { true yes 1 false no 0 }

This parameter must be set to true if processes spawned by ParaStation should belong to all groups defined for this user. Otherwise, they will only belong to the primary group.

This parameter can be set during runtime via the set supplementaryGroups directive within the ParaStation administration and management tool psiadmin(1).

ParaStation5 Administrator's Guide

43

Image 47
Contents Administrators Guide Info@par-tec.com ParaStation5 Administrators GuideTable of Contents Problem different groups of nodes are seen as up or down History of ParaStation IntroductionAbout this document Kernel modules Technical overviewRuntime daemon LibrariesLicense Hardware InstallationPrerequisites Kernel version Directory structureSoftware Mpi2, mpi2-intel, mpi2-pgi, mpi2-psc Installation via RPM packagesGetting the ParaStation5 RPM packages ManFile Version Installing the RPMsCompiling the ParaStation5 packages from source ParaStation entries Installing the documentationEtc/init.d/xinetd reload # rpm -Uv psmpi2.5.0.0-1.i586.rpm Installing MPIFurther steps # rpm -Uv psdoc-5.0.0-1.noarch.rpmUninstalling ParaStation5 ParaStation5 Administrators Guide Define Number of nodes ConfigurationConfiguration of the ParaStation system Copy templateHostname id HWType runJob starter accounter Enable optimized network drivers# /opt/parastation/bin/testconfig Testing the installation # /opt/parastation/bin/testnodes -np nodes # /opt/parastation/bin/psiadmin -s -c listParaStation5 pscom communication library Insight ParaStation5# cat /proc/sys/ps4/state/connections # echo 10 /proc/sys/ps4/state/ResendTimeoutDirectory /proc/sys/ps4/state Directory /proc/sys/ps4/local Controlling process placementExporting environment variables for a task Using the ParaStation5 queuing facilityUsing non-ParaStationapplications Export LDPRELOAD=/opt/parastation/lib64/libp4tcp.so Controlling ParaStation5 communication pathsExport PSPLIB=/opt/parastation/lib64/libpscomopenib.so Authentication within ParaStation5Pspshm or Pspsharedmem PSPP4S or PSPP4SOCKNodes and CPUs Homogeneous user ID spaceSingle system view Parallel shell toolPSIRARGPRE0=/some/path/env2tok Integration with AFSIntegrating external queuing systems Tok2envMulticasts Route add -net 224.0.0.0 netmask 240.0.0.0 dev ethX Using ParaStation accountingCopying files in parallel # UseMCastSpawning processes belonging to all groups Using ParaStation process pinningUsing memory binding Changing the default ports for psid8Port Problem node shown as down TroubleshootingProblem psiadmin returns error Problem cannot start process on frontend Problem cannot start parallel taskProblem bad performance Problem different groups of nodes are seen as up or downProblem psid does not startup, reports port in use Problem pssh failsProblem processes cannot access files on remote nodes Reference Pages ParaStation5 Administrators Guide Parameters InstallDir inst-dir , InstallationDir inst-dirParastation.conf DescriptionStatusscript SetupscriptStartscript StopscriptElan P4sockOpenib MvapiNrOfNodes num Accounter$GENERATE 1-96 node$0,2 $0 Node node17 16 HWType ethernet p4sock starter yes runJobs noMCastGroup group-num SelectTime timeDeadInterval num LogLevel numMemLock size Core sizeCPUTime time DataSize sizeProc CPUmap map Processes maxprocsRdpResendTimeout ms RdpTimeout msStatusTimeout ms RdpClosedTimeout msSee also ErrorsParaStation5 Administrators Guide Options PsiadminSynopsis Extended description Standard ErrorStandard Input Standard OutputAllproc cnt count ExitAll Load Count hw hwDown HardwareQuit RdpSummary max max Maxproc nodes Accounters nodesUser nodes Group nodesNodesSort nodes Master nodesFreeOnSuspend nodes HandleOldBins nodesRlrss nodes Cpumap nodesRdpResendTimeout nodes RdpTimeout nodesStatusTimeout nodes RdpClosedTimeout nodesRestart nodes Resolve nodesPsiddebug mask nodes Selecttime time nodesPattern Name Description HandleOldBins 0 1 nodes Rdpmaxretrans val nodes RdpResendTimeout ms nodes RdpTimeout ms nodesStatusTimeout ms nodes RdpClosedTimeout ms nodesVerbose FilesQuiet NormalPsid Logfile=file Configfile=fileDebug=level ? , --usage Show a help message TestconfigFilename NumParaStation5 Administrators Guide Map TestnodesNp num Cnt numParaStation5 Administrators Guide Testpse -npnum TestpseParaStation5 Administrators Guide ?,--help P4statSock NetParaStation5 Administrators Guide Delete P4tcpAdd ParaStation5 Administrators Guide Pattern Description PsaccounterVar/account/yyyymmdd Accounting files, one per day DumpcoreCoredir=dir ?, --helpPsaccview Ls,--ltotsum Lj,--ljobsLu,--ltotuser Lg,--ltotgroupEnd CpuweightAqtime CputimeInitialization file Mlisten ParaStation5 Administrators Guide Appendix A. Quick Installation Guide Testing # /opt/parastation/bin/psiadmin psiadmin add# chkconfig -a /etc/init.d/parastation Appendix B. ParaStation license Page Page Page Changes to the runtime environment Building and installing ParaStation5 packages# psiadmin -s Appendix C. Upgrading ParaStation4 to ParaStation5Page ARP GlossarySee ParaStation Logger To share a common address space within a node ParaStation5 Administrators Guide