PAR Technologies V5 manual Table of Contents

Page 3

Table of Contents

 

1. Introduction

1

1.1. What is ParaStation

1

1.2. The history of ParaStation

1

1.3. About this document

2

2. Technical overview

3

2.1. Runtime daemon

3

2.2. Libraries

3

2.3. Kernel modules

3

2.4. License

4

3. Installation

5

3.1. Prerequisites

5

3.2. Directory structure

6

3.3. Installation via RPM packages

7

3.4. Installing the documentation

9

3.5. Installing MPI

10

3.6. Further steps

10

3.7. Uninstalling ParaStation5

11

4. Configuration

13

4.1. Configuration of the ParaStation system

13

4.2. Enable optimized network drivers

14

4.3. Testing the installation

15

5. Insight ParaStation5

17

5.1. ParaStation5 pscom communication library

17

5.2. ParaStation5 protocol p4sock

17

5.2.1. Directory /proc/sys/ps4/state

18

5.2.2. Directory /proc/sys/ps4/ether

18

5.2.3. Directory /proc/sys/ps4/local

19

5.2.4. p4stat

19

5.3. Controlling process placement

19

5.4. Using the ParaStation5 queuing facility

20

5.5. Exporting environment variables for a task

20

5.6. Using non-ParaStationapplications

20

5.7. ParaStation5 TCP bypass

21

5.8. Controlling ParaStation5 communication paths

21

5.9. Authentication within ParaStation5

22

5.10. Homogeneous user ID space

23

5.11. Single system view

23

5.12. Parallel shell tool

23

5.13. Nodes and CPUs

23

5.14. Integration with AFS

24

5.15. Integrating external queuing systems

24

5.15.1. Integration with PBS PRO

25

5.15.2. Integration with OpenPBS

25

5.15.3. Integration with Torque

25

5.15.4. Integration with LSF

25

5.15.5. Integration with LoadLeveler

25

5.16. Multicasts

25

5.17. Copying files in parallel

26

5.18. Using ParaStation accounting

26

5.19. Using ParaStation process pinning

27

5.20. Using memory binding

27

5.21. Spawning processes belonging to all groups

27

5.22. Changing the default ports for psid(8)

27

6. Troubleshooting

29

6.1. Problem: psiadmin returns error

29

ParaStation5 Administrator's Guide

iii

Image 3
Contents Administrators Guide Info@par-tec.com ParaStation5 Administrators GuideTable of Contents Problem different groups of nodes are seen as up or down History of ParaStation IntroductionAbout this document Kernel modules Technical overviewRuntime daemon LibrariesLicense Installation PrerequisitesHardware Directory structure SoftwareKernel version Mpi2, mpi2-intel, mpi2-pgi, mpi2-psc Installation via RPM packagesGetting the ParaStation5 RPM packages ManInstalling the RPMs Compiling the ParaStation5 packages from sourceFile Version Installing the documentation Etc/init.d/xinetd reloadParaStation entries # rpm -Uv psmpi2.5.0.0-1.i586.rpm Installing MPIFurther steps # rpm -Uv psdoc-5.0.0-1.noarch.rpmUninstalling ParaStation5 ParaStation5 Administrators Guide Define Number of nodes ConfigurationConfiguration of the ParaStation system Copy templateEnable optimized network drivers # /opt/parastation/bin/testconfigHostname id HWType runJob starter accounter Testing the installation # /opt/parastation/bin/testnodes -np nodes # /opt/parastation/bin/psiadmin -s -c listParaStation5 pscom communication library Insight ParaStation5# echo 10 /proc/sys/ps4/state/ResendTimeout Directory /proc/sys/ps4/state# cat /proc/sys/ps4/state/connections Directory /proc/sys/ps4/local Controlling process placementUsing the ParaStation5 queuing facility Using non-ParaStationapplicationsExporting environment variables for a task Export LDPRELOAD=/opt/parastation/lib64/libp4tcp.so Controlling ParaStation5 communication pathsExport PSPLIB=/opt/parastation/lib64/libpscomopenib.so Authentication within ParaStation5Pspshm or Pspsharedmem PSPP4S or PSPP4SOCKNodes and CPUs Homogeneous user ID spaceSingle system view Parallel shell toolPSIRARGPRE0=/some/path/env2tok Integration with AFSIntegrating external queuing systems Tok2envMulticasts Route add -net 224.0.0.0 netmask 240.0.0.0 dev ethX Using ParaStation accountingCopying files in parallel # UseMCastSpawning processes belonging to all groups Using ParaStation process pinningUsing memory binding Changing the default ports for psid8Port Troubleshooting Problem psiadmin returns errorProblem node shown as down Problem cannot start process on frontend Problem cannot start parallel taskProblem bad performance Problem different groups of nodes are seen as up or downProblem psid does not startup, reports port in use Problem pssh failsProblem processes cannot access files on remote nodes Reference Pages ParaStation5 Administrators Guide Parameters InstallDir inst-dir , InstallationDir inst-dirParastation.conf DescriptionStatusscript SetupscriptStartscript StopscriptElan P4sockOpenib MvapiNrOfNodes num Accounter$GENERATE 1-96 node$0,2 $0 Node node17 16 HWType ethernet p4sock starter yes runJobs noMCastGroup group-num SelectTime timeDeadInterval num LogLevel numMemLock size Core sizeCPUTime time DataSize sizeProc CPUmap map Processes maxprocsRdpResendTimeout ms RdpTimeout msStatusTimeout ms RdpClosedTimeout msSee also ErrorsParaStation5 Administrators Guide Psiadmin SynopsisOptions Extended description Standard ErrorStandard Input Standard OutputExit AllAllproc cnt count Load Count hw hwDown HardwareRdp Summary max maxQuit Maxproc nodes Accounters nodesUser nodes Group nodesNodesSort nodes Master nodesFreeOnSuspend nodes HandleOldBins nodesRlrss nodes Cpumap nodesRdpResendTimeout nodes RdpTimeout nodesStatusTimeout nodes RdpClosedTimeout nodesRestart nodes Resolve nodesPsiddebug mask nodes Selecttime time nodesPattern Name Description HandleOldBins 0 1 nodes Rdpmaxretrans val nodes RdpResendTimeout ms nodes RdpTimeout ms nodesStatusTimeout ms nodes RdpClosedTimeout ms nodesVerbose FilesQuiet NormalPsid Configfile=file Debug=levelLogfile=file ? , --usage Show a help message TestconfigFilename NumParaStation5 Administrators Guide Map TestnodesNp num Cnt numParaStation5 Administrators Guide Testpse -npnum TestpseParaStation5 Administrators Guide ?,--help P4statSock NetParaStation5 Administrators Guide P4tcp AddDelete ParaStation5 Administrators Guide Pattern Description PsaccounterVar/account/yyyymmdd Accounting files, one per day DumpcoreCoredir=dir ?, --helpPsaccview Ls,--ltotsum Lj,--ljobsLu,--ltotuser Lg,--ltotgroupEnd CpuweightAqtime CputimeInitialization file Mlisten ParaStation5 Administrators Guide Appendix A. Quick Installation Guide # /opt/parastation/bin/psiadmin psiadmin add # chkconfig -a /etc/init.d/parastationTesting Appendix B. ParaStation license Page Page Page Changes to the runtime environment Building and installing ParaStation5 packages# psiadmin -s Appendix C. Upgrading ParaStation4 to ParaStation5Page ARP GlossarySee ParaStation Logger To share a common address space within a node ParaStation5 Administrators Guide