PAR Technologies V5 manual Glossary, Arp

Page 95

Glossary

Address Resolution Protocol

A sending host decides, through a protocols routing mechanism, that it

 

wants to transmit to a target host located some place on a connected piece

 

of a physical network. To actually transmit the hardware packet usually

 

a hardware address must be generated. In the case of Ethernet this is

 

48 bit Ethernet address. The addresses of hosts within a protocol are

 

not always compatible with the corresponding hardware address (being

 

different lengths or values).

 

The Address Resolution Protocol (ARP) is used by the sending host in

 

order to resolve the Ethernet address of the target host from its IP address.

 

It is described in the RFC 826. The ARP is part of the TCP/IP protocol

 

family.

Administration Network

The administration network is used for exchanging (meta) data used for

 

administrative tasks between cluster nodes.

 

This network typically carries only a moderate data rate and can be entirely

 

separated from the data network. Almost always, Ethernet (Fast or more

 

and more Gigabit) is used for this purpose.

Administrative Task

A single process running on one of the compute nodes within the cluster.

 

This process does not communicate with other processes using MPI.

 

This task will not be accounted within the ParaStation process

 

management, ie. it will not allocate a dedicated CPU. Thus, administration

 

tasks may be startet in addition to parallel tasks.

 

See also Serial Task for tasks accounted with ParaStation.

admin-task

See Administrative Task.

ARP

See Address Resolution Protocol.

Data Network

The data network is used for exchanging data between the compute

 

processes on the cluster nodes. Typically, high bandwidth and low latency

 

is required for this kind of network.

 

Interconnect types used for this network are Myrinet or InfiniBand, and

 

(Gigabit) Ethernet for moderate bandwidth and latency requirements.

 

Especially for Ethernet based clusters, the administration and data

 

network are often collapsed into a single interconnect.

CPU

Modern multi-core CPUs provide multiple CPU cores within a physical

 

CPU package. Within this document, the term CPU will be used to refer to

 

a independing computing core, independent of the physical packaging.

DMA

See Direct Memory Access.

Direct Memory Access

In the old days devices within a computer were not able to put data into

 

memory on their own but the CPU had to fetch it from them and to store

 

it to the final destination manually.

 

Nowadays devices as Ethernet cards, harddisk controllers, Myrinet cards

 

etc. are capable to store chunks of data into memory on their own. E.g. a

 

disk controller is told to fetch an amount of memory from a hard disk and

ParaStation5 Administrator's Guide

91

Image 95
Contents Administrators Guide Info@par-tec.com ParaStation5 Administrators GuideTable of Contents Problem different groups of nodes are seen as up or down History of ParaStation IntroductionAbout this document Kernel modules Technical overviewRuntime daemon LibrariesLicense Hardware InstallationPrerequisites Kernel version Directory structureSoftware Mpi2, mpi2-intel, mpi2-pgi, mpi2-psc Installation via RPM packagesGetting the ParaStation5 RPM packages ManFile Version Installing the RPMsCompiling the ParaStation5 packages from source ParaStation entries Installing the documentationEtc/init.d/xinetd reload # rpm -Uv psmpi2.5.0.0-1.i586.rpm Installing MPIFurther steps # rpm -Uv psdoc-5.0.0-1.noarch.rpmUninstalling ParaStation5 ParaStation5 Administrators Guide Define Number of nodes ConfigurationConfiguration of the ParaStation system Copy templateHostname id HWType runJob starter accounter Enable optimized network drivers# /opt/parastation/bin/testconfig Testing the installation # /opt/parastation/bin/testnodes -np nodes # /opt/parastation/bin/psiadmin -s -c listParaStation5 pscom communication library Insight ParaStation5# cat /proc/sys/ps4/state/connections # echo 10 /proc/sys/ps4/state/ResendTimeoutDirectory /proc/sys/ps4/state Directory /proc/sys/ps4/local Controlling process placementExporting environment variables for a task Using the ParaStation5 queuing facilityUsing non-ParaStationapplications Export LDPRELOAD=/opt/parastation/lib64/libp4tcp.so Controlling ParaStation5 communication pathsExport PSPLIB=/opt/parastation/lib64/libpscomopenib.so Authentication within ParaStation5Pspshm or Pspsharedmem PSPP4S or PSPP4SOCKNodes and CPUs Homogeneous user ID spaceSingle system view Parallel shell toolPSIRARGPRE0=/some/path/env2tok Integration with AFSIntegrating external queuing systems Tok2envMulticasts Route add -net 224.0.0.0 netmask 240.0.0.0 dev ethX Using ParaStation accountingCopying files in parallel # UseMCastSpawning processes belonging to all groups Using ParaStation process pinningUsing memory binding Changing the default ports for psid8Port Problem node shown as down TroubleshootingProblem psiadmin returns error Problem cannot start process on frontend Problem cannot start parallel taskProblem bad performance Problem different groups of nodes are seen as up or downProblem psid does not startup, reports port in use Problem pssh failsProblem processes cannot access files on remote nodes Reference Pages ParaStation5 Administrators Guide Parameters InstallDir inst-dir , InstallationDir inst-dirParastation.conf DescriptionStatusscript SetupscriptStartscript StopscriptElan P4sockOpenib MvapiNrOfNodes num Accounter$GENERATE 1-96 node$0,2 $0 Node node17 16 HWType ethernet p4sock starter yes runJobs noMCastGroup group-num SelectTime timeDeadInterval num LogLevel numMemLock size Core sizeCPUTime time DataSize sizeProc CPUmap map Processes maxprocsRdpResendTimeout ms RdpTimeout msStatusTimeout ms RdpClosedTimeout msSee also ErrorsParaStation5 Administrators Guide Options PsiadminSynopsis Extended description Standard ErrorStandard Input Standard OutputAllproc cnt count ExitAll Load Count hw hwDown HardwareQuit RdpSummary max max Maxproc nodes Accounters nodesUser nodes Group nodesNodesSort nodes Master nodesFreeOnSuspend nodes HandleOldBins nodesRlrss nodes Cpumap nodesRdpResendTimeout nodes RdpTimeout nodesStatusTimeout nodes RdpClosedTimeout nodesRestart nodes Resolve nodesPsiddebug mask nodes Selecttime time nodesPattern Name Description HandleOldBins 0 1 nodes Rdpmaxretrans val nodes RdpResendTimeout ms nodes RdpTimeout ms nodesStatusTimeout ms nodes RdpClosedTimeout ms nodesVerbose FilesQuiet NormalPsid Logfile=file Configfile=fileDebug=level ? , --usage Show a help message TestconfigFilename NumParaStation5 Administrators Guide Map TestnodesNp num Cnt numParaStation5 Administrators Guide Testpse -npnum TestpseParaStation5 Administrators Guide ?,--help P4statSock NetParaStation5 Administrators Guide Delete P4tcpAdd ParaStation5 Administrators Guide Pattern Description PsaccounterVar/account/yyyymmdd Accounting files, one per day DumpcoreCoredir=dir ?, --helpPsaccview Ls,--ltotsum Lj,--ljobsLu,--ltotuser Lg,--ltotgroupEnd CpuweightAqtime CputimeInitialization file Mlisten ParaStation5 Administrators Guide Appendix A. Quick Installation Guide Testing # /opt/parastation/bin/psiadmin psiadmin add# chkconfig -a /etc/init.d/parastation Appendix B. ParaStation license Page Page Page Changes to the runtime environment Building and installing ParaStation5 packages# psiadmin -s Appendix C. Upgrading ParaStation4 to ParaStation5Page ARP GlossarySee ParaStation Logger To share a common address space within a node ParaStation5 Administrators Guide