#lsid

Platform LSF HPC 6.2 for SLURM, LSF_build_date

Copyright 1992-2005 Platform Computing Corporation

My cluster name is hptclsf

My master name is lsfhost.localdomain

2.Verify that the lsf partition exists and all nodes are in the idle state:

#sinfo

PARTITION AVAIL TIMELIMIT NODES STATE NODELIST

lsf

up infinite

8 idle n[1-8]

3.Confirm that the ncpus value matches the expected total number of available processors:

# lshosts

 

 

 

 

 

 

 

HOST_NAME

type

model

cpuf ncpus maxmem maxswp server

RESOURCES

lsfhost.loc SLINUX6

Opteron8

16.0

60 3649M

-

Yes

(slurm)

4.Verify the dynamic resource information:

# bhosts

 

 

 

 

 

 

 

 

HOST_NAME

STATUS

JL/U

MAX

NJOBS

RUN

SSUSP

USUSP

RSV

lsfhost.localdomai

ok

-

16

0

0

0

0

0

See the troubleshooting information in the HP XC System Software Administration Guide if you do not receive a status of ok from the bhosts command.

5.11 Running the OVP to Verify Software and Hardware Components

The Operation Verification Program (OVP) verifies the major HP XC software and hardware components to provide a level of confidence that the system has been installed and configured correctly.

The OVP performs tests to verify the following:

The interconnect is functional.

Network connectivity has been established.

The administration network is operational.

A valid license key file is installed and the license manager servers are up.

All compute nodes are responding and are available to run applications.

SLURM control daemons are responding and partitioning is valid if LSF-HPC with SLURM is configured.

CPU usage on all nodes except the head node (by default).

Memory usage on all compute nodes except the head node (by default).

Start the Operation Verification Program

To start the OVP, follow these steps:

1.Login as the root user on the head node.

2.Start the OVP with no component-specific options to test the entire system:

# ovp [--verbose [--verbose]] [--timeout=0]

3.Follow along with the OVP command output.

4.Examine the test results to ensure that all tests passed. Test results are stored in a date-stamped log file located in the /hptc_cluster/adm/logs/ovp directory.

Test failures and warnings are clearly reported in the log file, and it contains some troubleshooting information. In some cases, the errors might be obvious, and the test output is terse.

The format of the OVP log file name includes the following:

The internal name of the head node.

The OVP run date in MMDDYYformat.

36 XC Software Installation