HP Cluster Test Software manual Inspectibfabric.pl

Page 54

node6 HCA-1

1

0

0

0

0

0

node8 HCA-1

1

0

0

0

0

0

node3 HCA-1

1

0

0

0

0

0

node4 HCA-1

1

0

0

0

0

0

node8 HCA-1

1

0

0

0

0

0

node7 HCA-1

1

0

0

0

0

0

inspect_ib_fabric.pl

Description – The inspect_ib_fabric.pl utility is provided as an additional tool for checking for errors in the InfiniBand fabric. This utility invokes ibnetdiscover and perfquery to detect components in the fabric and check their port counters. This information is then displayed in various formats, including one that shows errors on an InfiniBand link basis, depending on which output format flags are specified.

Usage –

#inspect_ip_fabric.pl [-(detailssummarylinkslinkerrsmappingfull)] [-scan=<file>] [-map=<file>] [-refresh] [-nocounters] [-swirate=<rate>] [hcarate=<rate>] [-rate=<rate>]

Output Format Options:

-details– Displays each InfiniBand switch and HCA, along with a list of active ports with their error counters. Includes GUID, lid, and total port count information.

-summary– Displays a single-line entry for each InfiniBand component detected in the fabric. Includes GUID, name, active/available port count, and total error count.

-links– Displays each physical link between the InfiniBand components in the fabric. Links are depicted by either a ‘<====>’, ‘<**==>’, ‘<==**>’, or ‘<****>’. A ‘**’ in the link depiction indicates an error on that side of the link. Links are displayed using the component name. Detected link speed is also shown.

-linkerrs– Displays only the links with errors and provides the detailed view of the link error.

-mapping– Displays each InfiniBand component along with the name being used to identify that component.

-full– Default. displays all the above formats.

Fabric Scan Options:

-scan=<file>– Specifies the ibnetdiscover input/output file. By default the output file is /opt/clustertest/logs/ibnetdiscover.log.

-map=<file>– Specifies a node-name map file to use with the -node-name-map ibnetdiscover option. This file is used to override the default description text that is tied to each GUID.

-refresh– When specifying an ibnetdiscover input file (-scan), this option skips running ibnetdiscover to generate a new file. Skips scanning the InfiniBand fabric.

-nocounters– Do not collect port counter information.

Expected Link Rate Options:

-swirate=<rate>– Sets the expected switch-to-switch link rate (for example, ‘4xDDR’).

-hcarate=<rate>– Sets the expected switch-to-HCA link rate (for example, ‘4xDDR’)

-rate=<rate>– Sets the expected switch-to-switch and switch-to-HCA link rate. The default expected link rate is ‘4xQDR’.

Naming and mapping – The inspect_ib_fabric.pl utility identifies GUIDs in the InfiniBand fabric by the description text common to other InfiniBand utilities and by a generated name. The generated name is in the format SWxxxyy or HCAxxxyy for switches and HCAs respectively.

Whenever possible, inspect_ib_fabric.pl attempts to group InfiniBand components together using the system GUID. If multiple components are detected in the fabric with the same system GUID, then they will use the same xxx identifier. The yy identifier is used to uniquely identify each component with the same system GUID. For example, if a switch with a fabric board and two line boards were discovered in the fabric utilizing the same system GUID, they would be named

54 Utility commands

Image 54
Contents HP Cluster Test Administration Guide January Contents Useful files and directories Utility commands Sample test outputDocumentation feedback Glossary Index CT Image using a network Varieties of Cluster TestCT Image RPM Files generated by Cluster Test Cluster Test GUIStarting Cluster Test Running cluster testsCluster Test GUI Running cluster tests Configuration settings Running tests in a batch Using scripts to run tests Running cluster tests Test descriptions CrissCrossMonitoring tests and viewing results Nodes monitoring windowTest output window Monitoring tests and viewing results Performance analysis Test report Checking the InfiniBand fabric Cluster Test toolbar menus Cluster Test toolbar menusAccelerator test GUI Starting accelerator testsFiles generated by accelerator test Running accelerator tests GPU detectionVerify BandWidth GPU Bandwidth Test Sgemm Single Precision General Matrix Multiply TestDgemm Double Precision General Matrix Multiply Test Memory TestNvidia Linpack Cuda Accelerated Linpack Benchmark Cluster Test procedure as recommended by HP Configuring Cluster Test when using RPMAdditional software Accelerator test procedure Cluster Test procedure as recommended by HP Cluster Test procedure # checkadm Cluster Test procedure Cluster Test procedure as recommended by HP Performance monitor Performance monitor utilityPerformance Monitor toolbar menu Xperf utility Cluster Test tools Hardware InventoryFirmware Summary Server health check Excluding the head node from tests Disk Scrubber Cluster Test tools Running tests in parallel Creating and changing per node files An example per-node directoryAn example cloned per-node directory NFS performance tuning NfsTroubleshooting Detecting new hardwareTroubleshooting Cluster Test Cluster Test Troubleshooting GuideIntended audience Support and other resourcesScope of this document Contacting HPDocumentation New and changed information in this editionRelated information WebsitesTypographic conventions Customer self repairCustomer self repair Useful files and directories Cluster Test Useful Files and DirectoriesUtility commands AnalyzeConrep Files generated by ibfabriccheck Inspectibfabric.pl Inspectibfabric.pl Utility commands Ipmitool Pdsh Sample test output CrissCrossSample test output Test4 Pallas Mpibyte Sample test output Stream Node24 Triad 3078.7949 3355 3488 3536 Disk Test CPULinpack UTKPassed Passed Passed Documentation feedback Glossary CMUIndex MPI Accelerator
Related manuals
Manual 25 pages 60.17 Kb