Q

C – Troubleshooting InfiniPath MPI Troubleshooting

C.6.1

Broken Intermediate Link

Sometimes message traffic passes through the fabric while other traffic appears to be blocked. In this case, MPI jobs fail to run.

In large cluster configurations, switches may be attached to other switches in order to supply the necessary inter-node connectivity. Problems with these inter-switch (or intermediate) links are sometime more difficult to diagnose than failure of the final link between a switch and a node. The failure of an intermediate link may allow some traffic to pass through the fabric while other traffic is blocked or degraded.

If you encounter such behavior in a multi-layer fabric, check that all switch cable connections are correct. Statistics for managed switches are available on a per-port basis, and may help with debugging. See your switch vendor for more information.

C.7

Performance Issues

Performance issues that are currently being addressed are covered in this section.

C.7.1

MVAPICH Performance Issues

MVAPICH over OpenFabrics over InfiniPath performance tuning has not yet been done. Improved performance will be delivered in future releases.

C.8

InfiniPath MPI Troubleshooting

Problems specific to compiling and running MPI programs are detailed below.

C.8.1

Mixed Releases of MPI RPMs

Make sure that all of the MPI RPMs are from the same release. When using mpirun, an error message will occur if different components of the MPI RPMs are from different releases. This is a sample message in the case where mpirun from release 1.3 is being used with a 2.0 library:

$ mpirun -np 2 -m ~/tmp/x2 osu_latency

MPI_runscript-xqa-14.0: ssh -x> Cannot detect InfiniPath interconnect.

MPI_runscript-xqa-14.0: ssh -x> Seek help on loading InfiniPath interconnect driver.

MPI_runscript-xqa-15.1: ssh -x> Cannot detect InfiniPath interconnect.

MPI_runscript-xqa-15.1: ssh -x> Seek help on loading InfiniPath interconnect driver.

MPIRUN: Node program(s) exited during connection setup

IB6054601-00 D

C-13

Page 87
Image 87
Q-Logic IB6054601-00 D manual Broken Intermediate Link, Mvapich Performance Issues, InfiniPath MPI Troubleshooting

IB6054601-00 D specifications

The Q-Logic IB6054601-00 D is a high-performance InfiniBand adapter card designed for data centers and enterprise applications requiring robust connectivity and low-latency communication. This adapter is part of QLogic's extensive portfolio of networking solutions, catering to the needs of high-performance computing (HPC), cloud computing, and virtualization environments.

One of the standout features of the IB6054601-00 D is its capability to support data transfer rates of up to 56 Gbps. This makes it ideal for applications demanding large bandwidth and quick data processing. The adapter is optimized for RDMA (Remote Direct Memory Access) technology, which allows data to be transferred directly between the memory of different computers without involving the CPU. This reduces latency and CPU overhead, leading to enhanced overall system performance.

The architecture of the IB6054601-00 D includes support for a dual-port design, which offers increased bandwidth, redundancy, and fault tolerance. This dual-port configuration is especially advantageous for environments that require high availability and reliability, such as financial services and mission-critical applications.

The adapter utilizes advanced error detection and correction mechanisms, ensuring that data integrity is maintained during transmission. With features like adaptive routing and congestion management, the IB6054601-00 D is capable of optimizing the handling of data flows, thereby enhancing performance even under heavy loads.

In terms of compatibility, the Q-Logic IB6054601-00 D supports a wide range of operating systems and virtualization technologies, making it easy to integrate into diverse IT environments. It also includes drivers and software packages that facilitate seamless deployment and management.

In addition to high-speed connectivity, the adapter is designed with power efficiency in mind. It adheres to Energy Star regulations, helping organizations lower their operational costs while minimizing their environmental footprint.

Overall, the Q-Logic IB6054601-00 D stands out for its high throughput, low latency, and reliability. Its combination of advanced features and technologies positions it as an excellent choice for organizations looking to enhance their data center performance and maximize the efficiency of their network infrastructure. With the growing demands for faster and more efficient data transfer, solutions like the IB6054601-00 D are essential in meeting the evolving needs of modern enterprises.