Send documentation comments to mdsfeedback-doc@cisco.com.
35-4
Cisco MDS 9000 Family Fabric Manager Configuration Guide
OL-6965-03, Cisco MDS SAN-OS Release 2.x
Chapter 35 Troubleshooting Your Fabric
Online System Health Management
Online System Health Management
The Online Health Management System (system health) is a hardware fault detection and recovery
feature. It ensures the general health of switching, services, and supervisor modules in any switch in the
Cisco MDS 9000 Family as of Cisco MDS SAN-OS Release 1.3(4) and later.
The system health application runs on all Cisco MDS modules and runs multiple tests on each module
to test individual module components and system hardware. The tests run at preconfigured intervals,
cover all major fault points, and isolate any failing component in the MDS switch. The system health
running on the active supervisor maintains control over all other system health components running on
all other modules in the switch. The system health application running in the standby supervisor module
only monitors the standby supervisor module—if that module is available in the HA standby mode.
On detecting a fault, the system health application attempts the following recovery actions:
•Sends Call Home and system messages and exception logs as soon as it detects a failure.
•Shuts down the failing module or component (such as an interface).
•Isolates failed ports from further testing.
•Reports the failure to the appropriate software component.
•Switches to the standby supervisor module if an error is detected on the active supervisor module,
and a standby supervisor module exists in the Cisco MDS switch. After the switchover, the new
active supervisor module restarts the active supervisor tests.
•Reloads the switch if a standby supervisor module does not exist in the switch.
•Provides CLI support to view, test, and obtain test run statistics or change the system health test
configuration on the switch.
•Performs tests to focus on the problem area.
•Retrieves its configuration information from persistent storage.
Each module is configured to run the test relevant to that module. You can change the default parameters
of the test in each module as required.
By default, the system health feature is enabled in each switch in the Cisco MDS 9000 Family.
Loopback Test Configuration Frequency
Loopback tests are designed to identify hardware errors in the data path in the module(s) and the control
path in the supervisors. One loopback frame is sent to each module at a preconfigured frequency–it
passes through each configured interface and returns to the supervisor module.
The loopback tests can be run at frequencies ranging from 5 seconds (default) to 255 seconds. The
configured value is used for all modules. To configure the frequency of loopback tests, refer to the Cisco
MDS 9000 Family Configuration Guide.
Performing Internal Loopbacks
Internal loopback tests send and receive FC2 frames to and from the same ports and provides the round
trip time taken in microseconds. These tests are available for both Fibre Channel and iSCSI interfaces.
Choose Interface > Diagnostics > Internal to perform an internal loopback test from Device Manager.