Troubleshooting

1.Observe the problem.

2.Isolate the problem.

Observing the Problem

Take a step back and think about what is being seen. Ask the following questions:

If the SAN has been working, what has changed? Apart from a catastrophic hardware failure, SANs typically do not just stop working – something has changed. For example, a commonly overlooked SAN change is a switch firmware upgrade. Even something as seemingly innocuous as an upgrade to disk drive firmware in a RAID storage system can have unexpected effects on a SAN.

What is the observed behavior compared to the expected behavior? For example, a planned storage fail-over during system maintenance takes eight minutes to occur, but was expected to complete in two minutes. Observe behavior on two levels:

The overall problem. For example, users experienced an outage of eight minutes.

The exact problem. For example, the storage controller is reporting an error.

Is the expected behavior supported by storage and system providers? For example, is a manually initiated path fail-over of two minutes supported by the storage and path management software providers? A storage provider can only support two-minute failovers when their internal Redundant Array of Independent Disks (RAID) controller cache is disabled.

What are the exact symptoms? Make a list. Examples:

Mouse pointer stopped moving for 30 seconds immediately after failover was initiated, then went to an hour glass until the failover completed.

Path management software reported errors in the system error log 60 seconds after the failover was initiated.

Adapter FC link to switch dropped and did not recover immediately after failover was ini- tiated.

Is the problem repeatable? If yes, can it be repeated on a non-production test system? Collecting information such as system error logs is often needed. Since production systems are not generally set up to collect this information in normal operation, it is important to be able to configure the system to collect data and recreate the problem on a non-production test system.

What do the LEDs on the adapter indicate? Check the adapter and switch LEDs to determine the status. If the LEDs stop flashing or flash an error code, this indicates the adapter may need to be returned to Emulex for repair. See “LED Reference Information” on page 18.

Troubleshooting and Maintenance Manual for LightPulse Adapters

Page 4

Page 7
Image 7
Emulex Adapters manual Troubleshooting, Observing the Problem