Troubleshooting A Damaged Array
CAUTION: Many repairs may only be done by a certified service technician. You should only perform troubleshooting and simple repairs as authorized in your product documentation, or as directed by the online or telephone service and support team. Damage due to servicing that is not authorized by Dell is not covered by your warranty. Read and follow the safety instructions that came with the product.
1.Ensure that the following components are properly installed:
•Physical disks
•RAID controller modules
•Power supply modules
•Cooling fan module
2.Ensure that all the cables are properly connected and that there are no damaged pins in the connectors.
3.Run the diagnostics available in Dell PowerVault Modular Disk (MD) Storage Manager.
4.In the AMW, select a component in the Hardware pane of the Hardware tab.
5.Select Hardware → RAID Controller Module → Advanced → Run Diagnostics → RAID Controller Module.
Controller Failure Conditions
Certain events can cause a RAID controller module to fail and/or shut down. Unrecoverable ECC memory or PCI errors, or critical physical conditions can cause lockdown. If your RAID storage array is configured for redundant access and cache mirroring, the surviving controller can normally recover without data loss or shutdown.
Critical Conditions
The storage array generates a critical event if the RAID controller module detects a critical condition that could cause immediate failure of the array and/or loss of data. The storage array is in a critical condition if one of the following occurs:
•More than one fan has failed
•Any midplane temperature sensors in the critical range
•Midplane/power supply module failure
•Two or more temperature sensors are unreadable
•Failure to detect or unable to communicate with peer port
NOTE: If both RAID controller modules fail simultaneously, the enclosure cannot issue critical or noncritical event alarms for any enclosure component.
Noncritical Conditions
A noncritical condition is an event or status that does not cause immediate failure, but must be corrected to ensure continued reliability of the storage array. Examples of noncritical events include the following:
•One power supply module has failed
•One cooling fan module has failed
49