Hard Drive Installation and Replacement

Drive Failure During Rebuild

If another drive in the array fails while fault tolerance is unavailable during rebuild, a fatal system error may occur. If this happens, all data on the array is lost. In exceptional cases, however, failure of another drive need not lead to a fatal system error. These exceptions include:

Failure after activation of a spare drive

Failure of a drive that is not mirrored to any other failed drives (in a RAID 1+0 configuration)

Failure of a second drive in a RAID ADG configuration

Minimizing Fatal System Errors During Rebuild

When a hard drive is replaced, the controller gathers fault-tolerance data from the remaining drives in the array. This data is then used to rebuild the missing data (originally on the failed drive) to the replacement drive. If more than one drive is removed at a time, the fault-tolerance data is incomplete. The missing data cannot then be reconstructed and is likely to be permanently lost.

To minimize the likelihood of fatal system errors, take these precautions when removing failed drives:

Do not remove a degraded drive if any other member of the array is offline (the Online LED is off). In this condition, no other drive in the array can be removed without data loss.

There are some exceptions:

When RAID 1+0 is used, drives are mirrored in pairs. Several drives can be in a failed condition simultaneously (and they can all be replaced simultaneously) without data loss, as long as no two failed drives belong to the same mirrored pair.

When RAID ADG is used, two drives can fail simultaneously (and be replaced simultaneously) without data loss.

If an online spare has an unlit Online LED (it is offline), the degraded drive can still be replaced.

E-8

HP Smart Array 641/642 Controller User Guide