If you insert a hot-pluggable drive into a drive bay while the system power is on, all disk activity in the array pauses for 1 or 2 seconds while the new drive is initializing. When the drive is ready, data recovery to the replacement drive begins automatically if the array is in a fault-tolerant configuration.

If you replace a drive belonging to a fault-tolerant configuration while the system power is off, a POST message appears when the system is next powered up. This message prompts you to press the F1 key to start automatic data recovery. If you do not enable automatic data recovery, the logical volume remains in a ready-to-recover condition and the same POST message appears whenever the system is restarted.

Before replacing drives

Open Systems Insight Manager, and inspect the Error Counter window for each physical drive in the same array to confirm that no other drives have any errors. For more information, see the Systems Insight Manager documentation on the Management CD.

Be sure that the array has a current, valid backup.

Confirm that the replacement drive is of the same type as the degraded drive (either SAS or SATA and either hard drive or solid state drive).

Use replacement drives that have a capacity equal to or larger than the capacity of the smallest drive in the array. The controller immediately fails drives that have insufficient capacity.

In systems that use external data storage, be sure that the server is the first unit to be powered down and the last unit to be powered up. Taking this precaution ensures that the system does not, erroneously, mark the drives as failed when the server is powered up.

In some situations, you can replace more than one drive at a time without data loss. For example:

In RAID 1+0 configurations, drives are mirrored in pairs. You can replace several drives simultaneously if they are not mirrored to other removed or failed drives.

In RAID 50 configurations, drives are arranged in parity groups. You can replace several drives simultaneously, if the drives belong to different parity groups. If two drives belong to the same parity group, replace those drives one at a time.

In RAID 6 configurations, you can replace any two drives simultaneously.

In RAID 60 configurations, drives are arranged in parity groups. You can replace several drives simultaneously, if no more than two of the drives being replaced belong to the same parity group.

To remove more drives from an array than the fault tolerance method can support, follow the previous guidelines for removing several drives simultaneously, and then wait until rebuild is complete (as indicated by the drive LEDs) before removing additional drives.

However, if fault tolerance has been compromised, and you must replace more drives than the fault tolerance method can support, delay drive replacement until after you attempt to recover the data (refer to "Recovering from compromised fault tolerance" on page 82).

Automatic data recovery (rebuild)

When you replace a drive in an array, the controller uses the fault-tolerance information on the remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced drive) and then write the data to the replacement drive. This process is called automatic data recovery or rebuild. If fault tolerance is compromised, the controller cannot reconstruct the data, and the data is likely lost permanently.

Drive procedures 83