communication, such as a cable failure, can cause the controller to fail numerous disk drives. Once the loss of communication is resolved, the time to rebuild all the failed drives can take many weeks.
The rebuild journals contain bitmaps that indicate which portions of the disks in a tier have been updated with new data while a disk was failed or replaced. The system uses the information in the journals to reduce the rebuild time of drives that have not been swapped out. This can dramatically lower rebuild time, since only portions of the tier may have been updated while the drive was failed or replaced.
The granularity of the journal will be 4MB of data on a single disk or 32MB of host data. Thus a single host write will force the system to rebuild a minimum of 4MB of data on the disk. A new host write into a 4MB section that has already been journaled will not cause a new journal entry. The system will automatically update journals when disks are failed or replaced regardless of whether journaling is enabled.
To ensure that the journals are correct, the system carefully monitors the state of the journals and will automatically invalidate or disable the journals if it detects a condition where the journal cannot be used or journal information could potentially be lost.
The following summarizes the limitations that apply to journaling:
•Rebuild journaling will automatically be disabled if the failed disk is swapped with a new disk. The system will track the serial number of the disks when they are failed and will force a rebuild of the entire disk if the serial number changes.
•Rebuild journaling will not be used when a failed disk is replaced by a spare. The rebuild journal can be used when rebuilding a replaced disk that has not been swapped.
•The system will invalidate the journal on tiers that have failed or replaced disks on boot up. This is required because the system does not save the journal information.
•Rebuild journaling will be managed by the controller that owns the tier. If a controller is failed, then the journals on the tiers owned by that controller will be invalidated.
•The system tracks the original owner of a tier when a drive is failed so changing the ownership of the tier will disable use of the journal for rebuilds on that tier.
•Rebuild journaling will be disabled when rebuilding disks that are failed due to a change in the parity mode of the tier.
•Use of the rebuild journal will be temporarily disabled if the system is rebuilding a LUN that is a backup LUN in a mirror group.
70 |
|