IBM DS8000 manual Predictive Failure Analysis PFA, Floating spares, Hot plugable DDMs

Models: DS8000

1 450
Download 450 pages 61.48 Kb
Page 100
Image 100

On the ESS 800 the spare creation policy was to have four DDMs on each SSA loop for each DDM type. This meant that on a specific SSA loop it was possible to have 12 spare DDMs if you chose to populate a loop with three different DDM sizes. With the DS8000 the intention is to not do this. A minimum of one spare is created for each array site defined until the following conditions are met:

￿A minimum of 4 spares per DA pair

￿A minimum of 4 spares of the largest capacity array site on the DA pair

￿A minimum of 2 spares of capacity and RPM greater than or equal to the fastest array site of any given capacity on the DA pair

Floating spares

The DS8000 implements a smart floating technique for spare DDMs. On an ESS 800, the spare floats. This means that when a DDM fails and the data it contained is rebuilt onto a spare, then when the disk is replaced, the replacement disk becomes the spare. The data is not migrated to another DDM, such as the DDM in the original position the failed DDM occupied. So in other words, on an ESS 800 there is no post repair processing.

The DS8000 microcode may choose to allow the hot spare to remain where it has been moved, but it may instead choose to migrate the spare to a more optimum position. This will be done to better balance the spares across the DA pairs, the loops, and the enclosures. It may be preferable that a DDM that is currently in use as an array member be converted to a spare. In this case the data on that DDM will be migrated in the background onto an existing spare. This process does not fail the disk that is being migrated, though it does reduce the number of available spares in the DS8000 until the migration process is complete.

A smart process will be used to ensure that the larger or higher RPM DDMs always act as spares. This is preferable because if we were to rebuild the contents of a 146 GB DDM onto a 300 GB DDM, then approximately half of the 300 GB DDM will be wasted since that space is not needed. The problem here is that the failed 146 GB DDM will be replaced with a new 146 GB DDM. So the DS8000 microcode will most likely migrate the data back onto the recently replaced 146 GB DDM. When this process completes, the 146 GB DDM will rejoin the array and the 300 GB DDM will become the spare again. Another example would be if we fail a 73 GB 15k RPM DDM onto a 146 GB 10k RPM DDM. This means that the data has now moved to a slower DDM, but the replacement DDM will be the same as the failed DDM. This means the array will have a mix of RPMs. This is not desirable. Again, a smart migrate of the data will be performed once suitable spares have become available.

Hot plugable DDMs

Replacement of a failed drive does not affect the operation of the DS8000 because the drives are fully hot plugable. Due to the fact that each disk plugs into a switch, there is no loop break associated with the removal or replacement of a disk. In addition there is no potentially disruptive loop initialization process.

4.6.5 Predictive Failure Analysis® (PFA)

The drives used in the DS8000 incorporate Predictive Failure Analysis (PFA) and can anticipate certain forms of failures by keeping internal statistics of read and write errors. If the error rates exceed predetermined threshold values, the drive will be nominated for replacement. Because the drive has not yet failed, data can be copied directly to a spare drive. This avoids using RAID recovery to reconstruct all of the data onto the spare drive.

78DS8000 Series: Concepts and Architecture

Page 100
Image 100
IBM DS8000 manual Predictive Failure Analysis PFA, Floating spares, Hot plugable DDMs