Hitachi Lightning 9900™ User and Reference Guide 9
1.2 Reliability, Availability, and Serviceability
The Lightning 9900™ subsystem is not expected to fail in any way that would interrupt user
access to data. The 9900 can sustain multiple component failures and still continue to
provide full access to all stored user data. Note: While access to user data is never
compromised, the failure of a key component can degrade performance.
The reliability, availability, and serviceability features of the 9900 subsystem include:
Full fault-tolerance. The 9900 subsystem provides full fault-tolerance capability for all
critical components. The disk drives are protected against error and failure by enhanced
RAID technologies and dynamic scrubbing and sparing. The 9900 uses component and
function redundancy to provide full fault-tolerance for all other subsystem components
(microprocessors, control storage, power supplies, etc.). The 9900 has no active single
point of component failure and is designed to provide continuous access to all user data.
Separate power supply systems. Each storage cluster is powered by a separate set of
power supplies. Each set can provide power for the entire subsystem in the unlikely
event of power supply failure. The power supplies of each set can be connected across
power boundaries, so that each set can continue to provide power if a power outage
occurs. The 9900 can sustain the loss of multiple power supplies and still continue
operation.
Dynamic scrubbing and sparing for disk drives. The 9900 uses special diagnostic
techniques and dynamic scrubbing to detect and correct disk errors. Dynamic sparing is
invoked automatically if needed. The 9960 can be configured with up to sixteen spare
disk drives, and any spare disk can back up any other disk of the same capacity, even if
the failed disk and spare disk are in different array domains (attached to different ACP
pairs).
Dynamic duplex cache. The 9900 cache is divided into two equal segments on separate
power boundaries. The 9900 places all write data in both cache segments with one
internal write operation, so the data is always duplicated (duplexed) across power
boundaries. If one copy of write data is defective or lost, the other copy is immediately
destaged to disk. This duplex design ensures full data integrity in the event of a cache or
power failure.
Remote copy features. The Hitachi TrueCopy and Hitachi Extended Remote Copy
(HXRC) data movement features enable the user to set up and maintain duplicate copies
of S/390® and open-system data over extended distances. In the event of a system
failure or site disaster, the secondary copy of data can be invoked rapidly, allowing
applications to be recovered with guaranteed data integrity.