Sizing, Considerations, and Recommendations 37
5.2.3 Hard Disks
In today’s environment, hard disks are becoming more and more reliable, with an
average life expectancy around 100,000 hours. Your selection of hard disks
should provide you with the confidence and assurance of data integrity.
Unexpected data errors are a very serious issue in today’s business and disk
failures are annoying to customers, IT managers and system administrators alike.
So what if the disk could tell you it was having problems before it failed or before
that piece of data is unreadable? With today’s hard disk technology, businesses
run critical applications and store critical data while maintaining a 24x7x365
operation. IBM hard disks, have Self-Monitoring Analysis and Reporting
Technology (SMART) built in to them as standard to alert you of any potential
problems.
Predictive Failure Analysis (PFA) within the SMART specification provides early
warning of some hard-disk drive failures. This allows critical data to be protected.
SMART is the industry-standard reliability prediction indicator for hard disk
drives. IBM paved the way for SMART by marketing the industry’s first
failure-prediction capability for SCSI hard disk drives.
Regular backups, combined with SMART-capable hard-disk drives, help
safeguard against loss of data. There are two kinds of hard-disk drive failures:
unpredictable and predictable. As you might expect, unpredictable failures
happen quickly, without advance warning. These failures can be caused by static
electricity, handling damage or thermal-related solder problems. Predictable
failures, on the other hand, are the types of failures that SMART attempts to
detect. These failures result from the gradual degradation of the drive’s
performance.
SMART-capable drives use a variety of techniques to monitor data availability.
These techniques vary from one manufacturer to another. For example, a SMART
drive might monitor the fly height of the head above the magnetic media. If the
head starts to fly too high or too low, there’s a good chance the drive could fail.
Other drives might monitor different conditions, such as ECC circuitry on the
hard-drive card or soft-error rates. Depending on the circumstances, some drives
might monitor all or none of these conditions.
Internal hard drives also support the SCSI Accessed Fault Tolerant Enclosure
(SAF-TE) standard to protect hard drive data if failures occur. If one of IBM’s
SMART-capable drives predicts it is going to fail while it’s still under warranty,
IBM will repair or replace it at no additional cost to you. PFA keeps track of key
parameters of the drive over time. If any of these key parameters should exceed
its predetermined threshold, drive logic will alert the system of this event, so that
necessary remedial action can be taken before any physical failure occurs. The
alert can be sent to the systems management’s software, such as IBM Netfinity
Manager, where predetermined actions are addressed.
Predictive Failure Analysis will monitor parameters, such as:
Spindle motor problems (torque and speed)
Hardware problems (electronic and logic)
Channel problems (noise, asymmetry, pre-comp, DC offset)
Fly height change problems