5.2.4S.M.A.R.T.
S.M.A.R.T. is an acronym for
Note. The drive’s firmware monitors specific attributes for degradation over time but can’t predict instanta- neous drive failures.
Each monitored attribute has been selected to monitor a specific set of failure conditions in the operating per- formance of the drive and the thresholds are optimized to minimize “false” and “failed” predictions.
Controlling S.M.A.R.T.
The operating mode of S.M.A.R.T. is controlled by the DEXCPT and PERF bits on the Informational Exceptions Control mode page (1Ch). Use the DEXCPT bit to enable or disable the S.M.A.R.T. feature. Setting the DEX- CPT bit disables all S.M.A.R.T. functions. When enabled, S.M.A.R.T. collects
You can measure
You can interrogate the drive through the host to determine the time remaining before the next scheduled mea- surement and data logging process occurs. To accomplish this, issue a Log Sense command to log page 0x3E. This allows you to control when S.M.A.R.T. interruptions occur. Forcing S.M.A.R.T. with the RTZ command resets the timer.
Performance impact
S.M.A.R.T. attribute data is saved to the disk so that the events that caused a predictive failure can be recre- ated. The drive measures and saves parameters once every two hours subject to an idle period on the drive interfaces. The process of measuring
Maximum processing delay |
| |
| ||
| DEXCPT = 0, PERF = 1 | DEXCPT = 0, PERF = 0 |
S.M.A.R.T. delay times | 42 ms | 163 ms |
5.2.5Reporting control
Reporting is controlled by the MRIE bits in the Informational Exceptions Control mode page (1Ch). Subject to the reporting method, the firmware will issue to the host an
Determining rate
S.M.A.R.T. monitors the rate at which errors occur and signals a predictive failure if the rate of degraded errors increases to an unacceptable level. To determine rate, error events are logged and compared to the number of total operations for a given attribute. The interval defines the number of operations over which to measure the rate. The counter that keeps track of the current number of operations is referred to as the Interval Counter.
S.M.A.R.T. measures error rates. All errors for each monitored attribute are recorded. A counter keeps track of the number of errors for the current interval. This counter is referred to as the Failure Counter.
Error rate is the number of errors per operation. The algorithm that S.M.A.R.T. uses to record rates of error is to set thresholds for the number of errors and their interval. If the number of errors exceeds the threshold before
14 | Cheetah NS 10K.2 SAS Product Manual, Rev. C |