CHEETAH 15K.7 FC PRODUCT MANUAL, REV. E 16
Controlling S.M.A.R.T.
The operating mode of S.M.A.R.T. is controlled by the DEXCPT and PERF bits on the Informational Exceptions Control
mode page (1Ch). Use the DEXCPT bit to enable or disable the S.M.A.R.T. feature. Setting the DEXCPT bit disables all
S.M.A.R.T. functions. When enabled, S.M.A.R.T. collects on-line data as the drive performs normal read and write
operations. When the PERF bit is set, the drive is considered to be in “On-line Mode Only” and will not perform off-line
functions.
You can measure off-line attributes and force the drive to save the data by using the Rezero Unit command. Forcing
S.M.A.R.T. resets the timer so that the next scheduled interrupt is in two hours.
You can interrogate the drive through the host to determine the time remaining before the next scheduled measurement and
data logging process occurs. To accomplish this, issue a Log Sense command to log page 0x3E. This allows you to control
when S.M.A.R.T. interruptions occur. Forcing S.M.A.R.T. with the RTZ command resets the timer.
Performance impact
S.M.A.R.T. attribute data is saved to the disk so that the events that caused a predictive failure can be recreated. The drive
measures and saves parameters once every two hours subject to an idle period on the FC-AL bus. The process of
measuring off-line attribute data and saving data to the disk is uninterruptable. The maximum on-line only processing delay is
summarized below:
Reporting control
Reporting is controlled by the MRIE bits in the Informational Exceptions Control mode page (1Ch). Subject to the reporting
method, the firmware wil l issue to the host an 01-5Dxx sense code. The e rror code is preserved through bus reset s and
power cycles.
Determining rate
S.M.A.R.T. monitors the rate at which errors occur and signals a predictive failure if the rate of degraded errors increases to
an unacceptable level. To determine rate, error events are logged and compared to the number of total operations for a given
attribute. The interval defines the number of operations over which to measure the rate. The counter that keeps track of the
current number of operations is referred to as the Interval Counter.
S.M.A.R.T. measures error rates. All errors for each monitored attribute are recorded. A counter keeps track of the number of
errors for the current interval. This counter is referred to as the Failure Counter.
Error rate is the number of errors per operation. The algorithm that S.M.A.R.T. uses to record rates of error is to set
thresholds for the number of errors and their interval. If the number of errors exceeds the threshold before the interval
expires, the error rate is considered to be unacceptable. If the number of errors does not exceed the threshold before the
interval expires, the error rate is considered to be acceptable. In either case, the interval and failure counters are reset and
the process starts over.
Predictive failures
S.M.A.R.T. signals predictive failures when the drive is performing unacceptably for a period of time. The firmware keeps a
running count of the number of times the error rate for each attribute is unacceptable. To accomplish this, a counter is
incremented each time the error rate is unacceptable and decremented (not to exceed zero) whenever the error rate is
acceptable. If the counter continually increments such that it reaches the predictive threshold, a predictive failure is signaled.
This counter is referred to as the Failure History Counter. There is a separate Failure History Counter for each attribute.
Table 2:
Maximum processing delay
On-line only delay
DEXCPT = 0, PERF = 1 Fully-enabled delay
DEXCPT = 0, PERF = 0
S.M.A.R.T. delay times 42 milliseconds 163 milliseconds