2-8 SPARC Enterprise M4000/M5000 Servers Overview Guide • December 2010
2.4 Reliability,Availability, andServiceability
Reliability,availability, and serviceability (RAS) are aspects of the system design that
affect the ability of the system to:
Operate without stopping
Remain accessible and usable
Minimize the time necessary to service the system
TABLE2-2 defines each RAS feature.

2.4.1 Reliability

Reliability represents the length of time the midrange server can operate normally
without failure.
Toimprove quality, adequate components must be selected with consideration given
to the product service life and the required responsein case of a failure. In
evaluations such as stress tests that check the service life, components and products
are inspected to determine whether they meet the target reliability levels.
Reliability is equally important to both hardware and software. Naturally,
trouble-free software is desired,but eliminating all software problems is difficult.
Installing the functions below leads to reliability improvements in the field.
Cooperates with XSCF firmware to periodically check whether the software,
including the domain OS, is running (host watchdog monitoring).
TABLE2-2 RAS Definitions
RASFeature Description
Reliability Length of time the midrange server can operate normally without
failure.The ability to detectfailures with accuracy.
Availability Ratioof time during which the system is accessible and usable.
Serviceability Timerequired for the system to be recovered by specific
maintenance after a failure occurs.