Sun Microsystems CP3260 manual Automatic System Recovery

Models: CP3260

1 144
Download 144 pages 25.13 Kb
Page 88
Image 88
4.5Automatic System Recovery

Any CPU failed

All logical memory banks failed

Flash RAM cyclical redundancy check (CRC) failure

Critical field-replaceable unit (FRU) PROM configuration data failure

Critical application-specific integrated circuit (ASIC) failure

4.5Automatic System Recovery

Automatic system recovery (ASR) consists of self-test features and an autoconfiguration capability to detect failed hardware components and unconfigure them. By enabling ASR, the server is able to resume operating after certain nonfatal hardware faults or failures have occurred.

If a component is monitored by ASR and the server is capable of operating without it, the server automatically reboots if that component develops a fault or fails. This capability prevents a faulty hardware component from stopping operation of the entire system or causing the system to fail repeatedly.

If a fault is detected during the power-on sequence, the faulty component is disabled. If the system remains capable of functioning, the boot sequence continues.

To support this degraded boot capability, the OpenBoot firmware uses the 1275 client interface (by means of the device tree) to mark a device as either failed or disabled, creating an appropriate status property in the device tree node. The Solaris OS does not activate a driver for any subsystem marked in this way.

As long as a failed component is electrically dormant (not causing random bus errors or signal noise, for example), the system reboots automatically and resumes operation while a service call is made.

Once a failed or disabled device is replaced with a new one, the OpenBoot firmware automatically modifies the status of the device upon reboot.

Note – ASR is not enabled until you activate it (see Section 4.5.1.1, “To Enable Automatic System Recovery” on page 4-17).

4-16Netra CP3260 Blade Server User’s Guide • April 2009

Page 88
Image 88
Sun Microsystems CP3260 manual Automatic System Recovery