Error Handling Summary

Error handling during the power-on sequence falls into one of the following three cases:

If no errors are detected by POST or OpenBoot Diagnostics, the system attempts to boot if auto-boot?is true.

If only nonfatal errors are detected by POST or OpenBoot Diagnostics, the system attempts to boot if auto-boot?is true and auto-boot-on-error?is true. Nonfatal errors include the following:

Ethernet interface failure.

Serial interface failure.

PCI-Express card failure.

Memory failure. When a DIMM fails, the firmware unconfigures the entire logical bank associated with the failed module. Another nonfailing logical bank must be present in the system for the system to attempt a degraded boot. Note that certain DIMM failures might not be diagnosable to a single DIMM. These failures are fatal, and result in both logical banks being unconfigured.

Note – If POST or OpenBoot Diagnostics detect a nonfatal error associated with the normal boot device, the OpenBoot firmware automatically unconfigures the failed device and tries the next-in-line boot device, as specified by the boot-deviceconfiguration variable.

If a fatal error is detected by POST or OpenBoot Diagnostics, the system does not boot regardless of the settings of auto-boot?or auto-boot-on-error?. Fatal nonrecoverable errors include the following:

Any CPU failed

All logical memory banks failed

Flash RAM cyclical redundancy check (CRC) failure

Critical field-replaceable unit (FRU) PROM configuration data failure

Critical system configuration SEEPROM read failure

Critical application-specific integrated circuit (ASIC) failure

For more information about troubleshooting fatal errors, refer to the service manual for your server.

Reset Scenarios

Three ALOM CMT configuration variables, diag_mode, diag_level, and

diag_trigger, control whether the system runs firmware diagnostics in response to system reset events.

32 SPARC Enterprise T1000 Server Administration Guide • April 2007