Error Reporting and Handling

Intel® Server Board SE7520JR2

system reset (ASR). The Sahalee BMC retains status bits that can be read by the BIOS later in the POST for the purpose of disabling the previously failing processor, logging the appropriate event into the System Event Log (SEL), and displaying an appropriate error message to the user.

Options are provided by the BIOS to control the policy applied to FRB2 failures. By default, an FRB2 failure results in the failing processor being disabled during the next reboot. This policy can be overridden to prevent BSP from ever being disabled due to the FRB2 failure or a policy resulting in disabling the BSP after three consecutive FRB2 failures can be selected. These options may be useful in systems that experience fatal errors during POST that are not indicative of a bad processor. Selection of this policy should be considered an advanced feature and should only be modified by a qualified system administrator. The mBMC does not support the option to disable the BSP.

6.1.3FRB3 – BSP Reset Failures

The BIOS and firmware provide a feature to guarantee that the system boots, even if one or more processors fail during POST. The Sahalee BMC contains two watchdog timers that can be configured to reset the system upon time-out. The first timer (FRB3) starts counting down whenever the system comes out of hard reset. With no Intel® Management Module, only one watchdog timer is present. If the BSP successfully resets and begins executing, the BIOS disables the FRB-3 timer in the BMC and the system continues executing POST. If the timer expires because of the BSP’s failure to fetch or execute BIOS code, the Sahalee BMC resets the system and disables the failed processor. The Sahalee BMC continues to change the bootstrap processor until the BIOS successfully disables the FRB3 timer. The BMC generates beep codes on the system speaker if it fails to find a good processor. It will continue to cycle until it finds a good processor. The process of cycling through all the processors is repeated upon system reset or power cycle. Soft resets do not affect the FRB3 timer. The duration of the FRB3 timer is set by system firmware. The mBMC also supports the algorithm described above, with the exception that it does not disable the processor and it will be logged as an FRB2 failure.

6.1.4OS Watchdog Timer - Operating System Load Failures

The OS Watchdog Timer feature is designed to allow watchdog timer protection of the operating system load process. This is done in conjunction with an operating system-present device driver or application that will disable the watchdog timer once the operating system has successfully loaded. If the operating system load process fails, the BMC will reset the system.

The BIOS shall disable the OS Watchdog Timer before handing control to the OS loader if it is determined to be booting from removable media or the BIOS cannot determine the media type.

If the BIOS is going to boot to a known hard drive, it will read a user option for the OS Watchdog Timer for HDD Boots. If this is disabled, the BIOS will ensure the watchdog timer is disabled and boot. Otherwise the BIOS will read the enabled time value from the option and set the OS Watchdog timer for that value (5, 10, 15, or 20 minutes) before trying to load the operating system. If the OS Watchdog Timer is enabled, the timer is repurposed as an OS Watchdog timer and is referred to by that title as well. WARNING: The BIOS may incorrectly determine that a removable media is a hard drive if the media emulates a hard drive. In this case, the OS Watchdog timer will not be automatically disabled.

150

Revision 1.0

 

C78844-002