Service Processor System Monitoring - Surveillance
Surveillance is a function in which the Service Processor monitors the system, and
the system monitors the Service Processor. This monitoring is accomplished by peri-
odic samplings called heartbeats.
Surveillance is available during two phases:
1. System firmware bringup (automatic)
2. Operating system runtime (optional)
System Firmware Surveillance: Provides the Service Processor with a means
to detect boot failures while the system firmware is running.
System firmware surveillance is automatically enabled during system power-on. It
cannot be disabled via a user selectable option.
If the Service Processor detects no heartbeats during system IPL (for 7 minutes), it
cycles the system power to attempt a reboot. The maximum number of retries is set
from the Service Processor menus. If the fail condition repeats, the Service
Processor leaves the machine powered on, logs an error and offers menus to the
user. If Call-out is enabled, the Service Processor calls to report the failure and dis-
plays the operating system surveillance failure code on the operator panel.
Operating System Surveillance: Provides the Service Processor with a means
to detect hang conditions, hardware or software failures while the operating system is
running. It also provides the operating system with a means to detect the Service
Processor failure by the lack of a return heartbeat.
Operating system surveillance is enabled by default. This is to allow the user to run
operating systems that do not support this Service Processor option.
Operating system surveillance can be enabled and disabled via:
service processor Menus
service processor Service Aids
Three parameters must be set for operating system surveillance:
1. Surveillance enable/disable
2. Surveillance interval
This is the maximum time in minutes the Service Processor should wait for a
heartbeat from the operating system before timeout.
3-28 RS/6000 Enterprise Server Model H Series User's Guide