LAN cards

Power sources

All cables

Disk interface cards

Some monitoring can be done through simple physical inspection, but for the most comprehensive monitoring, you should examine the system log file (/var/adm/syslog/syslog.log) periodically for reports on all configured HA devices. The presence of errors relating to a device will show the need for maintenance.

When the proper redundancy has been configured, failures can occur with no external symptoms. Proper monitoring is important. For example, if a Fibre Channel switch in a redundant mass storage configuration fails, LVM will automatically fail over to the alternate path through another Fibre Channel switch. Without monitoring, however, you may not know that the failure has occurred, since the applications are still running normally. But at this point, there is no redundant path if another failover occurs, so the mass storage configuration is vulnerable.

Using System Fault Management Service

The System Fault Management (SFM) is used to monitor the health of HP servers running HP-UX. SFM retrieves information about a system’s hardware devices such as CPU, memory, power supply, and cooling devices. SFM operates within the Web-Based Enterprise Management (WBEM) environment. WBEM is an industry-wide standards-based initiative to aid the management of large scale systems. SFM provides the same features and benefits as those found in the EMS Hardware Monitors.

These system devices can be monitored in Serviceguard by configuring generic resources. See “Using the Generic Resources Monitoring Service” (page 57).

See the System Fault Management Administrator Guide at http://www.hp.com/go/ hpux-diagnostics-docs.

Using Event Monitoring Service

Event Monitoring Service (EMS) allows you to configure monitors of specific devices and system resources. You can direct alerts to an administrative workstation where operators can be notified of further action in case of a problem. For example, you could configure a disk monitor to report when a mirror was lost from a mirrored volume group being used in the cluster.

See the manual Using High Availability Monitors at the address given in the preface to this manual.

Using EMS (Event Monitoring Service) Hardware Monitors

A set of hardware monitors is available for monitoring and reporting on memory, CPU, and many other system values. Some of these monitors are supplied with specific hardware products.

Hardware Monitors and Persistence Requests

When hardware monitors are disabled using the monconfig tool, associated hardware monitor persistent requests are removed from the persistence files. When hardware monitoring is re-enabled, the monitor requests that were initialized using the monconfig tool are re-created.

However, hardware monitor requests created using Serviceguard Manager, or established when Serviceguard is started, are not re-created. These requests are related to thepsmmon hardware monitor.

To re-create the persistence monitor requests, halt Serviceguard on the node, and then restart it. This will re-create the persistence monitor requests.

Monitoring Hardware 323

Page 323
Image 323
HP Serviceguard manual Using System Fault Management Service, Using Event Monitoring Service