Table 6 Monitored States and Possible Causes (continued)

Cluster Event (Old state ->

Cluster-related Causes

Network-related Causes

New state)

 

 

 

 

 

Unreachable -> Up

Cluster nodes were rebooted and the cluster

Network came up and the cluster was

 

started

already running

 

 

 

Error -> Up

Error resolved, cluster is up

Network problem was fixed, cluster is

 

 

up

NOTE: There is only one condition under which cmclsentryd will determine that the cluster

has Error status: all nodes are unreachable except those which have Serviceguard Error status. (If any nodes are Down or Up, then the cluster status will take one of those values, rather than Error.)

Interpreting the Significance of Cluster Events

Because some cluster events (for example, Up -> Unreachable) can be caused by changes in either a cluster state or a network state, additional independent information is required to achieve the primary objective of determining whether you need to recover a cluster’s applications. Sources of independent information include:

Contact with the network provider

Contact with the administrator of the monitored cluster

Contact with local cluster administrator

Contact with company executives

When problematic cluster events persist, obtain as much information as possible, including authorization to recover, if your business practices require this, and then issue the Continentalclusters recovery command, cmrecovercl.

How Notifications Work

A central part of the operation of Continentalclusters is the transmission of notifications following the detection of a cluster event. Notifications occur at specifically coded times, and at two different levels:

Alert — when a cluster event should be considered noteworthy.

Alarm — when an event shows evidence of a cluster failure. Notifications are typically sent as:

Email messages

SNMP traps

Text log files

OPC messages to OpenView IT/Operations

In addition, notifications are sent to the eventlog file located in the /var/opt/resmon/log/

ccdirectory on the system where monitoring is taking place.

NOTE: An email message can be sent to an address supplied by a pager service that will forward the message to a specified pager system. (Contact your pager service provider for more information.)

Alerts

Alerts are intended as informational. Some typical uses of alerts include:

Notification that a cluster has been halted for a significant amount of time.

Notification that a cluster has come up after being down or unreachable.

Understanding Continentalclusters Concepts

41