#### #### definitions will help in making the decision #### ####

#### #### whether or not to issue the cmrecovercl(1m) #### ####

#### #### command. Each monitoring definition specifies#### ####

#### #### a cluster event along with the messages #### ####

#### #### that should be sent to system administrators #### ####

#### #### or other IT staff.#### ####

#### #### All messages are appended to the default log #### ####

#### #### /var/opt/resmon/log/cc/eventlog as well as to#### ####

#### #### the destination you specify below.

#### ####

#### #### A cluster event takes place when a monitor

#### ####

#### #### that is located on one cluster detects a

#### ####

#### #### significant change in the condition of

#### ####

#### #### another cluster. The monitored cluster

#### ####

#### #### conditions are:

#### ####

#### #### UNREACHABLE - the cluster is unreachable.

#### ####

#### #### This will occur when the communication link

#### ####

#### #### to the cluster has gone down, as in a WAN

#### ####

#### #### failure, or when the all nodes in the

#### ####

#### #### cluster have failed.

#### ####

#### #### DOWN - the cluster is down but nodes are

#### ####

#### #### responding. This will occur when the cluster #### ####

#### #### is halted, but some or all of the member

#### ####

#### #### nodes are booted and communicating with the

#### ####

#### #### monitoring cluster.

#### ####

#### #### UP - the cluster is up.

#### ####

#### #### ERROR - there is a mismatch of cluster

#### ####

#### #### versions or a security error.

#### ####

#### #### A change from one of these conditions to

#### ####

#### #### another one is a cluster event. You can

#### ####

#### #### define alert or alarm states based on the

#### ####

#### #### length of time since the cluster event was

#### ####

#### #### observed. Some events are noteworthy at the

#### ####

#### #### time they occur, and some are noteworthy

#### ####

#### #### when they persist over time. Setting the

#### ####

#### #### elapsed time to zero results in a message

#### ####

#### #### being sent as soon as the event takes place. #### ####

#### #### Setting the elaspsed time to 5 minutes results#### ####

#### #### in a message being sent

when the condition

#### ####

#### #### has persisted for 5 minutes.

#### ####

#### #### An alert is intended as informational only.

#### ####

#### #### Alerts may be sent for

any type of cluster

#### ####

#### #### condition. For an alert, a notification is

#### ####

#### #### sent to a system administrator or other

#### ####

#### #### destination. Alerts are

not intended to

#### ####

#### #### indicate the need for recovery. The

#### ####

#### #### cmrecovercl(1m) command

is disabled.

#### ####

#### ####

 

#### ####

#### ####

An alarm is an indication that a condition

####

#### ####

exists that may require recovery. For an

####

#### ####

alarm, a notification is sent, and in

####

#### ####

addition, the cmrecovercl(1m) command is

####

#### ####

enabled for immediate execution, allowing

####

#### ####

the administrator to carry out cluster

####

#### ####

recovery. An alarm can only be defined for

####

#### ####

an UNREACHABLE or DOWN condition in the

####

#### ####

monitored cluster.

####

#### ####

A notification defines a message that is

####

#### ####

appended to the log file

####

#### ####

/var/opt/resmon/log/cc/eventlog and sent

####

#### ####

to other specified destinations, including

####

#### ####

email addresses, SNMP traps, the system

####

#### ####

console, or the syslog file. The message

####

#### ####

string in a notification can be no more than

####

#### ####

170 characters. Enter notifications in one of

####

#### ####

the following forms:

####

Building the Continentalclusters Configuration

83