commands in their external scripts. If a timeout is not specified and your configuration has a command loop as described above, inconsistent results can occur, including a hung cluster.

Determining Why a Package Has Shut Down

You can use an external script (or CUSTOMER DEFINED FUNCTIONS area of a legacy package control script) to find out why a package has shut down.

Serviceguard sets the environment variable SG_HALT_REASON in the package control script to one of the following values when the package halts:

failure - set if the package halts because of the failure of a subnet, resource, or service it depends on

user_halt - set if the package is halted by a cmhaltpkg or cmhaltnode command, or by corresponding actions in Serviceguard Manager

automatic_halt - set if the package is failed over automatically because of the failure of a package it depends on, or is failed back to its primary node automatically (failback_policy = automatic)

You can add custom code to the package to interrogate this variable, determine why the package halted, and take appropriate action. For legacy packages, put the code in the customer_defined_halt_cmds() function in the CUSTOMER DEFINED FUNCTIONS area of the package control script; see “Adding Customer Defined Functions to the Package Control Script ” (page 307). For modular packages, put the code in the package’s external script; see “About External Scripts” (page 151).

For example, if a database package is being halted by an administrator (SG_HALT_REASON set to user_halt) you would probably want the custom code to perform an orderly shutdown of the database; on the other hand, a forced shutdown might be needed if SG_HALT_REASON is set to failure, indicating thatthe package is halting abnormally (for example because of the failure of a service it depends on).

last_halt_failed

cmviewcl -v -f line displays a last_halt_failed flag.

NOTE: last_halt_failed appears only in the line output of cmviewcl, not the default tabular format; you must use the -vand -f line options to see it.

The value of last_halt_failed is no if the halt script ran successfully, or was not run since the node joined the cluster, or was not run since the package was configured to run on the node; otherwise it is yes.

About Cross-Subnet Failover

It is possible to configure a cluster that spans subnets joined by a router, with some nodes using one subnet and some another. This is known as a cross-subnet configuration; see “Cross-Subnet Configurations” (page 30). In this context, you can configure packages to fail over from a node on one subnet to a node on another.

The implications for configuring a package for cross-subnet failover are as follows:

For modular packages, you must configure two new parameters in the package configuration file to allow packages to fail over across subnets:

ip_subnet_node (page 243) - to indicate which nodes a subnet is configured on

monitored_subnet_access (page 241) - to indicate whether a monitored subnet is configured on all nodes (FULL) or only some (PARTIAL). (Leaving

154 Planning and Documenting an HA Cluster

Page 154
Image 154
HP Serviceguard manual About Cross-Subnet Failover, Determining Why a Package Has Shut Down, Lasthaltfailed