The purpose of the NETWORK_AUTO_FAILBACK parameter is to allow you control Serviceguard's behavior if you find network interfaces are experiencing a “ping-pong” effect, switching back and forth unduly between primary and standby interfaces. If you are not seeing any such problem, leave NETWORK_AUTO_FAILBACK set to YES. You can track switching behavior in the syslog file.

NOTE: The NETWORK_AUTO_FAILBACK setting applies only to link-level failures, not to failures

at the IP level; see “Monitoring LAN Interfaces and Detecting Failure: IP Level” (page 73) for more information about such failures. For more information about the cluster configuration file, see “Cluster Configuration Parameters ” (page 109).

Remote Switching

A remote switch (that is, a package switch) involves moving packages to a new system. In the most common configuration, in which all nodes are on the same subnet(s), the package IP (relocatable IP; see “Stationary and Relocatable IP Addresses ” (page 67)) moves as well, and the new system must already have the subnet configured and working properly, otherwise the packages will not be started.

NOTE: It is possible to configure a cluster that spans subnets joined by a router, with some nodes using one subnet and some another. This is called a cross-subnet configuration. In this context, you can configure packages to fail over from a node on one subnet to a node on another, and you will need to configure a relocatable address for each subnet the package is configured to start on; see “About Cross-Subnet Failover” (page 154), and in particular the subsection “Implications for Application Deployment” (page 155).

When a remote switch occurs, TCP connections are lost. TCP applications must reconnect to regain connectivity; this is not handled automatically. Note that if the package is dependent on multiple subnets (specified as monitored_subnets in the package configuration file), all those subnets must normally be available on the target node before the package will be started. (In a cross-subnet configuration, all subnets configured on that node, and identified as monitored subnets in the package configuration file, must be available.)

Note that remote switching is supported only between LANs of the same type. For example, a remote switchover between an Ethernet interface on one machine and an IPoIB interface on the failover machine is not supported. The remote switching of relocatable IP addresses is shown in Figure 14 and Figure 15.

Address Resolution Messages after Switching on the Same Subnet

When a relocatable IPv4 address is moved to a new interface, either locally or remotely, an ARP message is broadcast to indicate the new mapping between IP address and link layer address. An ARP message is sent for each IPv4 address that has been moved. All systems receiving the broadcast should update the associated ARP cache entry to reflect the change. Currently, the ARP messages are sent at the time the IP address is added to the new system. An ARP message is sent in the form of an ARP request. The sender and receiver protocol address fields of the ARP request message are both set to the same relocatable IP address. This ensures that nodes receiving the message will not send replies.

Unlike IPv4, IPv6 addresses use NDP messages to determine the link-layer addresses of their neighbors.

Monitoring LAN Interfaces and Detecting Failure: IP Level

In addition to monitoring network interfaces at the link level, Serviceguard can also monitor the IP level, checking Layer 3 health and connectivity for both IPv4 and IPv6 subnets. This is done by the IP Monitor, which is configurable: you can enable IP monitoring for any subnet configured into the cluster, but you do not have to monitor any. You can configure IP monitoring for a subnet, or turn off monitoring, while the cluster is running.

How the Network Manager Works 73

Page 73
Image 73
HP Serviceguard manual Monitoring LAN Interfaces and Detecting Failure IP Level, Remote Switching