Alternate Configuration—Fast Reconfiguration with Low Node Member Timeout

High RAC-IC traffic may interfere with SG-HB traffic and cause unnecessary member timeout if Serviceguard cluster configuration parameter MEMBER_TIMEOUT is low. If MEMBER_TIMEOUT cannot be increased, use of an additional network dedicated for SG-HB alone avoids unnecessary member timeouts when RAC-IC traffic is high. This configuration is for a cluster with two or more nodes that have high RAC-IC traffic and/or need faster failover (a low value for Serviceguard configuration parameter MEMBER_TIMEOUT).

NOTE: Starting with Serviceguard A.11.19, the faster failover capability is in core Serviceguard. This configuration can be used for faster failover.

Figure 8 SG-HB/RAC-IC Traffic Separation

Each primary and standby pair protects against a single failure. With the SG-HB on more than one subnet, a single subnet failure will not trigger a Serviceguard reconfiguration. If the subnet with CSS-HB fails, unless subnet monitoring is used, CSS will resolve the interconnect subnet failure with a CSS cluster reconfiguration. It will wait for the CSS misscount time interval before handling the CSS-HB subnet failure (by bringing down the node on which the CSS-HB subnet has failed).

The default value of CSS misscount in SGeRAC configurations is 600 seconds.

As shown in Figure 8, CLUSTER_INTERCONNECT_SUBNET can be used in conjunction with the NODE_FAIL_FAST_ENABLED package configuration parameter to monitor the CSS-HB network. A failure of CSS-HB subnet on a node should be handled by bringing down that node. Therefore, set NODE_FAIL_FAST_ENABLED to YES for the package monitoring the CSS-HB subnet. If the monitored subnet fails, the failure of the CSS-HB subnet on a node will bring down the instance of the multi-node package and the node where the subnet has failed (When Oracle Clusterware is configured as a multi-node package and CLUSTER_INTERCONNECT_SUBNET is used to monitor the CSS-HB subnet).

A failure of CSS-HB subnet on all nodes will result in the multi-node package failing on the nodes one by one (resulting in that node going down), and one instance of the multi-node package and node will remain providing services to the clients.

Use a separate package to monitor only the CSS-HB subnet and have Oracle Clusterware multi-node package depend on the package monitoring the CSS-HB subnet. The NODE_FAIL_FAST_ENABLED parameter is set to NO for the Oracle Clusterware package, and is set to YES for the package

Cluster Communication Network Monitoring 37

Page 37
Image 37
HP Serviceguard Extension for RAC (SGeRAC) manual SG-HB/RAC-IC Traffic Separation