If after cleaning up the node on which the timeout occurred it is desirable to have that node as an alternate for running the package, remember to re-enable the package to run on the node:

cmmodpkg -e-n<node-name><package-name>

If the package failed because of a failure in generic resource of evaluation type

during_package_start, then the status of the generic resource will be 'down'. The switching parameter will be disabled on the node.

To enable switching:

First fix the resource that went down.

Once the resource is available, set the status of the generic resource to 'up' by running: cmsetresource -r <resource_name> -s up

NOTE: If the resource is an extended resource, ensure to set the value for the generic resource that satisfies the generic_resource_up_criteria.

Re-enable the package to run on the node by running: cmmodpkg -e-n<node-name><package-name>

The default Serviceguard control scripts are designed to take the straightforward steps needed to get an application running or stopped. If the package administrator specifies a time limit within which these steps need to occur and that limit is subsequently exceeded for any reason, Serviceguard takes the conservative approach that the control script logic must either be hung or defective in some way. At that point the control script cannot be trusted to perform cleanup actions correctly, thus the script is terminated and the package administrator is given the opportunity to assess what cleanup steps must be taken.

If you want the package to switch automatically in the event of a control script timeout, set the node_fail_fast_enabled parameter (page 244) to yes. In this case, Serviceguard will cause the node where the control script timed out to halt (system reset). This effectively cleans up any side effects of the package's run or halt attempt (but remember that the system reset will cause all the packages running on that node to halt abruptly). In this case the package will be automatically restarted on any available alternate node for which it is configured. For more information, see “Responses to Package and Service Failures ” (page 90).

Problems with Cluster File System (CFS)

NOTE: Check the Serviceguard/SGeRAC/SMS/Serviceguard Manager Plug-in Compatibility and Feature Matrix and the latest Release Notes for your version of Serviceguard for up-to-date information about support for CFS (http://www.hp.com/go/hpux-serviceguard-docs).

If you have a system multi-node package for Veritas CFS, you may not be able to start the cluster until SG-CFS-pkgstarts. Check SG-CFS-pkg.logfor errors.

You will have trouble running the cluster if there is a discrepancy between the CFS cluster and the Serviceguard cluster. To check, use gabconfig -a.

The ports that must be up are:

1.a llt, gab

2.b vxfen

3.v w cvm

4.f cfs

336 Troubleshooting Your Cluster

Page 336
Image 336
HP Serviceguard manual Problems with Cluster File System CFS, Llt, gab Vxfen W cvm Cfs