HP Serviceguard Continentalcluster 209

NOTE: The preceding steps are automated provided the default value of 1 is being used for the auto variable AUTO_PSUEPSUS. Once the Continuous Access link failure has been fixed, the user only needs to halt the package at the failover site and restart on the primary site. However, if you want to reduce the amount of application downtime, you should manually invoke pairresync before failback.

Full resynchronization must be manually initiated (as described in the next section) after repairing the following failures:

•failure of the recovery P9000 or XP disk array for a given application package followed by application startup on a primary host

•failure of all Continuous Access links with Fence Level NEVER or ASYNC with restart of the application on a primary host

Pairs must be manually recreated if both the primary and recovery P9000 or XP disk arrays are in the SMPL (simplex) state.

Make sure you periodically review the following files for messages, warnings and recommended actions. It is recommended to review these files after system, data center and/or application failures:

•/var/adm/syslog/syslog.log

•/etc/cmcluster/<package-name>/<package-name>.log

•/etc/cmcluster/<bkpackage-name/<bkpackage-name>.log

Using the pairresync Command

The pairresync command can be used with special options after a failover in which the recovery site has started the application and has processed transaction data on the disk at the recovery site, but the disks on the primary site are intact. After the Continuous Access link is fixed, depending on which site you are on, use the pairresync command in one of the following two ways:

•pairresync -swapp—from the primary site.

•pairresync -swaps—from the failover site.

These options take advantage of the fact that the recovery site maintains a bit-map of the modified data sectors on the recovery array. Either version of the command will swap the personalities of the volumes, with the PVOL becoming the SVOL and SVOL becoming the PVOL. With the personalities swapped, any data that has been written to the volume on the failover site (now PVOL) are then copied back to the SVOL, which is now running on the primary site. During this time the package continues running on the failover site. After resynchronization is complete, you can halt the package on the failover site, and restart it on the primary site. Metrocluster will then swap the personalities between the PVOL and the SVOL, returning PVOL status to the primary site.

Some Further Points

•This toolkit may increase package startup time by 5 minutes or more. Packages with many disk devices will take longer to start up than those with fewer devices due to the time needed to get device status from the P9000 or XP disk array or to synchronize.

NOTE: Long delays in package startup time will occur in those situations when recovering from broken pair affinity.

•The value of RUN_SCRIPT_TIMEOUT in the package ASCII file should be set to NO_TIMEOUT or to a large enough value to take into consideration the extra startup time due to getting status from the P9000 or XP disk array. (See the previous paragraph for more information on the extra startup time).

•Online cluster configuration changes may require a Raid Manager configuration file to be changed. Whenever the configuration file is changed, the Raid Manager instance must be

Completing and Running a Continentalclusters Solution with Continuous Access P9000 or XP 209