Following is a partial list of failures that require full resynchronization to restore disaster-tolerant data protection. Resynchronization is automatically initiated by moving the application package back to its primary host after repairing the failure.

Failure of the entire primary Data Center for a given application package.

Failure of all of the primary hosts for a given application package.

Failure of the primary P9000 and XP disk array for a given application package.

Failure of all Continuous Access links with application restart on a secondary host.

NOTE: The preceding steps are automated provided the default value of 1 is being used for the auto variable AUTO_PSUEPSUS. After the Continuous Access link failure is fixed, you must halt the package at the failover site and restart on the primary site. However, if you want to reduce downtime, you must manually invoke pairresync before failback.

Full resynchronization must be manually initiated (as described in the next section) after repairing the following failures:

Failure of the recovery P9000 and XP disk array for a given application package followed by application startup on a primary host.

Failure of all Continuous Access links with Fence Level NEVER or ASYNC with restart of the application on a primary host.

Pairs must be manually recreated if both the primary and recovery P9000 and XP disk arrays are in the SMPL (simplex) state.

Ensure you periodically review the following files for messages, warnings, and recommended actions. HP recommends to review these files after system, data center, and application failures.

/var/adm/syslog/syslog.log

/etc/cmcluster/<package-name>/<package-name>.log

/etc/cmcluster/<bkpackage-name/<bkpackage-name>.log

Using the pairresync command

The pairresync command can be used with special options after a failover in which the recovery site has started the application and has processed transaction data on the disk at the recovery site, but the disks on the primary site are intact. After the Continuous Access link is fixed, depending on which site you are on, use the pairresync command in one of the following two ways:

pairresync -swapp—from the primary site.

pairresync -swaps—from the failover site.

These options take advantage of the fact that the recovery site maintains a bit-map of the modified data sectors on the recovery array. Either version of the command will swap the personalities of the volumes, with the PVOL becoming the SVOL and SVOL becoming the PVOL. With the personalities swapped, data written to the volume on the failover site (now PVOL) are copied to the SVOL, which is now running on the primary site. During this time, the package continues running on the failover site. After resynchronization is complete, you can halt the package on the failover site, and restart it on the primary site. Metrocluster swaps the personalities between the PVOL and the SVOL, returning PVOL status to the primary site.

Additional points

This toolkit might increase package startup time by 5 minutes or more. Packages with many disk devices will take longer to start up than those with fewer devices because of the time required to get device status from the P9000 and XP disk array or to synchronize.

64 Administering Continentalclusters

Page 64
Image 64
HP Serviceguard Metrocluster manual Using the pairresync command