Failure of the entire secondary Data Center for a given application package

Failure of the secondary P9000 or XP Series disk array for a given application package while the application is running on a primary host

Following is a partial list of failures that require full resynchronization to restore disaster-tolerant data protection. Full resynchronization is automatically initiated for these failures by moving the application package back to its primary host after repairing the failure:

Failure of the entire primary data center for a given application package

Failure of all of the primary hosts for a given application package

Failure of the primary P9000 or XP Series disk array for a given application package

Failure of all Continuous Access links with restart of the application on a secondary host

Pairs must be manually recreated if both the primary and secondary P9000 or XP Series disk array are in SMPL (simplex) state. Make sure you periodically review the files syslog.log and /etc/cmcluster/pkgname/pkgname.log for messages, warnings and recommended actions. It is recommended to review these files after system, data center, or application failures.

Full resynchronization must be manually initiated after repairing the following failures:

Failure of the secondary P9000 or XP Series disk array for a given application package followed by application startup on a primary host

Failure of all Continuous Access links with Fence Level NEVER and ASYNC with restart of the application on a primary host

Using the pairresync Command

The pairresync command can be used with special options; after a failover in which the recovery site has started the application, and has processed transaction data on the disk at the recovery site, but the disks on the primary site are intact. After the Continuous Access link is fixed, use the pairresync command in one of the following two ways depending on which site you are on:

pairresync -swapp—from the primary site.

pairresync -swaps—from the failover site.

These options take advantage of the fact that the recovery site maintains a bit-map of the modified data sectors on the recovery array. Either version of the command will swap the personalities of the volumes, with the PVOL becoming the SVOL and SVOL becoming the PVOL. With the personalities swapped, any data that has been written to the volume on the failover site (now PVOL) are then copied back to the SVOL (now running on the primary site). During this time the package continues running on the failover site. After resynchronization is complete, you can halt the package on the failover site, and restart it on the primary site. Metrocluster will then swap the personalities between the PVOL and the SVOL, returning PVOL status to the primary site.

NOTE: The preceding steps are automated provided the default value of 1 is being used for the auto variable AUTO_PSUEPSUS. Once the Continuous Access link failure has been fixed, the user only needs to halt the package on the target disk site and restart on the source disk site. However, if you want to reduce the amount of application downtime, you should manually invoke pairresync before failback.

Failback

After resynchronization is complete, you can halt the package on the failover site, and restart it on the primary site. Metrocluster will then swap the personalities between the PVOL and the SVOL, returning PVOL status to the primary site.

Completing and Running a Metrocluster Solution with Continuous Access P9000 or XP 195