Remote Copy Link Failure and Resume Modes

When the link is failed, snapshots are created for all the primary volumes, but not for the secondary volumes while replication is stopped. When replication is restarted for the volume, all differences between the base volume and the snapshot taken when the replication was stopped are sent over in order to resynchronize the secondary volume with the primary volume.

When the Remote Copy links are recovered, HP 3PAR Remote Copy automatically restarts the replication if the auto_recover policy is set. If the auto_recover policy is not set, when the links are restored, you can copy any writes from the primary to the secondary groups by running the startrcopygroup command on the system that holds the primary group to resynchronize the primary and secondary groups.

Restoring replication after a failover

When the primary package fails over to the remote site and the links are not up or the primary storage system is not up, Metrocluster runs the setrcopygroup failover command. This command changes the role of the Remote Copy volume group on the storage system in the recovery site from Secondary to Primary-Rev. In this role, the data is not replicated from the recovery site to the primary site. After the links are restored or the primary storage system is restored, manually run the setrcopygroup recover command on the storage system in the recovery site to resynchronize the data from the recovery site to the primary site. This results in the change of the role of the Remote Copy volume group on the storage system in the primary site from "Primary" to "Secondary-Rev".

CAUTION: When the roles are Secondary-Revand Primary-Rev, a disaster on the recovery site results in a failure of the Metrocluster package. To avoid this, immediately halt the package on the recovery site and start it up on the primary site. This will restore the role of the Remote Copy volume group to its original role of Primary and Secondary.

Administering Continentalclusters using SADTA configuration

This section elaborates the procedures that must be followed to administer a SADTA configuration in which complex workloads other than Oracle RAC are configured.

Maintaining a Node

To perform maintenance procedures on a cluster node, the node must be removed from the cluster. Run the cmhaltnode -fcommand to move the node out of the cluster. This command halts the complex workload package instance running on the node. As long as there are other nodes in the site and the Site Controller Package is still running on the site, the site aware disaster recovery workload continues to run with one less instance on the same site.

Once the node maintenance procedures are complete, join the node to the cluster using the cmrunnode command. If the Site Controller Package is running on the site that the node belongs to, the active complex-workload package instances on the site must be manually started on the restarted node since the auto_run flag is set to no.

Prior to halting a node in the cluster, the Site Controller Package must be moved to a different node in the site. However, if the node that needs to be halted in the cluster is the last surviving node in the site, then the Site Controller Packages running on this node must be moved to the other site. In such scenarios, the site aware disaster recovery workload must be moved to the remote site before halting the node in the cluster. For more information on moving a site aware disaster recovery complex workload to a remote cluster, see the section “Moving a Complex Workload to the Recovery Cluster” (page 70).

Maintaining the Site

Maintenance operation at a site might require that all the nodes on that site are down. In such scenarios, the site aware disaster tolerant workload can be started on the other site in the recovery

Administering Continentalclusters using SADTA configuration 67

Page 67
Image 67
HP Serviceguard Metrocluster manual Administering Continentalclusters using Sadta configuration, Maintaining a Node