Disaster Tolerance and Recovery in a Serviceguard Cluster

Understanding Types of Disaster Tolerant Clusters

Benefits of CLX

CLX offers a more resilient solution than Extended Distance Cluster, as it provides complete integration between Serviceguard’s application package and the data replication subsystem. The storage subsystem is queried to determine the state of the data on the arrays.

CLX knows that application package data is replicated between two data centers. It takes advantage of this knowledge to evaluate the status of the local and remote copies of the data, including whether the local site holds the primary copy or the secondary copy of data, whether the local data is consistent or not and whether the local data is current or not. Depending on the result of this evaluation, CLX decides if it is safe to start the application package, whether a resynchronization of data is needed before the package can start, or whether manual intervention is required to determine the state of the data before the application package is started.

CLX allows for customization of the startup behavior for application packages depending on your requirements, such as data currency or application availability. This means that by default, CLX will always prioritize data consistency and data currency over application availability. If, however, you choose to prioritize availability over currency, you can configure CLX to start up even when the state of the data cannot be determined to be fully current (but the data is consistent).

CLX XP supports synchronous and asynchronous replication modes, allowing you to prioritize performance over data currency between the data centers.

Because data replication and resynchronization are performed by the storage subsystem, CLX may provide significantly better performance than Extended Distance Cluster during recovery. Unlike Extended Distance Cluster, CLX does not require any additional CPU time for data replication, which minimizes the impact on the host.

There is little or no lag time writing to the replica, so the data remains current.

Data can be copied in both directions, so that if the primary site fails and the replica takes over, data can be copied back to the primary site when it comes back up.

Chapter 1

25