HP serviceguard t2808-90006 manual Differences Between Extended Distance Cluster and CLX

Page 26

Disaster Tolerance and Recovery in a Serviceguard Cluster

Understanding Types of Disaster Tolerant Clusters

•Disk resynchronization is independent of CPU failure (that is, if the hosts at the primary site fail but the disk remains up, the disk knows it does not have to be resynchronized).

Differences Between Extended Distance Cluster and CLX

The major differences between an Extended Distance Cluster and a CLX cluster are:

•The methods used to replicate data between the storage devices in the two data centers. The two basic methods available for replicating data between the data centers for Linux clusters are either host-based or storage array-based. Extended Distance Cluster always uses host-based replication (MD mirroring on Linux). Any (mix of) Serviceguard supported Fibre Channel storage can be implemented in an Extended Distance Cluster. CLX always uses array-based replication/mirroring, and requires storage from the same vendor in both data centers (that is, a pair of XPs with Continuous Access, or a pair of EVAs with Continuous Access).

•Data centers in an Extended Distance Cluster can span up to 100km, whereas the distance between data centers in a Metrocluster is defined by the shortest of the following distances:

—Maximum distance that guarantees a network latency of no more than 200ms

—Maximum distance supported by the data replication link

—Maximum supported distance for DWDM as stated by the provider

•In an Extended Distance Cluster, there is no built-in mechanism for determining the state of the data being replicated. When an application fails over from one data center to another, the package is allowed to start up if the volume group(s) can be activated. A CLX implementation provides a higher degree of data integrity; that is, the application is only allowed to start up based on the state of the data and the disk arrays.

It is possible for data to be updated on the disk system local to a server running a package without remote data being updated. This happens if the data link between sites is lost, usually as a precursor to a site going down. If that occurs and the site with the latest data then goes down, that data is lost. The period of time from the link lost to the site going down is called the "recovery point". An

26

Chapter 1

Image 26

Contents Page Legal Notices Contents Disaster Scenarios and Their Handling Managing an MD Device Contents Contents Printing History Editions and ReleasesHP Printing Division Intended Audience Document OrganizationPage Related Page Disaster Tolerance Evaluating the Need for Disaster Tolerance Evaluating the Need for Disaster Tolerance Node 1 fails What is a Disaster Tolerant Architecture?High Availability Architecture Pkg B Client ConnectionsDisaster Tolerant Architecture Understanding Types of Disaster Tolerant Clusters Extended Distance ClustersFrom both storage devices Extended Distance Cluster Two Data Center Setup Benefits of Extended Distance Cluster Cluster Extension CLX Cluster Shows a CLX for a Linux Serviceguard cluster architecture CLX for Linux Serviceguard ClusterBenefits of CLX Differences Between Extended Distance Cluster and CLX Continental Cluster Data Cent er a Data Center B Los Angeles ClusterNew York Cluster Continental ClusterBenefits of Continentalclusters Comparison of Disaster Tolerant Solutions Continental Cluster With Cascading FailoverContinentalclusters Comparison of Disaster Tolerant Cluster SolutionsAttributes Extended Distance Cluster HP-UX onlyUnderstanding Types of Disaster Tolerant Clusters Understanding Types of Disaster Tolerant Clusters Understanding Types of Disaster Tolerant Clusters WAN EVA Disaster Tolerant Architecture Guidelines Protecting Nodes through Geographic DispersionProtecting Data through Replication Off-line Data ReplicationOn-line Data Replication Physical Data ReplicationAdvantages of physical replication in hardware are Disadvantages of physical replication in hardware areAdvantages of physical replication in software are Disadvantages of physical replication in software are Logical Data ReplicationDisadvantages of logical replication are Using Alternative Power Sources Ideal Data ReplicationData Center a Node 3 Power Circuit Alternative Power SourcesPower Circuit 1 node Creating Highly Available NetworkingDisaster Tolerant Local Area Networking Disaster Tolerant Wide Area NetworkingDisaster Tolerant Cluster Limitations Manage it in-house, or hire a service? Managing a Disaster Tolerant EnvironmentHow is the cluster maintained? Additional Disaster Tolerant Solutions Information Building an Extended Distance Types of Data Link for Storage Networking DwdmTwo Data Center and Quorum Service Location Architectures Two Data Center and Quorum Service Location Architectures Two Data Centers and Third Location with Dwdm and Quorum ServerTwo Data Center and Quorum Service Location Architectures Rules for Separate Network and Data Links Guidelines on Dwdm Links for Network and Data Guidelines on Dwdm Links for Network and Data Guidelines on Dwdm Links for Network and Data Chapter Configuring your Environment Understanding Software RAID Supported Operating Systems Installing the Extended Distance Cluster SoftwareInstalling XDC PrerequisitesVerifying the XDC Installation # rpm -Uvh xdc-A.01.00-0.rhel4.noarch.rpmInstalling the Extended Distance Cluster Software Configuring the Environment Configuring the Environment Configuring the Environment Configuring Multiple Paths to Storage Setting the Value of the Link Down Timeout ParameterCluster Reformation Time and Timeout Values Using Persistent Device Names Http//docs.hp.comCreating a Multiple Disk Device To Create and Assemble an MD Device# mdadm -A -R /dev/md0 /dev/hpdev/sde1 /dev/hpdev/sdf1 Chapter Linux #RAIDTAB= # MD RAID Commands To Edit the Datarep Variable Creating and Editing the Package Control ScriptsTo Create a Package Control Script Editing the raid.conf File To Edit the Xdcconfig File parameterTo Configure the RAID Monitoring Service Cases to Consider when Setting Rpotarget RPO Target Definitions Chapter Multipledevices and Componentdevices Raidmonitorinterval Configuring your Environment for Software RAID Disaster Scenario What happens when this disaster occursRecovery Process Disaster Scenarios and Their Handling Disaster Scenarios and Their Handling# mdadm --remove /dev/md0 # mdadm -add /dev/md0 Dev/hpdev/mylink-sdf P1 uses a mirror md0 Run the following command to S2 is non-current by less # cmrunpkg packagename Execute the commands that With md0 consisting of only N1, for example Becomes accessible from N2 Center Disaster Scenarios and Their Handling Managing an MD Device Viewing the Status of the MD Device Cat /proc/mdstatStopping the MD Device Example A-1 Stopping the MD Device /dev/md0Starting the MD Device Example A-2 Starting the MD Device /dev/md0Removing and Adding an MD Mirror Component Disk # udevinfo -q symlink -n sdc1Adding a Mirror Component Device # mdadm --remove /dev/md0 /dev/hpdev/sdeIndex 104