HP serviceguard t2808-90006 manual Managing a Disaster Tolerant Environment

Page 48

Disaster Tolerance and Recovery in a Serviceguard Cluster

Managing a Disaster Tolerant Environment

Managing a Disaster Tolerant Environment

In addition to the changes in hardware and software to create a disaster tolerant architecture, there are also changes in the way you manage the environment. Configuration of a disaster tolerant architecture needs to be carefully planned, implemented and maintained. There are additional resources needed, and additional decisions to make concerning the maintenance of a disaster tolerant architecture:

Manage it in-house, or hire a service?

Hiring a service can remove the burden of maintaining the capital equipment needed to recover from a disaster. Most disaster recovery services provide their own off-site equipment, which reduces maintenance costs. Often the disaster recovery site and equipment are shared by many companies, further reducing cost.

Managing disaster recovery in-house gives complete control over the type of redundant equipment used and the methods used to recover from disaster, giving you complete control over all means of recovery.

Implement automated or manual recovery?

Manual recovery costs less to implement and gives more flexibility in making decisions while recovering from a disaster. Evaluating the data and making decisions can add to recovery time, but it is justified in some situations, for example if applications compete for resources following a disaster and one of them has to be halted.

Automated recovery reduces the amount of time and in most cases eliminates human intervention needed to recover from a disaster. You may want to automate recovery for any number of reasons:

Automated recovery is usually faster.

Staff may not be available for manual recovery, as is the case with “lights-out” data centers.

Reduction in human intervention is also a reduction in human error. Disasters don’t happen often, so lack of practice and the stressfulness of the situation may increase the potential for human error.

Automated recovery procedures and processes can be transparent to the clients.

48

Chapter 1

Image 48
Contents Page Legal Notices Contents Disaster Scenarios and Their Handling Managing an MD Device Contents Contents Printing History Editions and ReleasesHP Printing Division Intended Audience Document OrganizationPage Related Page Disaster Tolerance Evaluating the Need for Disaster Tolerance Evaluating the Need for Disaster Tolerance What is a Disaster Tolerant Architecture? High Availability ArchitectureNode 1 fails Pkg B Client ConnectionsDisaster Tolerant Architecture Understanding Types of Disaster Tolerant Clusters Extended Distance ClustersFrom both storage devices Extended Distance Cluster Two Data Center Setup Benefits of Extended Distance Cluster Cluster Extension CLX Cluster Shows a CLX for a Linux Serviceguard cluster architecture CLX for Linux Serviceguard ClusterBenefits of CLX Differences Between Extended Distance Cluster and CLX Continental Cluster Los Angeles Cluster New York ClusterData Cent er a Data Center B Continental ClusterBenefits of Continentalclusters Comparison of Disaster Tolerant Solutions Continental Cluster With Cascading FailoverComparison of Disaster Tolerant Cluster Solutions Attributes Extended DistanceContinentalclusters Cluster HP-UX onlyUnderstanding Types of Disaster Tolerant Clusters Understanding Types of Disaster Tolerant Clusters Understanding Types of Disaster Tolerant Clusters WAN EVA Disaster Tolerant Architecture Guidelines Protecting Nodes through Geographic DispersionProtecting Data through Replication Off-line Data ReplicationOn-line Data Replication Physical Data ReplicationAdvantages of physical replication in hardware are Disadvantages of physical replication in hardware areAdvantages of physical replication in software are Disadvantages of physical replication in software are Logical Data ReplicationDisadvantages of logical replication are Using Alternative Power Sources Ideal Data ReplicationAlternative Power Sources Power Circuit 1 nodeData Center a Node 3 Power Circuit Creating Highly Available NetworkingDisaster Tolerant Local Area Networking Disaster Tolerant Wide Area NetworkingDisaster Tolerant Cluster Limitations Manage it in-house, or hire a service? Managing a Disaster Tolerant EnvironmentHow is the cluster maintained? Additional Disaster Tolerant Solutions Information Building an Extended Distance Types of Data Link for Storage Networking DwdmTwo Data Center and Quorum Service Location Architectures Two Data Center and Quorum Service Location Architectures Two Data Centers and Third Location with Dwdm and Quorum ServerTwo Data Center and Quorum Service Location Architectures Rules for Separate Network and Data Links Guidelines on Dwdm Links for Network and Data Guidelines on Dwdm Links for Network and Data Guidelines on Dwdm Links for Network and Data Chapter Configuring your Environment Understanding Software RAID Installing the Extended Distance Cluster Software Installing XDCSupported Operating Systems PrerequisitesVerifying the XDC Installation # rpm -Uvh xdc-A.01.00-0.rhel4.noarch.rpmInstalling the Extended Distance Cluster Software Configuring the Environment Configuring the Environment Configuring the Environment Configuring Multiple Paths to Storage Setting the Value of the Link Down Timeout ParameterCluster Reformation Time and Timeout Values Using Persistent Device Names Http//docs.hp.comCreating a Multiple Disk Device To Create and Assemble an MD Device# mdadm -A -R /dev/md0 /dev/hpdev/sde1 /dev/hpdev/sdf1 Chapter Linux #RAIDTAB= # MD RAID Commands Creating and Editing the Package Control Scripts To Create a Package Control ScriptTo Edit the Datarep Variable To Edit the Xdcconfig File parameter To Configure the RAID Monitoring ServiceEditing the raid.conf File Cases to Consider when Setting Rpotarget RPO Target Definitions Chapter Multipledevices and Componentdevices Raidmonitorinterval Configuring your Environment for Software RAID What happens when this disaster occurs Recovery ProcessDisaster Scenario Disaster Scenarios and Their Handling Disaster Scenarios and Their Handling# mdadm --remove /dev/md0 # mdadm -add /dev/md0 Dev/hpdev/mylink-sdf P1 uses a mirror md0 Run the following command to S2 is non-current by less # cmrunpkg packagename Execute the commands that With md0 consisting of only N1, for example Becomes accessible from N2 Center Disaster Scenarios and Their Handling Managing an MD Device Viewing the Status of the MD Device Cat /proc/mdstatStopping the MD Device Example A-1 Stopping the MD Device /dev/md0Starting the MD Device Example A-2 Starting the MD Device /dev/md0Removing and Adding an MD Mirror Component Disk # udevinfo -q symlink -n sdc1Adding a Mirror Component Device # mdadm --remove /dev/md0 /dev/hpdev/sdeIndex 104