Disaster Tolerance and Recovery in a Serviceguard Cluster
Managing a Disaster Tolerant Environment
Managing a Disaster Tolerant Environment
In addition to the changes in hardware and software to create a disaster tolerant architecture, there are also changes in the way you manage the environment. Configuration of a disaster tolerant architecture needs to be carefully planned, implemented and maintained. There are additional resources needed, and additional decisions to make concerning the maintenance of a disaster tolerant architecture:
•Manage it in-house, or hire a service?
Hiring a service can remove the burden of maintaining the capital equipment needed to recover from a disaster. Most disaster recovery services provide their own
Managing disaster recovery
•Implement automated or manual recovery?
Manual recovery costs less to implement and gives more flexibility in making decisions while recovering from a disaster. Evaluating the data and making decisions can add to recovery time, but it is justified in some situations, for example if applications compete for resources following a disaster and one of them has to be halted.
Automated recovery reduces the amount of time and in most cases eliminates human intervention needed to recover from a disaster. You may want to automate recovery for any number of reasons:
—Automated recovery is usually faster.
—Staff may not be available for manual recovery, as is the case with
—Reduction in human intervention is also a reduction in human error. Disasters don’t happen often, so lack of practice and the stressfulness of the situation may increase the potential for human error.
—Automated recovery procedures and processes can be transparent to the clients.
48 | Chapter 1 |