Disaster Scenarios and Their Handling

Table 4-1

Disaster Scenarios and Their Handling (Continued)

 

 

 

 

Disaster Scenario

What Happens When

Recovery Process

This Disaster Occurs

 

 

 

 

 

 

In this case, initially the

With this failure, the

Complete the following steps to

package (P1) is running on node

heartbeat exchange is lost

initiate a recovery:

N1. P1 uses a mirror md0

between N1 and N2. This

1. You need to only restore the

consisting of S1 (local to node

results in both nodes trying

N1, for example

 

to get to the Quorum server.

Ethernet links between the

/dev/hpdev/mylink-sde) and

If N1 accesses the Quorum

data centers so that N1 and

S2 (local to node N2). The first

N2 can exchange heartbeats

server first, the package

failure occurs with all Ethernet

2. After restoring the links, you

continues to run on N1 with

links between the two data

S1 and S2 while N2 is

must add the node that was

centers failing.

 

 

rebooted. If N2 accesses the

rebooted as part of the cluster.

 

 

 

 

Quorum server, the package

Run the cmrunnode command

 

 

fails over to N2 and starts

to add the node to the cluster.

 

 

running with both S1 and

NOTE: If this failure is a

 

 

S2 and N1 is rebooted.

 

 

precursor to a site failure, and if

 

 

 

 

 

 

the Quorum Service arbitration

 

 

 

selects the site that is likely to

 

 

 

have a failure, it is possible that

 

 

 

the entire cluster will go down.

 

 

 

In this case, initially the

With this failure, the

Complete the following procedure

package (P1) is running on node

heartbeat exchange

to initiate a recovery:

N1. P1 uses a mirror md0

between N1 and N2 is lost.

1. Restore the Ethernet links

consisting of S1 (local to node

N2 accesses the Quorum

N1, say

 

from N1 to the switch in data

 

server, as it is the only node

/dev/hpdev/mylink-sde) and

center 1.

which has access to the

S2 (local to node N2). The first

2. After restoring the links, you

Quorum server. The

failure occurs when the

package fails over to N2 and

must add the node that was

Ethernet links from N1 to the

starts running with both S1

rebooted as part of the cluster.

Ethernet switch in datacenter1

and S2 while N1 gets

Run the cmrunnode command

fails.

 

 

rebooted.

to add the node to the cluster.

 

 

 

 

 

 

Chapter 4

95