|
|
| Disaster Scenarios and Their Handling | |
Table | Disaster Scenarios and Their Handling (Continued) | |||
|
|
|
| |
Disaster Scenario | What Happens When | Recovery Process | ||
This Disaster Occurs | ||||
|
|
| ||
|
|
| ||
In this case, initially the | With this failure, the | Complete the following steps to | ||
package (P1) is running on node | heartbeat exchange is lost | initiate a recovery: | ||
N1. P1 uses a mirror md0 | between N1 and N2. This | 1. You need to only restore the | ||
consisting of S1 (local to node | results in both nodes trying | |||
N1, for example |
| to get to the Quorum server. | Ethernet links between the | |
If N1 accesses the Quorum | data centers so that N1 and | |||
S2 (local to node N2). The first | N2 can exchange heartbeats | |||
server first, the package | ||||
failure occurs with all Ethernet | 2. After restoring the links, you | |||
continues to run on N1 with | ||||
links between the two data | ||||
S1 and S2 while N2 is | must add the node that was | |||
centers failing. |
| |||
| rebooted. If N2 accesses the | rebooted as part of the cluster. | ||
|
| |||
|
| Quorum server, the package | Run the cmrunnode command | |
|
| fails over to N2 and starts | to add the node to the cluster. | |
|
| running with both S1 and | NOTE: If this failure is a | |
|
| S2 and N1 is rebooted. | ||
|
| precursor to a site failure, and if | ||
|
|
| ||
|
|
| the Quorum Service arbitration | |
|
|
| selects the site that is likely to | |
|
|
| have a failure, it is possible that | |
|
|
| the entire cluster will go down. | |
|
|
| ||
In this case, initially the | With this failure, the | Complete the following procedure | ||
package (P1) is running on node | heartbeat exchange | to initiate a recovery: | ||
N1. P1 uses a mirror md0 | between N1 and N2 is lost. | 1. Restore the Ethernet links | ||
consisting of S1 (local to node | N2 accesses the Quorum | |||
N1, say |
| from N1 to the switch in data | ||
| server, as it is the only node | |||
center 1. | ||||
which has access to the | ||||
S2 (local to node N2). The first | 2. After restoring the links, you | |||
Quorum server. The | ||||
failure occurs when the | ||||
package fails over to N2 and | must add the node that was | |||
Ethernet links from N1 to the | ||||
starts running with both S1 | rebooted as part of the cluster. | |||
Ethernet switch in datacenter1 | ||||
and S2 while N1 gets | Run the cmrunnode command | |||
fails. |
| |||
| rebooted. | to add the node to the cluster. | ||
|
| |||
|
|
|
|
Chapter 4 | 95 |