Disaster Scenarios and Their Handling |
|
| ||
Table | Disaster Scenarios and Their Handling (Continued) | |||
|
|
|
| |
Disaster Scenario | What Happens When | Recovery Process | ||
This Disaster Occurs | ||||
|
|
| ||
|
|
| ||
In this case, the package (P1) | When the first failure | Complete the following steps to | ||
runs with | occurs, the package (P1) | initiate a recovery: | ||
seconds. |
| continues to run on N1 with | 1. Restore the FC links between | |
In this case, initially the | md0 consisting of only S1. | |||
| the data centers. As a result, | |||
package (P1) is running on node | When the second failure | |||
S2 | ||||
N1. P1 uses a mirror md0 | occurs, the package fails | |||
becomes available to N1 and | ||||
consisting of S1 (local to node | over to N2 and starts with | |||
S1 | ||||
N1, for example |
| S2. | ||
| becomes accessible from N2. | |||
| ||||
When N2 fails, the package | 2. To start the package P1 on N1, | |||
S2 (local to node N2). The first | ||||
does not start on node N1 | ||||
failure occurs when all FC links | check the package log file in | |||
because a package is | ||||
between the two data centers | the package directory and run | |||
allowed to start only once | ||||
fail, causing N1 to lose access to | the commands which will | |||
with a single disk. You must | ||||
S2 and N2 to lose access to S1. | appear to force a package | |||
repair this failure and both | ||||
|
| start. | ||
Immediately afterwards, a | disks must be synchronized | |||
When the package starts up on | ||||
second failure occurs where | and be a part of the MD | |||
node (N1) goes down because of | array before another failure | N1, it automatically adds S2 back | ||
a power failure. |
| of same pattern occurs. | into the array and the | |
After N1 is repaired and | In this failure scenario, only | |||
When | ||||
brought back into the cluster, | S1 is available to P1 on N1, | |||
the extended distance cluster | ||||
package switching of P1 to N1 is | as the FC links between the | |||
detects and accepts S1 as part of | ||||
enabled. |
| data centers are not | ||
| md0. | |||
|
| repaired. As P1 started once | ||
IMPORTANT: While it is not a |
| |||
with S2 on N2, it cannot |
| |||
good idea to enable package |
| |||
start on N1 until both disks |
| |||
switching of P1 to N1, it is |
| |||
are available. |
| |||
described here to show recovery |
| |||
|
| |||
from an operator error. |
|
|
The FC links between the data centers are not repaired and N2 becomes inaccessible because of a power failure.
94 | Chapter 4 |