Disaster Scenarios and Their Handling

Table 4-1

Disaster Scenarios and Their Handling (Continued)

 

 

 

 

Disaster Scenario

What Happens When

Recovery Process

This Disaster Occurs

 

 

 

 

 

 

This is a multiple failure

The package (P1) continues

For the first failure scenario,

scenario where the failures

to run on N1 after the first

complete the following procedure

occur in a particular sequence in

failure, with md0 consisting

to initiate a recovery:

the configuration that

of only S1.

1. Restore the links in both

corresponds to figure 2 where

After the second failure, the

Ethernet and FC links do not go

directions between the data

package (P1) fails over to

over DWDM.

 

centers. As a result, S2

 

N2 and starts with S1.

 

 

(/dev/hpdev/mylink-sdf) is

The package (P1) is running on

Since S2 is also accessible,

accessible from N1 and S1 is

a node (N1). P1 uses a mirror

the extended distance

accessible from N2.

md0 consisting of S1 (local to

cluster adds S2 and starts

2. Run the following commands

node N1, say /dev/hpdev/

re-mirroring of S2.

mylink-sde) and S2 (local to

 

to remove and add S2 to md0

node N2).

 

 

on N1:

The first failure occurs with all

 

# mdadm --remove /dev/md0

FC links between the two data

 

/dev/hpdev/mylink-sdf

centers failing, causing N1 to

 

# mdadm --add /dev/md0

lose access to S2 and N2 to lose

 

 

/dev/hpdev/mylink-sdf

access to S1.

 

 

 

 

 

After recovery for the first

 

The re-mirroring process is

 

initiated. The re-mirroring process

failure has been initiated, the

 

 

starts from the beginning on N2

second failure occurs when

 

 

after the second failure. When it

re-mirroring is in progress and

 

 

completes, the extended distance

N1 goes down.

 

 

 

 

cluster detects S2 and accepts it as

 

 

 

 

 

 

part of md0 again.

 

 

 

 

Chapter 4

89