440Administering sites and remote mirrors

Fire drill — testing the configuration

Fire drill — testing the configuration

Caution: To avoid potential loss of service or data, it is recommended that you do not use these procedures on a live system.

After validating that the consistency of the volumes and disk groups at your sites, you should validate the procedures that you will use in the event of the various possible types of failure. A fire drill allows you to test that a site can be brought up cleanly during recovery from a disaster scenario such as site failure.

Simulating site failure

To simulate the failure of a site, use the following command to detach all the devices at a specified site:

#vxdg -g diskgroup [-f]detachsite sitename

The -foption must be specified if any plexes configured on storage at the site are currently online.

Recovery from simulated site failure
Use the following commands to reattach a site and recover the disk group:
#vxdg -g diskgroup [-o overridessb] reattachsite sitename
#vxrecover -g diskgroup

It may be necessary to specify the -o overridessb option if a serial split-brain condition is indicated.

Automatic site reattachment

The site reattachment daemon, vxsited, provides automatic reattachment of sites. vxsited uses the vxnotify mechanism to monitor storage coming back online on a site after a previous failure, and to restore redundancy of mirrors across sites.

If the hot-relocation daemon, vxrelocd, is running, vxsited attempts to reattach the site, and allows vxrelocd to try to use the available disks in the disk group to relocate the failed subdisks. If vxrelocd succeeds in relocating the failed subdisks, it starts the recovery of the plexes at the site. When all the plexes have been recovered, the plexes are put into the ACTIVE state, and the state of the site is set to ACTIVE.

If vxrelocd is not running, vxsited reattaches a site only when all the disks at that site become accessible. After reattachment succeeds, vxsited sets the