Coping with machine failures, Member failures

VSS snap of volumes hosted on dynamic disks in the Windows Guest : The vm-snapshot-with-quiesce CLI and the XenServer VSS hardware provider do not support snapshots of volumes hosted on dynamic disks on the Windows VM.

Coping with machine failures

This section provides details of how to recover from various failure scenarios. All failure recovery scenarios require the use of one or more of the backup types listed in the section called “Backups”.

Member failures

In the absence of HA, master nodes detect the failures of members by receiving regular heartbeat messages. If no heartbeat has been received for 200 seconds, the master assumes the member is dead. There are two ways to recover from this problem:

•Repair the dead host (e.g.. by physically rebooting it). When the connection to the member is restored, the master will mark the member as alive again.

•Shutdown the host and instruct the master to forget about the member node using the xe host-forgetCLI command. Once the member has been forgotten, all the VMs which were running there will be marked as offline and can be restarted on other XenServer hosts. Note it is very important to ensure that the XenServer host is actually offline, otherwise VM data corruption might occur. Be careful not to split your pool into multiple pools of a single host by using xe host-forget , since this could result in them all mapping the same shared storage and corrupting VM data.

Warning:

•If you are going to use the forgotten host as a XenServer host again, perform a fresh installation of the XenServer software.

•Do not use xe host-forgetcommand if HA is enabled on the pool. Disable HA first, then forget the host, and then re-enable HA.

When a member XenServer host fails, there may be VMs still registered in the running state. If you are sure that the member XenServer host is definitely down, and that the VMs have not been brought up on another XenServer host in the pool, use the xe vm-reset-powerstateCLI command to set the power state of the VMs to halted. See the section called “vm-reset-powerstate”for more details.

Warning:

Incorrect use of this command can lead to data corruption. Only use this command if absolutely necessary.

Master failures

Every member of a resource pool contains all the information necessary to take over the role of master if required. When a master node fails, the following sequence of events occurs:

1.If HA is enabled, another master is elected automatically.

2.If HA is not enabled, each member will wait for the master to return.

If the master comes back up at this point, it re-establishes communication with its members, and operation returns to normal.

If the master is really dead, choose one of the members and run the command xe pool-emergency-transition-to-masteron it. Once it has become the master, run the command xe pool-recover-slavesand the members will now point to the new master.

133

Citrix Systems 5.6 manual Coping with machine failures, Member failures, Master failures

Models: 5.6

Coping with machine failures

Member failures

Master failures