133
VSS snap of volumes hosted on dynamic disks in the Windows Guest : The vm-snapshot-with-quiesce CLI
and the XenServer VSS hardware provider do not support snapshots of volumes hosted on dynamic disks
on the Windows VM.
Coping with machine failures
This section provides details of how to recover from various failure scenarios. All failure recovery scenarios
require the use of one or more of the backup types listed in the section called “Backups”.

Member failures

In the absence of HA, master nodes detect the failures of members by receiving regular heartbeat messages.
If no heartbeat has been received for 200 seconds, the master assumes the member is dead. There are
two ways to recover from this problem:
Repair the dead host (e.g.. by physically rebooting it). When the connection to the member is restored,
the master will mark the member as alive again.
Shutdown the host and instruct the master to forget about the member node using the xe host-forget CLI
command. Once the member has been forgotten, all the VMs which were running there will be marked
as offline and can be restarted on other XenServer hosts. Note it is very important to ensure that the
XenServer host is actually offline, otherwise VM data corruption might occur. Be careful not to split your
pool into multiple pools of a single host by using xe host-forget , since this could result in them all
mapping the same shared storage and corrupting VM data.
Warning:
If you are going to use the forgotten host as a XenServer host again, perform a fresh installation of the
XenServer software.
Do not use xe host-forget command if HA is enabled on the pool. Disable HA first, then forget the host,
and then re-enable HA.
When a member XenServer host fails, there may be VMs still registered in the running state. If you are sure
that the member XenServer host is definitely down, and that the VMs have not been brought up on another
XenServer host in the pool, use the xe vm-reset-powerstate CLI command to set the power state of the
VMs to halted. See the section called “vm-reset-powerstate” for more details.
Warning:
Incorrect use of this command can lead to data corruption. Only use this command if absolutely necessary.

Master failures

Every member of a resource pool contains all the information necessary to take over the role of master if
required. When a master node fails, the following sequence of events occurs:
1. If HA is enabled, another master is elected automatically.
2. If HA is not enabled, each member will wait for the master to return.
If the master comes back up at this point, it re-establishes communication with its members, and operation
returns to normal.
If the master is really dead, choose one of the members and run the command xe pool-emergency-
transition-to-master on it. Once it has become the master, run the command xe pool-recover-slaves
and the members will now point to the new master.