cairoti I have had this happen to me.... With HA enabled, when the master fails, a new pool member becomes the new master. With HA enabled and shared storage, designated HA VMs should be restarted on the pool.
If you have stand-alone (per server) storage then VMs on the dead server can't be restarted because their storage is gone.
If you don't have HA enabled then you need to manually force a new master and restart failed VMs.
To force a new master on a pool member use the command xe pool-emergency-transition-to-master
from a XCP console. You can't do much to the pool without a master. If your old master just needs to reboot quickly then just do that without changing the pool. If you master is going to be down for a while (more than a few minutes) then pick a new master beforehand, or after as a forced change.
It may take a while to force another pool member to become the new master as the members have to timeout contacting the old dead master.
With a failed master or pool member you have two choices:
- Kick out the dead host from the pool forever (from the new master). After you fix/replace the dead host you need to REFORMAT and START OVER with a NEW install of XCP (there may be a reset command, not sure). You can then add the NEW host to the existing pool. You CAN NOT re-add a host that has been kicked out from the pool. IF you reformat the dead host and start over with a clean install, you need to kick the old host from the pool as the new server has a new UUID and you can't replace a pool member (just kick old and add new).
- Don't kick out the dead host and fix the hardware (as long as the drives boot the same XCP data as before). Just turn on the host and it should join the pool and know that it was the master, but there is a new master now and it should just become a normal pool host.
If you fix the old master (and did NOT kick it from the pool) then boot it with the same disks, it should just connect to the pool and know there is a newer master. You can re-designate it as the new master if you wish or just let it be a pool member.
Yes, I have done this (again, just now as a test) with XCP 8.2.1