@Denson After enabling HA, the host has to be manually rebooted, which I think you're already aware of.
OK, well, that leaves pretty much a network issue. Hmm. The statefile created OK on the shared SR within the pool?
The "not found" error message doesn't give you much to go on, unfortunately.

Posts
-
RE: Unable to enable HA on a XCP-ng 8.2.1 Compute Pool
-
RE: Unable to enable HA on a XCP-ng 8.2.1 Compute Pool
@Denson When you enable HA, note that the host has to be rebooted for HA to take effect. Was that the case?
-
RE: Issue with SR and coalesce
@nikade Am wondering still if one of the hosts isn't connected to that SR properly. Re-creating teh SR from scratch would do the trick, but a lot of work shuffling all the VMs to different SR storage. Might be worth it, of course, if it fixes the issue.
-
RE: Issue with SR and coalesce
@Byte_Smarter It my be that the SR is not mounted on one of the hosts in your pool?
-
RE: Issue with SR and coalesce
@Byte_Smarter There's an ophan VDI locator in XenOrchestra, I believe. If the VDI is no longer associated with a VM, it's likely an orphan.
YOu can also use the command "vhd-util scan -a -p /var/run/sr-mount/<UUID-of-the-SR>" to look for orphans. -
RE: Issue with SR and coalesce
@nikade Oh, sorry, right...
What happens if you run a manual "xe sr-scan uuid=UUID-of-SR" ?
Do you have orphan VDIs? You said earlier there were no remnant snapshots. -
RE: Issue with SR and coalesce
@Byte_Smarter Hmmm, can you migrate any VMs' storage to other SRs to free up more space?
-
RE: Issue with SR and coalesce
@Byte_Smarter Sure, that's of course possible. Does "xe task-list" show any currently running tasks? Anything else of possible value in the logs?
-
RE: Issue with SR and coalesce
@lucasljorge That may be the issue. That's pretty full for a coalesce to work!
-
RE: Issue with SR and coalesce
@lucasljorge How full is the storage, percentage-wise? If over around 90%, a coalesce operation sometimes will not work. You may have to shuffle some of your VM storage to a different SR.
If your host is not responding, you may have to do a reboot to clear out stuck taks if the "xe task-cancel" command isn't working. -
RE: Unable to enable HA on a XCP-ng 8.2.1 Compute Pool
@Denson Are all hosts properly time synchronized to NTP? Make sure they are all within reasonable limits of each other.
Might be a network thing -- are all interfaces configured alike on all hosts? Can the hosts all ping each other? -
RE: The HA doesn't work
@sixela Hmmm ... that SR backend error makes me wonder if the place where you designate HA info to be stored (the so-called "heartbeat SR") might be corrupted or such?
-
RE: The HA doesn't work
How many hosts in your pool? For HA to work out of the box, you need at least three hosts in a pool. Also, are all your hosts properly time synchronized to the same time source?
They need to be very close in time to each other for HA to work properly. Note that when HA is first enabled on a given host, it has to be rebooted for HA to function. -
RE: Rolling pool update failed to migrate VMs back
Ever since the early days of XenServer, I have always done the upgrade procedure manually, starting of course with the pool master, and manually migrating VMs to other hosts to make sure they all remain running (tracking of course what VMs should run on what host (the so-called host affinity setting). This can be done on individual VMs with the command:
xe vm-param-set uuid=<vm_uuid> affinity=<host_uuid>
That way, you can make sure a all VMs are successfully migrated off any given host before it's updated. -
RE: SR Garbage Collection running permanently
@Tristis-Oris Do first verify you have good backups before considering deleting snapshots. You could also just export the snapshots associated with the VMs..
As to the GUI vs. CLI, it should do the same thing, but if it runs, it should show up in the task list. -
RE: SR Garbage Collection running permanently
@Tristis-Oris By manually, do you mean from the CLI vs. from the GUI? If so, then:
xe sr-scan sr-uuid=sr_uuid
Check your logs (probably /var/log/SMlog) and run "xe task-list" to see what, if anything, is active. -
RE: SR Garbage Collection running permanently
@Tristis-Oris It wouldn't hurt to do a manual cleanup. Not sure if a reboot might help, but strange that no task is showing as active. Do you have other SRs on which you can try a scan/coalesce?
Are there any VMs in a weird power state? -
RE: SR Garbage Collection running permanently
@Tristis-Oris Note that if the SR storage device is around 90% or more full, a coalesce may not work. You have to either delete or move then enough storage so that there is adequate free space.
How full is the SR? That said, a coalesce process can take up to 24 hours to complete. I wonder if this shows up and with what progress when you run "xe task-list" ? -
RE: XCP-ng host - Power management
@abudef Yeah, nested virtualization has its own issues. I think it was possible with at least some versions of XenServer, but it's something not well supported.
Changing those parameters also works on native XCP-ng so not sure where the advantages actually lie by putting XCP-ng on top of ESXi? Maybe you can clarify that?