Server will not migrate VMs to enter maintenance mode
In any scenario, the error message isn't clear and doesn't help to address the issue. As @Danp pointed to the issue created by @olivierlambert should at least better define the root issue.
@olivierlambert said in Server will not migrate VMs to enter maintenance mode:
I have three hosts with 1.5 TB of Memory, I have 2 VMs running on the hosts using less then 10 GB of Ram, so memory is not the issue.I can manually migrate the VM and the host will go into maintenance mode.
The error is bogus, the issue may be more related to the XO VM running on the host and the host fails to suspend the VMS on the host.
The error is coming from XCP-ng (it's in all caps), not from XO.
Unsure why this detail wasn't included here, but his support ticket indicates that HA is enabled on the pool.
@Danp OK, so here is the situation....
Manual Migrations work manually regardless of the VM HA state...
Auto-Migration fails if anything but Restart is selected for the HA Mode.
Cluster in HA, VMs in best effort HA mode - Memory Error is thrown
Cluster in HA, VMs in disabled HA mode - Memory Error is thrown
Cluster in HA, VMs in restart HA mode - No Memory Error is thrown -
Last test: cluster without HA.
OK, so with both VMs set to Best Effort and HA disabled on the Pool the error does not occur.If the pool is set to HA the VMs need to be set to Restart the error occurs
Only one setting works when the Pool in HA Mode, the VM Set to restart.
If I disable the HA Pool setting the VMs migrate as needed and no errors occurs regardless of the VMs HA Settings.
@pmcgrail said in Server will not migrate VMs to enter maintenance mode:
If the pool is set to HA the VMs need to be set to Restart the error occurs
If the pool is set to HA the VMs need to be set to Restart or the error occurs -
So it's clearly as @Danp suggested.
I just sent a precision upstream with your report @pmcgrail
Apologies for jumping in, but what are the plans to resolve this issue. Any update will be appreciated.
What's your issue functionally speaking?
@olivierlambert - With the host not able to evacuate, i will have to manually move VMs around to other hosts in the pool and then perform maintenance on the host. Imagine you have to do this for few hundred VMs and multiple physical hosts.
Also, the issue is not just the evacuate, if the pool ha is enabled and you disable all the ha property for all VMs, it allows to put the host in maint fine. But when i tried to disable the maint mode, i got this:
"code": "HA_OPERATION_WOULD_BREAK_FAILOVER_PLAN"So, i disabled ha on the pool, then the disable maint on the host worked fine.
I think whole HA needs to be fully validated and every aspect needs to cross checked, otherwise i dont think its production ready.
For smaller environments, it might not be too much of a pain but even a medium environment this needs to be fixed.
Please open a support ticket, obviously for a large infrastructure that would be logical to be sure it's not behaving like that or to make sure XO disable things in the correct order before trying to evacuate a host