@aflons @jgrafton First of all, I would like to thank very much both of you for replying so quickly to this old thread!
Our failure rate is roughly 1 frozen VM / 90 Rocky8 VMs / day, which is not tolerable. We have further hundreds of Rocky8 VMs on VMware, waiting for migration to XCP-ng.
I tried to summarise our options:
Our kernels are pretty fresh, but we can try the very latest available for Rocky 8.
Upgrading to Rocky 9 on the sort term is not an option. We have to migrate Rocky 8 from VMware to XCP-ng first, then we can think about switching to Rocky 9 later.
VMware tools removed during migration as part of the migration procedure.
We are aready on shared lvmohba storage, which is a production grade Hitachi Vantara all SSD, same as under VMware, so I see no room for change/improvement here.
As last resort we can try disable load-balancing plugin and reboot monthly during our maintenance window, but this would be an ugly workaround.
Is there anything I forgot?
@jgrafton Was there any useful suggestion or conclusion in your Vates support ticket #7726289? I am afraid that we are facing a tricky interworking issue between the xen hypervisor and the 4.18.0 kernel and both components are independent from XCP-ng and Vates.