XCP-ng 8.2 Rolling Update Error
With the new updates that just came in, I wanted to use the new rolling update feature.
When I do it, I get the following error.
I'm not sure what that UUID is and can't find it, it's not one of my hosts. Each of my host has enough ram to run all of the VMs that I have. All of the VMs are running on an NFS SR.
I'm happy to pull any other logs to help.
- It's an OpaqueRef not an UUID
- It might make some checks on the static max memory itself. Can you double check on that?
- It's an OpaqueRef not an UUID
- Is there a way to look up OpaqueRef and does it point to a specific service/host/vm/etc?
- I checked all running vms including XO and XCP-ng, 33GB assigned / being used. Each host has 64GB of RAM. I do have cores over-provisioned if all the VMs are forced onto one host but I can fix that. With a rolling update dividing VMs among the 2/3 hosts at a given time, there shouldn't be an over-provision of cores.
I don't remember where to check OpaqueRef
Try to find the error in XCP-ng host logs (
) you might have more details -
There are more debug messages but these seem like the most relevant.
Apr 3 20:02:18 localhost xapi: [debug||1864 ||vgpuops] vGPUs allocated to VM (OpaqueRef:ee6254af-44af-4bcf-8382-d9b7542d2b3c) are: Apr 3 20:02:56 localhost xapi: [ warn||1864 ||rbac_audit] cannot marshall arguments for the action VM.migrate_send: name and value list lengths don't match. str_names=[session_id,vm,dest,live,vdi_map,vif_map,options,vgpu_map,], xml_values=[S(OpaqueRef:4489cc5b-fbe9-44d4-a60f-627b8960338f),S(OpaqueRef:ee6254af-44af-4bcf-8382-d9b7542d2b3c),{SM:S(;host:S(OpaqueRef:ac2e50bc-1f80-417a-8eea-a7ce955d4c24);xenops:S(;session_id:S(OpaqueRef:cd0a7342-5155-434d-a8ca-2e9f7dcf8460);master:S(},B(true),{OpaqueRef:a4398d4e-946c-4c83-ad5f-b4dc1c3ddeee:S(OpaqueRef:4c00dfd9-a30c-47a4-8367-e8a7d7c53cb6);OpaqueRef:188682c3-1b45-4ae4-98cf-0dfff1d90d56:S(OpaqueRef:4c00dfd9-a30c-47a4-8367-e8a7d7c53cb6)},{},{force:S(false)},]
Those 2 lines aren't relevant at all indeed. Can you copy more lines in https://paste.vates.fr/ for example.
https://paste.vates.fr/?f2f854418b6493c8#B1Wrjkb9wVQTY1BKvTvcz1r4WZmXHnTB2n52UjLx4RofI used:
cat /var/log/xensource.log | grep -i ee6254af-44af-4bcf-8382-d9b7542d2b3c
The logs I sent are from host 1, not sure if I need to run that command on all of the hosts.
You are in HA. Please disable HA first before doing a rolling pool update.
This error comes from HA: if the host is down for a reboot, then if there's a failure on the extra left host, you won't have enough memory to get the HA plan to be deployed.
edit: I'm opening an issue on XO GH repo to check and disable HA before doing RPU. Thanks for the feedback!
Awesome, thanks for helping to troubleshoot. I'll go a head and disable that then do the update.I've been having to switch between XO and XCP-ng Center to get certain features to work.
- The HA configuration in Center is super easy with good feedback on the heartbeat, will this type of info be incorporated into XO?
- The in trying to troubleshoot this I was looking at the VM memory options and the XO vs XCP-ng Center set fixed memory vs automatic is much easier in Memory.
- XO detects that all of my VMs properly have the management agent installed (with ip, os, and agent version being reported), XCP-ng Center only reports 1/4 of the VMs having the management agent installed.
We are aware of those shortcoming and they'll be fixed in XO 6
Great to hear, looking forward to the updates.