XCP-ng 8.2 Rolling Update Error
-
- Is there a way to look up OpaqueRef and does it point to a specific service/host/vm/etc?
- I checked all running vms including XO and XCP-ng, 33GB assigned / being used. Each host has 64GB of RAM. I do have cores over-provisioned if all the VMs are forced onto one host but I can fix that. With a rolling update dividing VMs among the 2/3 hosts at a given time, there shouldn't be an over-provision of cores.
-
I don't remember where to check OpaqueRef
Try to find the error in XCP-ng host logs (
/var/log/xensource.log
) you might have more details -
There are more debug messages but these seem like the most relevant.
Apr 3 20:02:18 localhost xapi: [debug||1864 ||vgpuops] vGPUs allocated to VM (OpaqueRef:ee6254af-44af-4bcf-8382-d9b7542d2b3c) are: Apr 3 20:02:56 localhost xapi: [ warn||1864 ||rbac_audit] cannot marshall arguments for the action VM.migrate_send: name and value list lengths don't match. str_names=[session_id,vm,dest,live,vdi_map,vif_map,options,vgpu_map,], xml_values=[S(OpaqueRef:4489cc5b-fbe9-44d4-a60f-627b8960338f),S(OpaqueRef:ee6254af-44af-4bcf-8382-d9b7542d2b3c),{SM:S(http://10.100.10.31/services/SM?session_id=OpaqueRef:cd0a7342-5155-434d-a8ca-2e9f7dcf8460);host:S(OpaqueRef:ac2e50bc-1f80-417a-8eea-a7ce955d4c24);xenops:S(http://10.100.10.31/services/xenops?session_id=OpaqueRef:cd0a7342-5155-434d-a8ca-2e9f7dcf8460);session_id:S(OpaqueRef:cd0a7342-5155-434d-a8ca-2e9f7dcf8460);master:S(http://10.100.10.31/)},B(true),{OpaqueRef:a4398d4e-946c-4c83-ad5f-b4dc1c3ddeee:S(OpaqueRef:4c00dfd9-a30c-47a4-8367-e8a7d7c53cb6);OpaqueRef:188682c3-1b45-4ae4-98cf-0dfff1d90d56:S(OpaqueRef:4c00dfd9-a30c-47a4-8367-e8a7d7c53cb6)},{},{force:S(false)},]
-
Those 2 lines aren't relevant at all indeed. Can you copy more lines in https://paste.vates.fr/ for example.
-
@olivierlambert
https://paste.vates.fr/?f2f854418b6493c8#B1Wrjkb9wVQTY1BKvTvcz1r4WZmXHnTB2n52UjLx4RofI used:
cat /var/log/xensource.log | grep -i ee6254af-44af-4bcf-8382-d9b7542d2b3c
The logs I sent are from host 1, not sure if I need to run that command on all of the hosts.
-
You are in HA. Please disable HA first before doing a rolling pool update.
This error comes from HA: if the host is down for a reboot, then if there's a failure on the extra left host, you won't have enough memory to get the HA plan to be deployed.
edit: I'm opening an issue on XO GH repo to check and disable HA before doing RPU. Thanks for the feedback!
-
@olivierlambert
Awesome, thanks for helping to troubleshoot. I'll go a head and disable that then do the update.I've been having to switch between XO and XCP-ng Center to get certain features to work.
- The HA configuration in Center is super easy with good feedback on the heartbeat, will this type of info be incorporated into XO?
- The in trying to troubleshoot this I was looking at the VM memory options and the XO vs XCP-ng Center set fixed memory vs automatic is much easier in Memory.
- XO detects that all of my VMs properly have the management agent installed (with ip, os, and agent version being reported), XCP-ng Center only reports 1/4 of the VMs having the management agent installed.
-
We are aware of those shortcoming and they'll be fixed in XO 6
-
-
Great to hear, looking forward to the updates.