Rolling Pool Update - failing
-
@Danp said in Rolling Pool Update - failing:
- How long ago was the last host added to this pool?
They were added last week (https://xcp-ng.org/forum/topic/8916/vsan-to-xcp-ng-xostor-homelab/8)
- When were these hosts last rebooted?
Yesterday evening
- Is this XCP-ng 8.2.1? Fully patched? Rebooted since patched?
XCP-ng 8.3 beta 2
- XOA or XO from sources? What version / commit?
XOA 5.94, build 20240401
What happens if you reboot the VM and then retry the RPU? Does the error still occur?
Let me try this
-
@Danp said in Rolling Pool Update - failing:
What happens if you reboot the VM and then retry the RPU? Does the error still occur?
Thanks @Danp rebooting XOA "fixed" the RPU error.
-
@Danp I think this RPU is "stuck"?
-
-
@fatek Looking at the entry under XO Tasks, it shows that it completed in 22 minutes. Those appear to be tasks related to the host patching.
Since v8.3 is still beta software, YMMV when performing these types of tasks. What is the status of your hosts? Have they rebooted? Are they stuck in maintenance mode? Etc.
-
I restarted the toolstack on the master & ran the RPU again.
It looks like it is making progress.
I will report back shortly.Does an RPU not reboot the hosts automagically?
-
I assume the RPU will also take care of outstanding patches on the host, is that correct?
-
@fatek Yes, it should evacuate each host, starting with the pool master. Then patch the host, followed by a reboot.
-
@Danp The RPU did not patch the hosts, all hosts still show outstanding patches.
I'll reboot all the hosts maybe they just need a 'lil kick in the ass!
-
@fatek said in Rolling Pool Update - failing:
CANNOT_EVACUATE_HOST
This error means there's VMs that can't be live migrated elsewhere. In that case, RPU can't work.
The details reason is the CPU version on which a VM booted. You probably added a host in the pool after booting some VMs, so they were using more advanced CPU feature than now available. The trick is to shutdown then immediately boot those VMs, and the problem will be solved.
-
@olivierlambert That is what I was thinking as well, but the he said that all hosts had been rebooted the previous day.
-
Even though the RPU showed a completed status of 22 mins, 1 of the 4 hosts was in emergency mode & 2 other's were disabled.
I was able to fix the situation with the following commands:yum-complete-transaction --cleanup-only yum-update reboot
-
That's weird So you have things unfinished in yum?
-
Yes, it said there were a few uncompleted transactions in yum,
After cleanup operation, I was able to patch the hosts, reboot & now it's time to install XOSTOR 1.0!
*remember, I'm on xcp 8.3 beta 2 -
About XOSTOR on 8.3: you'll hit bugs, it's not fully up to date, because we prioritize bug fixes on 8.2. So as long it's test and not prod, that's fine
edit: for yum we have identified something that XO could fail without waiting for yum to finish, it will be solved ASAP
-
It is test not prod.
Next week, I'll probably tear it down & re-install with 8.2