Wyse 5070 VM won't booting after update bios 1.27
-
Wanted to share this with group because this had me thrown for a loop for the better part of three day & I know there are quite a few other homelab aficionados that repurposed these little thinclients because they are just so small & versatile.
TLDR: microcode update was to culprit. Hope dell either push an update or it otherwise get resolved or...I'll be unsecure, I guess.
Fixed it by rolling back to version 2:2.1-26.xs28.1.xcpng8.2
yum downgrade microcode_ctl 2:2.1-26.xs28.1.xcpng8.2
yum downgrade microcode_ctl
might work as well, but that really depends on your upgrade path.
Detail:
After a recent update I discovered that none on my virtual machines would boot save for one that would boot sporadically. It was accidentally set to pv drivers...not entirely sure how that happened, but it also could have just loaded that way when I imported the cloud image from Harvester.
The symptoms:
Not keyboard or mouse input for windows or linux.
Windows vms would boot & do one of two thing the screen would "Guest has not initialised the display (yet)" or get to the point where the drive was initialized & hang...they would basically max out what CPU they were allowed.
Linux vms would boot & always hang at the same place right after grub loaded initial ramdisk.
Try every bios setting I could no joy.
Hope you other poor stuck soles stumble of this & save your self's some time.
-
@stormi I'm not sure there's many things we can do about it, maybe a proper/documented way to pin the latest working microcode?
-
@olivierlambert Completely agree, years old consumer/non-HCL compliant hardware shouldn't rate in overall development progress & a pin is about the best we should hope for.
It might just be time for some home labbers to start shopping & retire come gear!
The XCP-NG community & platform are amazing...keep up the great work!
-
Making sure (how?) this topic is indexed by search engines would be a good starting point.
-
My search engine just made me discover this one: https://xcp-ng.org/forum/topic/8584/dell-wyse-fw-update-breaks-vm-booting-console-frozen-tianocore-edk2-related
-
So, could you get some logs from the boot failures?
head /proc/cpuinfo
- full output of
xl dmesg
including booting the problem VM with the good and bad ucode in place
-
@stormi said in Wyse 5070 VM won't booting after update bios 1.27:
head /proc/cpuinfo
Here info for my 3 nodes.
xenserver01.trc.blox_cpuinfo.txt
xenserver01.trc.blox_xl_dmesg.txt
xenserver02.trc.blox_cpuinfo.txt
xenserver02.trc.blox_xl_dmesg.txt
xenserver03.trc.blox_cpuinfo.txt
xenserver03.trc.blox_xl_dmesg.txt -
@t-chamberlain Is the output of
xl dmesg
with, or without the bad microcode? Did a VM fail to boot since the hosts were booted? -
Those are the microcode downgraded hosts. I can upgrade the packages & get those outputs as well if need be.
-
@stormi Sorry about that only answered part of the question. I did have a vm fail to boot, but it was because of an underlying issues with drbd/xostor & don't believe related to the microcode.
-
@t-chamberlain Yes, please. One host, with one failed VM start, would be enough
-
Here is the xl_dmesg output after the microcode update.
-
You can see the vms just kind of hang here.
-
-
@t-chamberlain Was the xl dmesg output produced before, or after the VM hanged?
-
@stormi At that point that vm dsk05002 was hung for about 20 minutes. I don't know if it where are like this that an alert event is ever generated.
If need be I can boot a vm & just let it go.
-
@t-chamberlain No need to, unless you have some doubts. For now the conclusion will be
xl dmesg
doesn't output anything particular when the VM hangs. -
However, logs from a debug Xen might give more clues.
If you can, please follow the instructions given by @andyhhp - a Xen developer - at https://xcp-ng.org/forum/post/74855.
-
@t-chamberlain In addition to the XTF testing, could you also please (with the bad microcode) try booting Xen with
spec-ctrl=no-verw
on the command line, and seeing whether that changes the behaviour of your regular VMs? Please capturexl dmesg
from this run too. -
Doc about XTF testing: https://docs.xcp-ng.org/project/development-process/tests/#test-the-xen-hypervisor-itself
-
@t-chamberlain I've got a fix from Intel, and @stormi has packaged it.
yum update microcode_ctl --enablerepo=xcp-ng-testing
should get youmicrocode_ctl-2.1-26.xs29.2.xcpng8.2
which has the fixed microcode for this issue in it.