Dell Wyse FW update breaks VM booting; console frozen; TianoCore/EDK2 related?
-
Hiya.
Just came out of a couple weekends' worth of troubleshooting an issue booting VMs on a clean install of XCP-ng.
The good news is that my original issue is resolved for now -- but it involves downgrading my system's firmware that, depending on where the bug lies, might be of interest here. At the very least, hopefully this helps others with similar issues and maybe spurs on a solution that doesn't lock me into older firmware.
TLDR: A Dell firmware update that claims to resolve vulnerabilities (among others) related to TianoCore and EDK2 (DSA-2023-344) breaks VMs ability to boot and freezes the console. For future reference, version 1.27.0 currently works as of 3/10.
System:
A Dell Wyse 5070 Extended; this is a re-purposed thin client that has gained popularity in home-lab circles as a low-power, x86-64 alternative to comparable SBCs like the Pi. Personally, I'm hoping to utilize this as a failover VM host in my network.It is not on the Xen HCL, but its Intel J5005 is perfectly capable of HVM -- and as part of my troubleshooting, works fine with other hypervisors I've tested. Other tests I've ran such as memtest, drive health, etc. all came back passing.
The only pain point that stood out to me initially was that this system does away with legacy boot support entirely in favor of EFI-only booting -- this may be related to the issue I'm experiencing, but without the ability to try a legacy boot, I have no way to test.
Behavior:
After a successful install of either 8.2.1 or 8.3 beta -- attempts to spin up a new VM from XO are met with the CPU usage pegged at 100% and a VM console that appears 'frozen' -- unable to accept any input after a second or two of activity, usually freezing on the installer menu of whatever ISO I've loaded up.I initially thought this might be an issue with the console itself, but the behavior is consistent across XO/XOA, XCP-ng Center for Windows, as well as the XO-lite preview on 8.3.
Ended up messing around with a ton of VM settings, system UEFI options (C-states, SpeedStep, etc.), GRUB options (attempting both Safe Mode and the alt kernel), as well as swapping out system components for hours.
My breakthrough came with trying a FW downgrade. When I bought the system, it came pre-loaded with Dell FW version 1.28.0 -- this was only one revision behind the latest 1.29.0 (only released days ago it seems). Both of these versions exhibited the same freezing issue. However, downgrading to 1.27.0 seems to totally fix everything.
Version 1.27.0: (WORKING)
https://www.dell.com/support/home/en-us/drivers/driversdetails?driverid=hwdwd&oscode=biosaVersion 1.28.0: (NOT WORKING)
https://www.dell.com/support/home/en-us/drivers/driversdetails?driverid=dpfjj&oscode=biosaSecurity Advisory 1.28.0 supposedly fixes: (DSA-2023-344)
https://www.dell.com/support/kbdoc/en-bb/000217986/dsa-2023-344I'll admit that I'm not versed enough in the details to know what exactly is changed with this FW update -- but it seems related to recently-discovered "PixieFail" vulnerabilities with TianoCore/EDK2.
You may be aware of all this already, but I wanted to share my experience, findings and temporary solution for my particular system. If anyone has any insight or suggestions that they're willing to share, I'm all ears.
Thanks!
-
thank you for this information @rubberhose
i am eyeing on these Wyse 5070 units as well & i might have been "devastated" if i have not seen your post!
btw, is 1.29 not also working?
-
@ambad4u For sure! Yeah, 1.29 does not work for me either. Going to keep an eye out for any new versions and report back.
For the record, I only have this one unit to test; so there's always the chance mine could be bunk. Luckily these units make it easy to change out the firmware in either direction, so anyone willing to (carefully) test and confirm, would be appreciated.
Been running some benches and it's a fine little machine otherwise.
-
-
@rubberhose I've got a fix from Intel, and @stormi has packaged it.
yum update microcode_ctl --enablerepo=xcp-ng-testing
should get youmicrocode_ctl-2.1-26.xs29.2.xcpng8.2
which has the fixed microcode for this issue in it.When you've got that installed, it should be safe to update back to the latest firmware.
-
Hello! My apologies -- I hadn't gotten around to trying this solution out until now.
I just did and it appears to have fixed the issue! VMs are booting correctly and I can interact with them normally through the console.
Can also confirm it works on both the 1.29 Dell FW from my original post, as well as the latest 1.32 FW available as of this post.
Much thanks @andyhhp and @stormi for both your time and effort in packaging this -- I greatly appreciate it!