[RHEL kernel bug] XCP vm fails to boot after newest kernel applied.
-
Wasn't a choice I wanted either, but decided I had to do it. Went with HP T740 for everything which is way less than I'd really like, but seems to be working so far for my lab.
I'll probably load up an Alma 9 (assuming it is affected too) later today. If anyone knows that this is limited to Alma 8, then I'll switch gears and use 8. Need a simple LAMP stack running to test something, and need some other VMs for backup testing too.
-
Can you please let me know where this bug is filed (I see some numbers mentioned above), so I might keep an eye on it for my own curiosity. A PM is fine as well.
Thanks!
-
@bberndt
I just installed a fresh Alma 8, did yum update to see what would happen, and it's still working. Gave out the same kernel as above.This was installed UEFI on XCP-ng 8.3 which was a fresh install a few days ago from a nightly (near release?) ISO. It was installed to an NFS share, 2 cores and 4GB with an Intel i1000 interface. Xenserver tools 8.4.0-1 installed.
There are no extra packages installed yet, could this be a package conflict.
Anything else I can check to see why mine works and others are failing?
-
@Greg_E
I checked a few of mine, and they appear to all be BIOS mode. not UEFI.
I've had a couple hardware machines as well, that updated OK. I know at least one was UEFI. -
@bberndt that's why I mentioned uefi, wondering if legacy is part of the problem. I won't have time to fiddle with this for a while, broke a couple things today that I need to fix, and need to set up glpi for some testing.
-
-
CCing @anthonyper who is tasked with following this regression closely and letting us know about any progress.
-
@Greg_E said in XCP vm fails to boot after newest kernel applied.:
@bberndt that's why I mentioned uefi, wondering if legacy is part of the problem. I won't have time to fiddle with this for a while, broke a couple things today that I need to fix, and need to set up glpi for some testing.
Made a new Rocky Linux 8 install, on a lab host. UEFI boot mode, and mostly all defaults. on XCP-ng 8.2 on a E5 2620 v0 host.
Used the guest tool from the Rocky and or EPEL repository. (added the EPEL repo and then installed xe-guest-utilities)
Does NOT boot after updating to the latest kernel. -
Is there something different about Alma? Friday I may have some time to fiddle and can try a Rocky 8 to see what happens.
Is this an 8.2 and 8.3 issue or just 8.2? I have 8.2 in production and could try there, 8.3 in my lab with a very fresh build.
-
@Greg_E said in XCP vm fails to boot after newest kernel applied.:
Is there something different about Alma? Friday I may have some time to fiddle and can try a Rocky 8 to see what happens.
Is this an 8.2 and 8.3 issue or just 8.2? I have 8.2 in production and could try there, 8.3 in my lab with a very fresh build.
Migrated to a XCP-ng 8.3 host. Xeon E5-2689 v4
No change. -
If I find an hour, I'll give Rocky 8 a try.
Could you use LEAPP to migrate that to Alma, maybe they are doing something differently which is why mine are working.
I can tell you, there is nothing special that I'm doing, my systems are as vanilla as they get.
-
@Greg_E said in XCP vm fails to boot after newest kernel applied.:
If I find an hour, I'll give Rocky 8 a try.
Could you use LEAPP to migrate that to Alma, maybe they are doing something differently which is why mine are working.
I can tell you, there is nothing special that I'm doing, my systems are as vanilla as they get.
@Greg_E
google AI says what I think is familiar:
Rocky Linux strives for 1:1 bug-for-bug compatibility with RHEL, while Alma Linux is more of an RHEL rebuild, making some adjustments and adding its own features -
Ok, that might explain the difference.
Would a LEAPP from Rocky 8 up to Alma 9 be possible and solve the issue?
-
@Greg_E said in XCP vm fails to boot after newest kernel applied.:
@bberndt
I just installed a fresh Alma 8, did yum update to see what would happen, and it's still working. Gave out the same kernel as above.This was installed UEFI on XCP-ng 8.3 which was a fresh install a few days ago from a nightly (near release?) ISO. It was installed to an NFS share, 2 cores and 4GB with an Intel i1000 interface. Xenserver tools 8.4.0-1 installed.
There are no extra packages installed yet, could this be a package conflict.
Anything else I can check to see why mine works and others are failing?
Hmm....I did the same and that VM died just like any other with the broken kernel.
Although I used the Cloud Image so not completly new install from scratch.
Maybe I'll check building a new cloud image what will happen then. -
How many of these that are failing have been upgraded from a previous version? Could it be something left over from EL7 or early EL8?
My Alma 8.10 base install is still running fine, did a yum update to apply a few more things and reboot and still working with the same kernel version above. But again, this was a clean fresh install, not something that's been running for a while.
-
FYI, there's a patch submitted to linux-stable (6.6 and earlier) but not yet in a stable release:
https://lore.kernel.org/stable/20250411160833.12944-1-jason.andryuk@amd.com/I guess we'll have to wait until this is picked up by Linux, then Red Hat will have to pick that as well.
-
@Greg_E said in XCP vm fails to boot after newest kernel applied.:
Ok, that might explain the difference.
Would a LEAPP from Rocky 8 up to Alma 9 be possible and solve the issue?
I did a (not LEAPP, but a migration script from Alama) from Rocky 8 to Alma 8, and it died. None of my Rocky 9's have had a problem so far, and of course end up with a completely different kernel.
-
@Greg_E They are all running for a while but none where upgrades from RHEL 7 to RHEL 8. I don't do LEAPPs that gives me more often errors than it works. What I did though on one is replacing the System VDI with a newer one, but that was Alma 8 to Alma 9 and that one has no issues.
-
@bberndt you Rocky/Alma 9 will be fine since it's a bug in the kernel for 8 only.
-
This is just a "me too" reply to indicate that I am also experiencing the boot failure immediately after upgrading AlmaLinux 8.10 to the latest kernel 4.18.0-553.50.1. Hopefully, the kernel fix will get integrated soon into the affected and popular RedHat 8 derivatives so that Alma and Rocky 8 et al. can continue to run on our favorite hypervisor. This is a bad one.