XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    [RHEL kernel bug] XCP vm fails to boot after newest kernel applied.

    Scheduled Pinned Locked Moved Solved Compute
    49 Posts 12 Posters 1.8k Views 14 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E Offline
      edsilber
      last edited by olivierlambert

      We have a ticket from some users who were unable to boot their Xen vms after updating to the newest Red Hat kernel (4.18.0-553.50.1.el8_10.x86_64) That same kernel seems to work ok on VMware. We've seen references to it in the wild https://forums.almalinux.org/t/kernel-panic-4-18-0-553-50-1-el8-10-x86-64-on-xenserver-8-4/5754
      We're telling our users to hold off on that patch until we know more.
      The xen security advisory is from back in December but looks like the fix implementation has broken something. https://xenbits.xen.org/xsa/advisory-466.html
      Any workarounds known?

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Hi,

        For such issues, it's a lot better to create a support ticket. We could point you to get more detailed output (or serial) to understand exactly what's causing the boot to fail 🙂

        E 1 Reply Last reply Reply Quote 0
        • E Offline
          edsilber @olivierlambert
          last edited by

          @olivierlambert said in XCP vm fails to boot after newest kernel applied.:

          For such issues, it's a lot better to create a support ticket. We could point you to get more detailed output (or serial) to understand exactly what's causing the boot to fail

          Reply

          Sure thing, we were just trying to triage this morning. "Yes this seems to be real & Don't update until we know more" bought us a little time until we could move to next steps. Ticket incoming...

          E 1 Reply Last reply Reply Quote 0
          • E Offline
            edsilber @edsilber
            last edited by

            @edsilber Ticket#7737661 submitted

            1 Reply Last reply Reply Quote 1
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              Investigating, thanks for the report!

              1 Reply Last reply Reply Quote 0
              • Y Offline
                yomeyo
                last edited by

                We have the same issue here with the latest kernel since installing it yesterday.
                4.18.0-553.50.1.el8_10.x86_64 does not boot while 4.18.0-553.47.1.el8_10.x86_64 boots fine.
                Is there a fix or workaround available yet?

                1 Reply Last reply Reply Quote 0
                • K Offline
                  kpark
                  last edited by

                  Ticket 7737647, same issue.

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by

                    This is a dual bug:

                    1. In the Linux PV driver upstream
                    2. How it was backported by RedHat

                    Both upstream and RH are aware, avoid doing the update for now until we have the confirmation it's fixed.

                    @stormi we need someone assigned to this to monitor and report any news and progress.

                    1 Reply Last reply Reply Quote 1
                    • stormiS Offline
                      stormi Vates 🪐 XCP-ng Team
                      last edited by

                      I also passed the information to Almalinux maintainers, they're now aware of the situation.

                      1 Reply Last reply Reply Quote 1
                      • B Offline
                        bberndt
                        last edited by

                        Just popping in to say, I also found this today.
                        Yesterday and today was my set aside time to do some updates on my VMs, and came across this today.

                        Last night they were updating to 4.18.0-553.47.1.el8_10.x86_64
                        today, 4.18.0-553.50.1.el8_10.x86_64

                        in my looking around I thought this was related to something that was called: support for x86-64 v2, or lack there of. My CPU's on the problems are E5-2620 v0, and a E5-2430 V2. Moving one of the problem VMs (all Rock Linux 8, I believe) to a Xeon SIlver 4210 made no help. Turns out not the issue, but something to be aware of RHEL 10 is losing this v2 support, which would affect me.

                        G 1 Reply Last reply Reply Quote 0
                        • G Offline
                          Greg_E @bberndt
                          last edited by Greg_E

                          @bberndt

                          Not to go off topic, but lack of v2 and lack of uefi were reasons I moved my lab to newer hardware.

                          As of yesterday, the xenserver management agent was still at v8.4, I'm guessing the drivers haven't changed either.

                          I have one Alma 9 on my vSphere lab, I'll have to update it on Monday and see if it breaks. Debian 12 on XCP-ng 8.3 is not effected as of yesterday.

                          B 1 Reply Last reply Reply Quote 0
                          • B Offline
                            bberndt @Greg_E
                            last edited by

                            @Greg_E said in XCP vm fails to boot after newest kernel applied.:

                            @bberndt

                            Not to go off topic, but lack of v2 and lack of uefi were reasons I moved my lab to newer hardware.

                            @Greg_E you're not wrong, but I don't get to hold the checkbook.

                            G 1 Reply Last reply Reply Quote 0
                            • G Offline
                              Greg_E @bberndt
                              last edited by

                              @bberndt

                              Wasn't a choice I wanted either, but decided I had to do it. Went with HP T740 for everything which is way less than I'd really like, but seems to be working so far for my lab.

                              I'll probably load up an Alma 9 (assuming it is affected too) later today. If anyone knows that this is limited to Alma 8, then I'll switch gears and use 8. Need a simple LAMP stack running to test something, and need some other VMs for backup testing too.

                              B 1 Reply Last reply Reply Quote 0
                              • B Offline
                                bberndt @Greg_E
                                last edited by bberndt

                                @olivierlambert @kpark

                                Can you please let me know where this bug is filed (I see some numbers mentioned above), so I might keep an eye on it for my own curiosity. A PM is fine as well.

                                Thanks!

                                G stormiS 2 Replies Last reply Reply Quote 0
                                • G Offline
                                  Greg_E @bberndt
                                  last edited by Greg_E

                                  @bberndt
                                  I just installed a fresh Alma 8, did yum update to see what would happen, and it's still working. Gave out the same kernel as above.

                                  Alma8.png

                                  This was installed UEFI on XCP-ng 8.3 which was a fresh install a few days ago from a nightly (near release?) ISO. It was installed to an NFS share, 2 cores and 4GB with an Intel i1000 interface. Xenserver tools 8.4.0-1 installed.

                                  There are no extra packages installed yet, could this be a package conflict.

                                  Anything else I can check to see why mine works and others are failing?

                                  B B 2 Replies Last reply Reply Quote 0
                                  • B Offline
                                    bberndt @Greg_E
                                    last edited by

                                    @Greg_E
                                    I checked a few of mine, and they appear to all be BIOS mode. not UEFI.
                                    I've had a couple hardware machines as well, that updated OK. I know at least one was UEFI.

                                    G 1 Reply Last reply Reply Quote 0
                                    • G Offline
                                      Greg_E @bberndt
                                      last edited by

                                      @bberndt that's why I mentioned uefi, wondering if legacy is part of the problem. I won't have time to fiddle with this for a while, broke a couple things today that I need to fix, and need to set up glpi for some testing.

                                      B 1 Reply Last reply Reply Quote 0
                                      • stormiS Offline
                                        stormi Vates 🪐 XCP-ng Team @bberndt
                                        last edited by

                                        @bberndt https://bugzilla.redhat.com/show_bug.cgi?id=2331326

                                        There's also a KB now: https://access.redhat.com/solutions/7116307

                                        1 Reply Last reply Reply Quote 0
                                        • stormiS Offline
                                          stormi Vates 🪐 XCP-ng Team
                                          last edited by stormi

                                          CCing @anthonyper who is tasked with following this regression closely and letting us know about any progress.

                                          1 Reply Last reply Reply Quote 0
                                          • B Offline
                                            bberndt @Greg_E
                                            last edited by

                                            @Greg_E said in XCP vm fails to boot after newest kernel applied.:

                                            @bberndt that's why I mentioned uefi, wondering if legacy is part of the problem. I won't have time to fiddle with this for a while, broke a couple things today that I need to fix, and need to set up glpi for some testing.

                                            Made a new Rocky Linux 8 install, on a lab host. UEFI boot mode, and mostly all defaults. on XCP-ng 8.2 on a E5 2620 v0 host.
                                            Used the guest tool from the Rocky and or EPEL repository. (added the EPEL repo and then installed xe-guest-utilities)
                                            Does NOT boot after updating to the latest kernel.

                                            G 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post