XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Server 2016 BSOD Loop

    Scheduled Pinned Locked Moved Compute
    16 Posts 2 Posters 392 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G Offline
      guiltykeyboard
      last edited by guiltykeyboard

      On 11/12, one of my Server 2016 VM's got stuck in a boot loop.

      It starts, and then has a BSOD with the code KMODE Exception Not Handled.

      Taking the VHD from our storage server and copying it to a Hyper-V server allows it to boot. The issue seems to be with XCP-NG.

      My two hosts are XCP-NG 8.3.0 with the latest patches applied. I've had this VM for a few years - and it seemed fine after the migration to 8.3.0 - until it wasn't. None of our other VM's have this issue.

      This VM has secure boot disabled and does not have a VTPM. There are no passthrough PCI or USB devices.

      1 Reply Last reply Reply Quote 0
      • G Offline
        guiltykeyboard
        last edited by

        99dea17e-ae5a-4e4e-a2a9-5f7eab98904f-image.png

        1 Reply Last reply Reply Quote 0
        • G Offline
          guiltykeyboard
          last edited by

          Additionally, I just did a fresh install of Server 2016 in a new VM and experienced the exact same code as soon as the VM finished the OS install and rebooted the first time with no drivers or anything.

          Seems to be an issue with Server 2016 VM's.

          D 1 Reply Last reply Reply Quote 0
          • D Offline
            dinhngtu Vates 🪐 XCP-ng Team @guiltykeyboard
            last edited by

            Could you describe the hardware being used?

            G 1 Reply Last reply Reply Quote 0
            • G Offline
              guiltykeyboard @dinhngtu
              last edited by

              @dinhngtu Dell R660 with 2x Intel(R) Xeon(R) Gold 5416S

              G 1 Reply Last reply Reply Quote 0
              • G Offline
                guiltykeyboard @guiltykeyboard
                last edited by

                @guiltykeyboard Both of my hosts are identical

                1 Reply Last reply Reply Quote 0
                • D Offline
                  dinhngtu Vates 🪐 XCP-ng Team
                  last edited by

                  I think it's a compatibility problem with newer Intel CPUs. We have some experimental patches available. Do you have an identical host for testing purposes? Could you install the following RPMs and see if it helps?

                  https://nextcloud.vates.fr/index.php/s/MYZpPgQCRwWnDYq

                  If you don't want to install test RPMs you could try the following command instead:

                  xe vm-param-add uuid=... param-name=platform msr-relaxed=true
                  
                  G 1 Reply Last reply Reply Quote 0
                  • G Offline
                    guiltykeyboard @dinhngtu
                    last edited by

                    @dinhngtu I can run all of my VM's on one host while testing on the other.

                    Which of the two methods should I try first?

                    1 Reply Last reply Reply Quote 0
                    • D Offline
                      dinhngtu Vates 🪐 XCP-ng Team
                      last edited by dinhngtu

                      Please try the RPMs first, they contain the fixes that we'd like to integrate with Xen upstream. Also please note that you need to reboot the host after installing these patches.

                      G 1 Reply Last reply Reply Quote 0
                      • G Offline
                        guiltykeyboard @dinhngtu
                        last edited by

                        @dinhngtu Is there any special process for installing these, or is it just installing all of the RPM's one at a time?

                        1 Reply Last reply Reply Quote 0
                        • D Offline
                          dinhngtu Vates 🪐 XCP-ng Team
                          last edited by

                          You can use the following command:

                          yum localinstall xen-{dom0-libs,dom0-tools,hypervisor,libs,tools}-4.17.5-4.0.lbr.19.xcpng8.3.x86_64.rpm
                          
                          G 1 Reply Last reply Reply Quote 0
                          • G Offline
                            guiltykeyboard @dinhngtu
                            last edited by

                            @dinhngtu said in Server 2016 BSOD Loop:

                            yum localinstall xen-{dom0-libs,dom0-tools,hypervisor,libs,tools}-4.17.5-4.0.lbr.19.xcpng8.3.x86_64.rpm

                            Installing those RPMs, restarting the host, and then starting the 2016 VM on that host did not fix the issue. It had the same result.

                            One thing that was different is that it said preparing devices when starting before it hit the BSOD.

                            1 Reply Last reply Reply Quote 0
                            • D Offline
                              dinhngtu Vates 🪐 XCP-ng Team
                              last edited by

                              Could you try it again with a fresh 2016 VM? I'll try to reproduce the issue.

                              G 1 Reply Last reply Reply Quote 0
                              • G Offline
                                guiltykeyboard @dinhngtu
                                last edited by

                                @dinhngtu Same problem on a fresh install.

                                G 1 Reply Last reply Reply Quote 0
                                • G Offline
                                  guiltykeyboard @guiltykeyboard
                                  last edited by

                                  @guiltykeyboard This was not an issue on 8.2.* and when we originally upgraded to 8.3 it wasn't an issue either.

                                  I think perhaps the latest round of patches might have introduced an issue.

                                  1 Reply Last reply Reply Quote 0
                                  • D Offline
                                    dinhngtu Vates 🪐 XCP-ng Team
                                    last edited by

                                    Could you collect a crash dump using a boot CD?

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post