XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Wyse 5070 VM won't booting after update bios 1.27

    Scheduled Pinned Locked Moved Hardware
    21 Posts 4 Posters 2.8k Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T Offline
      t.chamberlain
      last edited by

      Wanted to share this with group because this had me thrown for a loop for the better part of three day & I know there are quite a few other homelab aficionados that repurposed these little thinclients because they are just so small & versatile.

      TLDR: microcode update was to culprit. Hope dell either push an update or it otherwise get resolved or...I'll be unsecure, I guess.

      Fixed it by rolling back to version 2:2.1-26.xs28.1.xcpng8.2

      yum downgrade microcode_ctl 2:2.1-26.xs28.1.xcpng8.2

      yum downgrade microcode_ctl

      might work as well, but that really depends on your upgrade path.

      Detail:

      After a recent update I discovered that none on my virtual machines would boot save for one that would boot sporadically. It was accidentally set to pv drivers...not entirely sure how that happened, but it also could have just loaded that way when I imported the cloud image from Harvester.

      The symptoms:

      Not keyboard or mouse input for windows or linux.

      Windows vms would boot & do one of two thing the screen would "Guest has not initialised the display (yet)" or get to the point where the drive was initialized & hang...they would basically max out what CPU they were allowed.

      Linux vms would boot & always hang at the same place right after grub loaded initial ramdisk.

      Try every bios setting I could no joy.

      Hope you other poor stuck soles stumble of this & save your self's some time.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        @stormi I'm not sure there's many things we can do about it, maybe a proper/documented way to pin the latest working microcode?

        T 1 Reply Last reply Reply Quote 0
        • T Offline
          t.chamberlain @olivierlambert
          last edited by

          @olivierlambert Completely agree, years old consumer/non-HCL compliant hardware shouldn't rate in overall development progress & a pin is about the best we should hope for.

          It might just be time for some home labbers to start shopping & retire come gear!

          The XCP-NG community & platform are amazing...keep up the great work!

          1 Reply Last reply Reply Quote 1
          • stormiS Offline
            stormi Vates 🪐 XCP-ng Team
            last edited by

            Making sure (how?) this topic is indexed by search engines would be a good starting point.

            stormiS 1 Reply Last reply Reply Quote 0
            • stormiS Offline
              stormi Vates 🪐 XCP-ng Team @stormi
              last edited by

              My search engine just made me discover this one: https://xcp-ng.org/forum/topic/8584/dell-wyse-fw-update-breaks-vm-booting-console-frozen-tianocore-edk2-related

              1 Reply Last reply Reply Quote 0
              • stormiS Offline
                stormi Vates 🪐 XCP-ng Team
                last edited by

                So, could you get some logs from the boot failures?

                • head /proc/cpuinfo
                • full output of xl dmesg including booting the problem VM with the good and bad ucode in place
                1 Reply Last reply Reply Quote 0
                • T Offline
                  t.chamberlain
                  last edited by stormi

                  @stormi said in Wyse 5070 VM won't booting after update bios 1.27:

                  head /proc/cpuinfo

                  Here info for my 3 nodes.

                  xenserver01.trc.blox_cpuinfo.txt
                  xenserver01.trc.blox_xl_dmesg.txt
                  xenserver02.trc.blox_cpuinfo.txt
                  xenserver02.trc.blox_xl_dmesg.txt
                  xenserver03.trc.blox_cpuinfo.txt
                  xenserver03.trc.blox_xl_dmesg.txt

                  stormiS 1 Reply Last reply Reply Quote 0
                  • stormiS Offline
                    stormi Vates 🪐 XCP-ng Team @t.chamberlain
                    last edited by

                    @t-chamberlain Is the output of xl dmesg with, or without the bad microcode? Did a VM fail to boot since the hosts were booted?

                    T 2 Replies Last reply Reply Quote 0
                    • T Offline
                      t.chamberlain @stormi
                      last edited by

                      @stormi

                      Those are the microcode downgraded hosts. I can upgrade the packages & get those outputs as well if need be.

                      stormiS 1 Reply Last reply Reply Quote 0
                      • T Offline
                        t.chamberlain @stormi
                        last edited by

                        @stormi Sorry about that only answered part of the question. I did have a vm fail to boot, but it was because of an underlying issues with drbd/xostor & don't believe related to the microcode.

                        1 Reply Last reply Reply Quote 0
                        • stormiS Offline
                          stormi Vates 🪐 XCP-ng Team @t.chamberlain
                          last edited by

                          @t-chamberlain Yes, please. One host, with one failed VM start, would be enough

                          T 2 Replies Last reply Reply Quote 0
                          • T Offline
                            t.chamberlain @stormi
                            last edited by

                            @stormi

                            Here is the xl_dmesg output after the microcode update.

                            xenserver02.trc.blox_microcode_upd_xl_dmesg.txt

                            stormiS 1 Reply Last reply Reply Quote 0
                            • T Offline
                              t.chamberlain @stormi
                              last edited by

                              @stormi

                              You can see the vms just kind of hang here.

                              374c345a-8aef-4d42-8aaa-d997590415a6-image.png

                              1 Reply Last reply Reply Quote 0
                              • olivierlambertO olivierlambert referenced this topic on
                              • stormiS Offline
                                stormi Vates 🪐 XCP-ng Team @t.chamberlain
                                last edited by stormi

                                @t-chamberlain Was the xl dmesg output produced before, or after the VM hanged?

                                T 1 Reply Last reply Reply Quote 0
                                • T Offline
                                  t.chamberlain @stormi
                                  last edited by

                                  @stormi At that point that vm dsk05002 was hung for about 20 minutes. I don't know if it where are like this that an alert event is ever generated.

                                  If need be I can boot a vm & just let it go.

                                  stormiS 1 Reply Last reply Reply Quote 0
                                  • stormiS Offline
                                    stormi Vates 🪐 XCP-ng Team @t.chamberlain
                                    last edited by

                                    @t-chamberlain No need to, unless you have some doubts. For now the conclusion will be xl dmesg doesn't output anything particular when the VM hangs.

                                    1 Reply Last reply Reply Quote 0
                                    • stormiS Offline
                                      stormi Vates 🪐 XCP-ng Team
                                      last edited by

                                      However, logs from a debug Xen might give more clues.

                                      If you can, please follow the instructions given by @andyhhp - a Xen developer - at https://xcp-ng.org/forum/post/74855.

                                      A 1 Reply Last reply Reply Quote 0
                                      • A Offline
                                        andyhhp Xen Guru @stormi
                                        last edited by andyhhp

                                        @t-chamberlain In addition to the XTF testing, could you also please (with the bad microcode) try booting Xen with spec-ctrl=no-verw on the command line, and seeing whether that changes the behaviour of your regular VMs? Please capture xl dmesg from this run too.

                                        1 Reply Last reply Reply Quote 1
                                        • stormiS Offline
                                          stormi Vates 🪐 XCP-ng Team
                                          last edited by

                                          Doc about XTF testing: https://docs.xcp-ng.org/project/development-process/tests/#test-the-xen-hypervisor-itself

                                          A 1 Reply Last reply Reply Quote 0
                                          • A Offline
                                            andyhhp Xen Guru @stormi
                                            last edited by

                                            @t-chamberlain I've got a fix from Intel, and @stormi has packaged it.

                                            yum update microcode_ctl --enablerepo=xcp-ng-testing should get you microcode_ctl-2.1-26.xs29.2.xcpng8.2 which has the fixed microcode for this issue in it.

                                            T 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post