XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Slow boot on rocky linux 10 latest kernel

    Scheduled Pinned Locked Moved Compute
    23 Posts 7 Posters 230 Views 6 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Online
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      I can reproduce on Debian 13 with stock kernel and a Ryzen 5 7600.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Online
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        I'm currently bisecting various kernel builds, I think I'm close to find the culprit.

        1 Reply Last reply Reply Quote 0
        • olivierlambertO Online
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          Luckily I have enough cores to build a kernel in 2 minutes, because it's been a LOT of them already built to find the culprit 😅

          1 Reply Last reply Reply Quote 1
          • olivierlambertO Online
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by olivierlambert

            • 6.12.90 -> bad
            • 6.12.45 -> bad
            • 6.12.22 -> bad
            • 6.12.11 -> bad
            • 6.12.5 -> bad

            BUT 6.12.2 is GOOD! 😓 Almost there!

            edit: now 6.12.3 is also good. Getting close…

            edit: since 6.12.4 is good, then the issue is within 6.12.5. Investigating now.

            1 Reply Last reply Reply Quote 3
            • olivierlambertO Online
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by olivierlambert

              Found the culprit, made & tested a patch that works.

              Recap at https://notes.vates.tech/share/v5jtq0iytw/p/slow-hvm-boot-on-linux-6-12-5wrvOvZKJ7

              I will let the rest of my team to try to get it fixed in upstream.

              M 1 Reply Last reply Reply Quote 1
              • M Online
                MajorP93 @olivierlambert
                last edited by

                @olivierlambert Awesome work on tracking down the issue!
                Very nice, detailed, technical writeup.

                1 Reply Last reply Reply Quote 1
                • olivierlambertO Online
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  Ping @Team-Hypervisor-Kernel for reference.

                  1 Reply Last reply Reply Quote 0
                  • TeddyAstieT Offline
                    TeddyAstie Vates 🪐 XCP-ng Team Xen Guru
                    last edited by

                    @majorp93 @henri9813 @acebmxer
                    Do you observe the same behavior after setting this for the VM ?

                    xe vm-param-add uuid=$UUID param-name=platform tsc_mode=2
                    xe vm-param-add uuid=$UUID param-name=platform nomigrate=true
                    

                    (beware you lose live migration support doing this, you can cancel these changes with matching vm-param-remove like xe vm-param-remove uuid=$UUID param-name=platform param-key=nomigrate)

                    M 1 Reply Last reply Reply Quote 0
                    • M Online
                      MajorP93 @TeddyAstie
                      last edited by MajorP93

                      @TeddyAstie said:

                      param-name=platform nomigrate=true

                      Hi @teddyastie , thanks for working on this.

                      As per policy I am not allowed to test these parameters in production which is why I had to create a small test setup for being able to try your settings.

                      I deployed a Debian 13 VM via Cloud-Init on a XCP-ng test host using the official Debian 13 cloud image.

                      After deploying the VM I had the issue of slow boot.

                      After shutting the VM down, applying the settings that you just sent and starting it again I can say that you are on the right track!

                      In my case the boot time is completely normal now and on par with Debian 13 VMs that use BIOS instead of UEFI (for booting).

                      As this is a workaround and disables live migration this is not an option for production environments but good to have a workaround available anyways for sure!

                      Do you think it is possible to fix this on hypervisor level while still having live migration etc. enabled or do we have to wait for an upstream fix within Linux kernel tree?

                      TeddyAstieT 1 Reply Last reply Reply Quote 0
                      • TeddyAstieT Offline
                        TeddyAstie Vates 🪐 XCP-ng Team Xen Guru @MajorP93
                        last edited by

                        @MajorP93 said:
                        Do you think it is possible to fix this on hypervisor level while still having live migration etc. enabled or do we have to wait for an upstream fix within Linux kernel tree?

                        Yes it's possible to fix it on the hypervisor level (Invariant TSC in guest), but it's quite a bit of work that still needs to be done. A Linux upstream fix for the underlying bug should come at some point hopefully.

                        1 Reply Last reply Reply Quote 1
                        • olivierlambertO Online
                          olivierlambert Vates 🪐 Co-Founder CEO
                          last edited by

                          2 paths we are doing in parallel:

                          1. We are doing our best to make it upstream in Linux, it's a regression after all. We know how to fix it, so hopefully this will be fixed quickly. Then, we'll have to wait for a Linux kernel update in main distros.
                          2. Invariant TSC in Xen is also a way to fix it, because we want to improve that anyway. But as Teddy said, it's more work and it will take more time.
                          1 Reply Last reply Reply Quote 1

                          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                          With your input, this post could be even better 💗

                          Register Login
                          • First post
                            Last post