Slow boot on rocky linux 10 latest kernel
-
I can reproduce on Debian 13 with stock kernel and a Ryzen 5 7600.
-
I'm currently bisecting various kernel builds, I think I'm close to find the culprit.
-
Luckily I have enough cores to build a kernel in 2 minutes, because it's been a LOT of them already built to find the culprit

-
- 6.12.90 -> bad
- 6.12.45 -> bad
- 6.12.22 -> bad
- 6.12.11 -> bad
- 6.12.5 -> bad
BUT 6.12.2 is GOOD!
Almost there!edit: now 6.12.3 is also good. Getting close…
edit: since 6.12.4 is good, then the issue is within 6.12.5. Investigating now.
-
Found the culprit, made & tested a patch that works.
Recap at https://notes.vates.tech/share/v5jtq0iytw/p/slow-hvm-boot-on-linux-6-12-5wrvOvZKJ7
I will let the rest of my team to try to get it fixed in upstream.
-
@olivierlambert Awesome work on tracking down the issue!
Very nice, detailed, technical writeup. -
Ping @Team-Hypervisor-Kernel for reference.
-
@majorp93 @henri9813 @acebmxer
Do you observe the same behavior after setting this for the VM ?xe vm-param-add uuid=$UUID param-name=platform tsc_mode=2 xe vm-param-add uuid=$UUID param-name=platform nomigrate=true(beware you lose live migration support doing this, you can cancel these changes with matching vm-param-remove like
xe vm-param-remove uuid=$UUID param-name=platform param-key=nomigrate) -
param-name=platform nomigrate=true
Hi @teddyastie , thanks for working on this.
As per policy I am not allowed to test these parameters in production which is why I had to create a small test setup for being able to try your settings.
I deployed a Debian 13 VM via Cloud-Init on a XCP-ng test host using the official Debian 13 cloud image.
After deploying the VM I had the issue of slow boot.
After shutting the VM down, applying the settings that you just sent and starting it again I can say that you are on the right track!
In my case the boot time is completely normal now and on par with Debian 13 VMs that use BIOS instead of UEFI (for booting).
As this is a workaround and disables live migration this is not an option for production environments but good to have a workaround available anyways for sure!
Do you think it is possible to fix this on hypervisor level while still having live migration etc. enabled or do we have to wait for an upstream fix within Linux kernel tree?
-
@MajorP93 said:
Do you think it is possible to fix this on hypervisor level while still having live migration etc. enabled or do we have to wait for an upstream fix within Linux kernel tree?Yes it's possible to fix it on the hypervisor level (Invariant TSC in guest), but it's quite a bit of work that still needs to be done. A Linux upstream fix for the underlying bug should come at some point hopefully.
-
2 paths we are doing in parallel:
- We are doing our best to make it upstream in Linux, it's a regression after all. We know how to fix it, so hopefully this will be fixed quickly. Then, we'll have to wait for a Linux kernel update in main distros.
- Invariant TSC in Xen is also a way to fix it, because we want to improve that anyway. But as Teddy said, it's more work and it will take more time.
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login