We experience the exact same issue with CloudLinux OS 8, seemingly random after live migration. This has been ongoing for years. Seems to happen far less now with shared storage.

My theory somehow the kernel and/or PVE module doesn't handle the freeze during live migration, longer freeze, more risk of this happening.

VMs start to crash random amount of time after live migration, never immideate. Could be hours, or days even, making it hard to diagnose. No crash dump, nothing, just 100% CPU on all cores and frozen console.

One consistent thing we see, that happens almost every time, is that top and other tools stop working, they are frozen in a state were no CPU load etc is reported, but there is load on the server.

We've been going back and forth with CloudLinux support and they did some changed to tuned profile regarding disk buffers/cache that made things at bit more stable but not gone 100%.

We don't see the same error in AlmaLinux 9 and CloudLinux OS 9.

More busy VM = more chance of happening. Uptime may be a factor, too.