If it's possible to over-subscribe vCPU's: What do vCPU's really represent - A weighted average of CPU cycles that will be allocated to the guest?
e.g., If a host has eight CPU cores with hyperthreading enabled and all four guests are allocated 16 vCPU's: What benefit will the four guests see over what they would experience if they had each been allocated 2 vCPU's?
olivierlambert Vates 🪐 Co-Founder🦸 CEO 🧑💼
From Xen point a view, a VM is like a process. Yes, like your regular process in your "regular" OS. So there's no problem to get more vCPUs used in total than existing CPU (including threads). There's some "rules" however:
- one VM can't have more vCPUs than the physical machine (mostly for security reasons)
- when you have more vCPUs than actual CPU, Xen will schedule each vCPU like your regular OS will do for a process/thread.
Let's imagine you have 8 vCPUs on 2 different VMs (=16 vCPUs in total) while having only 8 cores on your physical host. Let's also imagine those 2x VMs are running 100% of all their vCPUs.
The end result will be: all the load split equally between your VMs, so each one will have 50% of the physical CPUs time. That's simple as this. Obviously, it's just an example and in the real world you also have the dom0 vCPU usage, but you get the concept.
Also, in real life, it's not common to have all your VMs using 100% CPU, so over-provision is OK. For most load, twice vCPUs per CPU works OK, but YMMV. If you truly want dedicated CPU time, then get less vCPU per CPU.
edit: it's a good idea to tune that after few weeks of production and monitoring. In Xen Orchestra/Dashboard, you can easily see the total number of vCPU per pool vs the number of physical cores. If you are at 2:1 but you don't have crazy VM load, then it's fine to continue even further.
tjkreidl Ambassador 📣
@epretorious I would add that you have to be careful about overprovisioning when NUMA/vNUMA kicks in, that is when you allocate more VCPUs to exceed the number of physical CPUs of a bank of them as well as the associated physical memory (assume, for the sake of argument, you have two banks of physical CPUs and each has directly accessible to it one of two banks of memory) then things get inefficient because a CPU may need to go across to a different bank of memory to access data and there is additional overhead involved. See for example this article and the two preceding it: