XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Question on CPU masking with qemu and xen

    Scheduled Pinned Locked Moved Compute
    6 Posts 5 Posters 887 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • brodiecyberB Offline
      brodiecyber
      last edited by

      Hello all

      I have a question but dot know enough about Xen hypervisor to answer it.
      So it simple why cant XCP-ng mask the type of CPU that the vms running on.
      Take for example the KVM on proxmox you can set host pass-through to send all CPU functions to the vm or mask it to an intel Xeon scalable or AMD Epyc.
      I see that host pass-through is what XCP-ng does but was wondering why.

      Doesnt masking the cpu make the vm more portable as the profile can be set to for example QEMU-xen or something making it possible to say migrate a vm live from AMD to intel n different pools or a heterogeneous cluster.

      And one last thing if it is possible is there no way to emulate other architecture supported by qemu such as arm64 or does xen hypervisor still need more features built out on arm before that i possible. Asking because my use case is i cant afford a raspberry pi but have alot of compute on x86 why not emulate arm 64 on the hypervisor rather than in a vm

      Just asking would appreciate a response. thanks

      TeddyAstieT 1 Reply Last reply Reply Quote 0
      • TeddyAstieT Online
        TeddyAstie Vates 🪐 XCP-ng Team Xen Guru @brodiecyber
        last edited by

        @brodiecyber said in Question on CPU masking with qemu and xen:

        Hello all

        I have a question but dot know enough about Xen hypervisor to answer it.
        So it simple why cant XCP-ng mask the type of CPU that the vms running on.
        Take for example the KVM on proxmox you can set host pass-through to send all CPU functions to the vm or mask it to an intel Xeon scalable or AMD Epyc.
        I see that host pass-through is what XCP-ng does but was wondering why.

        It doesn't exactly work this way.
        With XCP-ng, it works at pool level, it basically takes the lowest CPU feature level of your hosts of the pool, and sets all VM to use it. That way you should always be able to migrate to another machine of the same pool, while not having to manually set the feature levels to improve performance.

        Doesnt masking the cpu make the vm more portable as the profile can be set to for example QEMU-xen or something making it possible to say migrate a vm live from AMD to intel n different pools or a heterogeneous cluster.

        In practice, you can't reliably live migrate between Intel and AMD architecture (even with QEMU) due to various subtle architecture differences. It may somewhat work with luck if the guest is using a ancient processing model (like 32-bits) but that's not really something we support.

        And one last thing if it is possible is there no way to emulate other architecture supported by qemu such as arm64 or does xen hypervisor still need more features built out on arm before that i possible. Asking because my use case is i cant afford a raspberry pi but have alot of compute on x86 why not emulate arm 64 on the hypervisor rather than in a vm

        Emulating a completely different CPU architecture while technically possible, is a completely different story.

        1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          For migration between vendors (AMD/Intel) I suggest the warm migration feature of XOA, solving the problem with only few minutes of interruption.

          1 Reply Last reply Reply Quote 0
          • C Online
            cg
            last edited by

            As I can't find it anywhere:
            What exactly is happening on join/leave?
            Is the masking changing at runtime and affecting every new started VM, does the pool need to be rebooted after an older server has been removed, to "unlock" all features?

            I have a pool with HPE DL380 Gen 9 and 10 where Gen 9 will replaced with Gen 12.

            So I assume it makes sense to remove 9 first, then add 12.
            But how exactly is the masking process working?
            Should I reboot Gen 10 first (shutting down all VMs) or add 12, move VMs over... or just shutdown and start all currently running VMs?

            In the early days (~XenServer 6) it had to be done manually by getting CPU features, calculating the commons and give it as parameter to join command.
            What I did not see though, is: How exactly is it working on removal and what's best practice.

            (also: should be added to documentations)

            A 1 Reply Last reply Reply Quote 0
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              It cannot be done on runtime, because if it does, VMs will crash when losing some CPU features. A CPU mask is applied on domain creation.

              Shutdown and then start all VMs (not just reboot).

              Could be confirmed by @andyhhp ideally 😄

              1 Reply Last reply Reply Quote 0
              • A Online
                andyhhp Xen Guru @cg
                last edited by

                @cg said in Question on CPU masking with qemu and xen:

                In the early days (~XenServer 6) it had to be done manually

                Yes, and I rewrote it entirely in XenServer 7 because doing it manually was absurd.

                tl;dr, for your case:

                1. Add the Gen12's to the pool
                2. Migrate remaining VMs off the Gen 9's
                  2a. Any VMs which can't migrate for feature reasons, reboot first then migrate
                3. Remove the Gen9's from the pool
                4. Reboot all VMs

                The longer answer:

                When Xen boots, it calculates what it can offer to guests, feature wise. This takes into account the CPU, firmware settings, errata, command line parameters, etc. This feature information is made available to the toolstack/xapi to work with. On a per-VM basis, Xen knows the features that the guest was given. Different VMs can have different configurations, even if they're running on the same host.

                An individual VM's features are fixed during it's uptime (including migrate). The only point at which the features can safely change is when the VM reboots. All the migration safety checks are performed as "is the featureset this VM saw at boot compatible with the destination host it's trying to run on".

                At a pool level, Xapi always dynamically calculates the "pool level". i.e. the common subset[*] of features that will allow a VM to migrate to anywhere in the pool. Importantly, this is recalculated as pool members join and leave the pool, including a pool member rebooting (where it leaves temporarily, then rejoins. Feature information may change after the reboot, e.g. changing a firmware or command line setting).

                When a VM boots, it gets given the "pool level" by default, meaning that it should be able to migrate anywhere in the pool as the pool existed at the point of booting the VM. If you subsequently add a new host to the pool, the pool level may drop and already-running VMs will be unable to migrate to this new host, but will be able to migrate to other pool members.

                As you remove members from the pool, the pool level may rise. e.g. if you removed the only host that was lacking a certain feature. The final reboot in your case is to allow the VM's to start using the Gen10 feature baseline, now that it's not "levelled down" for compatibility with the Gen9's.

                ~Andrew

                [*] While subset is the intuitive way to think of this operation, it's not actually a subset in the mathematical sense. Some features behave differently to maintain safety for the VM.

                1 Reply Last reply Reply Quote 1
                • First post
                  Last post