XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XCP-ng and NVIDIA GPUs

    Scheduled Pinned Locked Moved Development
    22 Posts 6 Posters 10.8k Views 8 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • MajorTomM Offline
      MajorTom @jcpt928
      last edited by

      jcpt928 👍 🙂

      1 Reply Last reply Reply Quote 0
      • apayneA Offline
        apayne @olivierlambert
        last edited by

        olivierlambert I did a bit of light digging.

        General consensus is that Dell's servers are not ready for this kind of stuff, but then again I've seen crowds get things wrong before:
        https://www.reddit.com/r/homelab/comments/6mafcg/can_i_install_a_gpu_in_a_dell_power_edge_r810/

        This is the method described for KVM:
        http://mathiashueber.com/fighting-error-43-how-to-use-nvidia-gpu-in-a-virtual-machine/

        Additional KVM docs (plus a small description of the vendor ID problem):
        https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#"Error_43:_Driver_failed_to_load"_on_Nvidia_GPUs_passed_to_Windows_VMs

        An updated methodology for Ryzen on-chip GPU:
        http://mathiashueber.com/ryzen-based-virtual-machine-passthrough-setup-ubuntu-18-04/

        This is the method described for VMWare:
        http://codefromabove.com/2019/02/the-hyperconverged-homelab-windows-vm-gaming/

        Hyper-V documentation is a bit more sparse, but this hints that Microsoft may have simply worked around the issue (ala vendor license agreements), at least when using RemoteFX:
        http://techgenix.com/enabling-physical-gpus-hyper/

        (Optional) Get CUDA working for cheap-o cards:
        https://medium.com/@samnco/using-the-nvidia-gt-1030-for-cuda-workloads-on-ubuntu-16-04-4eee72d56791

        So it looks like the common factors are:

        • The GPU device must be isolated on the host with the vfio kernel driver. To ensure this, the vfio driver must load first, prior to any vendor or open source driver.
        • GPU must be connected to the guest VM via PCI pass-through. No surprise.
        • The CPU must not be identified as a virtual one, it must have some other identity when probed. This appears to be the key to preventing the dread NVidia Error 43; it suggests the driver is just examining the CPU assigned to it, although some documentation mentions a "vendor" setting. The work-around is to make it into a string it doesn't match against, and it just works. Even a setting of "unknown" is shown to work. I don't know if there is a way to specify in a XCP guest "please don't identify yourself as virtual".
        • For cards that are CUDA capable but "unsupported" by NVidia, you install the software in a difference sequence (CUDA first, then driver).

        Disclaimer: I'm just compiling a list to get an idea about what to do; I haven't done the actual install, nor do I have the hardware. Hopefully this helps.

        1 Reply Last reply Reply Quote 2
        • First post
          Last post