XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    nVidia Tesla P4 for vgpu and Plex encoding

    Scheduled Pinned Locked Moved Solved Compute
    vgpu
    63 Posts 14 Posters 18.3k Views 16 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • tjkreidlT Offline
      tjkreidl Ambassador @wyatt-made
      last edited by

      wyatt-made Yeah, you need not only licenses for the hosts and any VMs running on them, but also have to run a custom NVIDIA license manager.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates πŸͺ Co-Founder CEO
        last edited by

        Mediated devices will be a game changer… Eager to show our results with DPU, that will be the start of it. Some reading on the potential: https://arccompute.com/blog/libvfio-commodity-gpu-multiplexing/

        1 Reply Last reply Reply Quote 0
        • splastunovS Offline
          splastunov
          last edited by splastunov

          No luck yet...

          I found 2 configs:

          1. /usr/share/nvidia/vgpu/vgpuConfig.xml.
            Seams that nvidia-vgpud.service use this config on start to generate vgpu types.
            List of vgpu types you can get with command
          nvidia-smi vgpu -s
          

          My output is

          GPU 00000000:81:00.0
              GRID T4-1B
              GRID T4-2B
              GRID T4-2B4
              GRID T4-1Q
              GRID T4-2Q
              GRID T4-4Q
              GRID T4-8Q
              GRID T4-16Q
              GRID T4-1A
              GRID T4-2A
              GRID T4-4A
              GRID T4-8A
              GRID T4-16A
              GRID T4-1B4
          
          GPU 00000000:C1:00.0
              GRID T4-1B
              GRID T4-2B
              GRID T4-2B4
              GRID T4-1Q
              GRID T4-2Q
              GRID T4-4Q
              GRID T4-8Q
              GRID T4-16Q
              GRID T4-1A
              GRID T4-2A
              GRID T4-4A
              GRID T4-8A
              GRID T4-16A
              GRID T4-1B4
          
          1. Set of configs located in /usr/share/nvidia/vgx
            Here are individual config for each type
           # ls -la | grep "grid_t4"
          -r--r--r-- 1 root root   530 Oct 20 09:11 grid_t4-16a.conf
          -r--r--r-- 1 root root   556 Oct 20 09:11 grid_t4-16q.conf
          -r--r--r-- 1 root root   529 Oct 20 09:11 grid_t4-1a.conf
          -r--r--r-- 1 root root   529 Oct 20 09:11 grid_t4-1b4.conf
          -r--r--r-- 1 root root   529 Oct 20 09:11 grid_t4-1b.conf
          -r--r--r-- 1 root root   555 Oct 20 09:11 grid_t4-1q.conf
          -r--r--r-- 1 root root   528 Oct 20 09:11 grid_t4-2a.conf
          -r--r--r-- 1 root root   528 Oct 20 09:11 grid_t4-2b4.conf
          -r--r--r-- 1 root root   528 Oct 20 09:11 grid_t4-2b.conf
          -r--r--r-- 1 root root   554 Dec  9 15:45 grid_t4-2q.conf
          -r--r--r-- 1 root root   529 Oct 20 09:11 grid_t4-4a.conf
          -r--r--r-- 1 root root   555 Oct 20 09:11 grid_t4-4q.conf
          -r--r--r-- 1 root root   530 Oct 20 09:11 grid_t4-8a.conf
          -r--r--r-- 1 root root   556 Oct 20 09:11 grid_t4-8q.conf
          

          Now I'm trying to change pci_id of vgpu, to make guest OS "think" that vgpu is Quadro RTX 5000 (based on the same chip TU104).

          I played around with the configs, but without success.
          Any changes in the configs lead to the fact that the VM stops starting, because XCP cannot create vgpu.

          In the guest OS, I tried to install different drivers manually, but the device does not start.

          So I have 2 questions:

          1. Is there any option to make change in some "raw" VM config (or something like this) to change vgpu pci_id?
            I have tried to export metadata to XVA format, edit it and import.
            But after VM start it change all IDs back....

          2. Is it possible to create custom vgpu_types?
            xe vgpu-type-list show that all types are RO (readonly).
            Seams that they are generating when XCP boot.

          tjkreidlT 1 Reply Last reply Reply Quote 0
          • tjkreidlT Offline
            tjkreidl Ambassador @splastunov
            last edited by

            splastunov My guess is that it's not going to be possible to do any customizing since the GPU configuration types are managed by the NVIDIA drivers and applications, which incorporate specific types associated with each GPU model. As newer releases appear, sometimes this will change (such as with the introduction of the "B" configurations some years ago).
            About the only close equivalent to a "raw" designation for a VM would be to do a passthrough to that VM,but even then , you still are going to be restricted to defining some standard GPU type.

            splastunovS 1 Reply Last reply Reply Quote 0
            • splastunovS Offline
              splastunov @tjkreidl
              last edited by splastunov

              Haha, it finally works!!! πŸ™‚

              The problem was with the template I used to deploy the VM.
              I first deployed a VM from the default Windows 2019 template and it was not possible to install the GPU drivers.

              After that I tried deploying the VM from the "other installation media" template and now I can install any drivers.
              To make it work with different benchmarks, I installed the Quadro RTX 5000 driver (from consumer site).
              The result is on the screen. About 60 average FPS.
              I think it is limited with driver on host.
              As you can see FurMark detected GPU correctly T4-Q8.
              be4b2d9f-cb0d-4170-b442-234531528393-image.png

              Result from eth classic miner.
              Miner detected GPU as T4 too.
              11a56093-bacf-4277-9072-a169a0be46a1-image.png

              No licenses are required : )

              tjkreidlT H 2 Replies Last reply Reply Quote 0
              • tjkreidlT Offline
                tjkreidl Ambassador @splastunov
                last edited by

                splastunov Hmmm, not sure that will work for long for a P4 or T4 without an NVIDIA license. At some point, it will likely throttle down to a maximum of 3 FPS after the grace period expires.

                Then again, you might get lucky.

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates πŸͺ Co-Founder CEO
                  last edited by

                  Weird, something in the Windows template is make it fail? That's interesting information πŸ€”

                  1 Reply Last reply Reply Quote 0
                  • splastunovS splastunov referenced this topic on
                  • D Offline
                    Dani
                    last edited by

                    Hi everyone,
                    I'm interested too in the use of Nvidia GRID in XCP-ng because we have a cluster with 3 XCP-ng servers and now a new one with a GPU Nvidia A100. It would be great if I could use it in a new XCP-ng pool, because it's an excellent tool and we already have the knowledge.
                    Our plan is to virtualize the A100 80 GB GPU so we can use it in various virtual machines, with "slices" of 10/20 GB, for compute tasks (AI, Deep learning, etc.).
                    So I have two questions:

                    1. The trick copying this vgpu executable can be dangerous when updating the XCP-ng server? Maybe overwriten, deleted or something.
                    2. Do you have plan of supporting nvidia vGPU soon? We still can use Qemu over Ubuntu or other Linux with this drivers and everything works ok but XCP-ng is more professional than qemu IMHO.

                    You are doing a great great job at Vates. Keep going!
                    Dani

                    wyatt-madeW 1 Reply Last reply Reply Quote 1
                    • wyatt-madeW Offline
                      wyatt-made @Dani
                      last edited by

                      Dani I made a post yesterday about Nvidia MiG support. vGPU support on XCP-ng is tricky because of the proprietary code bit that makes vGPU work can't be freely distributed. On the other hand, MiG (which is supported by many Ampere cards like to A100) doesn't requiring licensing like vGPU and seemingly just creates PCI addresses for the card which could, in theory, be passed through to VMs.

                      CC: olivierlambert since we briefly talked about this yesterday in my thread.

                      D 1 Reply Last reply Reply Quote 0
                      • D Offline
                        Dani @wyatt-made
                        last edited by Dani

                        wyatt-made thanks a lot. What a qick response!!
                        I'll check your post.
                        My plan is to install xcp in the A100 server Next week and test It. I will post the results in the forum and maybe will help olivierlambert and the rest of the community.

                        Dani

                        1 Reply Last reply Reply Quote 0
                        • olivierlambertO Offline
                          olivierlambert Vates πŸͺ Co-Founder CEO
                          last edited by

                          I will be interested to understand how MiG works and if we are far or not to get a solution for it πŸ™‚

                          1 Reply Last reply Reply Quote 0
                          • H Offline
                            hani @splastunov
                            last edited by

                            splastunov is it still not asking for license ?

                            splastunovS 1 Reply Last reply Reply Quote 0
                            • splastunovS Offline
                              splastunov @hani
                              last edited by

                              hani it began asking for license after one day but without throttling.

                              I have switched to AMD GPUs

                              H A 2 Replies Last reply Reply Quote 0
                              • H Offline
                                hani @splastunov
                                last edited by

                                splastunov Thanks thats expected πŸ™‚

                                1 Reply Last reply Reply Quote 0
                                • msupportM Offline
                                  msupport @splastunov
                                  last edited by

                                  splastunov
                                  Thanks a lot.
                                  This also works with the Nvidia M10 and Nvidia A16 graphics card.
                                  Driver Version: NVIDIA-vGPU-CitrixHypervisor-8.2-550.54.10.x86_64 (Version 17.0)

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambertO Offline
                                    olivierlambert Vates πŸͺ Co-Founder CEO
                                    last edited by

                                    msupport can you detail a bit what you did? thanks!

                                    msupportM 1 Reply Last reply Reply Quote 0
                                    • msupportM Offline
                                      msupport @olivierlambert
                                      last edited by msupport

                                      olivierlambert

                                      1. Install XCP-NG Version 8.2.1 (8.3 did not work)
                                      2. Install all update yum update
                                      3. reboot
                                      4. Download NVIDIA vGPU drivers for XenServer 8.2 from NVIDIA site. Version NVIDIA-vGPU-CitrixHypervisor-8.2-550.54.10.x86_64 (Version 17.0)
                                      5. Unzip and install rpm from Host-Drivers
                                      6. reboot again
                                      7. Download free CitrixHypervisor-8.2.0-install-cd.iso from Citrix site
                                      8. Open CitrixHypervisor-8.2.0-install-cd.iso with 7-zip, then unzip vgpu binary file from Packages->vgpu....rpm->vgpu....cpio->.->usr->lib64->xen->bin
                                        9.Upload vgpu to XCP-ng host to /usr/lib64/xen/bin and made it executable chmod +x /usr/lib64/xen/bin/vgpu
                                      9. Deployed VM with vGPU and it started without any problems
                                      10. Copy License File from Nvidia License Portal (*.tok) to C:\program files\Nvidia Corperation\vGPU Licensing\ClientConfigToken
                                      11. Install Windows Nvidia Driver on Windows 10 VM (need connection to api.dis.licensing.nvidia.com Port TCP 443, use Nvidia Control Panel to set the hostname and port for Licensing the Nvidia Card)

                                      Works fine for me


                                      If you want to install the NVidia driver on XCP-NG 8.3. Manipulate /etc/xensource-inventory (line PRODUCT_VERSION from 8.3.0 to 8.2.0) during the installation. Then xe-install-supplental-pack NVIDIA-vGPU-CitrixHypervisor-8.2-550.54.16.x86_64.iso.
                                      After installation, change PRODUCT_VERSION back to 8.3.0
                                      The driver now also works in version XCP-NG 8.3
                                      Do not forget to copy the vgpu file to /usr/lib64/xen/bin/vgpu. (change the chmod to 755)


                                      Nvidia vGPU M10 | A16 on XCP-NG 8.3 Beta2 only work without XCP-NG updates. After the update, the error message "An emulator required to run this VM failed to start" appears. It must be due to one of the 76 updates that can be installed. I am trying to find out which update is causing this problem.


                                      22.07.2024 [NEW]

                                      Installation XCP-NG RC1 Nvidia 17.1 GPU

                                      install XCP-NG 8.3 RC1
                                      download XenServer Driver Nvidia 17.1 (NVIDIA-GRID-XenServer-8-550.54.16-550.54.15-551.78)
                                      unzip driver and copy host driver (NVIDIA-vGPU-xenserver-8-550.54.16.x86_64.iso) I used winscp to copy the driver to the tmp directory.
                                      download XenServer iso file (https://www.xenserver.com/downloads | XenServer8_2024-06-03.iso)
                                      copy the file (vgpu-7.4.13-1.xs8.x86_64.rpm) in the packages directory ! Do not use CitrixHypervisor-8.2.0-install-cd file vgpu-7.4.8-1.x86_64
                                      unpack file vgpu-7.4.13-1.xs8.x86_64
                                      copy the file \usr\lib64\xen\bin\vgpu (size 129KB) to \usr\lib64\xen\bin\ on your XCP-NG host (chmod 755)
                                      (putty) /tmp/ xe-install-supplemental-pack NVIDIA-vGPU-xenserver-8-550.54.16.x86_64.iso
                                      reboot
                                      install guest driver on the VM client (551.78_grid_win10_win11_server2022_dch_64bit_international.exe)
                                      token file from Nvidia (C:\Program Files\Nvidia Corporation\vGPU Licensing\ClientConfigToken*.tok)
                                      Nvidia drivers 17.2 and 17.3 do not work yet (Guest driver crashes) Test with Windows 11 23H2


                                      My environment:
                                      16x Hosts HPE DL380
                                      6x Hosts HPE DL380 with vGPU Nvidia M10 and A16
                                      5x HPE 3PAR Storage and 1x HPE MSA 2050 Storage
                                      2x 96 port fibre channel switch

                                      I have migrated from Vmware to XCP-NG with XOA.

                                      1 Reply Last reply Reply Quote 2
                                      • olivierlambertO Offline
                                        olivierlambert Vates πŸͺ Co-Founder CEO
                                        last edited by

                                        Okay so the only binary you need to get from Citrix is vgpu, right?

                                        msupportM 1 Reply Last reply Reply Quote 0
                                        • A Offline
                                          austinw
                                          last edited by

                                          Out of curiosity what are you using this for? what are these VM's doing?

                                          msupportM 1 Reply Last reply Reply Quote 0
                                          • msupportM Offline
                                            msupport @olivierlambert
                                            last edited by

                                            olivierlambert
                                            yes, download and extract. Don't forget the permission.

                                            Download free CitrixHypervisor-8.2.0-install-cd.iso from Citrix site

                                            Open CitrixHypervisor-8.2.0-install-cd.iso with 7-zip, then unzip vgpu binary file from Packages->vgpu....rpm->vgpu....cpio->.->usr->lib64->xen->bin

                                            Upload vgpu to XCP-ng host to /usr/lib64/xen/bin and made it executable chmod +x /usr/lib64/xen/bin/vgpu

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post