XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    GPU support and Nvidia Grid vGPU

    Scheduled Pinned Locked Moved Compute
    34 Posts 10 Posters 6.4k Views 9 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • msupportM Offline
      msupport @EspenU
      last edited by olivierlambert

      @EspenU
      #*** Here are my insights for the nvidia drivers and xcp-ng 8.3
      #*** Install xcp-ng beta2
      #*** then start updates

      yum update kernel device-mapper guest-templates-json guest-templates-json-data-linux intel-microcode openssh python2-scapy amd-microcode cisco* libblkid libcgroup libcgroup-tools libcurl libmount libuuid curl util-linux edk2 forkexecd fuse-libs gdisk guest-templates-json-data-other guest-templates-json-data-windows intel-ice python-fasteners python2-defusedxml python2-xapi-storage nss-sysinit nss-tools nspr nss nss-softokn nss-softokn-freebl nss-tools   nss-util   kernel-livepatch logrotate mellanox-mlnxen message-switch openssl* openvswitch swtpm* tzdata microsemi-smartpqi vendor-drivers sudo newt qlogic-qla2xxx qlogic-fastlinq gpumon sm vhd-tool vcputune
      

      *** Nvidia Driver must install before this operation below, because the driver installation is not compatible with phython3 ****

      yum remove xcp-python-libs-2.3.5-1.1.xcpng8.3.noarch
      yum update ncurses-compat-libs python3-fasteners python3-pyudev python3-scapy python3-xcp-libs python36-future    
      yum update net-snmp yum update net-snmp-agent-libs net-snmp-libs
      yum update xapi-storage-script
      yum update xs-openssl-libs
      yum update xapi-nbd
      yum remove net-snmp
      yum net-snmp-libs net-snmp-agent-libs net-snmp
      yum update xcp-ng-plymouth-theme
      yum update xen-crashdump-analyser
      

      *** Driver don't work anymore with one of these 40 Updates,because the updates are dependent:

      Name	Description	Version	Release	Size	
      blktap	blktap user space utilities	3.54.9	1.1.xcpng8.3	305.47 KiB	
      kexec-tools	kexec/kdump userspace tools	2.0.15	20.xcpng8.3	67.8 KiB	
      ncurses	Ncurses support utilities	6.4	3.xcpng8.3	394.83 KiB	
      ncurses-base	Descriptions of common terminals	6.4	3.xcpng8.3	57.81 KiB	
      ncurses-libs	Ncurses libraries	6.4	3.xcpng8.3	312.17 KiB	
      qemu	qemu-dm device model	4.2.1	5.2.9.xcpng8.3	15.57 MiB	
      rrdd-plugins	RRDD metrics plugin	24.16.0	1.2.xcpng8.3	4.29 MiB	
      setup	A set of system configuration and setup files	2.8.71	9.1.xcpng8.3	169.24 KiB	
      sm-cli	CLI for xapi toolstack storage managers	24.16.0	1.2.xcpng8.3	1.53 MiB	
      squeezed	Memory ballooning daemon for the xapi toolstack	24.16.0	1.2.xcpng8.3	1.54 MiB	
      varstored	EFI Variable Storage Daemon	1.2.0	2.3.xcpng8.3	46.55 KiB	
      varstored-guard	Deprivileged XAPI socket Daemon for EFI variable storage	24.16.0	1.2.xcpng8.3	4.3 MiB	
      varstored-tools	Tools for manipulating a guest's EFI variables offline	1.2.0	2.3.xcpng8.3	58.66 KiB	
      vncterm	vncterm tty to vnc utility	10.2.1	2.xcpng8.3	43.94 KiB	
      wsproxy	Websockets proxy for VNC traffic	24.16.0	1.2.xcpng8.3	932.78 KiB	
      xapi-core	The xapi toolstack	24.16.0	1.2.xcpng8.3	24.55 MiB	
      xapi-rrd2csv	A tool to output RRD values in CSV format	24.16.0	1.2.xcpng8.3	2.61 MiB	
      xapi-tests	Toolstack test programs	24.16.0	1.2.xcpng8.3	6.25 MiB	
      xapi-xe	The xapi toolstack CLI	24.16.0	1.2.xcpng8.3	1.13 MiB	
      xcp-clipboardd	Daemon to share a virtualized Windows clipboard	1.0.3	8.xcpng8.3	22.53 KiB	
      xcp-featured	XCP-ng feature daemon	1.1.7	2.xcpng8.3	1.25 MiB	
      xcp-networkd	Simple host network management service for the xapi toolstack	24.16.0	1.2.xcpng8.3	4.15 MiB	
      xcp-ng-release	XCP-ng release file	8.3.0	24	112.56 KiB	
      xcp-ng-release-config	XCP-ng configuration	8.3.0	24	49.93 KiB	
      xcp-ng-release-presets	XCP-ng presets file	8.3.0	24	18.44 KiB	
      xcp-ng-xapi-plugins	XAPI additional plugins for XCP-ng	1.10.0	1.xcpng8.3	46.17 KiB	
      xcp-rrdd	Statistics gathering daemon for the xapi toolstack	24.16.0	1.2.xcpng8.3	3.14 MiB	
      xen-dom0-libs	Xen Hypervisor Domain 0 libraries	4.17.4	3.xcpng8.3	691.85 KiB	
      xen-dom0-tools	Xen Hypervisor Domain 0 tools	4.17.4	3.xcpng8.3	1.9 MiB	
      xen-hypervisor	The Xen Hypervisor	4.17.4	3.xcpng8.3	2.34 MiB	
      xen-libs	Xen Hypervisor general libraries	4.17.4	3.xcpng8.3	54.05 KiB	
      xen-livepatch	Live patches for Xen	2.0	1.xcpng8.3	2.91 KiB	
      xen-tools	Xen Hypervisor general tools	4.17.4	3.xcpng8.3	35.66 KiB	
      xenopsd	Simple VM manager	24.16.0	1.2.xcpng8.3	1.17 MiB	
      xenopsd-cli	CLI for xenopsd, the xapi toolstack domain manager	24.16.0	1.2.xcpng8.3	1.61 MiB	
      xenopsd-xc	Xenopsd using xc	24.16.0	1.2.xcpng8.3	4.61 MiB	
      xenserver-hwdata	Additional hardware identification and configuration data	20240411	1.xcpng8.3	284.41 KiB	
      xenserver-status-report	A program that generates status reports for a XenServer host	2.0.3	1.xcpng8.3	33.24 KiB	
      xo-lite	Xen Orchestra Lite	0.2.3	1.xcpng8.3	816.19 KiB	
      xsconsole	XCP-ng Host Configuration Console	11.0.2	1.1.xcpng8.3	304.44 KiB
      
      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Have you took the drivers for XS8, which is equivalent from XCP-ng 8.3?

        E msupportM 2 Replies Last reply Reply Quote 0
        • E Offline
          EspenU @olivierlambert
          last edited by

          @olivierlambert Would that mean that the vgpu binary should be taken from XS8 as well when using XCP-ng 8.3?
          During testing I tried using that binary in XCP-ng 8.2, and it didn't work (VMs would no boot). I had to use the one from Citrix Hypervisor 8.2.

          1 Reply Last reply Reply Quote 0
          • msupportM Offline
            msupport @olivierlambert
            last edited by msupport

            @olivierlambert
            I have tested the Nvidia XenServer version 17.0.

            1 Reply Last reply Reply Quote 0
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              I don't know all the versions, but I can tell that:

              • XCP-ng 8.2 == XS 8.2
              • XCP-ng 8.3 == XS 8

              So be sure to use the right/matching binary first 🙂

              msupportM 2 Replies Last reply Reply Quote 0
              • msupportM Offline
                msupport @olivierlambert
                last edited by

                @olivierlambert
                I have found the solution. I will test the whole thing again tomorrow with a clean installation with rc1.

                tjkreidlT 1 Reply Last reply Reply Quote 2
                • olivierlambertO Offline
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  Oh great! Keep us posted!!

                  1 Reply Last reply Reply Quote 0
                  • tjkreidlT Offline
                    tjkreidl Ambassador @msupport
                    last edited by

                    @msupport Please write up all the steps involved, as this would be very useful documentation for anyone else wanting to accomplish this. Many have delayed switching to XCP-ng because of not being able to make use of NVIDIA GPUs.

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      It works on 8.2 already even if it's not official at all 😉

                      1 Reply Last reply Reply Quote 0
                      • msupportM Offline
                        msupport
                        last edited by

                        Installation instructions XCP-NG (RC1) Nvidia M10 | A16 GPU

                        1. install XCP-NG 8.3 RC1
                        2. download XenServer Driver Nvidia 17.1 (NVIDIA-GRID-XenServer-8-550.54.16-550.54.15-551.78)
                        3. unzip driver and copy host driver (NVIDIA-vGPU-xenserver-8-550.54.16.x86_64.iso) I used winscp to copy the driver to the tmp directory.
                        4. download XenServer iso file (https://www.xenserver.com/downloads | XenServer8_2024-06-03.iso)
                        5. copy the file (vgpu-7.4.13-1.xs8.x86_64.rpm) in the packages directory ! Do not use CitrixHypervisor-8.2.0-install-cd file vgpu-7.4.8-1.x86_64
                        6. unpack file vgpu-7.4.13-1.xs8.x86_64
                        7. copy the file \usr\lib64\xen\bin\vgpu (size 129KB) to \usr\lib64\xen\bin\ on your XCP-NG host (chmod 755)
                        8. (putty) /tmp/ xe-install-supplemental-pack NVIDIA-vGPU-xenserver-8-550.54.16.x86_64.iso
                        9. reboot
                        10. install guest driver on the VM client (551.78_grid_win10_win11_server2022_dch_64bit_international.exe)
                        11. token file from Nvidia (C:\Program Files\Nvidia Corporation\vGPU Licensing\ClientConfigToken*.tok)

                        Nvidia drivers 17.2 and 17.3 do not work yet (Guest driver crashes)
                        I will stay tuned and inform you about new findings

                        Have fun

                        tjkreidlT 1 Reply Last reply Reply Quote 4
                        • msupportM Offline
                          msupport @olivierlambert
                          last edited by

                          @olivierlambert
                          Thanks for the hint, that helped me a lot

                          1 Reply Last reply Reply Quote 1
                          • olivierlambertO Offline
                            olivierlambert Vates 🪐 Co-Founder CEO
                            last edited by

                            Thank you very much!

                            1 Reply Last reply Reply Quote 0
                            • tjkreidlT Offline
                              tjkreidl Ambassador @msupport
                              last edited by

                              @msupport Many thanks for your write-up! Have you experienced any issues communicating with the NVIDIA license server?

                              msupportM 1 Reply Last reply Reply Quote 0
                              • F Offline
                                fatek
                                last edited by

                                I also used instructions from @msupport

                                https://xcp-ng.org/forum/topic/8987/vgpu-nvidia-tesla-p4-xcp-ng-8-3-beta-2?_=1721015408249

                                1 Reply Last reply Reply Quote 0
                                • msupportM Offline
                                  msupport @tjkreidl
                                  last edited by

                                  @tjkreidl
                                  Nvidia licence server works perfectly so far

                                  1 Reply Last reply Reply Quote 1
                                  • Tristis OrisT Offline
                                    Tristis Oris Top contributor
                                    last edited by Tristis Oris

                                    i can't find where to get this new nvidia driver. Tesla V100.

                                    upd
                                    looks the only way is license portal https://nvid.nvidia.com/. sad.

                                    msupportM 1 Reply Last reply Reply Quote 0
                                    • msupportM Offline
                                      msupport @Tristis Oris
                                      last edited by

                                      @Tristis-Oris
                                      The driver version 17.1 worked for me, 17.2 and 17.3 crashed the Windows drivers
                                      https://we.tl/t-VozEeV8TFB

                                      Tristis OrisT T 2 Replies Last reply Reply Quote 0
                                      • Tristis OrisT Offline
                                        Tristis Oris Top contributor @msupport
                                        last edited by

                                        @msupport thank you. Will try to play with it.

                                        msupportM 1 Reply Last reply Reply Quote 0
                                        • msupportM Offline
                                          msupport @Tristis Oris
                                          last edited by

                                          @Tristis-Oris
                                          The download will be available for 3 days...

                                          1 Reply Last reply Reply Quote 0
                                          • M Offline
                                            mgformula1
                                            last edited by olivierlambert

                                            i have followed the documentation however for some reason the VMs wont power on with a vGPU profile attached. We are testing with NVidia M10 GPUs. I'm using xcp-ng 8.3 and NVidia host driver 17.1, also tried 17.2 and 17.3

                                            This is the error

                                            {
                                              "id": "0m369opsm",
                                              "properties": {
                                                "method": "vm.start",
                                                "params": {
                                                  "id": "b8c94655-5801-21ee-7eb0-788a58b57736",
                                                  "bypassMacAddressesCheck": false,
                                                  "force": false
                                                },
                                                "name": "API call: vm.start",
                                                "userId": "2c8c735d-5369-4a91-8433-b9f94e6eb394",
                                                "type": "api.call"
                                              },
                                              "start": 1730921023894,
                                              "status": "failure",
                                              "updatedAt": 1730921066138,
                                              "end": 1730921066138,
                                              "result": {
                                                "code": "FAILED_TO_START_EMULATOR",
                                                "params": [
                                                  "OpaqueRef:f3f7d9f6-9dc7-772e-ddfe-c1ed19f1aeff",
                                                  "vgpu",
                                                  "Device.Dm.start_vgpu: emulator failed to start for domain 1"
                                                ],
                                                "call": {
                                                  "method": "VM.start",
                                                  "params": [
                                                    "OpaqueRef:f3f7d9f6-9dc7-772e-ddfe-c1ed19f1aeff",
                                                    false,
                                                    false
                                                  ]
                                                },
                                                "message": "FAILED_TO_START_EMULATOR(OpaqueRef:f3f7d9f6-9dc7-772e-ddfe-c1ed19f1aeff, vgpu, Device.Dm.start_vgpu: emulator failed to start for domain 1)",
                                                "name": "XapiError",
                                                "stack": "XapiError: FAILED_TO_START_EMULATOR(OpaqueRef:f3f7d9f6-9dc7-772e-ddfe-c1ed19f1aeff, vgpu, Device.Dm.start_vgpu: emulator failed to start for domain 1)\n    at Function.wrap (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/_XapiError.mjs:16:12)\n    at file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/transports/json-rpc.mjs:38:21\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at processImmediate (node:internal/timers:447:9)\n    at process.callbackTrampoline (node:internal/async_hooks:128:17)"
                                            

                                            host output

                                            360af704-caaa-42d5-8e14-60e047d31103-image.png

                                            M 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post