GPU support and Nvidia Grid vGPU
-
@EspenU
#*** Here are my insights for the nvidia drivers and xcp-ng 8.3
#*** Install xcp-ng beta2
#*** then start updatesyum update kernel device-mapper guest-templates-json guest-templates-json-data-linux intel-microcode openssh python2-scapy amd-microcode cisco* libblkid libcgroup libcgroup-tools libcurl libmount libuuid curl util-linux edk2 forkexecd fuse-libs gdisk guest-templates-json-data-other guest-templates-json-data-windows intel-ice python-fasteners python2-defusedxml python2-xapi-storage nss-sysinit nss-tools nspr nss nss-softokn nss-softokn-freebl nss-tools nss-util kernel-livepatch logrotate mellanox-mlnxen message-switch openssl* openvswitch swtpm* tzdata microsemi-smartpqi vendor-drivers sudo newt qlogic-qla2xxx qlogic-fastlinq gpumon sm vhd-tool vcputune
*** Nvidia Driver must install before this operation below, because the driver installation is not compatible with phython3 ****
yum remove xcp-python-libs-2.3.5-1.1.xcpng8.3.noarch yum update ncurses-compat-libs python3-fasteners python3-pyudev python3-scapy python3-xcp-libs python36-future yum update net-snmp yum update net-snmp-agent-libs net-snmp-libs yum update xapi-storage-script yum update xs-openssl-libs yum update xapi-nbd yum remove net-snmp yum net-snmp-libs net-snmp-agent-libs net-snmp yum update xcp-ng-plymouth-theme yum update xen-crashdump-analyser
*** Driver don't work anymore with one of these 40 Updates,because the updates are dependent:
Name Description Version Release Size blktap blktap user space utilities 3.54.9 1.1.xcpng8.3 305.47 KiB kexec-tools kexec/kdump userspace tools 2.0.15 20.xcpng8.3 67.8 KiB ncurses Ncurses support utilities 6.4 3.xcpng8.3 394.83 KiB ncurses-base Descriptions of common terminals 6.4 3.xcpng8.3 57.81 KiB ncurses-libs Ncurses libraries 6.4 3.xcpng8.3 312.17 KiB qemu qemu-dm device model 4.2.1 5.2.9.xcpng8.3 15.57 MiB rrdd-plugins RRDD metrics plugin 24.16.0 1.2.xcpng8.3 4.29 MiB setup A set of system configuration and setup files 2.8.71 9.1.xcpng8.3 169.24 KiB sm-cli CLI for xapi toolstack storage managers 24.16.0 1.2.xcpng8.3 1.53 MiB squeezed Memory ballooning daemon for the xapi toolstack 24.16.0 1.2.xcpng8.3 1.54 MiB varstored EFI Variable Storage Daemon 1.2.0 2.3.xcpng8.3 46.55 KiB varstored-guard Deprivileged XAPI socket Daemon for EFI variable storage 24.16.0 1.2.xcpng8.3 4.3 MiB varstored-tools Tools for manipulating a guest's EFI variables offline 1.2.0 2.3.xcpng8.3 58.66 KiB vncterm vncterm tty to vnc utility 10.2.1 2.xcpng8.3 43.94 KiB wsproxy Websockets proxy for VNC traffic 24.16.0 1.2.xcpng8.3 932.78 KiB xapi-core The xapi toolstack 24.16.0 1.2.xcpng8.3 24.55 MiB xapi-rrd2csv A tool to output RRD values in CSV format 24.16.0 1.2.xcpng8.3 2.61 MiB xapi-tests Toolstack test programs 24.16.0 1.2.xcpng8.3 6.25 MiB xapi-xe The xapi toolstack CLI 24.16.0 1.2.xcpng8.3 1.13 MiB xcp-clipboardd Daemon to share a virtualized Windows clipboard 1.0.3 8.xcpng8.3 22.53 KiB xcp-featured XCP-ng feature daemon 1.1.7 2.xcpng8.3 1.25 MiB xcp-networkd Simple host network management service for the xapi toolstack 24.16.0 1.2.xcpng8.3 4.15 MiB xcp-ng-release XCP-ng release file 8.3.0 24 112.56 KiB xcp-ng-release-config XCP-ng configuration 8.3.0 24 49.93 KiB xcp-ng-release-presets XCP-ng presets file 8.3.0 24 18.44 KiB xcp-ng-xapi-plugins XAPI additional plugins for XCP-ng 1.10.0 1.xcpng8.3 46.17 KiB xcp-rrdd Statistics gathering daemon for the xapi toolstack 24.16.0 1.2.xcpng8.3 3.14 MiB xen-dom0-libs Xen Hypervisor Domain 0 libraries 4.17.4 3.xcpng8.3 691.85 KiB xen-dom0-tools Xen Hypervisor Domain 0 tools 4.17.4 3.xcpng8.3 1.9 MiB xen-hypervisor The Xen Hypervisor 4.17.4 3.xcpng8.3 2.34 MiB xen-libs Xen Hypervisor general libraries 4.17.4 3.xcpng8.3 54.05 KiB xen-livepatch Live patches for Xen 2.0 1.xcpng8.3 2.91 KiB xen-tools Xen Hypervisor general tools 4.17.4 3.xcpng8.3 35.66 KiB xenopsd Simple VM manager 24.16.0 1.2.xcpng8.3 1.17 MiB xenopsd-cli CLI for xenopsd, the xapi toolstack domain manager 24.16.0 1.2.xcpng8.3 1.61 MiB xenopsd-xc Xenopsd using xc 24.16.0 1.2.xcpng8.3 4.61 MiB xenserver-hwdata Additional hardware identification and configuration data 20240411 1.xcpng8.3 284.41 KiB xenserver-status-report A program that generates status reports for a XenServer host 2.0.3 1.xcpng8.3 33.24 KiB xo-lite Xen Orchestra Lite 0.2.3 1.xcpng8.3 816.19 KiB xsconsole XCP-ng Host Configuration Console 11.0.2 1.1.xcpng8.3 304.44 KiB
-
Have you took the drivers for XS8, which is equivalent from XCP-ng 8.3?
-
@olivierlambert Would that mean that the vgpu binary should be taken from XS8 as well when using XCP-ng 8.3?
During testing I tried using that binary in XCP-ng 8.2, and it didn't work (VMs would no boot). I had to use the one from Citrix Hypervisor 8.2. -
@olivierlambert
I have tested the Nvidia XenServer version 17.0. -
I don't know all the versions, but I can tell that:
- XCP-ng 8.2 == XS 8.2
- XCP-ng 8.3 == XS 8
So be sure to use the right/matching binary first
-
@olivierlambert
I have found the solution. I will test the whole thing again tomorrow with a clean installation with rc1. -
Oh great! Keep us posted!!
-
@msupport Please write up all the steps involved, as this would be very useful documentation for anyone else wanting to accomplish this. Many have delayed switching to XCP-ng because of not being able to make use of NVIDIA GPUs.
-
It works on 8.2 already even if it's not official at all
-
Installation instructions XCP-NG (RC1) Nvidia M10 | A16 GPU
- install XCP-NG 8.3 RC1
- download XenServer Driver Nvidia 17.1 (NVIDIA-GRID-XenServer-8-550.54.16-550.54.15-551.78)
- unzip driver and copy host driver (NVIDIA-vGPU-xenserver-8-550.54.16.x86_64.iso) I used winscp to copy the driver to the tmp directory.
- download XenServer iso file (https://www.xenserver.com/downloads | XenServer8_2024-06-03.iso)
- copy the file (vgpu-7.4.13-1.xs8.x86_64.rpm) in the packages directory ! Do not use CitrixHypervisor-8.2.0-install-cd file vgpu-7.4.8-1.x86_64
- unpack file vgpu-7.4.13-1.xs8.x86_64
- copy the file \usr\lib64\xen\bin\vgpu (size 129KB) to \usr\lib64\xen\bin\ on your XCP-NG host (chmod 755)
- (putty) /tmp/ xe-install-supplemental-pack NVIDIA-vGPU-xenserver-8-550.54.16.x86_64.iso
- reboot
- install guest driver on the VM client (551.78_grid_win10_win11_server2022_dch_64bit_international.exe)
- token file from Nvidia (C:\Program Files\Nvidia Corporation\vGPU Licensing\ClientConfigToken*.tok)
Nvidia drivers 17.2 and 17.3 do not work yet (Guest driver crashes)
I will stay tuned and inform you about new findingsHave fun
-
@olivierlambert
Thanks for the hint, that helped me a lot -
Thank you very much!
-
@msupport Many thanks for your write-up! Have you experienced any issues communicating with the NVIDIA license server?
-
I also used instructions from @msupport
https://xcp-ng.org/forum/topic/8987/vgpu-nvidia-tesla-p4-xcp-ng-8-3-beta-2?_=1721015408249
-
@tjkreidl
Nvidia licence server works perfectly so far -
i can't find where to get this new nvidia driver. Tesla V100.
upd
looks the only way is license portal https://nvid.nvidia.com/. sad. -
@Tristis-Oris
The driver version 17.1 worked for me, 17.2 and 17.3 crashed the Windows drivers
https://we.tl/t-VozEeV8TFB -
@msupport thank you. Will try to play with it.
-
@Tristis-Oris
The download will be available for 3 days... -
i have followed the documentation however for some reason the VMs wont power on with a vGPU profile attached. We are testing with NVidia M10 GPUs. I'm using xcp-ng 8.3 and NVidia host driver 17.1, also tried 17.2 and 17.3
This is the error
{ "id": "0m369opsm", "properties": { "method": "vm.start", "params": { "id": "b8c94655-5801-21ee-7eb0-788a58b57736", "bypassMacAddressesCheck": false, "force": false }, "name": "API call: vm.start", "userId": "2c8c735d-5369-4a91-8433-b9f94e6eb394", "type": "api.call" }, "start": 1730921023894, "status": "failure", "updatedAt": 1730921066138, "end": 1730921066138, "result": { "code": "FAILED_TO_START_EMULATOR", "params": [ "OpaqueRef:f3f7d9f6-9dc7-772e-ddfe-c1ed19f1aeff", "vgpu", "Device.Dm.start_vgpu: emulator failed to start for domain 1" ], "call": { "method": "VM.start", "params": [ "OpaqueRef:f3f7d9f6-9dc7-772e-ddfe-c1ed19f1aeff", false, false ] }, "message": "FAILED_TO_START_EMULATOR(OpaqueRef:f3f7d9f6-9dc7-772e-ddfe-c1ed19f1aeff, vgpu, Device.Dm.start_vgpu: emulator failed to start for domain 1)", "name": "XapiError", "stack": "XapiError: FAILED_TO_START_EMULATOR(OpaqueRef:f3f7d9f6-9dc7-772e-ddfe-c1ed19f1aeff, vgpu, Device.Dm.start_vgpu: emulator failed to start for domain 1)\n at Function.wrap (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/_XapiError.mjs:16:12)\n at file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/transports/json-rpc.mjs:38:21\n at runNextTicks (node:internal/process/task_queues:60:5)\n at processImmediate (node:internal/timers:447:9)\n at process.callbackTrampoline (node:internal/async_hooks:128:17)"
host output