How to enable vgpu on Nvidia T4 after the nvidia drivers and vgpu binary from xenserver are installed?
-
Re: nVidia Tesla P4 for vgpu and Plex encoding
Thanks for various guides I was able to install the driver on xcp-ng 8.3
I've used NVIDIA-vGPU-CitrixHypervisor-8.2-570.124.03.x86_64.iso
and vgpu-7.4.16-1.xs8.x86_64.rpm (from xen server)The driver seems to work and T4 is detected
# nvidia-smi Thu Apr 3 17:18:25 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.124.03 Driver Version: 570.124.03 CUDA Version: N/A | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 Tesla T4 Off | 00000000:02:00.0 Off | 0 | | N/A 86C P0 42W / 70W | 13MiB / 15360MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+
Unfortunately the card doesn't seem to be in vGPU mode:
# nvidia-smi vgpu -q No supported devices in vGPU mode
I'm not sure if "Addressing Mode: Unknown Error" is anything to be concerned about, cannot find anything specific about that.
#nvidia-smi -q ==============NVSMI LOG============== Timestamp : Thu Apr 3 16:53:35 2025 Driver Version : 570.124.03 CUDA Version : Not Found Attached GPUs : 1 GPU 00000000:02:00.0 Product Name : Tesla T4 Product Brand : NVIDIA Product Architecture : Turing Display Mode : Enabled Display Active : Disabled Persistence Mode : Disabled Addressing Mode : Unknown Error ... GPU Virtualization Mode Virtualization Mode : None Host VGPU Mode : N/A vGPU Heterogeneous Mode : N/A ...
I also see that vGPU / sr-iov (as least theoretically) supported:
# lspci -v -s 02:00.0 02:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1) Subsystem: NVIDIA Corporation Device 12a2 Physical Slot: 6 Flags: bus master, fast devsel, latency 0, IRQ 32 Memory at f8000000 (32-bit, non-prefetchable) [size=16M] Memory at 383fc0000000 (64-bit, prefetchable) [size=256M] Memory at 383ff0000000 (64-bit, prefetchable) [size=32M] Capabilities: [60] Power Management version 3 Capabilities: [68] #00 [0080] Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [c8] MSI-X: Enable- Count=6 Masked- Capabilities: [100] Virtual Channel Capabilities: [258] L1 PM Substates Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] #19 Capabilities: [bb0] #15 Capabilities: [bcc] Single Root I/O Virtualization (SR-IOV) Capabilities: [c14] Alternative Routing-ID Interpretation (ARI) Kernel driver in use: nvidia Kernel modules: nvidia
I do have SR-IOV enabled in BIOS, but having it disabled didn't seem to change anything.
Those with nvidia vGPUs, how have you ended up getting xcp-ng to enable the feature?
-
Ok, found a solution.
Seems there is nothing I could do via cli or xo-ce to see the virtual GPUs, had to install xcp-ng center, specifically from:
https://github.com/xcp-ng/xenadmin/files/13800584/XCP-NG-Center-2023-Release.zipUsing the center, I was able to assign the GPU to the VM and everything worked well from then on.
Now to figure out licensing....