No free virtual function found vGPU S7150
-
Recently installed a pair of Radeon S7150's (single slot) gpu's in a couple of HPE DL360 G9 servers.
The cards showed up as passthrough capable in XCP-NG Center,
I then installed the host drivers (mxgpu-2.0.0.amd),
rebooted,
MxGPU (AMD vGPU) was now enabled.I assign a vGPU to a Windows VM and when i tried to start the VM I got the below error in XCP-NG
"Internal Error: No free virtual function found"
Xen orchestra also produced the same error...I attempted the solution outlined in this thread, https://xcp-ng.org/forum/topic/3404/firepro-s7150x2-sr-iov-errors
and after a reboot I was never greeted to the xcp-ng server screen and the server no longer connected to the network.
I had to reboot go through the list of options in the grub boot screen until I managed to get one that booted back into XCP-NG, vGPU still does not work.send help
-
@erfant could you upload
lspci -k
anddmesg
? Also, have you checked/tried:- SR-IOV enabled in BIOS;
- On
vmlinuz
entry, boot withpci=realloc
; - On
vmlinuz
entry, boot withpci=realloc pci=assign-busses
-
@tuxen I've attached both outputs, they're quite long actually.
I can also confirm SR-IOV is definitely enabled in the bios (double checked).I'm not too sure what you mean steps 2 & 3.
-
@erfant after seeing your uploaded
dmesg
, the steps 2 & 3 boot options can be put aside for while because the error isn't the same as the other topics.The log is showing MxGPU driver probe/initialization errors. After some digging, could be the case of a GPU firmware being incompatible with UEFI. Do you have any spare server for testing XCP-ng boot in legacy/BIOS with this GPU?
[ 119.418930] gim error:(gim_probe:123) gim_probe(08:00.0) [ 121.145663] gim error:(wait_cmd_complete:2387) wait_cmd_complete -- time out after 0.003044131 sec [ 121.145719] gim error:(wait_cmd_complete:2390) Cmd = 0x17, Status = 0x0, cmd_Complete=0 [ 121.145984] gim error:(init_register_init_state:4643) Failed to INIT PF for initial register 'init-state'
Edited for clarification.
-
@tuxen
That's very bizarre..
I don't have any spare servers for testing at the moment.
I do have a NVMe drive plugged into one of the PCIe slots, could that be causing an issue? -
Unrelated: just wanted to chime in and thanks @tuxen for your great community help!
-
@erfant probably not because the nvme driver is loaded and there're no nvme errors in the logs.
@olivierlambert thank you and your team for this great project and community! It's a nice place to share knowledge and learn new stuff. I learn a lot here!
-