Gpu passthrough on Asrock rack B650D4U3-2L2Q will not work
-
Pinging @Teddy-Astie
-
Hello @steff22 ,
Tested with another PC with Xcp-ng 8.3 and everything worked immediately. So think it must be Asrock rack B650D4U3-2L2Q bios related or ipmi trying to use the same gpu.
What are the specs of the hardware where it works ?
-
@Teddy-Astie It's an asus prime h370f with an old 8 core intel core i7 9700k
-
I have gpu pcie passthrough working on Asrock rack wrx80d8-2t (Threadripper), and x470d4u2 (Ryzen) to passthrough multiple Radeon Pro 7000 series gpus
Rodney
-
@ravenet Do you remember if you only enabled svm and Iommu? Or did you take the whole bunch with SR-iov, ACS,AER,ARI support, resize bar Support
-
@steff22
Far as I know, SVM, IOMMU, SR-IOV and POSSIBLY resize bar were enabled.
Can't check the threadripper system until the weekend as it runs production loads.I'll reboot one of the Ryzen boxes tonight to verify what I enabled there, but was basically defaults other than making sure the 3 or 4 items above were enabled.
-
@ravenet ok thanks
Didn't work here.
But there may be some error in the bios since it is beta and the first bios that supports the Amd Ryzen9000 series.But the strange thing is that it works stably with Proxmox without any errors of any kind.
-
It's not strange, those a two VERY different system in how they work.
-
@olivierlambert Is there a slight software difference, yes But both are hypervisors.
So the fact that one of them works means that it should be possible to make it happen with my hardware.
Isn't that much you do with proxmox rather than enable iommu in grub and blacklist nvidia drivers plus disable_vga out on the Gpu card.
Is it possible to blacklist the nvidia drivers with Xcp-ng?
And there is no difference to enable pci passthrough in xen-orchestra compared to command-line?I think I have tried both parts but not sure if I had enabled everything in the bios then
-
No, it's not a slight difference: it's a completely different design. In XCP-ng, you start first to boot on Xen, a kind of microkernel. Then only you boot on specific VM, the Dom0, which is a PV guest. Then, from this guest, you have the API etc. But if the Dom0 has access to the hardware for I/O (NICs, disks, GPUs…), it's still a VM and Dom0 doesn't have access to all CPUs and memory of the physical machine.
This means, Xen will always be "in the middle" to control who takes what (on cores and RAM). This is a great design from a security perspective because you always have a small piece of software (Xen) that is controlling what's going on.
In KVM, it's vastly different. You boot on a full Linux kernel, and then you load a small module (KVM). The "host" is really the host, accessing all the memory and CPUs. There's a lot less isolation (and therefore security), however the plus is that there's nobody in the middle to deal with.
I can assure you that PCI passthrough is really complex and also depends on many factors, even how the BIOS is configured and how many things are done by the motherboard manufacturer.
So knowing it works with Proxmox is as saying "it works on Windows", it's that different. It doesn't mean it's not possible in Xen, but if it does, it will be vastly different on how it's done.
-
@olivierlambert Ok, didn't realize there was such a big difference. I actually like everything better with Xcp-ng and xen orchestra.
All testing has been done with UEFI BIOS. Is it just a waste of time to try with legacy bios on Xcp-ng and vm os?
Does it still make sense to try to blacklist the nvidia drivers with Xcp-ng Dom0 to try to isolate the Gpu even more.
But is Xcp-ng more like Vmware esxi in the way everything is handled? doesn't work with vmware esxi either.
-
Yes, you could say XCP-ng is a lot closer to VMware than KVM (even if VMware is more advanced on some aspects, the architecture is roughly similar).
Anyway, it doesn't mean that will never work; However, it might be less straightforward than with KVM until we find exactly what's causing the problem.
-
@steff22 Assuming the
xen-pciback.hide
was previously set, could you try this workaround (no guarantee that'll work, since each motherboard and BIOSes have their quirks):/opt/xensource/libexec/xen-cmdline --set-dom0 pci=realloc reboot
-
@tuxen Didn't work unfortunately. Have used xen orchestra for pciback.hide. the last time. But tried the old fashioned way first. with the command line.
-
I got Gpu passthrough working with Vmware esxi 8.0.3 now.
Was a choice that one could choose to connect a dynamic PCI device.
Isn't this an indication that the error lies in how Xcp-ng handles pci passthrough?
Both Vmware esxi 8.0.3 and proxmox have two different ways to connect pci devices to vm only one of them works the other method creates error 43 in windows.
But as I said, I don't know anything about what the different choices do
-
Do you have more information in that VMware option?
-
@olivierlambert No, unfortunately, nothing but this picture
-
@Teddy-Astie do you have any idea what it could be and why it makes the device working?
-
@olivierlambert Found this article which explains a bit about dynamic PCI device. From what I understood, dynamic PCI devices are newer and more accurate and use DRS
-
@steff22 what's the output of
lspci -k
andxl pci-assignable-list
?Also, the outputs of the system logs re. GPU and IOMMU initialization would be very useful:
egrep -i '(nvidia|vga|video|pciback)' /var/log/kern.log xl dmesg
Tux