Gpu passthrough on Asrock rack B650D4U3-2L2Q will not work
-
Hello @steff22 ,
Tested with another PC with Xcp-ng 8.3 and everything worked immediately. So think it must be Asrock rack B650D4U3-2L2Q bios related or ipmi trying to use the same gpu.
What are the specs of the hardware where it works ?
-
@Teddy-Astie It's an asus prime h370f with an old 8 core intel core i7 9700k
-
I have gpu pcie passthrough working on Asrock rack wrx80d8-2t (Threadripper), and x470d4u2 (Ryzen) to passthrough multiple Radeon Pro 7000 series gpus
Rodney
-
@ravenet Do you remember if you only enabled svm and Iommu? Or did you take the whole bunch with SR-iov, ACS,AER,ARI support, resize bar Support
-
@steff22
Far as I know, SVM, IOMMU, SR-IOV and POSSIBLY resize bar were enabled.
Can't check the threadripper system until the weekend as it runs production loads.I'll reboot one of the Ryzen boxes tonight to verify what I enabled there, but was basically defaults other than making sure the 3 or 4 items above were enabled.
-
@ravenet ok thanks
Didn't work here.
But there may be some error in the bios since it is beta and the first bios that supports the Amd Ryzen9000 series.But the strange thing is that it works stably with Proxmox without any errors of any kind.
-
It's not strange, those a two VERY different system in how they work.
-
@olivierlambert Is there a slight software difference, yes But both are hypervisors.
So the fact that one of them works means that it should be possible to make it happen with my hardware.
Isn't that much you do with proxmox rather than enable iommu in grub and blacklist nvidia drivers plus disable_vga out on the Gpu card.
Is it possible to blacklist the nvidia drivers with Xcp-ng?
And there is no difference to enable pci passthrough in xen-orchestra compared to command-line?I think I have tried both parts but not sure if I had enabled everything in the bios then
-
No, it's not a slight difference: it's a completely different design. In XCP-ng, you start first to boot on Xen, a kind of microkernel. Then only you boot on specific VM, the Dom0, which is a PV guest. Then, from this guest, you have the API etc. But if the Dom0 has access to the hardware for I/O (NICs, disks, GPUs…), it's still a VM and Dom0 doesn't have access to all CPUs and memory of the physical machine.
This means, Xen will always be "in the middle" to control who takes what (on cores and RAM). This is a great design from a security perspective because you always have a small piece of software (Xen) that is controlling what's going on.
In KVM, it's vastly different. You boot on a full Linux kernel, and then you load a small module (KVM). The "host" is really the host, accessing all the memory and CPUs. There's a lot less isolation (and therefore security), however the plus is that there's nobody in the middle to deal with.
I can assure you that PCI passthrough is really complex and also depends on many factors, even how the BIOS is configured and how many things are done by the motherboard manufacturer.
So knowing it works with Proxmox is as saying "it works on Windows", it's that different. It doesn't mean it's not possible in Xen, but if it does, it will be vastly different on how it's done.
-
@olivierlambert Ok, didn't realize there was such a big difference. I actually like everything better with Xcp-ng and xen orchestra.
All testing has been done with UEFI BIOS. Is it just a waste of time to try with legacy bios on Xcp-ng and vm os?
Does it still make sense to try to blacklist the nvidia drivers with Xcp-ng Dom0 to try to isolate the Gpu even more.
But is Xcp-ng more like Vmware esxi in the way everything is handled? doesn't work with vmware esxi either.
-
Yes, you could say XCP-ng is a lot closer to VMware than KVM (even if VMware is more advanced on some aspects, the architecture is roughly similar).
Anyway, it doesn't mean that will never work; However, it might be less straightforward than with KVM until we find exactly what's causing the problem.
-
@steff22 Assuming the
xen-pciback.hide
was previously set, could you try this workaround (no guarantee that'll work, since each motherboard and BIOSes have their quirks):/opt/xensource/libexec/xen-cmdline --set-dom0 pci=realloc reboot
-
@tuxen Didn't work unfortunately. Have used xen orchestra for pciback.hide. the last time. But tried the old fashioned way first. with the command line.
-
I got Gpu passthrough working with Vmware esxi 8.0.3 now.
Was a choice that one could choose to connect a dynamic PCI device.
Isn't this an indication that the error lies in how Xcp-ng handles pci passthrough?
Both Vmware esxi 8.0.3 and proxmox have two different ways to connect pci devices to vm only one of them works the other method creates error 43 in windows.
But as I said, I don't know anything about what the different choices do
-
Do you have more information in that VMware option?
-
@olivierlambert No, unfortunately, nothing but this picture
-
@Teddy-Astie do you have any idea what it could be and why it makes the device working?
-
@olivierlambert Found this article which explains a bit about dynamic PCI device. From what I understood, dynamic PCI devices are newer and more accurate and use DRS
-
@steff22 what's the output of
lspci -k
andxl pci-assignable-list
?Also, the outputs of the system logs re. GPU and IOMMU initialization would be very useful:
egrep -i '(nvidia|vga|video|pciback)' /var/log/kern.log xl dmesg
Tux
-
@tuxen said in Gpu passthrough on Asrock rack B650D4U3-2L2Q will not work:
lspci -k and xl pci-assignable-list
xcp-ng-Asrock ~]# lspci -k 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14d8 Subsystem: Advanced Micro Devices, Inc. [AMD] Device 14d8 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Device 14d9 Subsystem: Advanced Micro Devices, Inc. [AMD] Device 14d9 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da 00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14db Kernel driver in use: pcieport 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14db Kernel driver in use: pcieport 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da 00:02.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14db Kernel driver in use: pcieport 00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14db Kernel driver in use: pcieport 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da 00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14dd Kernel driver in use: pcieport 00:08.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14dd Kernel driver in use: pcieport 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 71) Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller Kernel driver in use: piix4_smbus Kernel modules: i2c_piix4 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51) Subsystem: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e0 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e1 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e2 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e3 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e4 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e5 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e6 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e7 01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1) Subsystem: ZOTAC International (MCO) Ltd. Device 2445 Kernel driver in use: pciback 01:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1) Subsystem: ZOTAC International (MCO) Ltd. Device 2445 Kernel driver in use: pciback 02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57502 NetXtreme-E 10Gb/25Gb/40Gb/50Gb Ethernet (rev 12) Subsystem: ASRock Incorporation Device 1752 Kernel driver in use: bnxt_en Kernel modules: bnxt_en 02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57502 NetXtreme-E 10Gb/25Gb/40Gb/50Gb Ethernet (rev 12) Subsystem: ASRock Incorporation Device 1752 Kernel driver in use: bnxt_en Kernel modules: bnxt_en 03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Upstream Port (rev 01) Kernel driver in use: pcieport 04:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port (rev 01) Kernel driver in use: pcieport 04:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port (rev 01) Kernel driver in use: pcieport 04:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port (rev 01) Kernel driver in use: pcieport 04:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port (rev 01) Kernel driver in use: pcieport 04:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port (rev 01) Kernel driver in use: pcieport 04:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port (rev 01) Kernel driver in use: pcieport 04:0c.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port (rev 01) Kernel driver in use: pcieport 04:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port (rev 01) Kernel driver in use: pcieport 05:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03) Subsystem: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 06:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03) Subsystem: ASRock Incorporation Device 1533 Kernel driver in use: igb Kernel modules: igb 07:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03) Subsystem: ASRock Incorporation Device 1533 Kernel driver in use: igb Kernel modules: igb 08:00.0 PCI bridge: ASRock Incorporation Device 1150 (rev 06) 09:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52) Subsystem: ASRock Incorporation Device 2000 0c:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset USB 3.2 Controller (rev 01) Subsystem: ASMedia Technology Inc. Device 1142 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 0d:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset SATA Controller (rev 01) Subsystem: ASMedia Technology Inc. Device 1062 Kernel driver in use: ahci Kernel modules: ahci 0e:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. FURY Renegade NVMe SSD with heatsink (rev 01) Subsystem: Kingston Technology Company, Inc. FURY Renegade NVMe SSD with heatsink Kernel driver in use: nvme Kernel modules: nvme 0f:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 13c0 (rev c1) Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 13c0 0f:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller 0f:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 19h PSP/CCP Subsystem: Advanced Micro Devices, Inc. [AMD] Family 19h PSP/CCP 0f:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b6 Subsystem: Advanced Micro Devices, Inc. [AMD] Device 15b6 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 0f:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b7 Subsystem: Advanced Micro Devices, Inc. [AMD] Device 15b6 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 0f:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor (rev 62) Subsystem: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor 0f:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller Subsystem: Advanced Micro Devices, Inc. [AMD] Device d601 10:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b8 Subsystem: Advanced Micro Devices, Inc. [AMD] Device 15b6 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci [17:29 xcp-ng-Asrock ~]# xl pci-assignable-list 0000:01:00.0 0000:01:00.1 [17:29 xcp-ng-Asrock ~]#