Hello community.
I am struggling to get PCI passthrough on my DIY homelab/NAS server to work.
I use the latest version (8.2.1-20231130) of XCP-ng.
Hardware:
CPU: AMD Ryzen 9 5950x
Motherboard: ASRock B550M Pro4 (Bios Version 3.42 provided by ASRock support. Will go back to 3.40)
RAM: Kingston Fury Beast 128GB 3200MHz DDR4
GPU: Saphire Nitro+ RX580 (in PCIE1 for setup and troubleshooting. Will be swapped for an RTX 4000 Ada Gen.)
Storage: Samsung 870 EVO Sata III (connected to Sata_1 on the Mobo. Main drive for hypervisor and VM’s
Samsung 980 PRO NVMe 1TB (in M.2_1 on the Mobo. Planed as cache for ZFS)
5x Sata HDD’s connected to Sata_2 to Sata_6 on the Mobo
NIC: ULANSeN Dual PCIe 3.1 2.5GBase-T Network Adapter with Intel I225-V (Currently in PCIE3 due to space limitations with GPU. Will be moved to PCIE2 after swapping the GPU)
10Gb SFP+ NIC (planed for slot PCIE3 but due to hardware incompatibility from the Mobo and NIC’s with Aquantia Chipset, AQC100S, I’m still searching for a compatible one. Should get hands on a NIC with Intel X520-DA1 Chipset, where ASRock support tested and confirmed compatibility)
Accelerator: Coral M.2 edge TPU A+E key (in M.2_3 slot on Mobo meant for WiFi cards)
Following the documentation for Compute and GPU does not work at all.
The commands, for in my case
/opt/xensource/libexec/xen-cmdline --set-dom0 "xen-pciback.hide=(0000:07:00.0)(0000:08:00.0)(0000:0e:00.0)"
/opt/xensource/libexec/xen-cmdline –delete-dom0 xen-pciback.hide
does nothing.
I was able to manually unbind the dual 2,5Gb NIC from dom0 by using the commands:
echo "0000:07:00.0" > /sys/bus/pci/devices/0000:07:00.0/driver/unbind
echo "0000:08:00.0" > /sys/bus/pci/devices/0000:08:00.0/driver/unbind
echo "0000:07:00.0" > /sys/bus/pci/drivers/pciback/new_slot
echo "0000:07:00.0" > /sys/bus/pci/drivers/pciback/bind
echo "0000:08:00.0" > /sys/bus/pci/drivers/pciback/new_slot
echo "0000:08:00.0" > /sys/bus/pci/drivers/pciback/bind
but this method does not work for the TPU “0000:0e:00.0”
When “xl pci-assignable-list” the NIC’s show up after removing them manually, but the TPU does not.
After attaching the NIC’s to the desired VM. The VM refuse to start.
I don’t mind running into issues and have to tweak to solve issues, but my knowledge has limits. I’m a newbie with virtualization (7 month). And even working with Linux for only 1,5 years. Even if I’m a newbie, I’m no stranger to ssh and know how to use cli. I am familiar with docker, docker-compose and docker swarm.
I’m successfully running XCP-ng on an network appliance with Intel 1240p CPU and dual 10Gb SFP+ ports for half a year now as a test server for what Wendel from Level1techs calls the forbidden router. I ran this server currently for testing the setup, but it will go into production pretty soon.
Due to a cpu fan failure on the appliance after only 4 month I shifted the plan for the homelab server. Replacement for the fan took 6 weeks.
I don’t really trust the network appliance anymore to be reliable.
Infrastructure:
• Server1/Network appliance (KingnovyPC 13th Gen Firewall Micro appliance, 2x10G SFP, 6xi226 2,5G LAN)
o VM for Xen Orchestra
o VM for primary pFsense, SFP+ ports passed through to isolate dom0 from WAN interface.
o VM for Docker (masternode_1, worker_1 docker swarm)
Portainer
Pihole
Unbound
LAN Cache
Unifi Network appliance server
…
• Server2/DIY Homelab
o VM for Xen Orchestre failover
o VM for secondary pFsense (HA config), dual 2,5G NIC passed trough to proper isolate dom0 from WAN interface.
o VM for Docker (masternode_2, worker_2 docker swarm)
o VM for NAS (Unraid or TrueNAS. Not decided yet) 10G NIC, TPU, GPU,NVMe and HDD’s will be passed through to this VM
PLEX
NVR
OwnCloud
Tinker with ML/AL
…
• Server3 Raspberry Pi 4B (masternode_3 docker swarm)
What I have found out so far is that the Mobo does some strange things with IOMMU Groups. I found a post where someone had to tweak the grub.cfg and had to add “amd_iommu=on” “iommu=pt” and “pcie_acs_override=downstream,multifunction” to make PCI passthrough work for Proxmox
Any suggestions what I have to do to make my setup work? I would also need to know how.
I want to use XCP-ng really bad for my setup!
BIOS Configuration:
• Advanced => PCI Configuration
o Above 4G Decoding: Enable
o Re-Size BAR Support: Enable
o SR=IOV Support: Enable
• Advanced => AMD CBS => CPU Common Options
o SVM: Enable
• Advanced => AMD CBS => NBIO Common Options
o IOMMU: Enable
o DMA Protection: Enable
o DMAr Support: Auto
o ACS Enable: Enable
o PCIe ARI Support: Enable
o PCIe ARI Enumeration: Enable
o PCIe AER Cap: Enable
o SRIS: Enable