NVIDIA GPU passthrough on XCP-ng 8.3 fails after reboot — UUID/PCI ID changes
-
I’m trying to use passthrough for an NVIDIA GPU on XCP-ng 8.3.
The host detects the GPU (lspci shows VGA + Audio), and IOMMU is enabled.However, whenever I apply xen-pciback.hide and reboot the host, XCP-ng generates a new internal UUID and new PCI ID for the GPU.
As a result, I cannot even assign the GPU to a VM, because the device “changes” its reference after each reboot.Additional important context:
I previously had a different GPU doing passthrough in a VM.
That GPU was removed.
The new GPU was installed in a different slot.
I suspect the issues may be related to leftover metadata from the previous GPU, which prevents the new GPU from being recognized consistently for passthrough.
Question:
Has anyone encountered this problem? Is there a way to completely clear the old passthrough state on the host or ensure that the new GPU is recognized consistently for passthrough? -
Hi,
Clear entirely the boot parameter with
/opt/xensource/libexec/xen-cmdline --delete-dom0 xen-pciback.hide, reboot, and re-assign. -
Thank you for your reply.
I followed your instructions; however, when I passthrough the GPU NVIDIA on the XenOrchestra and he same error occurs when I reboot.
Change the PCI and UUID of the graphics card to the one before rebooting.
-
This post is deleted! -
Can you tell us on what platform you are doing this? It looks like a buggy PCI reset.
-
I have a the XCP8.3 installed.
Xen Orchestra, commit bcee5 .When I add it manually, the audio, for example, it asks for a reboot, but then, when I restart the server, it has another pci ID, and another ID.

-
Oo I'm not sure to follow. But let me add @Team-Hypervisor-Kernel in case that rings a bell.
-
Hi,
@samuelolavo can you run the following command on the dom0 and paste its output please:
grep -A5 "'XCP-ng'" /etc/grub.cfgThen also run
lspci -vvv xe pci-list xl pci-assignable-list -
@yannsionneau Hi,
Sorry for the delay...menuentry 'XCP-ng' { search --label --set root root-zrxcsq multiboot2 /boot/xen.gz dom0_mem=8192M,max:8192M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311 module2 /boot/vmlinuz-4.19-xen root=LABEL=root-zrxcsq ro nolvm hpet=disable console=hvc0 console=tty0 quiet vga=785 splash plymouth.ignore-serial-consoles xen-pciback.hide=(0000:03:00.0) module2 /boot/initrd-4.19-xen.img }03:00.0 VGA compatible controller: NVIDIA Corporation GB202GL [RTX PRO 6000 Blackwell Max-Q Workstation Edition] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation Device 204c Physical Slot: 10 Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 89 Region 0: Memory at f4000000 (32-bit, non-prefetchable) [disabled] [size=64M] Region 1: Memory at 70060000000 (64-bit, prefetchable) [disabled] [size=256M] Region 3: Memory at 70070000000 (64-bit, prefetchable) [disabled] [size=32M] Region 5: I/O ports at 1000 [disabled] [size=128] Expansion ROM at f8000000 [disabled] [size=512K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] MSI: Enable- Count=1/16 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [60] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed unknown, Width x16, ASPM L1, Exit Latency L0s unlimited, L1 unlimited ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed unknown, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF Via message DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Via WAKE# LnkCtl2: Target Link Speed: Unknown, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest- Capabilities: [9c] Vendor Specific Information: Len=14 <?> Capabilities: [100 v1] #19 Capabilities: [12c v1] Latency Tolerance Reporting Max snoop latency: 1048576ns Max no snoop latency: 1048576ns Capabilities: [134 v1] #15 Capabilities: [14c v1] #25 Capabilities: [158 v1] #26 Capabilities: [188 v1] #2a Capabilities: [1b8 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO+ CmpltAbrt- UnxCmplt+ RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [200 v1] #27 Capabilities: [248 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 1 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [2a4 v1] Vendor Specific Information: ID=0001 Rev=1 Len=014 <?> Capabilities: [2bc v1] Power Budgeting <?> Capabilities: [2f4 v1] Device Serial Number 18-a6-fe-7f-8f-2d-b0-48 Kernel driver in use: pciback 03:00.1 Audio device: NVIDIA Corporation Device 22e8 (rev a1) Subsystem: NVIDIA Corporation Device 0000 Physical Slot: 10 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin B routed to IRQ 10 Region 0: Memory at f8080000 (32-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [60] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed unknown, Width x16, ASPM L1, Exit Latency L0s unlimited, L1 unlimited ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed unknown, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF Via message DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [9c] Vendor Specific Information: Len=14 <?> Capabilities: [100 v1] #25 Capabilities: [10c v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO+ CmpltAbrt- UnxCmplt+ RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [154 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Genoa/Bergamo Dummy Function (rev 01) Subsystem: Advanced Micro Devices, Inc. [AMD] Genoa/Bergamo Dummy Function Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [64] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed unknown, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed unknown, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: Unknown, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [270 v1] #19 Capabilities: [328 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 1 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [410 v1] #26 Capabilities: [450 v1] #27 Capabilities: [500 v1] #2axe pci-list uuid ( RO) : 73708288-55ec-b17f-ba73-6d2c116b3bbc vendor-name ( RO): NVIDIA Corporation device-name ( RO): GB202GL [RTX PRO 6000 Blackwell Max-Q Workstation Edition] pci-id ( RO): 0000:03:00.0 uuid ( RO) : c94f0327-8c86-3aa8-dd7c-9389ae1123f5 vendor-name ( RO): Intel Corporation device-name ( RO): Ethernet Controller X550 pci-id ( RO): 0000:81:00.1 uuid ( RO) : 09e0f3b1-18bb-8a6e-97d8-0209a8e4a97c vendor-name ( RO): Advanced Micro Devices, Inc. [AMD] device-name ( RO): FCH SATA Controller [AHCI mode] pci-id ( RO): 0000:0a:00.1 uuid ( RO) : cc5feb5c-b8aa-a975-de76-513d309f8e73 vendor-name ( RO): Intel Corporation device-name ( RO): Ethernet Controller X550 pci-id ( RO): 0000:41:00.0 uuid ( RO) : 72baa4c2-13b3-22fb-cc87-b10033ccb025 vendor-name ( RO): Broadcom / LSI device-name ( RO): MegaRAID 12GSAS/PCIe Secure SAS39xx pci-id ( RO): 0000:c1:00.0 uuid ( RO) : d2d40d25-4b69-7f6a-a8e2-3101ad80fcb6 vendor-name ( RO): Intel Corporation device-name ( RO): Ethernet Controller X550 pci-id ( RO): 0000:41:00.1 uuid ( RO) : 630bdaee-0e03-b8a1-c726-4f34230e89f7 vendor-name ( RO): Intel Corporation device-name ( RO): Ethernet Controller X550 pci-id ( RO): 0000:81:00.0 uuid ( RO) : d4023077-83fe-a0b7-5f3f-516204c2c1d1 vendor-name ( RO): Advanced Micro Devices, Inc. [AMD] device-name ( RO): FCH SATA Controller [AHCI mode] pci-id ( RO): 0000:ce:00.1 uuid ( RO) : c67855cd-4908-0711-7424-a6db2eb011f0 vendor-name ( RO): Advanced Micro Devices, Inc. [AMD] device-name ( RO): FCH SATA Controller [AHCI mode] pci-id ( RO): 0000:0a:00.0 uuid ( RO) : ef1d93e4-e82b-c33c-a488-8f7a9129eb8a vendor-name ( RO): NVIDIA Corporation device-name ( RO): Device 22e8 pci-id ( RO): 0000:03:00.1 uuid ( RO) : b6235c56-4070-dc4d-9db2-5e361f38d2b2 vendor-name ( RO): Advanced Micro Devices, Inc. [AMD] device-name ( RO): FCH SATA Controller [AHCI mode] pci-id ( RO): 0000:ce:00.0 uuid ( RO) : 5c7258ae-504b-9ac4-8b16-7129b8d8455d vendor-name ( RO): ASPEED Technology, Inc. device-name ( RO): ASPEED Graphics Family pci-id ( RO): 0000:cc:00.0xl pci-assignable-list 0000:03:00.0Model: Supermicro AS-2015CS-TNR