XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    AMD 'Barcelo' passthrough issues - any success stories?

    Scheduled Pinned Locked Moved Hardware
    8 Posts 2 Posters 116 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DustyArmstrongD Offline
      DustyArmstrong
      last edited by DustyArmstrong

      I am trying to pass through my AMD GPU to a Debian host. This sort of succeeds, but I am getting error messages about a "BIOS ROM". I have had a look online and found some resources to suggest how to provide one, but I am quite unsure on that and cannot locate a specific one for my GPU (AMD Ryzen 7 5825U with Radeon Graphics, I think Vega 8).

      I am showing it has at least tried to load/associate the kernel module:

      lspci -nnk -s 00:08.0
      00:08.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Barcelo [1002:15e7] (rev c1)
      	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1636]
      	Kernel modules: amdgpu
      

      But dmesg output states it cannot locate a ROM.

      [    4.655776] amdgpu 0000:00:08.0: amdgpu: Unable to locate a BIOS ROM
      [    4.655797] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init
      [    4.655812] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device.
      [    4.656681] amdgpu 0000:00:08.0: probe with driver amdgpu failed with error -22
      

      Is there a trick to this, has anyone had success with this kind of AMD GPU? On my old hosts, enabling pass through was enough for it to just kind of work (Intel HD 530). The host machine outputs to a display normally when the card is in-use by the host. I am of the understanding the ROM is just part of the motherboard/GPU, there is some suggestion it can be dumped from the host-side, but I'm unsure on this.

      Can anyone shed some light on how this should work, or whether it's just something in my setup that isn't working? I have IOMMU and SR-IOV enabled in the BIOS, but I don't know if it's working.

      EDIT: It looks like I may just have a fake BIOS? The settings to enable all the relevant components (IOMMU, DMAr support etc) don't actually seem to do anything, they might just be for show - dmesg | grep -i iommu returns nothing, dmesg | grep -i -e dmar -e vfio -e pciback only shows pciback info, and cat /proc/cmdline contains nothing about IOMMU. Oddly, XO is still reporting that IOMMU is enabled:

      0e1c311b-7e3f-48d7-b840-58ec85aef7b6-image.png

      The output of xe host-param-get uuid={uuid} param-name=chipset-info also returns iommu: true, but the output of xl info | grep "iommu" returns nothing.

      TeddyAstieT 2 Replies Last reply Reply Quote 0
      • TeddyAstieT Online
        TeddyAstie Vates 🪐 XCP-ng Team Xen Guru @DustyArmstrong
        last edited by TeddyAstie

        @DustyArmstrong

        EDIT: It looks like I may just have a fake BIOS? The settings to enable all the relevant components (IOMMU, DMAr support etc) don't actually seem to do anything, they might just be for show - dmesg | grep -i iommu returns nothing, dmesg | grep -i -e dmar -e vfio -e pciback only shows pciback info, and cat /proc/cmdline contains nothing about IOMMU. Oddly, XO is still reporting that IOMMU is enabled:

        dmesg in the Dom0 will not report the information you're looking for.
        To know if PCI Passthrough is supported (e.g IOMMU enabled), you should check xl info | grep virt_caps and look for hvm_directio. You can also look for IOMMU-related stuff in xl dmesg.
        As you managed to passthrough the device (even if not working in the guest), I don't see a issue there.

        [ 4.655776] amdgpu 0000:00:08.0: amdgpu: Unable to locate a BIOS ROM
        [ 4.655797] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init
        [ 4.655812] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device.
        [ 4.656681] amdgpu 0000:00:08.0: probe with driver amdgpu failed with error -22

        Is there a trick to this, has anyone had success with this kind of AMD GPU? On my old hosts, enabling pass through was enough for it to just kind of work (Intel HD 530). The host machine outputs to a display normally when the card is in-use by the host. I am of the understanding the ROM is just part of the motherboard/GPU, there is some suggestion it can be dumped from the host-side, but I'm unsure on this.

        Looks like the GPU ROMBAR is missing in the guest, while it's ok for many devices, many others will fail to work without it (like this GPU).
        To me, there's something missing on the PCI Passthrough logic, I just brought the topic internally to see what we can do.

        DustyArmstrongD 1 Reply Last reply Reply Quote 0
        • DustyArmstrongD Offline
          DustyArmstrong @TeddyAstie
          last edited by

          @TeddyAstie Thanks for that - I was going off of other old XCP hosts I have had running that did show IOMMU info there, so I figured it wasn't working on these particular machines. I also had the the Intel GPU working on those hosts, which further supported my theory. I do have a USB controller passed through to the same VM and that works perfectly, basically native functionality, but from the research I was doing it was suggested USB controllers are way less bothered about imperfect passthrough.

          I was able to find a ROM for my particular GPU, but I don't know what - if anything - I can do with it in XCP. I've found lots of info for QEMU and Proxmox, but I'm not sure they directly translate.

          If there's nothing I can do for the moment then I can live without the GPU, running decode on the CPU isn't really taxing it that much. Appreciate you looking into it.

          1 Reply Last reply Reply Quote 0
          • TeddyAstieT Online
            TeddyAstie Vates 🪐 XCP-ng Team Xen Guru @DustyArmstrong
            last edited by TeddyAstie

            @DustyArmstrong
            Can you give the result of :

            lspci -vvv -s 00:08.0
            

            (inside Dom0)

            Another question, what guest were you trying ?
            Can you try with a recent Linux kernel (some changes were made recently regarding video bios requirement) ? Latest Fedora should have a recent enough kernel for testing, that could maybe help workaround the issue in the meantime (and knowing if there are more issues), with no guarantee.

            DustyArmstrongD 1 Reply Last reply Reply Quote 0
            • DustyArmstrongD Offline
              DustyArmstrong @TeddyAstie
              last edited by

              @TeddyAstie Sure no problem.

              lspci -vvv -s 00:08.0
              
              00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
              	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
              	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
              

              Host is Debian, slightly older kernel so that may be why? Happy to try a different VM distro if you think it might help.

              Linux cctv 6.12.69+deb13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.69-1 (2026-02-08) x86_64 GNU/Linux

              TeddyAstieT 1 Reply Last reply Reply Quote 0
              • TeddyAstieT Online
                TeddyAstie Vates 🪐 XCP-ng Team Xen Guru @DustyArmstrong
                last edited by

                @DustyArmstrong said:

                lspci -vvv -s 00:08.0

                00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
                Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
                Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

                Ah that's not the one I'm looking for.

                Can you do lspci -vvv (without the -s ...) and take the part related to the GPU ?

                DustyArmstrongD 1 Reply Last reply Reply Quote 0
                • DustyArmstrongD Offline
                  DustyArmstrong @TeddyAstie
                  last edited by

                  @TeddyAstie yarp.

                  My bad, the VM has it as 00:08.0 but on the host it's actually 00:06.0, I just didn't think about the specifics of your request!

                  06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barcelo (rev c1) (prog-if 00 [VGA controller])
                  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 1636
                  	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
                  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
                  	Interrupt: pin A routed to IRQ 38
                  	Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
                  	Region 2: Memory at e0000000 (64-bit, prefetchable) [size=2M]
                  	Region 4: I/O ports at d000 [size=256]
                  	Region 5: Memory at fca00000 (32-bit, non-prefetchable) [size=512K]
                  	Capabilities: [48] Vendor Specific Information: Len=08 <?>
                  	Capabilities: [50] Power Management version 3
                  		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
                  		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
                  	Capabilities: [64] Express (v2) Legacy Endpoint, MSI 00
                  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                  			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                  			MaxPayload 256 bytes, MaxReadReq 512 bytes
                  		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend-
                  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                  			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                  		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                  		LnkSta:	Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                  		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                  			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                  			 Compliance De-emphasis: -6dB
                  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
                  			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
                  	Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
                  		Address: 0000000000000000  Data: 0000
                  	Capabilities: [c0] MSI-X: Enable- Count=4 Masked-
                  		Vector table: BAR=5 offset=00042000
                  		PBA: BAR=5 offset=00043000
                  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
                  	Capabilities: [270 v1] #19
                  	Capabilities: [2a0 v1] Access Control Services
                  		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                  		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                  	Capabilities: [2b0 v1] Address Translation Service (ATS)
                  		ATSCap:	Invalidate Queue Depth: 00
                  		ATSCtl:	Enable-, Smallest Translation Unit: 00
                  	Capabilities: [2c0 v1] Page Request Interface (PRI)
                  		PRICtl: Enable- Reset-
                  		PRISta: RF- UPRGI- Stopped+
                  		Page Request Capacity: 00000100, Page Request Allocation: 00000000
                  	Capabilities: [2d0 v1] Process Address Space ID (PASID)
                  		PASIDCap: Exec+ Priv+, Max PASID Width: 10
                  		PASIDCtl: Enable- Exec- Priv-
                  	Capabilities: [400 v1] #25
                  	Capabilities: [410 v1] #26
                  	Capabilities: [440 v1] #27
                  	Kernel driver in use: pciback
                  
                  
                  TeddyAstieT 1 Reply Last reply Reply Quote 0
                  • TeddyAstieT Online
                    TeddyAstie Vates 🪐 XCP-ng Team Xen Guru @DustyArmstrong
                    last edited by TeddyAstie

                    @DustyArmstrong said:

                    @TeddyAstie yarp.

                    My bad, the VM has it as 00:08.0 but on the host it's actually 00:06.0, I just didn't think about the specifics of your request!

                    06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barcelo (rev c1) (prog-if 00 [VGA controller])
                    	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 1636
                    	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
                    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
                    	Interrupt: pin A routed to IRQ 38
                    	Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
                    	Region 2: Memory at e0000000 (64-bit, prefetchable) [size=2M]
                    	Region 4: I/O ports at d000 [size=256]
                    	Region 5: Memory at fca00000 (32-bit, non-prefetchable) [size=512K]
                    	Capabilities: [48] Vendor Specific Information: Len=08 <?>
                    	Capabilities: [50] Power Management version 3
                    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
                    		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
                    	Capabilities: [64] Express (v2) Legacy Endpoint, MSI 00
                    		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                    			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                    		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                    			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                    			MaxPayload 256 bytes, MaxReadReq 512 bytes
                    		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend-
                    		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                    			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                    		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                    			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                    		LnkSta:	Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                    		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                    		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                    		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                    			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                    			 Compliance De-emphasis: -6dB
                    		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
                    			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
                    	Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
                    		Address: 0000000000000000  Data: 0000
                    	Capabilities: [c0] MSI-X: Enable- Count=4 Masked-
                    		Vector table: BAR=5 offset=00042000
                    		PBA: BAR=5 offset=00043000
                    	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
                    	Capabilities: [270 v1] #19
                    	Capabilities: [2a0 v1] Access Control Services
                    		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                    		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                    	Capabilities: [2b0 v1] Address Translation Service (ATS)
                    		ATSCap:	Invalidate Queue Depth: 00
                    		ATSCtl:	Enable-, Smallest Translation Unit: 00
                    	Capabilities: [2c0 v1] Page Request Interface (PRI)
                    		PRICtl: Enable- Reset-
                    		PRISta: RF- UPRGI- Stopped+
                    		Page Request Capacity: 00000100, Page Request Allocation: 00000000
                    	Capabilities: [2d0 v1] Process Address Space ID (PASID)
                    		PASIDCap: Exec+ Priv+, Max PASID Width: 10
                    		PASIDCtl: Enable- Exec- Priv-
                    	Capabilities: [400 v1] #25
                    	Capabilities: [410 v1] #26
                    	Capabilities: [440 v1] #27
                    	Kernel driver in use: pciback
                    
                    

                    thanks.

                    So basically, there is a more annoying issue, as the device doesn't even have a ROMBAR, in this case, the VBIOS is likely in the VFCT ACPI table of host (which the guest can't see); which needs to be injected as a "fake" rombar for the guest to behave properly.

                    That doable on its own, but it's quite tricky to integrate (and you would e.g need to extract VBIOS from VFCT using external tools).

                    I just discussed with Xen/AMD people, and there are known issues regarding PCI Passthrough of integrated AMD GPUs (not specific to Xen AFAIU). There are some projects regarding alternative approaches to bring AMD GPUs to VMs (virtio-gpu native context) which is the current focus.

                    1 Reply Last reply Reply Quote 1

                    Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                    Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                    With your input, this post could be even better 💗

                    Register Login
                    • First post
                      Last post