XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. DustyArmstrong
    3. Posts
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 11
    • Posts 62
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: AMD 'Barcelo' passthrough issues - any success stories?

      @timemaster5 Ah very cool, thank you for looking into this and the comprehensive write up, great work. I went on a small journey to try various workarounds but didn't get as far as yourself and others, particularly as my use-case is only a "nice-to-have". Most everything I found was in relation to Proxmox and others. Little did I know when I thought I was helping you over on Github, you would actually end up helping me!

      As I understand from your notes, the full procedure would be:

      1. Enable passthrough for the GPU on the host
      2. Apply the patched wrapper to enable custom model args and respect platform vga
      3. Disable emulated VGA via platform:vga=none
      4. Apply the config to the VM via -device loader options
      5. Reboot the host
      6. Mount the GPU to the VM and boot it up
      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • RE: AMD 'Barcelo' passthrough issues - any success stories?

      @TeddyAstie Thanks for the update.

      I do actually have a VBIOS for that GPU, but I wasn't entirely sure what to do with it - is there a process to inject it? I've found resources for Proxmox and others, but I really like XCP and equally I don't want to migrate my entire setup just for that.

      If it's really tricky then I'm not too worried about it, as I say the VM actually runs the cameras perfectly fine on CPU alone, it's negligible.

      Edit: Looks like this post might answer my question: https://github.com/xcp-ng/xcp/issues/786

      "Even when specifying romfile and rombar properties on the xen-pci-passthrough device in QEMU, the ROM region is not mapped into guest memory."

      timemaster5 created this issue in xcp-ng/xcp

      open PCI ROM BAR not exposed to guest when using xen-pci-passthrough #786

      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • RE: AMD 'Barcelo' passthrough issues - any success stories?

      @TeddyAstie yarp.

      My bad, the VM has it as 00:08.0 but on the host it's actually 00:06.0, I just didn't think about the specifics of your request!

      06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barcelo (rev c1) (prog-if 00 [VGA controller])
      	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 1636
      	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
      	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
      	Interrupt: pin A routed to IRQ 38
      	Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
      	Region 2: Memory at e0000000 (64-bit, prefetchable) [size=2M]
      	Region 4: I/O ports at d000 [size=256]
      	Region 5: Memory at fca00000 (32-bit, non-prefetchable) [size=512K]
      	Capabilities: [48] Vendor Specific Information: Len=08 <?>
      	Capabilities: [50] Power Management version 3
      		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
      		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
      	Capabilities: [64] Express (v2) Legacy Endpoint, MSI 00
      		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
      			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
      		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
      			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
      			MaxPayload 256 bytes, MaxReadReq 512 bytes
      		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend-
      		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
      			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
      		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
      			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
      		LnkSta:	Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
      		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
      		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
      		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
      			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
      			 Compliance De-emphasis: -6dB
      		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
      			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
      	Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
      		Address: 0000000000000000  Data: 0000
      	Capabilities: [c0] MSI-X: Enable- Count=4 Masked-
      		Vector table: BAR=5 offset=00042000
      		PBA: BAR=5 offset=00043000
      	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
      	Capabilities: [270 v1] #19
      	Capabilities: [2a0 v1] Access Control Services
      		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
      		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
      	Capabilities: [2b0 v1] Address Translation Service (ATS)
      		ATSCap:	Invalidate Queue Depth: 00
      		ATSCtl:	Enable-, Smallest Translation Unit: 00
      	Capabilities: [2c0 v1] Page Request Interface (PRI)
      		PRICtl: Enable- Reset-
      		PRISta: RF- UPRGI- Stopped+
      		Page Request Capacity: 00000100, Page Request Allocation: 00000000
      	Capabilities: [2d0 v1] Process Address Space ID (PASID)
      		PASIDCap: Exec+ Priv+, Max PASID Width: 10
      		PASIDCtl: Enable- Exec- Priv-
      	Capabilities: [400 v1] #25
      	Capabilities: [410 v1] #26
      	Capabilities: [440 v1] #27
      	Kernel driver in use: pciback
      
      
      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • RE: Issue to load gpu passthrough "Invalid PCI ROM header signature: expecting 0xaa55, got 0x4556"

      @Greg_E Thanks, I've got another thread up and it's potentially being addressed!

      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • RE: AMD 'Barcelo' passthrough issues - any success stories?

      @TeddyAstie Sure no problem.

      lspci -vvv -s 00:08.0
      
      00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
      	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
      	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
      

      Host is Debian, slightly older kernel so that may be why? Happy to try a different VM distro if you think it might help.

      Linux cctv 6.12.69+deb13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.69-1 (2026-02-08) x86_64 GNU/Linux

      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • RE: Issue to load gpu passthrough "Invalid PCI ROM header signature: expecting 0xaa55, got 0x4556"

      @itnok fair enough and understandable, thanks for taking the time to reply.

      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • RE: AMD 'Barcelo' passthrough issues - any success stories?

      @TeddyAstie Thanks for that - I was going off of other old XCP hosts I have had running that did show IOMMU info there, so I figured it wasn't working on these particular machines. I also had the the Intel GPU working on those hosts, which further supported my theory. I do have a USB controller passed through to the same VM and that works perfectly, basically native functionality, but from the research I was doing it was suggested USB controllers are way less bothered about imperfect passthrough.

      I was able to find a ROM for my particular GPU, but I don't know what - if anything - I can do with it in XCP. I've found lots of info for QEMU and Proxmox, but I'm not sure they directly translate.

      If there's nothing I can do for the moment then I can live without the GPU, running decode on the CPU isn't really taxing it that much. Appreciate you looking into it.

      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • RE: Issue to load gpu passthrough "Invalid PCI ROM header signature: expecting 0xaa55, got 0x4556"

      @itnok hey, did you ever get this working? I am in a similar position now with the ROM header.

      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • AMD 'Barcelo' passthrough issues - any success stories?

      I am trying to pass through my AMD GPU to a Debian host. This sort of succeeds, but I am getting error messages about a "BIOS ROM". I have had a look online and found some resources to suggest how to provide one, but I am quite unsure on that and cannot locate a specific one for my GPU (AMD Ryzen 7 5825U with Radeon Graphics, I think Vega 8).

      I am showing it has at least tried to load/associate the kernel module:

      lspci -nnk -s 00:08.0
      00:08.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Barcelo [1002:15e7] (rev c1)
      	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1636]
      	Kernel modules: amdgpu
      

      But dmesg output states it cannot locate a ROM.

      [    4.655776] amdgpu 0000:00:08.0: amdgpu: Unable to locate a BIOS ROM
      [    4.655797] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init
      [    4.655812] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device.
      [    4.656681] amdgpu 0000:00:08.0: probe with driver amdgpu failed with error -22
      

      Is there a trick to this, has anyone had success with this kind of AMD GPU? On my old hosts, enabling pass through was enough for it to just kind of work (Intel HD 530). The host machine outputs to a display normally when the card is in-use by the host. I am of the understanding the ROM is just part of the motherboard/GPU, there is some suggestion it can be dumped from the host-side, but I'm unsure on this.

      Can anyone shed some light on how this should work, or whether it's just something in my setup that isn't working? I have IOMMU and SR-IOV enabled in the BIOS, but I don't know if it's working.

      EDIT: It looks like I may just have a fake BIOS? The settings to enable all the relevant components (IOMMU, DMAr support etc) don't actually seem to do anything, they might just be for show - dmesg | grep -i iommu returns nothing, dmesg | grep -i -e dmar -e vfio -e pciback only shows pciback info, and cat /proc/cmdline contains nothing about IOMMU. Oddly, XO is still reporting that IOMMU is enabled:

      0e1c311b-7e3f-48d7-b840-58ec85aef7b6-image.png

      The output of xe host-param-get uuid={uuid} param-name=chipset-info also returns iommu: true, but the output of xl info | grep "iommu" returns nothing.

      posted in Hardware
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      @florent No problem, just thought it would be fun.

      Thanks for your work anyway!

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      @florent I'd be interested in giving it a shot if you accept PRs, but if you already have it planned and would rather do it yourselves at a later date (and it's too low a priority to review a PR anyway) then that's fair enough. I'll be content enough knowing it's a thing to be aware of under the hood.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      @florent Thanks, had to put the DFIR hat on.

      May as well ask as I thought about a PR for this - would it be feasible/practical/desirable to allow this to be done from XO's UI? I don't know how much of an edge case this was for me, but being able to remove "other-config" data following a migration (e.g. you do what I did and want the VMs to start over independently on a new host) might be beneficial to others.

      Obviously it would be quite destructive I imagine, if used inappropriately. Even just reporting those ghostly associations would be nice - again not sure of your overall design ethos so there may be good reasons why it's not a solid idea.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      @Pilow Time for a drink, I think.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      Well, scratch the help, I have fixed it.

      Bad VM:

      xe vm-param-list uuid={uuid} | grep "other-config"

      other-config (MRW): xo:backup:schedule: one-time; xo:backup:vm: {uuid}; xo:backup:datetime: 20260214T18:13:03Z; xo:backup:job: 97665b6d-1aff-43ba-8afb-11c7455c16ff; xo:backup:sr: {uuid}; auto_poweron: true; base_template_name: Debian Buster 10;

      Clean VM:

      other-config (MRW): auto_poweron: true; base_template_name: Debian Buster 10; [...]

      Stale references to my old backup job causing the VM to disassociate with the VDI chain. Removing the extra config resolved it. Simple when you know where to look.

      xe vm-param-remove uuid={uuid} param-name=other-config param-key=xo:backup:job
      xe vm-param-remove uuid={uuid}  param-name=other-config param-key=xo:backup:sr
      xe vm-param-remove uuid={uuid}  param-name=other-config param-key=xo:backup:vm
      xe vm-param-remove uuid={uuid}  param-name=other-config param-key=xo:backup:schedule
      xe vm-param-remove uuid={uuid}  param-name=other-config param-key=xo:backup:datetime
      

      VMs now snapshot clean.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      Looks like I unfortunately blew my chance to do this cleanly.

      Decided to plug in my old hosts again, just to see if XO felt the VM - despite being on a new host - was still associated with or on the old host. Turns out that was the right assumption, as when I snapshotted a problem VM, it did not have the health warning. Took that to mean if I removed the old VMs, the references would go with it, but alas, I should've snapshotted all of them before doing that.

      Back to square 1, but now the one VM I did snapshot no longer has the health warning. Completely wiped XO, still getting the same problem. Kind of at a loss now.

      I would still really appreciate any assistance.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      @acebmxer My container is built from the sources, only fairly recently. I may try rebuilding it with the latest commit and deploy completely new, if that would work. I'm just trying to avoid a situation where I end up with even more confusion in the database.

      In the end, my ultimate goal is just to re-associate the VMs with XO and the wider XAPI database. Whatever method will achieve that should work here as I'm starting from scratch anyway - for the most part. Whatever way achieves that cleanly really, I'm just a bit unsure how best to do it as moving XO clearly caused me issues, need to understand how to rebuild the VM associations.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      @Pilow Yeah I don't particularly want to, but in the absence of any alternatives! I'll give it a day in case anyone responds, otherwise I'll just wipe it - assuming it leaves the hosts untouched. I just need to do whatever needs to be done to re-associate the VM UUIDs and would hope a total rebuild of XO would do that.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      @Pilow Yes one pool, 2 servers (a master and a secondary/slave).

      2e9ca12d-21b4-4bf9-a7ac-e256b6e15652-image.png

      c042186d-ecbb-476b-ad6d-3f83a265bfd4-image.png

      I think I've realised per my last post update, when I warm migrated I didn't select "delete source VM", which has probably broken something since I retired the old hosts afterwards.

      I mainly just want to know the best method to wipe XO and start over so it can rebuild the database.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      Still not working, blew away my old backups and deleted the job, still getting detached snapshots. Interestingly, of my two hosts, the pool slave isn't actually recording the snapshot properly still. Tested the "revert snapshot" from XO, which works properly - everything works properly except the detached snapshot warning. Probably also worth noting that I did not select the "delete source VM", I deleted them manually later.

      Snapshotting a VM on XC2, slave in a pool. Don't know if this is expected or not for a pool slave.

      Pool master (Displays the IP of XO):

      HTTPS 123.456.789.6->|Async.VM.snapshot R:da375f8192ab|audit] ('trackid=2f566a5e238297cab4efbd4023dba8da' 'LOCAL_SUPERUSER' 'root' 'ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' 'VM_Name0' 'e0f0 29ec-f0b8-1c14-c654-80c6ecec582c' 'OpaqueRef:f0fae2b6-7729-b539-066f-16a795b0b22f')

      Pool slave (displays IP of the XCP host itself):

      HTTPS 123.456.789.2->:::80|Async.VM.snapshot R:da375f8192ab|audit] ('trackid=f49a8b9396c939d4cd5335542cba3848' 'LOCAL_SUPERUSER' '' 'ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' '' '' 'OpaqueRef:f0fae2b6-7729-b539-066f-16a795b0b22f')

      I don't know if this means anything. When I snapshot a VM on XC1 (warm migrated) it also shows in the logs of the pool master, but with the IP of XO (still detached):

      HTTPS 123.456.789.6->|Async.VM.snapshot R:cbd3f663c5d7|audit] ('trackid=2f566a5e238297cab4efbd4023dba8da' 'LOCAL_SUPERUSER' 'root' 'ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' 'VM_Name1 'c7c63201-e25a-a7d5-7e39-394636538866' 'OpaqueRef:72f934a8-bfd6-f01c-5917-234cacdd49d5')

      When I snapshot a VM I created natively (no warm migration), it looks exactly the same and is not detached:

      HTTPS 123.456.789.6->|Async.VM.snapshot R:3e4f5c3dd9f2|audit] ('trackid=2f566a5e238297cab4efbd4023dba8da' 'LOCAL_SUPERUSER' 'root' 'ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' 'VM_Name2' 'cd95df02-a907-bdaa-2c0e-ca503656460b' 'OpaqueRef:00c31a47-69d9-753d-5481-d5a10e881e13')

      Looking at another thread, running xl list shows VMs with a mixture of some with [XO warm migration Warm migration] tagged and some not. My VM that wasn't migrated shows as its normal name (matching output of xe vm-list), this one snapshots fine. Notably, another VM that was migrated doesn't show the migration tags, but doesn't snapshot properly anyway. I renamed the VMs after migrating. Renaming again doesn't update the output of xl list but does update the output of xe vm-list.

      If anyone in the know can help me understand what's going on I'd be most appreciative, I doubt I'll be able to backup my VMs until it's resolved. Alternatively, if anyone can confirm what steps I should take to completely rebuild XO such that the data is held correctly so this stops happening. To re-iterate, this only seems to happen on warm migrated VMs.

      posted in Backup
      DustyArmstrongD
      DustyArmstrong
    • RE: Detached VM Snapshots after Warm Migration

      @Pilow They would have the same IP address, the new XO is just a new Docker container on the same physical host but with a new database (different version of Redis on ARM so couldn't like-for-like re-use). There is only 1 XO running, 100% certain. The snapshots in the audit log now reflect XO as having initiated them, where before it was the host itself (fallback) - this may all be magically resolved now, but I'm not home to look yet.

      No, I made a backup of all my VMs before the warm migration, but the backup was made on the new XO instance, successfully, backing up the VMs on the old hosts (XCP1 & XCP2).

      So the full process I took was:

      1. 2 weeks ago, downed old XO
      2. Brought up brand new XO (fresh Redis DB), imported config, all working
      3. This weekend, spun up 2 new XCP hosts (lot of drama but we will ignore that) XC1 & XC2
      4. Created new pool containing the new XCP hosts (XC1 & XC2)
      5. Initiated a backup of all VMs on old host pool (XCP1 & XCP2) using an existing scheduled backup - manually triggered - backup succeeds
      6. Warm migrate VMs from pool XCP1/XCP2 to XC1/XC2 - success
      7. Disable old host pool XCP1/XCP2, VMs working as expected
      8. Snapshot a warm migrated VM - detached snapshot - snapshotting a VM created natively on XC1/XC2 does not have this issue
      9. Removed old pool XCP1/XCP2 entirely from XO - this solved the audit logs side of things (I think this is XAPI)
      10. Logs now show VM info correctly, but frontend still displays a detached snapshot

      My guess is that it may think the snapshots were happening on the old pool/old VM, not the warm migrated copy, or something like that.

      I will update the thread if I manage to resolve everything, I'll keep an eye out anyway in case someone from Vates knows any more on this!

      posted in Backup
      DustyArmstrongD
      DustyArmstrong