XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    PCI Passthrough Missing Capabilities in Guest

    Scheduled Pinned Locked Moved Hardware
    5 Posts 3 Posters 97 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R Offline
      redorangefreak
      last edited by

      Hi All!
      I am attempting a PCI passthrough of an LSI 9670-24i Tri-Mode RAID Card to a RHEL8 guest. Upon booting the guest I am getting a "osintfc_mrioc_security_status: PCI_EXT_CAP_ID_DSN is not supported" in dmesg. Which basically means the driver can not find the PCI Capability "Device Serial Number" and then the kernel basically shuts down the PCI device. Originally I tried different versions of the driver, changing every possible BIOS setting I could think would have anything to do with this, and after inspecting the source code of the driver the issue seems to be directly tied to a missing PCI Capability. XCP-ng is 8.3 with the latest updates as of a week or so ago.

      When running lspci -vvv in XCP-ng I get the following:

              Subsystem: Broadcom / LSI MegaRAID 9670-24i Tri-Mode Storage Adapter
              Physical Slot: 5
              Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
              Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
              Latency: 0, Cache Line Size: 32 bytes
              Region 0: Memory at 38bfffe00000 (64-bit, prefetchable) [size=16K]
              Capabilities: [40] Power Management version 3
                      Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                      Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
              Capabilities: [48] MSI: Enable- Count=1/32 Maskable+ 64bit+
                      Address: 0000000000000000  Data: 0000
                      Masking: 00000000  Pending: 00000000
              Capabilities: [68] Express (v2) Endpoint, MSI 00
                      DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                              ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
                      DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                              RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                              MaxPayload 512 bytes, MaxReadReq 4096 bytes
                      DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                      LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L0s <4us, L1 <32us
                              ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                      LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
                              ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                      LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                      DevCap2: Completion Timeout: Range ABCD, TimeoutDis-, LTR-, OBFF Not Supported
                      DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis-, LTR-, OBFF Disabled
                      LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
                               Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                               Compliance De-emphasis: -6dB
                      LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
                               EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
              Capabilities: [a4] MSI-X: Enable- Count=128 Masked-
                      Vector table: BAR=0 offset=00002000
                      PBA: BAR=0 offset=00003e00
              Capabilities: [b0] Vital Product Data
                      Product Name: Broadcom MegaRAID 9670-24i Tri-Mode Storage Adapter
                      Read-only fields:
                              [PN] Part number: 03-50123-00
                              [EC] Engineering changes: 002
                              [SN] Serial number: SKD3218062
                              [V0] Vendor specific: Broadcom Inc.
                              [V1] Vendor specific: SAS4124
                              [V2] Vendor specific: 500062B219DD1BC0
                              [RV] Reserved: checksum good, 0 byte(s) reserved
                      End
              Capabilities: [100 v1] Device Serial Number 00-80-5e-49-ae-38-8b-18
              Capabilities: [fb4 v1] Advanced Error Reporting
                      UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                      UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                      UESvrt: DLP+ SDES- TLP+ FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                      CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                      CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                      AERCap: First Error Pointer: 1f, GenCap+ CGenEn- ChkCap+ ChkEn-
              Capabilities: [138 v1] Power Budgeting <?>
              Capabilities: [db4 v1] #19
              Capabilities: [af4 v1] #25
              Capabilities: [d00 v1] #26
              Capabilities: [d40 v1] #27
              Capabilities: [160 v1] #16
              Kernel driver in use: pciback
              Kernel modules: mpi3mr
      

      And when running the same in the VM I get:

              Subsystem: Broadcom / LSI MegaRAID 9670-24i Tri-Mode Storage Adapter
              Physical Slot: 8
              Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
              Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
              Latency: 0
              Region 0: Memory at f1800000 (64-bit, prefetchable) [size=16K]
              Capabilities: [40] Power Management version 3
                      Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                      Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
              Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
                      Address: 0000000000000000  Data: 0000
              Capabilities: [68] Express (v2) Endpoint, MSI 00
                      DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                              ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W
                      DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                              RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                              MaxPayload 128 bytes, MaxReadReq 512 bytes
                      DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                      LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L1 <32us
                              ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                      LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
                              ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                      LnkSta: Speed 8GT/s (downgraded), Width x8 (ok)
                              TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                      DevCap2: Completion Timeout: Range ABCD, TimeoutDis- NROPrPrP- LTR-
                               10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix-
                               EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                               FRS- TPHComp- ExtTPHComp-
                               AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                      DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                               AtomicOpsCtl: ReqEn+
                      LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
                               EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                               Retimer- 2Retimers- CrosslinkRes: Upstream Port
              Capabilities: [a4] MSI-X: Enable- Count=128 Masked-
                      Vector table: BAR=0 offset=00002000
                      PBA: BAR=0 offset=00003e00
              Capabilities: [b0] Vital Product Data
                      Product Name: Broadcom MegaRAID 9670-24i Tri-Mode Storage Adapter
                      Read-only fields:
                              [PN] Part number: 03-50123-00
                              [EC] Engineering changes: 002
                              [SN] Serial number: SKD3218062
                              [V0] Vendor specific: Broadcom Inc.
                              [V1] Vendor specific: SAS4124
                              [V2] Vendor specific: 500062B219DD1BC0
                              [RV] Reserved: checksum good, 0 byte(s) reserved
                      End
              Kernel driver in use: mpi3mr
              Kernel modules: mpi3mr
      

      The Capability that seems to be required by the mpi3mr driver is:
      Capabilities: [100 v1] Device Serial Number 00-80-5e-49-ae-38-8b-18

      Anyone know of a way to unhide or passthrough additional capabilities over to the guest?

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Ping @Team-Hypervisor-Kernel

        1 Reply Last reply Reply Quote 0
        • R Offline
          redorangefreak
          last edited by redorangefreak

          Just a quick update, I spun up a RHEL9 guest and am seeing the same behavior.

          1 Reply Last reply Reply Quote 0
          • andSmvA Offline
            andSmv Vates 🪐 XCP-ng Team Xen Guru
            last edited by

            Hello,

            Yes, unfortunately this is a PCIe device and this is also a PCIe capability which is reported in PCI extended configuration space (offset 0x100) and not covered by standard PCI configuration access method. And visibly the driver NEEDS this cap to make the device work.

            Actually there's a work in progress (very close to its end) which offers to HVM guests a QEMU emulated Q35 chipset (instead of currently emulated i440fx chipset). This chipset "provide" to guest (amongst other things) an emulated PCI-e bus, which is capable to host PCIe devices and also offers an access to PCI extended configuration space.

            When this work is done we will be able to passthrough PCIe devices and provide access to guest to all PCIe caps, so normally no driver would complain about missing that.

            AFIAK, "most common" PCIe caps are emulated in this future patchset, but it still will be possible that some of them are not (exotic ones).

            For now, i ping @ThierryEscande to see if he can provide to you a beta version of this patches, to see if it solves your problem and if you're agree to do some tests by the same occasion 🙂

            R 1 Reply Last reply Reply Quote 1
            • R Offline
              redorangefreak @andSmv
              last edited by

              @andSmv Thanks for the good info! Sure if you guys provide the beta I would be more than happy to test. Unfortunately, I only have access to this hardware for another week before I need to ship it out. If you can get me the beta today or tomorrow I can get it in place and start exercising it over the weekend. Thanks!

              1 Reply Last reply Reply Quote 0
              • First post
                Last post