XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. redorangefreak
    R
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 4
    • Groups 0

    redorangefreak

    @redorangefreak

    2
    Reputation
    1
    Profile views
    4
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    redorangefreak Unfollow Follow

    Latest posts made by redorangefreak

    • How many guest IRQs is too many?

      Ran into an interesting issue today. We do a lot of hardware passthrough for some of our XCP-ng configurations. One of our use cases is a guest running a networking tool we built that we PCI passthrough NICs to. Typically we have been using about 6 to 8 NICs but today we put together a configuration that required us to passthrough 17 NICs.

      Because we do a lot of passthrough configs our standard config is to set the "extra_guest_irqs=128" parameter on all of our installs. Well long story short, I ran into some MSI-X interrupt errors on the guest with the 17 passthrough NICs and the RHEL guest couldn't initialize some of my NICs. I took a look at /proc/interrupts and with the extra 128 IRQ setting I was at 192 IRQ and they seemed to be all allocated to the NICs that were functional. It looked like each NIC needed 9 IRQ, so I calculated I needed at least 40 more and then bumped the "extra_guest_irqs" to 192 for a total of 256. That solved my issues and everything is working great now.

      So my question is, what is the downside to allocating too many guest IRQ and is there an upper limit before I start seeing issues? Also that is the only guest VM on this system with that many passthrough devices, is there a way to set the IRQ on a per guest basis?

      There isn't really a whole lot of documentation on how to best size guest IRQ and how to determine what settings i may need when passing through multiple PCI devices. If anyone has any insights or could point me to some documentation I would be very appreciative! Thanks!

      posted in Compute
      R
      redorangefreak
    • RE: PCI Passthrough Missing Capabilities in Guest

      @andSmv Thanks for the good info! Sure if you guys provide the beta I would be more than happy to test. Unfortunately, I only have access to this hardware for another week before I need to ship it out. If you can get me the beta today or tomorrow I can get it in place and start exercising it over the weekend. Thanks!

      posted in Hardware
      R
      redorangefreak
    • RE: PCI Passthrough Missing Capabilities in Guest

      Just a quick update, I spun up a RHEL9 guest and am seeing the same behavior.

      posted in Hardware
      R
      redorangefreak
    • PCI Passthrough Missing Capabilities in Guest

      Hi All!
      I am attempting a PCI passthrough of an LSI 9670-24i Tri-Mode RAID Card to a RHEL8 guest. Upon booting the guest I am getting a "osintfc_mrioc_security_status: PCI_EXT_CAP_ID_DSN is not supported" in dmesg. Which basically means the driver can not find the PCI Capability "Device Serial Number" and then the kernel basically shuts down the PCI device. Originally I tried different versions of the driver, changing every possible BIOS setting I could think would have anything to do with this, and after inspecting the source code of the driver the issue seems to be directly tied to a missing PCI Capability. XCP-ng is 8.3 with the latest updates as of a week or so ago.

      When running lspci -vvv in XCP-ng I get the following:

              Subsystem: Broadcom / LSI MegaRAID 9670-24i Tri-Mode Storage Adapter
              Physical Slot: 5
              Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
              Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
              Latency: 0, Cache Line Size: 32 bytes
              Region 0: Memory at 38bfffe00000 (64-bit, prefetchable) [size=16K]
              Capabilities: [40] Power Management version 3
                      Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                      Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
              Capabilities: [48] MSI: Enable- Count=1/32 Maskable+ 64bit+
                      Address: 0000000000000000  Data: 0000
                      Masking: 00000000  Pending: 00000000
              Capabilities: [68] Express (v2) Endpoint, MSI 00
                      DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                              ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
                      DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                              RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                              MaxPayload 512 bytes, MaxReadReq 4096 bytes
                      DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                      LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L0s <4us, L1 <32us
                              ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                      LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
                              ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                      LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                      DevCap2: Completion Timeout: Range ABCD, TimeoutDis-, LTR-, OBFF Not Supported
                      DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis-, LTR-, OBFF Disabled
                      LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
                               Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                               Compliance De-emphasis: -6dB
                      LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
                               EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
              Capabilities: [a4] MSI-X: Enable- Count=128 Masked-
                      Vector table: BAR=0 offset=00002000
                      PBA: BAR=0 offset=00003e00
              Capabilities: [b0] Vital Product Data
                      Product Name: Broadcom MegaRAID 9670-24i Tri-Mode Storage Adapter
                      Read-only fields:
                              [PN] Part number: 03-50123-00
                              [EC] Engineering changes: 002
                              [SN] Serial number: SKD3218062
                              [V0] Vendor specific: Broadcom Inc.
                              [V1] Vendor specific: SAS4124
                              [V2] Vendor specific: 500062B219DD1BC0
                              [RV] Reserved: checksum good, 0 byte(s) reserved
                      End
              Capabilities: [100 v1] Device Serial Number 00-80-5e-49-ae-38-8b-18
              Capabilities: [fb4 v1] Advanced Error Reporting
                      UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                      UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                      UESvrt: DLP+ SDES- TLP+ FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                      CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                      CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                      AERCap: First Error Pointer: 1f, GenCap+ CGenEn- ChkCap+ ChkEn-
              Capabilities: [138 v1] Power Budgeting <?>
              Capabilities: [db4 v1] #19
              Capabilities: [af4 v1] #25
              Capabilities: [d00 v1] #26
              Capabilities: [d40 v1] #27
              Capabilities: [160 v1] #16
              Kernel driver in use: pciback
              Kernel modules: mpi3mr
      

      And when running the same in the VM I get:

              Subsystem: Broadcom / LSI MegaRAID 9670-24i Tri-Mode Storage Adapter
              Physical Slot: 8
              Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
              Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
              Latency: 0
              Region 0: Memory at f1800000 (64-bit, prefetchable) [size=16K]
              Capabilities: [40] Power Management version 3
                      Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                      Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
              Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
                      Address: 0000000000000000  Data: 0000
              Capabilities: [68] Express (v2) Endpoint, MSI 00
                      DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                              ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W
                      DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                              RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                              MaxPayload 128 bytes, MaxReadReq 512 bytes
                      DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                      LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L1 <32us
                              ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                      LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
                              ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                      LnkSta: Speed 8GT/s (downgraded), Width x8 (ok)
                              TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                      DevCap2: Completion Timeout: Range ABCD, TimeoutDis- NROPrPrP- LTR-
                               10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix-
                               EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                               FRS- TPHComp- ExtTPHComp-
                               AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                      DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                               AtomicOpsCtl: ReqEn+
                      LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
                               EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                               Retimer- 2Retimers- CrosslinkRes: Upstream Port
              Capabilities: [a4] MSI-X: Enable- Count=128 Masked-
                      Vector table: BAR=0 offset=00002000
                      PBA: BAR=0 offset=00003e00
              Capabilities: [b0] Vital Product Data
                      Product Name: Broadcom MegaRAID 9670-24i Tri-Mode Storage Adapter
                      Read-only fields:
                              [PN] Part number: 03-50123-00
                              [EC] Engineering changes: 002
                              [SN] Serial number: SKD3218062
                              [V0] Vendor specific: Broadcom Inc.
                              [V1] Vendor specific: SAS4124
                              [V2] Vendor specific: 500062B219DD1BC0
                              [RV] Reserved: checksum good, 0 byte(s) reserved
                      End
              Kernel driver in use: mpi3mr
              Kernel modules: mpi3mr
      

      The Capability that seems to be required by the mpi3mr driver is:
      Capabilities: [100 v1] Device Serial Number 00-80-5e-49-ae-38-8b-18

      Anyone know of a way to unhide or passthrough additional capabilities over to the guest?

      posted in Hardware
      R
      redorangefreak