I'm having a weird issue with PCIe pass-through for our KIOXIA CM7 drives. We have a bunch of KIOXIA CX6 drives and those are being pass-through without any issue.
First thing I've noticed is that the device ID is not being properly recognized, and instead of showing the CM7 name, it's only displaying the device as Device 0013
.
I've tried raising the IRQ limit as explained in the doc. /opt/xensource/libexec/xen-cmdline --set-xen "extra_guest_irqs=128"
Here are some logs from hypervisor.log when the VM crashes during boot.
[2025-10-08 19:15:52] (XEN) [ 93.430177] d[IDLE]v14: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [ 93.436563] d[IDLE]v14: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [ 93.439733] d1v0: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [ 93.448323] d1v0: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [ 93.448801] d1v0: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [ 93.457235] d1v0: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:27:23] (XEN) [ 784.468669] domain_crash called from svm_vmexit_handler+0x129f/0x1480
[2025-10-08 19:27:23] (XEN) [ 784.468671] Domain 3 (vcpu#1) crashed on cpu#1:
[2025-10-08 19:27:23] (XEN) [ 784.468673] ----[ Xen-4.17.5-15 x86_64 debug=n Not tainted ]----
[2025-10-08 19:27:23] (XEN) [ 784.468674] CPU: 1
[2025-10-08 19:27:23] (XEN) [ 784.468674] RIP: 0010:[<ffffffffa751b5d0>]
[2025-10-08 19:27:23] (XEN) [ 784.468675] RFLAGS: 0000000000000286 CONTEXT: hvm guest (d3v1)
[2025-10-08 19:27:23] (XEN) [ 784.468676] rax: ffffb90bc0071200 rbx: ffffb90bc007fba4 rcx: 0000000000000000
[2025-10-08 19:27:23] (XEN) [ 784.468677] rdx: 00000000fee97000 rsi: 0000000000000000 rdi: 0000000000000000
[2025-10-08 19:27:23] (XEN) [ 784.468678] rbp: ffff944cba5c9a80 rsp: ffffb90bc007fb58 r8: 0000000000000000
[2025-10-08 19:27:23] (XEN) [ 784.468678] r9: 0000000000000000 r10: ffffb90bc007fb18 r11: 0000000000000000
[2025-10-08 19:27:23] (XEN) [ 784.468679] r12: 0000000000000197 r13: ffff944c80c830c8 r14: 0000000000000011
[2025-10-08 19:27:23] (XEN) [ 784.468679] r15: 0000000000000001 cr0: 0000000080050033 cr4: 0000000000770ef0
[2025-10-08 19:27:23] (XEN) [ 784.468680] cr3: 00000001105dc006 cr2: 0000000000000000
[2025-10-08 19:27:23] (XEN) [ 784.468680] fsb: 0000000000000000 gsb: ffff944d97480000 gss: 0000000000000000
[2025-10-08 19:27:23] (XEN) [ 784.468681] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0018 cs: 0010
The unsupported MSI delivery mode seems to be from our CX6 drives, but they seem to be working fine.
Here is a lspci output for one of the drive:
lspci -s e1:00.0 -vv
e1:00.0 Non-Volatile memory controller: KIOXIA Corporation Device 0013 (rev 01) (prog-if 02 [NVM Express])
Subsystem: KIOXIA Corporation Device 0043
Physical Slot: 0-2
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 182
Region 0: Memory at f2810000 (64-bit, non-prefetchable) [size=64K]
Expansion ROM at f2800000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 512 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed unknown, Width x4, ASPM not supported, Exit Latency L0s <2us, L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 16GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: Unknown, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [b0] MSI-X: Enable- Count=129 Masked-
Vector table: BAR=0 offset=00005200
PBA: BAR=0 offset=0000d600
Capabilities: [d0] Vital Product Data
Product Name: KIOXIA ESSD
Read-only fields:
[PN] Part number: KIOXIA KCMYXRUG15T3
[EC] Engineering changes: 0001
[SN] Serial number: 3DH0A00A0LP1
[MN] Manufacture ID: 31 45 30 46
[RV] Reserved: checksum good, 26 byte(s) reserved
End
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO+ CmpltAbrt- UnxCmplt+ RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+
Capabilities: [148 v1] Device Serial Number 8c-e3-8e-e3-00-32-1f-01
Capabilities: [168 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [178 v1] #19
Capabilities: [198 v1] #26
Capabilities: [1c0 v1] #27
Capabilities: [1e8 v1] #2a
Capabilities: [210 v1] Single Root I/O Virtualization (SR-IOV)
IOVCap: Migration-, Interrupt Message Number: 000
IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
IOVSta: Migration-
Initial VFs: 32, Total VFs: 32, Number of VFs: 0, Function Dependency Link: 00
VF offset: 1, stride: 1, Device ID: 0013
Supported Page Size: 00000553, System Page Size: 00000001
Region 0: Memory at 00000000f2600000 (64-bit, non-prefetchable)
VF Migration: offset: 00000000, BIR: 0
Kernel driver in use: pciback
Kernel modules: nvme
The only thing I haven't tried to do yet is enable SR-IOV, but I don't think that would really change anything.
Thanks in advance for anyone chiming in!