Issue with VM network dropping in and out
-
@glenlewis09 Does the XCP host report the ethernet link going down?
dmesg
-
I don't see any logs from the XEN Orchestra.
Home -> Hosts -> GLS-XENHOST08 -> Logs
I see no logs here, Also When I go to Network The status is always showing connected. -
@glenlewis09 You need to
ssh
into the XCP host. -
I have found the ssh command to create the log using the xen-bugtool.
I am not a linux guru so I don't know how to transfer the log file from the host to the windows client via ssh. Nor do I know how to access the host files outside of ssh.
[09:14GLS-XENHOST08bug-report]# xen-bugtool --yestoall
Warning: '--yestoall' argument provided, will not prompt for individual files.This application will collate the Xen dmesg output, details of the
hardware configuration of your machine, information about the build of
Xen that you are using, plus, if you allow it, various logs.The collated information will be saved as a .tar.bz2 for archiving or
sending to a Technical Support Representative.The logs may contain private information, and if you are at all
worried about that, you should exit now, or you should explicitly
exclude those logs from the archive.Omitting /dev/shm/metrics/xcp-rrdd-xenpm, size constraint of xcp-rrdd-plugins exceeded
Omitting /dev/shm/metrics/xcp-rrdd-squeezed, size constraint of xcp-rrdd-plugins exceeded
Omitting /dev/shm/metrics/xcp-rrdd-mem_vms, size constraint of xcp-rrdd-plugins exceeded
Omitting /dev/shm/metrics/xcp-rrdd-mem_host, size constraint of xcp-rrdd-plugins exceeded
[01/31/24 09:14:52 CST] Creating output file
[01/31/24 09:14:52 CST] Running commands to collect data
Writing tarball /var/opt/xen/bug-report/bug-report-20240131091452.tar.bz2 successful.
[09:15GLS-XENHOST08bug-report]#This is the output from the ssh. But again I am not sure how to copy or even open the log data.
Sorry for the lack of experience here.
-
@glenlewis09 You can use WinSCP to connect to your host via SSH and browse the filesystem. WinSCP is a file manager and works in a similar way to Windows Explorer. Bonus: there is also a portable version.
-
@gskger said in Issue with VM network dropping in and out:
@glenlewis09 You can use WinSCP to connect to your host via SSH and browse the filesystem. WinSCP is a file manager and works in a similar way to Windows Explorer. Bonus: there is also a portable version.
@gskger @glenlewis09 You can also use use scp directly in PowerShell if on a up to date Windows 10 or Windows 11. You will need the SSH Client feature installed (enabled) though and running PowerShell 5 (or later).
-
Thank you so much, that is a great insight. Now I have the logs there are 100+ log files. Which file are you wanting me to look at.
-
@glenlewis09 Check out the troubleshooting guide on logs in the documentation. @Andrew was talking about the kernel message logs. At the CLI, you could also type
dmsg
to show the kernel messages since the last boot. You could narrow the result down further withdmesg | grep -i eth
. -
@glenlewis09 I'm running Windows 10 and 11 and it works correctly.
Are you only having problems with Windows 2019 and 2022 ?
-
@glenlewis09 I don't have the same system as you (also XCP 8.3), but I do have r8125 cards and a 2.5G switch. I installed Windows Server 2022 and have no problems keeping a Remote Desktop connection open and watching YouTube videos (with audio)...
I'll have to try it on my AMD system with the r8125 next.
-
@gskger said in Issue with VM network dropping in and out:
dmesg | grep -i eth
Last login: Wed Jan 31 09:11:37 2024 from 192.168.20.168 [13:59 GLS-XENHOST08 ~]# dmesg | grep -i eth [ 1.042384] xen_netfront: Initialising Xen virtual ethernet driver [ 2.735978] r8125 Ethernet controller driver 9.012.03-NAPI-PTP-RSS loaded [ 3.823280] ACPI Error: Method parse/execution failed \_SB.UBTC._DSM, AE_NOT_FOUND (20180810/psparse-516) [ 3.860193] r8125 0000:02:00.0 side-2697-eth0: renamed from eth0 [ 5.989957] r8125 0000:02:00.0 eth0: renamed from side-2697-eth0 [ 7.086183] eth0: 0xffffc90040540000, 38:f7:cd:c6:d5:82, IRQ 177 [ 7.114826] r8125 0000:02:00.0 eth0: registered PHC device on eth0 [ 7.114829] r8125 0000:02:00.0 eth0: reset PHC clock [ 7.148004] device eth0 entered promiscuous mode [ 9.927534] r8125: eth0: link up [133076.658774] NETDEV WATCHDOG: eth0 (r8125): transmit queue 1 timed out [133076.685550] r8125 0000:02:00.0 eth0: reset PHC clock [133076.706192] r8125: eth0: link down [133079.932108] r8125: eth0: link up [13:59GLS-XENHOST08~]#
These are the logs for the NIC
-
It happens across all Windows OS, WIN 11/ Server 2016/2019/2022
But on one of the VM it never has issue.
But if I move the VM back to the other Host with the 1GB nic all the VM behave correctly.
Thank you.
-
@glenlewis09 Just to get more information on the NIC in your system, can you first identify the NICs with
lspci | grep Ethernet
(returns the ID00:1f.6
on my system)# lspci | grep Ethernet 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (11) I219-LM
and than get more details with
lspci -s 00:1f.6 -vv
using the ID of your NIC# lspci -s 00:1f.6 -vv 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (11) I219-LM Subsystem: Hewlett-Packard Company Device 8715 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 212 Region 0: Memory at e1200000 (32-bit, non-prefetchable) [size=128K] Capabilities: [c8] Power Management version 3 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee00f98 Data: 0000 Kernel driver in use: e1000e Kernel modules: e1000e
While this does not address your issue, it gives more insight into your setup.
Edit: some typos
-
@gskger [14:39 GLS-XENHOST08 ~]# lspci | grep Ethernet 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05) [14:39 GLS-XENHOST08 ~]# ^C [14:39 GLS-XENHOST08 ~]# lspci -s 02:00.0 -vvv 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05) Subsystem: Realtek Semiconductor Co., Ltd. Device 0123 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 36 Region 0: I/O ports at f000 [size=256] Region 2: Memory at fce00000 (64-bit, non-prefetchable) [size=64K] Region 4: Memory at fce10000 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] Express (v2) Endpoint, MSI 01 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 256 bytes, MaxReadReq 2048 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via message/WAKE# DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [b0] MSI-X: Enable+ Count=32 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00000800 Capabilities: [d0] Vital Product Data pcilib: sysfs_read_vpd: read failed: Input/output error Not readable Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [148 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status: NegoPending- InProgress- Capabilities: [168 v1] Device Serial Number 01-00-00-00-68-4c-e0-00 Capabilities: [178 v1] Transaction Processing Hints No steering table available Capabilities: [204 v1] Latency Tolerance Reporting Max snoop latency: 1048576ns Max no snoop latency: 1048576ns Capabilities: [20c v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=150us PortTPowerOnTime=150us Capabilities: [21c v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Kernel driver in use: r8125 Kernel modules: r8125 [14:40GLS-XENHOST08~]#
-
@glenlewis09 Can you please edit your post and format the output as code (insert ``` before and after the output)? This improves readability.
-
@gskger done, thank you for the correction.
-
@glenlewis09 Again just to double check: your XCP-ng 8.2.1 is fully up-to-date (
yum update
returnsNo packages marked for update
)? The refreshed 8.2.1 ISO from December 2023 contained updated drivers contributed by @Andrew, including ther8125
driver. -
@gskger said in Issue with VM network dropping in and out:
yum update
[15:25 GLS-XENHOST08 ~]# yum update Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org No packages marked for update [15:25GLS-XENHOST08~]#
-
@glenlewis09 The only thing that realy stands out is this error message:
pcilib: sysfs_read_vpd: read failed: Input/output error Not readable
Can you please try
dmesg | grep VPD
and report the output (if any)? -
@gskger said in Issue with VM network dropping in and out:
dmesg | grep VPD
[15:36 GLS-XENHOST08 ~]# dmesg | grep VPD [ 5.967152] r8125 0000:02:00.0: invalid short VPD tag 00 at offset 1 [15:36GLS-XENHOST08~]#