Aha, that helped. Thanks.
Best posts made by vegarnilsen
-
RE: Can't figure out how to configure a separate NFS network on the hosts via XO
Latest posts made by vegarnilsen
-
RE: XCP-ng 8.1 host loses network when running gateway/firewall VMs
@r1 Yup, see https://gist.github.com/vegarnilsen/dce2b5c17cf188f1fa2c7615dc6fefc4 for the modinfo and lsmod output.
@tuxen Since we're not using FibreChannel, I disabled fcoe before the latest test, see the gist above for info.
-
RE: XCP-ng 8.1 host loses network when running gateway/firewall VMs
@r1 We're not using the bnxt_en driver, we're using the bnx2x driver. But given your request I looked for and installed the alternate qlogic driver:
[10:29 oslo5pool3h03 etc]$ rpm -qa | grep qlogic qlogic-qla2xxx-firmware-8.03.02-1.xcpng8.1.x86_64 qlogic-netxtreme2-4.19.0+1-modules-7.14.53-1.1.xcpng8.1.x86_64 qlogic-qla2xxx-10.01.00.54.80.0_k-1.xcpng8.1.x86_64 qlogic-fastlinq-8.37.30.0-3.xcpng8.1.x86_64 qlogic-netxtreme2-7.14.53-1.1.xcpng8.1.x86_64 [10:29 oslo5pool3h03 etc]$ rpm -qil qlogic-netxtreme2-4.19.0+1-modules-7.14.53-1.1.xcpng8.1.x86_64 Name : qlogic-netxtreme2-4.19.0+1-modules Version : 7.14.53 Release : 1.1.xcpng8.1 Architecture: x86_64 Install Date: Tue 22 Sep 2020 06:04:01 PM CEST Group : System Environment/Kernel Size : 3048296 License : GPL Signature : RSA/SHA1, Wed 12 Feb 2020 01:27:25 PM CET, Key ID cd75783a3fd3ac9e Source RPM : qlogic-netxtreme2-7.14.53-1.1.xcpng8.1.src.rpm Build Date : Wed 12 Feb 2020 01:13:59 PM CET Build Host : koji.xcp-ng.org Relocations : (not relocatable) Packager : XCP-ng Vendor : XCP-ng Summary : Qlogic netxtreme2 device drivers Description : Qlogic netxtreme2 device drivers for the Linux Kernel version 4.19.0+1. /etc/modprobe.d/qlogic-netxtreme2.conf /lib/modules/4.19.0+1/updates/bnx2.ko /lib/modules/4.19.0+1/updates/bnx2fc.ko /lib/modules/4.19.0+1/updates/bnx2i.ko /lib/modules/4.19.0+1/updates/bnx2x.ko /lib/modules/4.19.0+1/updates/cnic.ko [10:29 oslo5pool3h03 etc]$ yum search qlogic Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org ====================================================== N/S matched: qlogic ======================================================= qlogic-fastlinq.x86_64 : Qlogic fastlinq device drivers qlogic-fastlinq-debuginfo.x86_64 : Debug information for package qlogic-fastlinq qlogic-netxtreme2.x86_64 : Qlogic NetXtreme II iSCSI, 1-Gigabit and 10-Gigabit ethernet drivers qlogic-netxtreme2-4.19.0+1-modules.x86_64 : Qlogic netxtreme2 device drivers qlogic-netxtreme2-alt.x86_64 : Qlogic NetXtreme II iSCSI, 1-Gigabit and 10-Gigabit ethernet drivers qlogic-netxtreme2-alt-4.19.0+1-modules.x86_64 : Qlogic netxtreme2 device drivers qlogic-netxtreme2-alt-debuginfo.x86_64 : Debug information for package qlogic-netxtreme2-alt qlogic-netxtreme2-debuginfo.x86_64 : Debug information for package qlogic-netxtreme2 qlogic-qla2xxx.x86_64 : Qlogic qla2xxx device drivers qlogic-qla2xxx-debuginfo.x86_64 : Debug information for package qlogic-qla2xxx qlogic-qla2xxx-firmware.x86_64 : Qlogic qla2xxx firmware qlogic-qla2xxx-firmware-debuginfo.x86_64 : Debug information for package qlogic-qla2xxx-firmware Name and summary matches only, use "search all" for everything. [10:30 oslo5pool3h03 etc]$ yum info qlogic-netxtreme2-alt-4.19.0+1-modules.x86_64 Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org Available Packages Name : qlogic-netxtreme2-alt-4.19.0+1-modules Arch : x86_64 Version : 7.14.63 Release : 2.xcpng8.1 Size : 1.2 M Repo : xcp-ng-base Summary : Qlogic netxtreme2 device drivers License : GPL Description : Qlogic netxtreme2 device drivers for the Linux Kernel : version 4.19.0+1. [10:30 oslo5pool3h03 etc]$ sudo yum install qlogic-netxtreme2-alt-4.19.0+1-modules.x86_64 Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org Resolving Dependencies --> Running transaction check ---> Package qlogic-netxtreme2-alt-4.19.0+1-modules.x86_64 0:7.14.63-2.xcpng8.1 will be installed --> Finished Dependency Resolution Dependencies Resolved ================================================================================================================================== Package Arch Version Repository Size ================================================================================================================================== Installing: qlogic-netxtreme2-alt-4.19.0+1-modules x86_64 7.14.63-2.xcpng8.1 xcp-ng-base 1.2 M Transaction Summary ================================================================================================================================== Install 1 Package Total download size: 1.2 M Installed size: 2.9 M Is this ok [y/d/N]: y Downloading packages: qlogic-netxtreme2-alt-4.19.0+1-modules-7.14.63-2.xcpng8.1.x86_64.rpm | 1.2 MB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : qlogic-netxtreme2-alt-4.19.0+1-modules-7.14.63-2.xcpng8.1.x86_64 1/1 Verifying : qlogic-netxtreme2-alt-4.19.0+1-modules-7.14.63-2.xcpng8.1.x86_64 1/1 Installed: qlogic-netxtreme2-alt-4.19.0+1-modules.x86_64 0:7.14.63-2.xcpng8.1 Complete! [10:32 oslo5pool3h03 etc]$
I rebooted the server, and booted up a couple of the VMs I'm having issues with, and then I ran
ping
from one of the internal servers to an external site:64 bytes from www.vg.no (195.88.54.16): icmp_seq=1338 ttl=248 time=2.88 ms 64 bytes from www.vg.no (195.88.54.16): icmp_seq=1339 ttl=248 time=3.04 ms 64 bytes from www.vg.no (195.88.54.16): icmp_seq=1340 ttl=248 time=3.17 ms 64 bytes from www.vg.no (195.88.54.16): icmp_seq=1341 ttl=248 time=2.91 ms client_loop: send disconnect: Broken pipe client_loop: send disconnect: Broken pipe
However, as you can see, this crashed the host after a while and resulted in a host with no network.
-
XCP-ng 8.1 host loses network when running gateway/firewall VMs
We are in the process of migrating our VMs from a XenServer 6.5 pool to a new pool running XCP-ng 8.1. After we migrated some VMs that are acting as gateways / firewalls for internal networks, the host(s) those VMs are running on loses network within a few minutes, at times within seconds, of the VM booting up. (The host is still running, and if I log in on the console everything except any network is working.)
The new pool is running on HP BL460c Gen8 blades, with 10Gb Flexfabric NICs, using the bnx2x driver.
When the host loses network these messages appear in kern.log:
[09:34 oslo5pool3h03 log]$ sudo grep bnx2x kern.log | grep timeout Nov 9 10:24:52 oslo5pool3h03 kernel: [ 1537.425714] bnx2x: [bnx2x_stats_comp:211(eth0)]timeout waiting for stats finished Nov 9 10:24:54 oslo5pool3h03 kernel: [ 1538.584055] bnx2x: [bnx2x_stats_comp:211(eth0)]timeout waiting for stats finished Nov 9 10:25:24 oslo5pool3h03 kernel: [ 1568.785236] bnx2x: [bnx2x_stats_comp:211(eth0)]timeout waiting for stats finished Nov 9 10:25:25 oslo5pool3h03 kernel: [ 1569.940934] bnx2x: [bnx2x_stats_comp:211(eth0)]timeout waiting for stats finished
Some pages I found through Google hint at IO-MMU being the problem. I tried disabling IO-MMU through grub parameters to the kernel, when I did that the host rebooted immediately when the test-VM caused the problem.
The NICs seem to be on the XenServer HCL, and since these are blade servers I can't swap the NICs to a different chipset, since all HPE NICs for this blade generation uses that same chipset.
"Regular" VMs are working fine, but VMs with multiple virtual NICs where there's traffic going from one interface to another seem to reliably crash the host.
I've applied all available updates to XCP-ng, this didn't make any difference.
Since we're not using FibreChannel, I tried disabling that module, also I tried disabling some offloading:
[09:40 oslo5pool3h03 log]$ cat /etc/modprobe.d/qlogic-netxtreme2.conf options bnx2x num_vfs=0 options bnx2x disable_tpa=1 [09:40 oslo5pool3h03 log]$ cat /etc/modprobe.d/blacklist-fc.conf blacklist bnx2fc
Neither of these made any difference.
-
RE: "Fast" and "slow" pool members?
@gn_ro Will there be any CPU masking if they all have the same feature set though?
In my scenario there's only a speed difference between the CPUs, nothing else. -
"Fast" and "slow" pool members?
I'm planning a new XCP-ng pool, where we plan on using HP Gen8 blades. These are available with a bunch of different CPU options, so I'd like to know if it would be reasonable to have e.g. half the blades with a CPU model that has fewer cores but higher clock speed, and half with a CPU with more cores and lower clock speed. All of the CPUs would be Xeon E5 26xx v2, so they should all have the same CPU features.
Would I be able to live-migrate guests between the fast and slow hosts, or would only cold-migrate be possible in this situation?
With such a setup, could I designate the slow hosts as the default for new guests? I would prefer to reserve the fast hosts for guests that actually need the higher clock speed, typically for single-thread workloads.
Cheers, Vegar
-
RE: Can't figure out how to configure a separate NFS network on the hosts via XO
Aha, that helped. Thanks.
-
Can't figure out how to configure a separate NFS network on the hosts via XO
I've set up a new XCP-ng 8 pool with two hosts and a shared NFS server on a different VLAN from the management network. When I try to give each host an IP address on this NFS network I can't figure out how to do that, there's nothing in the documentation as far as I can tell.
I ended up booting Windows and installing the latest XCP-ng Center, where it's easy: Open each host's Network tab, choose configure in the management network section, and add a new IP address on the correct VLAN.
Once I had added the IP-addresses via XCP-ng Center, they show up in the network list in XO, and I can edit the address and network mask if I want to do so.
This is with XOA, "Current version: 5.43.2".
Thanks,
Vegar