Thank you for reporting this fix, after I removed the Number of weekly backups kept
entry my backups are working again. Last night all our backups failed with the entries must be sorted in asc order 2025-01 2025-52
error.
Posts
-
RE: Long-term retention of backups
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
XenServer has a fix in their latest release for this - https://docs.xenserver.com/en-us/xenserver/8/whats-new/normal
These updates fix the following issues: The fan on Lenovo AMD systems is always at full speed and DIMM temperatures are not reported correctly by the BMC.
Does this mean the same fix can be implemented into xcpng 8.3?
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
Our main pool upgraded with no issues with the ISO. Same as the above posters though, master doesn't ask for network config but other hosts in the pool do...thankfully all the bonds and networking came up without issue on them.
Will attempt updating our other locations after hours but I don't foresee any problems
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
Seems to be running fine on our 12 hosts, iSCSI came up fine too with all our paths.
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@ThierryEscande I kept all the files from the acpidump from both new and old fw. I've ran that on both sets of acpi dumps which produced quite a few dsl files (one per ssdt) so I've just zipped both folders for you here:
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
Thanks having a look! I've dumped the ACPI before and after firmware updates and linked them below for you.
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
I had some time to test today so I upgraded the FW on a server, disabled IPMI as suggested above in the kernel and the fans remained spun up.
I disabled ACPI in grub (acpi=off) and the fans didn't spin up but then dom0 failed to fully load so that isn't great lol
Is it just a matter of the ACPI kernel driver being outdated? I'm not sure how to check that
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@LennertvdBerg The above message is indicating there is no driver written for IPMI BMC KCS for the LM_Sensors application, not that a kernel module is missing in XCPNG.
You can read more about this on the lm_sensors github issue https://github.com/lm-sensors/lm-sensors/issues/69
If you wish to view sensor data in XPCNG you can do so through IPMI still using ipmitool
ipmitool sensor
This will list the sensors in the server.
DIMM 1 | 0x0 | discrete | 0x0080| na | na | na | na | na | na DIMM 1 Temp | na | degrees C | na | na | na | na | 85.000 | 87.000 | 91.000 DIMM 2 | 0x0 | discrete | 0x4080| na | na | na | na | na | na DIMM 2 Temp | 23.000 | degrees C | ok | na | na | na | 85.000 | 87.000 | 91.000 DIMM 3 | 0x0 | discrete | 0x0080| na | na | na | na | na | na DIMM 3 Temp | na | degrees C | na | na | na | na | 85.000 | 87.000 | 91.000 DIMM 4 | 0x0 | discrete | 0x4080| na | na | na | na | na | na DIMM 4 Temp | 25.000 | degrees C | ok | na | na | na | 85.000 | 87.000 | 91.000 DIMM 5 | 0x0 | discrete | 0x4080| na | na | na | na | na | na DIMM 5 Temp | 26.000 | degrees C | ok | na | na | na | 85.000 | 87.000 | 91.000 DIMM 6 | 0x0 | discrete | 0x4080| na | na | na | na | na | na DIMM 6 Temp | 26.000 | degrees C | ok | na | na | na | 85.000 | 87.000 | 91.000 DIMM 7 | 0x0 | discrete | 0x4080| na | na | na | na | na | na DIMM 7 Temp | 26.000 | degrees C | ok | na | na | na | 85.000 | 87.000 | 91.000 DIMM 8 | 0x0 | discrete | 0x4080| na | na | na | na | na | na DIMM 8 Temp | 26.000 | degrees C | ok | na | na | na | 85.000 | 87.000 | 91.000 DIMM 9 | 0x0 | discrete | 0x4080| na | na | na | na | na | na DIMM 9 Temp | 25.000 | degrees C | ok | na | na | na | 85.000 | 87.000 | 91.000 DIMM 10 | 0x0 | discrete | 0x0080| na | na | na | na | na | na DIMM 10 Temp | na | degrees C | na | na | na | na | 85.000 | 87.000 | 91.000 DIMM 11 | 0x0 | discrete | 0x4080| na | na | na | na | na | na DIMM 11 Temp | 24.000 | degrees C | ok | na | na | na | 85.000 | 87.000 | 91.000
The issue is when upgrading the UEFI we start seeing the sensor data read NA for the RAM modules which spins up the fans on the server, I don't know how to determine what on the OS is causing that but it sounds like something is trying to read that information and is locking up the sensor.
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@Gheppy Our SR635v3 are running factory installed Lenovo hardware, although I did test swapping out the Broadcom 57504 OCP NIC with Intel X710 but the fan issue persisted. Thanks for the suggestion!
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@RIX_IT I tried kernel-alt before rolling back the uefi and it didn't help. Also tried the new Xen beta stormi posted (4.19 was it?) in case something was added but that didn't fix it either.
Thanks for trying the new Lenovo fw - I saw it come out and was going to test when I had time but now I don't need to!
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@RIX_IT I rolled my XCC and UEFI back to last May and the fans finally quieted down.
BMC Version
2.10 (Build ID: KAX318R)
UEFI Version
1.41 (Build ID: KAE110K)I noticed when the fans spun up the XCC page was showing multiple DIMMs temp as NA and when I checked
ipmitool sensor
in the XCPNG terminal the DIMM # Temp was also 0 when the fans were up. I'm assuming the old kernel XCPNG runs is causing some havoc with the UEFI Lenovo has on these servers and is preventing that sensor from being read so XCC freaks out and spins the fans up because it assumes the temp on that sensor is really high. No idea why the old XCC/UEFI seem to work though.
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@bleader My older UEFI fix didn't work long as the fans were spun up again this morning. The servers are nice and quiet when booting to AlmaLinux or like RIX_IT mentioned Windows Server so could this be an issue with Kernal 4.19 on XCP not playing well with new EPYC chipsets?
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@bleader yeah I'm not sure how the OS could affect that either and certainly haven't seen that in the past...very odd.
I am on 8.3 beta 2. I haven't tested with 8.2
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
Let me know if you figure this out, I've been fighting with this issue for the last couple days as well. We have SR635v3 servers with AMD EPYC 9354P processors. After GRUB the fans spin down, and the kernel begins to load, once I see efi: EFI_MEMMAP is not enabled the fans spin back up.
In the interim the fix has been to roll back UEFI firmware back to 1.43 which loses some microcode updates but at least our fans aren't screaming at 16k RPM in an office environment.
-
RE: Xen 4.17 on XCP-ng 8.3!
Can you run two different kernels in one pool? Can I test one host with 4.17 and then roll it back to 4.13 after testing completes without messing up my pool?
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
Not sure if this is a known issue, I couldn't find it on Github but maybe I was looking in the wrong spot. Is anyone else's network Bond speeds 0? I thought I had seen a fix a month ago for this in a release but now I can't find that changelog
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@Tristis-Oris Try a
yum clean all
and then redo youryum update
XO Update seems to be working well for us on 3 hosts!
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
Updated to latest XCP 8.3 build, however I'm on XOA Premium and that hasn't updated yet so I can't confirm stats restored.
However my issue with NFS speed remains (currently only getting 50MB/s write speed inside VMs or migrating a VM but when I ssh onto the host I can write to the mounted NFS share at 480MB/s)My Bond0 speed is still not registering. Xe pif-param-list shows speed 0, I'm still seeing errors in my xensource.log
[09:44 xcpng-core-3 ~]# xe pif-param-list uuid=13924cb4-288a-b340-e9ac-0034870c2bb5 uuid ( RO) : 13924cb4-288a-b340-e9ac-0034870c2bb5 device ( RO): bond0 MAC ( RO): fe:ff:ff:ff:ff:ff physical ( RO): false managed ( RO): true currently-attached ( RO): true MTU ( RO): 9000 VLAN ( RO): 500 vlan-master-of ( RO): 0f0b3661-03aa-79a3-9592-21c6512e2aff vlan-slave-of ( RO): bond-master-of ( RO): bond-slave-of ( RO): <not in database> sriov-physical-PIF-of ( RO): sriov-logical-PIF-of ( RO): tunnel-access-PIF-of ( RO): tunnel-transport-PIF-of ( RO): management ( RO): false network-uuid ( RO): 2f450b68-035d-77d6-3478-69779015cc4e network-name-label ( RO): Storage host-uuid ( RO): 17cde123-69de-4e13-9306-35ac7d1e25bd host-name-label ( RO): xcpng-core-1 IP-configuration-mode ( RO): Static IP ( RO): 10.10.30.30 netmask ( RO): 255.255.255.0 gateway ( RO): IPv6-configuration-mode ( RO): Static IPv6 ( RO): IPv6-gateway ( RO): primary-address-type ( RO): IPv4 DNS ( RO): properties (MRO): capabilities (SRO): io_read_kbs ( RO): 51259.102 io_write_kbs ( RO): 52415.578 carrier ( RO): false vendor-id ( RO): vendor-name ( RO): device-id ( RO): device-name ( RO): speed ( RO): 0 Mbit/s duplex ( RO): unknown disallow-unplug ( RW): false pci-bus-path ( RO): N/A other-config (MRW): igmp-snooping-status ( RO): unknown
Feb 23 09:47:00 xcpng-core-3 xcp-networkd: [error||3 ||network_utils] Error in read one line of file: /sys/class/net/bond0/carrier, exception Unix.Unix_error(Unix.ENOENT, "open", "/sys/class/net/bond0/carrier")\x0ARaised by primitive operation at Xapi_stdext_unix__Unixext.with_file in file "lib/xapi-stdext-unix/unixext.ml", line 90, characters 11-40\x0ACalled from Xapi_stdext_unix__Unixext.buffer_of_file in file "lib/xapi-stdext-unix/unixext.ml" (inlined), line 177, characters 31-83\x0ACalled from Xapi_stdext_unix__Unixext.string_of_file in file "lib/xapi-stdext-unix/unixext.ml", line 179, characters 47-73\x0ACalled from Network_utils.Sysfs.read_one_line in file "ocaml/networkd/lib/network_utils.ml", line 156, characters 6-33\x0A Feb 23 09:47:00 xcpng-core-3 xcp-networkd: [error||3 ||network_utils] Error in read one line of file: /sys/class/net/bond0/device/device, exception Unix.Unix_error(Unix.ENOENT, "open", "/sys/class/net/bond0/device/device")\x0ARaised by primitive operation at Xapi_stdext_unix__Unixext.with_file in file "lib/xapi-stdext-unix/unixext.ml", line 90, characters 11-40\x0ACalled from Xapi_stdext_unix__Unixext.buffer_of_file in file "lib/xapi-stdext-unix/unixext.ml" (inlined), line 177, characters 31-83\x0ACalled from Xapi_stdext_unix__Unixext.string_of_file in file "lib/xapi-stdext-unix/unixext.ml", line 179, characters 47-73\x0ACalled from Network_utils.Sysfs.read_one_line in file "ocaml/networkd/lib/network_utils.ml", line 156, characters 6-33\x0A Feb 23 09:47:00 xcpng-core-3 xcp-networkd: [error||3 ||network_utils] Error in read one line of file: /sys/class/net/bond0/device/vendor, exception Unix.Unix_error(Unix.ENOENT, "open", "/sys/class/net/bond0/device/vendor")\x0ARaised by primitive operation at Xapi_stdext_unix__Unixext.with_file in file "lib/xapi-stdext-unix/unixext.ml", line 90, characters 11-40\x0ACalled from Xapi_stdext_unix__Unixext.buffer_of_file in file "lib/xapi-stdext-unix/unixext.ml" (inlined), line 177, characters 31-83\x0ACalled from Xapi_stdext_unix__Unixext.string_of_file in file "lib/xapi-stdext-unix/unixext.ml", line 179, characters 47-73\x0ACalled from Network_utils.Sysfs.read_one_line in file "ocaml/networkd/lib/network_utils.ml", line 156, characters 6-33\x0A Feb 23 09:47:05 xcpng-core-3 xcp-networkd: [error||3 ||network_utils] Error in read one line of file: /sys/class/net/bond0/carrier, exception Unix.Unix_error(Unix.ENOENT, "open", "/sys/class/net/bond0/carrier")\x0ARaised by primitive operation at Xapi_stdext_unix__Unixext.with_file in file "lib/xapi-stdext-unix/unixext.ml", line 90, characters 11-40\x0ACalled from Xapi_stdext_unix__Unixext.buffer_of_file in file "lib/xapi-stdext-unix/unixext.ml" (inlined), line 177, characters 31-83\x0ACalled from Xapi_stdext_unix__Unixext.string_of_file in file "lib/xapi-stdext-unix/unixext.ml", line 179, characters 47-73\x0ACalled from Network_utils.Sysfs.read_one_line in file "ocaml/networkd/lib/network_utils.ml", line 156, characters 6-33\x0A Feb 23 09:47:05 xcpng-core-3 xcp-networkd: [error||3 ||network_utils] Error in read one line of file: /sys/class/net/bond0/device/device, exception Unix.Unix_error(Unix.ENOENT, "open", "/sys/class/net/bond0/device/device")\x0ARaised by primitive operation at Xapi_stdext_unix__Unixext.with_file in file "lib/xapi-stdext-unix/unixext.ml", line 90, characters 11-40\x0ACalled from Xapi_stdext_unix__Unixext.buffer_of_file in file "lib/xapi-stdext-unix/unixext.ml" (inlined), line 177, characters 31-83\x0ACalled from Xapi_stdext_unix__Unixext.string_of_file in file "lib/xapi-stdext-unix/unixext.ml", line 179, characters 47-73\x0ACalled from Network_utils.Sysfs.read_one_line in file "ocaml/networkd/lib/network_utils.ml", line 156, characters 6-33\x0A Feb 23 09:47:05 xcpng-core-3 xcp-networkd: [error||3 ||network_utils] Error in read one line of file: /sys/class/net/bond0/device/vendor, exception Unix.Unix_error(Unix.ENOENT, "open", "/sys/class/net/bond0/device/vendor")\x0ARaised by primitive operation at Xapi_stdext_unix__Unixext.with_file in file "lib/xapi-stdext-unix/unixext.ml", line 90, characters 11-40\x0ACalled from Xapi_stdext_unix__Unixext.buffer_of_file in file "lib/xapi-stdext-unix/unixext.ml" (inlined), line 177, characters 31-83\x0ACalled from Xapi_stdext_unix__Unixext.string_of_file in file "lib/xapi-stdext-unix/unixext.ml", line 179, characters 47-73\x0ACalled from Network_utils.Sysfs.read_one_line in file "ocaml/networkd/lib/network_utils.ml", line 156, characters 6-33\x0A
--Edit--
It appears as though that sys/class/net folder doesn't exist for the Bond0 which is why it can't read it, my physical NIC speeds are correct but those folders do exists[09:51 xcpng-core-1 2d00931d-0af4-1ec8-4bca-08d822c6c335]# ls /sys/class/net/ eth0 eth1 eth2 eth3 lo ovs-system vif1.0 vif10.0 vif11.0 vif12.0 vif13.0 vif14.0 vif15.0 vif16.0 vif17.0 vif18.0 vif19.0 vif2.0 vif20.0 vif3.0 vif4.0 vif5.0 vif6.0 vif7.0 vif8.0 vif9.0 xapi0 xapi1 xapi2 xapi4 xapi5 xapi6
--Edit2--
I migrated some more VMs onto NFS and the migrations themselves seem to be hard capped at 50MB/s however I am able to get around 250MB/s when testing INSIDE the VM. So I'm not sure if migrations are CPU capped or what's happening there.I also noticed on ifconfig the bond0 doesn't exist either but they appear to be the xapi networks, each numbered xapi network is a VLAN on my Bond0 interface. Looks like xcp-networkd is just looking in the wrong location
xapi4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000 inet 10.10.30.30 netmask 255.255.255.0 broadcast 10.10.30.255 ether 00:62:0b:bf:8f:44 txqueuelen 1000 (Ethernet) RX packets 22114210 bytes 138926062482 (129.3 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 15131116 bytes 94882708122 (88.3 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
-
RE: Is it possible yet to select a specific network adapter for Backups?
On the Pool > Advanced tab you can select a backup network so if you created a backup network for a specific NIC you could set it there
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
Same here (ticket is in).
Anyone having issues with NFS SR performance too? Since the update my VMs are transferring about 140MB/s vs before the update we were seeing speeds around 440MB/s.
I'm using "dd if=/dev/zero of=tmp bs=1G oflag=dsync count=1" to test, if I SSH onto the host I can get full speed to the NFS mount but from within a VM it is much slower, this means backups are also crawling since XOA can't read the VHD very quickly.
iSCSI performance seems unaffected.