@stormi I’ll be back on Wednesday (just short holiday now), I’ll try your advice and see how it works
Posts made by LennertvdBerg
-
RE: ISO modification with additional RPM for NIC
-
RE: ISO modification with additional RPM for NIC
@stormi I thought it’s convenient to have all in one as it’s easy for installation. But I can check this options as well. So you recommend to extract the iso to a separate USB drive and load drivers from there?
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@olivierlambert would there be a way after GRUB to walk step by step through the boot and see where it goes wrong?
-
RE: ISO modification with additional RPM for NIC
@stormi Hi, some help is welcome Still haven’t found a solutions.
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@ThierryEscande I've updated to
Xen 4.17
and it seems the upgrade went fine:host : xcp-ng-test1 release : 4.19.0+1 version : #1 SMP Wed Jan 24 17:19:11 CET 2024 machine : x86_64 nr_cpus : 64 max_cpu_id : 63 nr_nodes : 1 cores_per_socket : 32 threads_per_core : 2 cpu_mhz : 3245.126 hw_caps : 178bf3ff:7efa320b:2e500800:244037ff:0000000f:f1bf97a9:00405fce:00000780 virt_caps : pv hvm hvm_directio pv_directio hap gnttab-v1 gnttab-v2 total_memory : 130850 free_memory : 121721 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 17 xen_extra : .3-3 xen_version : 4.17.3-3 xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : $Format:%H$, pq ??? xen_commandline : dom0_mem=7568M,max:7568M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311 cc_compiler : gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1) cc_compile_by : mockbuild cc_compile_domain : [unknown] cc_compile_date : Wed Feb 28 10:12:19 CET 2024 build_id : 9a011a28e29a21a7643376b36aec959253587d42 xend_config_format : 4
However, the issues with the fan speeds and missing memory temperature readings still persist.
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@ThierryEscande I'm experiencing difficulties with installing the
kernel-alt
package on my system. Currently, I am using XCP-ng version 8.3.0-beta2. It appears that there might be a problem with updategrub.py. Any guidance on how to resolve this would be greatly appreciated.I've update
xcp-ng.repo
and this is my output ofyum --enablerepo=xcp-ng-tescande install kernel-alt
:[20:12 xcp-ng-test1 ~]# yum --enablerepo=xcp-ng-tescande install kernel-alt Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org Resolving Dependencies --> Running transaction check ---> Package kernel-alt.x86_64 0:4.19.309-1.0.lenovotest.2.xcpng8.3 will be installed --> Finished Dependency Resolution Dependencies Resolved =========================================================================================================================================================================================================================================================================================== Package Arch Version Repository Size =========================================================================================================================================================================================================================================================================================== Installing: kernel-alt x86_64 4.19.309-1.0.lenovotest.2.xcpng8.3 xcp-ng-tescande 30 M Transaction Summary =========================================================================================================================================================================================================================================================================================== Install 1 Package Total download size: 30 M Installed size: 154 M Is this ok [y/d/N]: y Downloading packages: kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64.rpm | 30 MB 00:00:01 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64 1/1 /var/tmp/rpm-tmp.l9rbzO: line 9: /opt/xensource/bin/updategrub.py: No such file or directory warning: %post(kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64) scriptlet failed, exit status 127 Non-fatal POSTIN scriptlet failure in rpm package kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64 Verifying : kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64 1/1 Installed: kernel-alt.x86_64 0:4.19.309-1.0.lenovotest.2.xcpng8.3
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@ThierryEscande . When I do
lsmod | grep ipmi
I get the following resultsipmi_si 65536 0 ipmi_devintf 20480 0 ipmi_msghandler 61440 2 ipmi_devintf,ipmi_si
So, I created the file with
vi /etc/modprobe.d/blacklist-ipmi.conf
and added the following:blacklist ipmi_si blacklist ipmi_devintf blacklist ipmi_msghandler
I saved the file and rebooted the system using shutdown -r now. However, I still don't see the memory temperatures in Xclarity, and the server's fans are still running at over 13,000 RPM. The system is running XCP-NG 8.3 beta 2 with kernel 4.19.0+1.
-
RE: ISO modification with additional RPM for NIC
@stormi could you maybe advise what I'm doing wrong?
-
RE: ISO modification with additional RPM for NIC
@Danp, UPDATED: I tried booting with an alternate kernel in XCP-NG 8.2.1 and XCP-NG 8.3 beta 2, but it didn't load the Mellanox ConnectX-6 Lx 10/25GbE drivers.
Yes, I've read the documentation about creating a custom ISO and have detailed my procedure above. The only part I'm unsure about is this:
"you need to add new RPMs not just replace existing ones, they need to be pulled by another existing RPM as dependencies. If there's none suitable, you can add the dependency to the xcp-ng-deps RPM."
I couldn’t realize or understand this step. -
ISO modification with additional RPM for NIC
I'm fairly new to XCP-NG and would like to build a custom ISO for XCP-NG where I can add an additional RPM for a Mellanox ConnectX-6 Lx 10/25GbE SFP28. The problem is that I don't have other NICs installed and I can't install XCP-NG 8.2.1 because it detects during installation that there's no NIC in the system. I can install XCP-NG 8.3 beta 2 as the drivers are included there. So, I would like to include the drivers for the Mellanox in the ISO so that during installation the process will automatically detect it and I can run the installation.
In xcp-ng-8.3.0-beta2, there's an additional
mellanox-mlnxen-5.4_1.0.3.0-4.xcpng8.3.x86_64.rpm
in thePackages/
directory. In xcp-ng-8.2.1-20231130, there is no mellanox-mlnxen*.rpm at all. I found two Mellanox RPMs at Koji;mellanox-mlnxen-alt-5.4_1.0.3.0-1.xcpng8.2.x86_64.rpm
(https://koji.xcp-ng.org/buildinfo?buildID=2620)mellanox-mlnxen-alt-5.9_0.5.5.0-1.1.xcpng8.2.x86_64.rpm
(https://koji.xcp-ng.org/buildinfo?buildID=2868)
I tried following the instructions for ISO modification mentioned in the XCP-NG ISO modification documentation
First, I extracted the ISO using the following commands:
mkdir tmpmountdir/ mount -o loop filename.iso tmpmountdir/ # as root cp -a tmpmountdir/. iso umount tmpmountdir/ # as root chmod a+w iso/ -R
Then, I used
wget
to download the RPMs into thePackages/
directory. After this, I updated therepodata/
using the following command (remember to installcreaterepo-c
first):"sudo apt install createrepo-c rm repodata/ -rf createrepo_c . -o .
Finally, I built the ISO using the instructions given in the XCP-NG documentation:
#OUTPUT=/path/to/destination/iso/file # change me OUTPUT=/home/xcp-ng/new_iso/xcp-ng-8.2.1-20231130-mod.iso VERSION=8.2 # change me genisoimage -o $OUTPUT -v -r -J --joliet-long -V "XCP-ng $VERSION" -c boot/isolinux/boot.cat -b boot/isolinux/isolinux.bin \ -no-emul-boot -boot-load-size 4 -boot-info-table -eltorito-alt-boot -e boot/efiboot.img -no-emul-boot . isohybrid --uefi $OUTPUT
However, when I use this ISO, the Mellanox ConnectX-6 Lx drivers do not load during installation.
Also, I have seen on the Nvidia website that new drivers for the ConnectX-6 Lx are available for Citrix XenServer Host 8.2 in version
mlnx-en-23.10-2.1.3.1-xenserver8.2-x86_64.
So my questions are:
- What am I doing wrong with building the ISO and including the RPMs?
- Is it possible to include the
mlnx-en-23.10-2.1.3.1-xenserver8.2-x86_64
for XCP-NG 8.2? - What steps do I need to take, and how?
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@rmaclachlan Thanks. I'm also unsure how we can determine what in the OS is causing this issue. Are there other installations or modifications we could try to help isolate the problem, such as another Linux distribution with the same kernel, to see if it's a kernel-related issue? @gduperrey or @olivierlambert any suggestions how we can help the team with identifying this?
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@olivierlambert can you help us with providing the module for the xen kernel, which @Gheppy is talking about?
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@Gheppy I've just reinstalled xcp-ng-8.3.0-beta2 after my Ubuntu experiment and installed lm_sensors. The output is indeed:
Driver `to-be-written': * ISA bus, address 0xcc0 Chip `IPMI BMC KCS' (confidence: 8) Note: there is no driver for IPMI BMC KCS yet. Check http://www.lm-sensors.org/wiki/Devices for updates. No modules to load, skipping modules configuration. Unloading i2c-dev... OK Unloading cpuid... OK
The complete output is:
What will be the solution for this?
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@Gheppy I just installed Ubuntu 22.044 LTS with kernel 5.15.0-102-generic just to test if there could be anything like a 'vendor lock'. Using Ubuntu I just see my memory temperatures and all my fan speeds are around 6000 rpm. So it really seems to be something with XCP and Lenovo.
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@gduperrey said in High Fan Speed Issue on Lenovo ThinkSystem Servers:
https://docs.xcp-ng.org/installation/hardware/#-alternate-kernel
I tried installing the kernel-alt using
yum install kernel-alt
However I receive the following error;
python: can't open file '/opt/xensource/bin/updategrub.py': [Errno 2] No such file or directory warning: %postun(kernel-alt-4.19.227-5.xcpng8.3.x86_64) scriptlet failed, exit status 2
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@bleader ; do you have somewhere the instructions how to do this procedure?
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@RIX_IT said in High Fan Speed Issue on Lenovo ThinkSystem Servers:
XClarity Controller Firmware 2.40 KAX326G
On our system we are running XClarity Controller Firmware 2.40 KAX326G, having similar issues.
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
@olivierlambert ; as it's a clean install I can see if I can provide you access to the server with Xclarity Controller 2 so you have full access to a system for testing? Would that help?
-
RE: High Fan Speed Issue on Lenovo ThinkSystem Servers
We have exactly the same issue with a Lenovo ThinkSystem SR665 V3 (Model #7D9AA01SEA) running a single AMD EPYC 9354 32C. We are running currently the following latest firmwares on our system;
We installed xcp-ng-8.3.0-beta2 (due to the Mellanox ConnectX-6 Lx 10/25GbE SFP28 2-port OCP Ethernet Adapter which is by default not supported in XCP 8.2.1) and have exactly similar issues. When the server is turned on but not in XCP yet, we get the following temperature reading in Xclarity Controller 2 (having 2x 64 GB RAM MTC40F2046S1RC48BR);
However, when XCP is booted we get the following temperature reading;
This results indeed that the fans go to max power due to missing readings.The downgrade of Firmwares I haven't tried yet, but would be nice if this could be solved in XCP. We have another identical system in stock which we still need to install. I'm planning to try the installation on that server without upgrading the default factory firmwares and see if that works better.