XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. LennertvdBerg
    3. Posts
    L
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 1
    • Posts 25
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: Epyc VM to VM networking slow

      @olivierlambert is it already known in which update/release this problem will be solved?

      posted in Compute
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @bleader I've done exactly the same on my ThinkSystem SR665 V3, BMC Version
      3.20 (Build ID: KAX334O), UEFI Version 4.20 (Build ID: KAE120J), LXPM Version 4.12 (Build ID: GNL114G) and it worked;

      # Found which driver to blacklist from kernel init
      fgrep i2c /boot/System.map-4.19.0+1 | grep init
      [10:10 xcp-ng-host1 ~]# fgrep i2c /boot/System.map-4.19.0+1 | grep init
      ffffffff815172f0 T drm_i2c_encoder_init
      ffffffff815574c0 T __regmap_init_i2c
      ffffffff81557510 T __devm_regmap_init_i2c
      ffffffff815d0700 t i2c_dw_init_master
      ffffffff815d14f0 t i2c_dw_init_slave
      ffffffff81ea9b40 r __ksymtab_drm_i2c_encoder_init
      ffffffff81eb0570 r __ksymtab___devm_regmap_init_i2c
      ffffffff81eb0838 r __ksymtab___regmap_init_i2c
      ffffffff81edc1a9 r __kstrtab_drm_i2c_encoder_init
      ffffffff81edffcd r __kstrtab___devm_regmap_init_i2c
      ffffffff81edffe4 r __kstrtab___regmap_init_i2c
      ffffffff8248156d t i2c_init
      ffffffff82481b65 t dw_i2c_init_driver
      ffffffff8255fe48 t __initcall_i2c_init2
      ffffffff8255ffa8 t __initcall_dw_i2c_init_driver4
      

      In /etc/grub-efi.cfg I added initcall_blacklist=dw_i2c_init_driver and ran grub-mkconfig then rebooted.

      terminal_input serial console
      terminal_output serial console
      set default=0
      set timeout=5
      menuentry 'XCP-ng' {
              search --label --set root root-pxdcvt
              multiboot2 /boot/xen.gz dom0_mem=7568M,max:7568M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311
              module2 /boot/vmlinuz-4.19-xen root=LABEL=root-pxdcvt ro nolvm hpet=disable rd.auto console=hvc0 console=tty0 quiet vga=785 splash plymouth.ignore-serial-consoles initcall_blacklist=dw_i2c_init_driver
              module2 /boot/initrd-4.19-xen.img
      }
      

      Run grub-mkconfig then rebooted

      grub-mkconfig
      

      I first checked the current system temperatures and fan speeds (configured at fix RPM);

      [10:17 xcp-ng-host1 ~]# ipmitool sdr | grep -i temp
      Ambient Temp     | 30 degrees C      | ok
      Exhaust Temp     | 34 degrees C      | ok
      CPU 1 Temp       | 39 degrees C      | ok
      CPU 2 Temp       | no reading        | ns
      DIMM 1 Temp      | no reading        | ns
      DIMM 2 Temp      | no reading        | ns
      DIMM 3 Temp      | no reading        | ns
      DIMM 4 Temp      | no reading        | ns
      DIMM 5 Temp      | no reading        | ns
      DIMM 6 Temp      | 0 degrees C       | ok
      DIMM 7 Temp      | 0 degrees C       | ok
      DIMM 8 Temp      | no reading        | ns
      DIMM 9 Temp      | no reading        | ns
      DIMM 10 Temp     | no reading        | ns
      DIMM 11 Temp     | no reading        | ns
      DIMM 12 Temp     | no reading        | ns
      DIMM 13 Temp     | no reading        | ns
      DIMM 14 Temp     | no reading        | ns
      DIMM 15 Temp     | no reading        | ns
      DIMM 16 Temp     | no reading        | ns
      DIMM 17 Temp     | no reading        | ns
      DIMM 18 Temp     | no reading        | ns
      DIMM 19 Temp     | no reading        | ns
      DIMM 20 Temp     | no reading        | ns
      DIMM 21 Temp     | no reading        | ns
      DIMM 22 Temp     | no reading        | ns
      DIMM 23 Temp     | no reading        | ns
      DIMM 24 Temp     | no reading        | ns
      PCIe 1 OverTemp  | 0x00              | ok
      PCIe 2 OverTemp  | 0x00              | ok
      PCIe 3 OverTemp  | 0x00              | ok
      OCP OverTemp     | 0x00              | ok
      [10:18 xcp-ng-host1 ~]# ipmitool sdr | grep -i fan
      Fan Mismatch     | 0x00              | ok
      Fan 1 Front Tach | 6642 RPM          | ok
      Fan 2 Front Tach | 6642 RPM          | ok
      Fan 3 Front Tach | 6724 RPM          | ok
      Fan 4 Front Tach | 6560 RPM          | ok
      Fan 5 Front Tach | 6642 RPM          | ok
      Fan 6 Tach       | 0 RPM             | ok
      Fan 1 Rear Tach  | 6225 RPM          | ok
      Fan 2 Rear Tach  | 6150 RPM          | ok
      Fan 3 Rear Tach  | 6300 RPM          | ok
      Fan 4 Rear Tach  | 6150 RPM          | ok
      Fan 5 Rear Tach  | 6300 RPM          | ok
      Sys Fan Pwr      | 18 Watts          | ok
      

      I followed by a reboot of the Xclarity BMC controller and the new readings are;

      [10:23 xcp-ng-host1 ~]# ipmitool sdr | grep -i temp
      Ambient Temp     | 30 degrees C      | ok
      Exhaust Temp     | 34 degrees C      | ok
      CPU 1 Temp       | 40 degrees C      | ok
      CPU 2 Temp       | no reading        | ns
      DIMM 1 Temp      | no reading        | ns
      DIMM 2 Temp      | no reading        | ns
      DIMM 3 Temp      | no reading        | ns
      DIMM 4 Temp      | no reading        | ns
      DIMM 5 Temp      | no reading        | ns
      DIMM 6 Temp      | 38 degrees C      | ok
      DIMM 7 Temp      | 37 degrees C      | ok
      DIMM 8 Temp      | no reading        | ns
      DIMM 9 Temp      | no reading        | ns
      DIMM 10 Temp     | no reading        | ns
      DIMM 11 Temp     | no reading        | ns
      DIMM 12 Temp     | no reading        | ns
      DIMM 13 Temp     | no reading        | ns
      DIMM 14 Temp     | no reading        | ns
      DIMM 15 Temp     | no reading        | ns
      DIMM 16 Temp     | no reading        | ns
      DIMM 17 Temp     | no reading        | ns
      DIMM 18 Temp     | no reading        | ns
      DIMM 19 Temp     | no reading        | ns
      DIMM 20 Temp     | no reading        | ns
      DIMM 21 Temp     | no reading        | ns
      DIMM 22 Temp     | no reading        | ns
      DIMM 23 Temp     | no reading        | ns
      DIMM 24 Temp     | no reading        | ns
      PCIe 1 OverTemp  | 0x00              | ok
      PCIe 2 OverTemp  | 0x00              | ok
      PCIe 3 OverTemp  | 0x00              | ok
      OCP OverTemp     | 0x00              | ok
      [10:26 xcp-ng-host1 ~]# ipmitool sdr | grep -i fan
      Fan Mismatch     | 0x00              | ok
      Fan 1 Front Tach | 8528 RPM          | ok
      Fan 2 Front Tach | 8446 RPM          | ok
      Fan 3 Front Tach | 8446 RPM          | ok
      Fan 4 Front Tach | 8610 RPM          | ok
      Fan 5 Front Tach | 8446 RPM          | ok
      Fan 6 Tach       | 0 RPM             | ok
      Fan 1 Rear Tach  | 7950 RPM          | ok
      Fan 2 Rear Tach  | 7950 RPM          | ok
      Fan 3 Rear Tach  | 8025 RPM          | ok
      Fan 4 Rear Tach  | 7950 RPM          | ok
      Fan 5 Rear Tach  | 7875 RPM          | ok
      Sys Fan Pwr      | 24 Watts          | ok
      
      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @LennertvdBerg Lenovo didn't want to provide any support. However, they just published a new UEFI/BIOS. Not sure if this is going to fix things;
      Screenshot 2024-07-07 at 23.57.02.png

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @RIX_IT I've dropped today a ticket as well, hoping them to realise it would be beneficial for all parties if they could help solving this.

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @ThierryEscande Has anyone made any progress on this? @Riven you got contact details at Lenovo for contacting regarding this?

      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @stormi I’ll be back on Wednesday (just short holiday now), I’ll try your advice and see how it works

      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @stormi I thought it’s convenient to have all in one as it’s easy for installation. But I can check this options as well. So you recommend to extract the iso to a separate USB drive and load drivers from there?

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @olivierlambert would there be a way after GRUB to walk step by step through the boot and see where it goes wrong?

      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @stormi Hi, some help is welcome 🙂 Still haven’t found a solutions.

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @ThierryEscande I've updated to Xen 4.17 and it seems the upgrade went fine:

      host                   : xcp-ng-test1
      release                : 4.19.0+1
      version                : #1 SMP Wed Jan 24 17:19:11 CET 2024
      machine                : x86_64
      nr_cpus                : 64
      max_cpu_id             : 63
      nr_nodes               : 1
      cores_per_socket       : 32
      threads_per_core       : 2
      cpu_mhz                : 3245.126
      hw_caps                : 178bf3ff:7efa320b:2e500800:244037ff:0000000f:f1bf97a9:00405fce:00000780
      virt_caps              : pv hvm hvm_directio pv_directio hap gnttab-v1 gnttab-v2
      total_memory           : 130850
      free_memory            : 121721
      sharing_freed_memory   : 0
      sharing_used_memory    : 0
      outstanding_claims     : 0
      free_cpus              : 0
      xen_major              : 4
      xen_minor              : 17
      xen_extra              : .3-3
      xen_version            : 4.17.3-3
      xen_caps               : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
      xen_scheduler          : credit
      xen_pagesize           : 4096
      platform_params        : virt_start=0xffff800000000000
      xen_changeset          : $Format:%H$, pq ???
      xen_commandline        : dom0_mem=7568M,max:7568M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311
      cc_compiler            : gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)
      cc_compile_by          : mockbuild
      cc_compile_domain      : [unknown]
      cc_compile_date        : Wed Feb 28 10:12:19 CET 2024
      build_id               : 9a011a28e29a21a7643376b36aec959253587d42
      xend_config_format     : 4
      

      However, the issues with the fan speeds and missing memory temperature readings still persist. 😕

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @ThierryEscande I'm experiencing difficulties with installing the kernel-alt package on my system. Currently, I am using XCP-ng version 8.3.0-beta2. It appears that there might be a problem with updategrub.py. Any guidance on how to resolve this would be greatly appreciated.

      I've update xcp-ng.repo and this is my output of yum --enablerepo=xcp-ng-tescande install kernel-alt:

      [20:12 xcp-ng-test1 ~]# yum --enablerepo=xcp-ng-tescande install kernel-alt
      Loaded plugins: fastestmirror
      Loading mirror speeds from cached hostfile
      Excluding mirror: updates.xcp-ng.org
       * xcp-ng-base: mirrors.xcp-ng.org
      Excluding mirror: updates.xcp-ng.org
       * xcp-ng-updates: mirrors.xcp-ng.org
      Resolving Dependencies
      --> Running transaction check
      ---> Package kernel-alt.x86_64 0:4.19.309-1.0.lenovotest.2.xcpng8.3 will be installed
      --> Finished Dependency Resolution
      
      Dependencies Resolved
      
      ===========================================================================================================================================================================================================================================================================================
       Package                                                       Arch                                                      Version                                                                                  Repository                                                          Size
      ===========================================================================================================================================================================================================================================================================================
      Installing:
       kernel-alt                                                    x86_64                                                    4.19.309-1.0.lenovotest.2.xcpng8.3                                                       xcp-ng-tescande                                                     30 M
      
      Transaction Summary
      ===========================================================================================================================================================================================================================================================================================
      Install  1 Package
      
      Total download size: 30 M
      Installed size: 154 M
      Is this ok [y/d/N]: y
      Downloading packages:
      kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64.rpm                                                                                                                                                                                                            |  30 MB  00:00:01     
      Running transaction check
      Running transaction test
      Transaction test succeeded
      Running transaction
        Installing : kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64                                                                                                                                                                                                                    1/1 
      /var/tmp/rpm-tmp.l9rbzO: line 9: /opt/xensource/bin/updategrub.py: No such file or directory
      warning: %post(kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64) scriptlet failed, exit status 127
      Non-fatal POSTIN scriptlet failure in rpm package kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64
        Verifying  : kernel-alt-4.19.309-1.0.lenovotest.2.xcpng8.3.x86_64                                                                                                                                                                                                                    1/1 
      
      Installed:
        kernel-alt.x86_64 0:4.19.309-1.0.lenovotest.2.xcpng8.3                    
      
      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @olivierlambert Thanks.

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @ThierryEscande . When I do lsmod | grep ipmiI get the following results

      ipmi_si                65536  0 
      ipmi_devintf           20480  0 
      ipmi_msghandler        61440  2 ipmi_devintf,ipmi_si
      

      So, I created the file with vi /etc/modprobe.d/blacklist-ipmi.conf and added the following:

      blacklist ipmi_si
      blacklist ipmi_devintf
      blacklist ipmi_msghandler
      

      I saved the file and rebooted the system using shutdown -r now. However, I still don't see the memory temperatures in Xclarity, and the server's fans are still running at over 13,000 RPM. The system is running XCP-NG 8.3 beta 2 with kernel 4.19.0+1.

      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @stormi could you maybe advise what I'm doing wrong?

      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @Danp, UPDATED: I tried booting with an alternate kernel in XCP-NG 8.2.1 and XCP-NG 8.3 beta 2, but it didn't load the Mellanox ConnectX-6 Lx 10/25GbE drivers.

      Yes, I've read the documentation about creating a custom ISO and have detailed my procedure above. The only part I'm unsure about is this:
      "you need to add new RPMs not just replace existing ones, they need to be pulled by another existing RPM as dependencies. If there's none suitable, you can add the dependency to the xcp-ng-deps RPM."
      I couldn’t realize or understand this step.

      posted in Hardware
      L
      LennertvdBerg
    • ISO modification with additional RPM for NIC

      I'm fairly new to XCP-NG and would like to build a custom ISO for XCP-NG where I can add an additional RPM for a Mellanox ConnectX-6 Lx 10/25GbE SFP28. The problem is that I don't have other NICs installed and I can't install XCP-NG 8.2.1 because it detects during installation that there's no NIC in the system. I can install XCP-NG 8.3 beta 2 as the drivers are included there. So, I would like to include the drivers for the Mellanox in the ISO so that during installation the process will automatically detect it and I can run the installation.

      In xcp-ng-8.3.0-beta2, there's an additional mellanox-mlnxen-5.4_1.0.3.0-4.xcpng8.3.x86_64.rpm in the Packages/ directory. In xcp-ng-8.2.1-20231130, there is no mellanox-mlnxen*.rpm at all. I found two Mellanox RPMs at Koji;

      • mellanox-mlnxen-alt-5.4_1.0.3.0-1.xcpng8.2.x86_64.rpm (https://koji.xcp-ng.org/buildinfo?buildID=2620)
      • mellanox-mlnxen-alt-5.9_0.5.5.0-1.1.xcpng8.2.x86_64.rpm (https://koji.xcp-ng.org/buildinfo?buildID=2868)

      I tried following the instructions for ISO modification mentioned in the XCP-NG ISO modification documentation

      First, I extracted the ISO using the following commands:

      mkdir tmpmountdir/
      mount -o loop filename.iso tmpmountdir/ # as root
      cp -a tmpmountdir/. iso
      umount tmpmountdir/ # as root
      chmod a+w iso/ -R
      

      Then, I used wget to download the RPMs into the Packages/ directory. After this, I updated the repodata/ using the following command (remember to install createrepo-c first):"

      sudo apt install createrepo-c
      rm repodata/ -rf
      createrepo_c . -o .
      

      Finally, I built the ISO using the instructions given in the XCP-NG documentation:

      #OUTPUT=/path/to/destination/iso/file # change me
      OUTPUT=/home/xcp-ng/new_iso/xcp-ng-8.2.1-20231130-mod.iso
      VERSION=8.2 # change me
      genisoimage -o $OUTPUT -v -r -J --joliet-long -V "XCP-ng $VERSION" -c boot/isolinux/boot.cat -b boot/isolinux/isolinux.bin \
                  -no-emul-boot -boot-load-size 4 -boot-info-table -eltorito-alt-boot -e boot/efiboot.img -no-emul-boot .
      isohybrid --uefi $OUTPUT
      

      However, when I use this ISO, the Mellanox ConnectX-6 Lx drivers do not load during installation.

      Also, I have seen on the Nvidia website that new drivers for the ConnectX-6 Lx are available for Citrix XenServer Host 8.2 in version mlnx-en-23.10-2.1.3.1-xenserver8.2-x86_64.

      So my questions are:

      • What am I doing wrong with building the ISO and including the RPMs?
      • Is it possible to include the mlnx-en-23.10-2.1.3.1-xenserver8.2-x86_64 for XCP-NG 8.2?
      • What steps do I need to take, and how?
      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @rmaclachlan Thanks. I'm also unsure how we can determine what in the OS is causing this issue. Are there other installations or modifications we could try to help isolate the problem, such as another Linux distribution with the same kernel, to see if it's a kernel-related issue? @gduperrey or @olivierlambert any suggestions how we can help the team with identifying this?

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @olivierlambert can you help us with providing the module for the xen kernel, which @Gheppy is talking about?

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @Gheppy I've just reinstalled xcp-ng-8.3.0-beta2 after my Ubuntu experiment and installed lm_sensors. The output is indeed:

      Driver `to-be-written':
        * ISA bus, address 0xcc0
          Chip `IPMI BMC KCS' (confidence: 8)
      
      Note: there is no driver for IPMI BMC KCS yet.
      Check http://www.lm-sensors.org/wiki/Devices for updates.
      
      No modules to load, skipping modules configuration.
      
      Unloading i2c-dev... OK
      Unloading cpuid... OK
      

      The complete output is:
      Screenshot 2024-04-09 at 11.57.06.png
      Screenshot 2024-04-09 at 11.57.21.png

      What will be the solution for this?

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @Gheppy I just installed Ubuntu 22.044 LTS with kernel 5.15.0-102-generic just to test if there could be anything like a 'vendor lock'. Using Ubuntu I just see my memory temperatures and all my fan speeds are around 6000 rpm. So it really seems to be something with XCP and Lenovo.

      posted in Hardware
      L
      LennertvdBerg