XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. LennertvdBerg
    L
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 1
    • Posts 25
    • Groups 0

    LennertvdBerg

    @LennertvdBerg

    2
    Reputation
    4
    Profile views
    25
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    LennertvdBerg Unfollow Follow

    Best posts made by LennertvdBerg

    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @bleader I've done exactly the same on my ThinkSystem SR665 V3, BMC Version
      3.20 (Build ID: KAX334O), UEFI Version 4.20 (Build ID: KAE120J), LXPM Version 4.12 (Build ID: GNL114G) and it worked;

      # Found which driver to blacklist from kernel init
      fgrep i2c /boot/System.map-4.19.0+1 | grep init
      [10:10 xcp-ng-host1 ~]# fgrep i2c /boot/System.map-4.19.0+1 | grep init
      ffffffff815172f0 T drm_i2c_encoder_init
      ffffffff815574c0 T __regmap_init_i2c
      ffffffff81557510 T __devm_regmap_init_i2c
      ffffffff815d0700 t i2c_dw_init_master
      ffffffff815d14f0 t i2c_dw_init_slave
      ffffffff81ea9b40 r __ksymtab_drm_i2c_encoder_init
      ffffffff81eb0570 r __ksymtab___devm_regmap_init_i2c
      ffffffff81eb0838 r __ksymtab___regmap_init_i2c
      ffffffff81edc1a9 r __kstrtab_drm_i2c_encoder_init
      ffffffff81edffcd r __kstrtab___devm_regmap_init_i2c
      ffffffff81edffe4 r __kstrtab___regmap_init_i2c
      ffffffff8248156d t i2c_init
      ffffffff82481b65 t dw_i2c_init_driver
      ffffffff8255fe48 t __initcall_i2c_init2
      ffffffff8255ffa8 t __initcall_dw_i2c_init_driver4
      

      In /etc/grub-efi.cfg I added initcall_blacklist=dw_i2c_init_driver and ran grub-mkconfig then rebooted.

      terminal_input serial console
      terminal_output serial console
      set default=0
      set timeout=5
      menuentry 'XCP-ng' {
              search --label --set root root-pxdcvt
              multiboot2 /boot/xen.gz dom0_mem=7568M,max:7568M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311
              module2 /boot/vmlinuz-4.19-xen root=LABEL=root-pxdcvt ro nolvm hpet=disable rd.auto console=hvc0 console=tty0 quiet vga=785 splash plymouth.ignore-serial-consoles initcall_blacklist=dw_i2c_init_driver
              module2 /boot/initrd-4.19-xen.img
      }
      

      Run grub-mkconfig then rebooted

      grub-mkconfig
      

      I first checked the current system temperatures and fan speeds (configured at fix RPM);

      [10:17 xcp-ng-host1 ~]# ipmitool sdr | grep -i temp
      Ambient Temp     | 30 degrees C      | ok
      Exhaust Temp     | 34 degrees C      | ok
      CPU 1 Temp       | 39 degrees C      | ok
      CPU 2 Temp       | no reading        | ns
      DIMM 1 Temp      | no reading        | ns
      DIMM 2 Temp      | no reading        | ns
      DIMM 3 Temp      | no reading        | ns
      DIMM 4 Temp      | no reading        | ns
      DIMM 5 Temp      | no reading        | ns
      DIMM 6 Temp      | 0 degrees C       | ok
      DIMM 7 Temp      | 0 degrees C       | ok
      DIMM 8 Temp      | no reading        | ns
      DIMM 9 Temp      | no reading        | ns
      DIMM 10 Temp     | no reading        | ns
      DIMM 11 Temp     | no reading        | ns
      DIMM 12 Temp     | no reading        | ns
      DIMM 13 Temp     | no reading        | ns
      DIMM 14 Temp     | no reading        | ns
      DIMM 15 Temp     | no reading        | ns
      DIMM 16 Temp     | no reading        | ns
      DIMM 17 Temp     | no reading        | ns
      DIMM 18 Temp     | no reading        | ns
      DIMM 19 Temp     | no reading        | ns
      DIMM 20 Temp     | no reading        | ns
      DIMM 21 Temp     | no reading        | ns
      DIMM 22 Temp     | no reading        | ns
      DIMM 23 Temp     | no reading        | ns
      DIMM 24 Temp     | no reading        | ns
      PCIe 1 OverTemp  | 0x00              | ok
      PCIe 2 OverTemp  | 0x00              | ok
      PCIe 3 OverTemp  | 0x00              | ok
      OCP OverTemp     | 0x00              | ok
      [10:18 xcp-ng-host1 ~]# ipmitool sdr | grep -i fan
      Fan Mismatch     | 0x00              | ok
      Fan 1 Front Tach | 6642 RPM          | ok
      Fan 2 Front Tach | 6642 RPM          | ok
      Fan 3 Front Tach | 6724 RPM          | ok
      Fan 4 Front Tach | 6560 RPM          | ok
      Fan 5 Front Tach | 6642 RPM          | ok
      Fan 6 Tach       | 0 RPM             | ok
      Fan 1 Rear Tach  | 6225 RPM          | ok
      Fan 2 Rear Tach  | 6150 RPM          | ok
      Fan 3 Rear Tach  | 6300 RPM          | ok
      Fan 4 Rear Tach  | 6150 RPM          | ok
      Fan 5 Rear Tach  | 6300 RPM          | ok
      Sys Fan Pwr      | 18 Watts          | ok
      

      I followed by a reboot of the Xclarity BMC controller and the new readings are;

      [10:23 xcp-ng-host1 ~]# ipmitool sdr | grep -i temp
      Ambient Temp     | 30 degrees C      | ok
      Exhaust Temp     | 34 degrees C      | ok
      CPU 1 Temp       | 40 degrees C      | ok
      CPU 2 Temp       | no reading        | ns
      DIMM 1 Temp      | no reading        | ns
      DIMM 2 Temp      | no reading        | ns
      DIMM 3 Temp      | no reading        | ns
      DIMM 4 Temp      | no reading        | ns
      DIMM 5 Temp      | no reading        | ns
      DIMM 6 Temp      | 38 degrees C      | ok
      DIMM 7 Temp      | 37 degrees C      | ok
      DIMM 8 Temp      | no reading        | ns
      DIMM 9 Temp      | no reading        | ns
      DIMM 10 Temp     | no reading        | ns
      DIMM 11 Temp     | no reading        | ns
      DIMM 12 Temp     | no reading        | ns
      DIMM 13 Temp     | no reading        | ns
      DIMM 14 Temp     | no reading        | ns
      DIMM 15 Temp     | no reading        | ns
      DIMM 16 Temp     | no reading        | ns
      DIMM 17 Temp     | no reading        | ns
      DIMM 18 Temp     | no reading        | ns
      DIMM 19 Temp     | no reading        | ns
      DIMM 20 Temp     | no reading        | ns
      DIMM 21 Temp     | no reading        | ns
      DIMM 22 Temp     | no reading        | ns
      DIMM 23 Temp     | no reading        | ns
      DIMM 24 Temp     | no reading        | ns
      PCIe 1 OverTemp  | 0x00              | ok
      PCIe 2 OverTemp  | 0x00              | ok
      PCIe 3 OverTemp  | 0x00              | ok
      OCP OverTemp     | 0x00              | ok
      [10:26 xcp-ng-host1 ~]# ipmitool sdr | grep -i fan
      Fan Mismatch     | 0x00              | ok
      Fan 1 Front Tach | 8528 RPM          | ok
      Fan 2 Front Tach | 8446 RPM          | ok
      Fan 3 Front Tach | 8446 RPM          | ok
      Fan 4 Front Tach | 8610 RPM          | ok
      Fan 5 Front Tach | 8446 RPM          | ok
      Fan 6 Tach       | 0 RPM             | ok
      Fan 1 Rear Tach  | 7950 RPM          | ok
      Fan 2 Rear Tach  | 7950 RPM          | ok
      Fan 3 Rear Tach  | 8025 RPM          | ok
      Fan 4 Rear Tach  | 7950 RPM          | ok
      Fan 5 Rear Tach  | 7875 RPM          | ok
      Sys Fan Pwr      | 24 Watts          | ok
      
      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @RIX_IT I've dropped today a ticket as well, hoping them to realise it would be beneficial for all parties if they could help solving this.

      posted in Hardware
      L
      LennertvdBerg

    Latest posts made by LennertvdBerg

    • RE: Epyc VM to VM networking slow

      @olivierlambert is it already known in which update/release this problem will be solved?

      posted in Compute
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @bleader I've done exactly the same on my ThinkSystem SR665 V3, BMC Version
      3.20 (Build ID: KAX334O), UEFI Version 4.20 (Build ID: KAE120J), LXPM Version 4.12 (Build ID: GNL114G) and it worked;

      # Found which driver to blacklist from kernel init
      fgrep i2c /boot/System.map-4.19.0+1 | grep init
      [10:10 xcp-ng-host1 ~]# fgrep i2c /boot/System.map-4.19.0+1 | grep init
      ffffffff815172f0 T drm_i2c_encoder_init
      ffffffff815574c0 T __regmap_init_i2c
      ffffffff81557510 T __devm_regmap_init_i2c
      ffffffff815d0700 t i2c_dw_init_master
      ffffffff815d14f0 t i2c_dw_init_slave
      ffffffff81ea9b40 r __ksymtab_drm_i2c_encoder_init
      ffffffff81eb0570 r __ksymtab___devm_regmap_init_i2c
      ffffffff81eb0838 r __ksymtab___regmap_init_i2c
      ffffffff81edc1a9 r __kstrtab_drm_i2c_encoder_init
      ffffffff81edffcd r __kstrtab___devm_regmap_init_i2c
      ffffffff81edffe4 r __kstrtab___regmap_init_i2c
      ffffffff8248156d t i2c_init
      ffffffff82481b65 t dw_i2c_init_driver
      ffffffff8255fe48 t __initcall_i2c_init2
      ffffffff8255ffa8 t __initcall_dw_i2c_init_driver4
      

      In /etc/grub-efi.cfg I added initcall_blacklist=dw_i2c_init_driver and ran grub-mkconfig then rebooted.

      terminal_input serial console
      terminal_output serial console
      set default=0
      set timeout=5
      menuentry 'XCP-ng' {
              search --label --set root root-pxdcvt
              multiboot2 /boot/xen.gz dom0_mem=7568M,max:7568M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311
              module2 /boot/vmlinuz-4.19-xen root=LABEL=root-pxdcvt ro nolvm hpet=disable rd.auto console=hvc0 console=tty0 quiet vga=785 splash plymouth.ignore-serial-consoles initcall_blacklist=dw_i2c_init_driver
              module2 /boot/initrd-4.19-xen.img
      }
      

      Run grub-mkconfig then rebooted

      grub-mkconfig
      

      I first checked the current system temperatures and fan speeds (configured at fix RPM);

      [10:17 xcp-ng-host1 ~]# ipmitool sdr | grep -i temp
      Ambient Temp     | 30 degrees C      | ok
      Exhaust Temp     | 34 degrees C      | ok
      CPU 1 Temp       | 39 degrees C      | ok
      CPU 2 Temp       | no reading        | ns
      DIMM 1 Temp      | no reading        | ns
      DIMM 2 Temp      | no reading        | ns
      DIMM 3 Temp      | no reading        | ns
      DIMM 4 Temp      | no reading        | ns
      DIMM 5 Temp      | no reading        | ns
      DIMM 6 Temp      | 0 degrees C       | ok
      DIMM 7 Temp      | 0 degrees C       | ok
      DIMM 8 Temp      | no reading        | ns
      DIMM 9 Temp      | no reading        | ns
      DIMM 10 Temp     | no reading        | ns
      DIMM 11 Temp     | no reading        | ns
      DIMM 12 Temp     | no reading        | ns
      DIMM 13 Temp     | no reading        | ns
      DIMM 14 Temp     | no reading        | ns
      DIMM 15 Temp     | no reading        | ns
      DIMM 16 Temp     | no reading        | ns
      DIMM 17 Temp     | no reading        | ns
      DIMM 18 Temp     | no reading        | ns
      DIMM 19 Temp     | no reading        | ns
      DIMM 20 Temp     | no reading        | ns
      DIMM 21 Temp     | no reading        | ns
      DIMM 22 Temp     | no reading        | ns
      DIMM 23 Temp     | no reading        | ns
      DIMM 24 Temp     | no reading        | ns
      PCIe 1 OverTemp  | 0x00              | ok
      PCIe 2 OverTemp  | 0x00              | ok
      PCIe 3 OverTemp  | 0x00              | ok
      OCP OverTemp     | 0x00              | ok
      [10:18 xcp-ng-host1 ~]# ipmitool sdr | grep -i fan
      Fan Mismatch     | 0x00              | ok
      Fan 1 Front Tach | 6642 RPM          | ok
      Fan 2 Front Tach | 6642 RPM          | ok
      Fan 3 Front Tach | 6724 RPM          | ok
      Fan 4 Front Tach | 6560 RPM          | ok
      Fan 5 Front Tach | 6642 RPM          | ok
      Fan 6 Tach       | 0 RPM             | ok
      Fan 1 Rear Tach  | 6225 RPM          | ok
      Fan 2 Rear Tach  | 6150 RPM          | ok
      Fan 3 Rear Tach  | 6300 RPM          | ok
      Fan 4 Rear Tach  | 6150 RPM          | ok
      Fan 5 Rear Tach  | 6300 RPM          | ok
      Sys Fan Pwr      | 18 Watts          | ok
      

      I followed by a reboot of the Xclarity BMC controller and the new readings are;

      [10:23 xcp-ng-host1 ~]# ipmitool sdr | grep -i temp
      Ambient Temp     | 30 degrees C      | ok
      Exhaust Temp     | 34 degrees C      | ok
      CPU 1 Temp       | 40 degrees C      | ok
      CPU 2 Temp       | no reading        | ns
      DIMM 1 Temp      | no reading        | ns
      DIMM 2 Temp      | no reading        | ns
      DIMM 3 Temp      | no reading        | ns
      DIMM 4 Temp      | no reading        | ns
      DIMM 5 Temp      | no reading        | ns
      DIMM 6 Temp      | 38 degrees C      | ok
      DIMM 7 Temp      | 37 degrees C      | ok
      DIMM 8 Temp      | no reading        | ns
      DIMM 9 Temp      | no reading        | ns
      DIMM 10 Temp     | no reading        | ns
      DIMM 11 Temp     | no reading        | ns
      DIMM 12 Temp     | no reading        | ns
      DIMM 13 Temp     | no reading        | ns
      DIMM 14 Temp     | no reading        | ns
      DIMM 15 Temp     | no reading        | ns
      DIMM 16 Temp     | no reading        | ns
      DIMM 17 Temp     | no reading        | ns
      DIMM 18 Temp     | no reading        | ns
      DIMM 19 Temp     | no reading        | ns
      DIMM 20 Temp     | no reading        | ns
      DIMM 21 Temp     | no reading        | ns
      DIMM 22 Temp     | no reading        | ns
      DIMM 23 Temp     | no reading        | ns
      DIMM 24 Temp     | no reading        | ns
      PCIe 1 OverTemp  | 0x00              | ok
      PCIe 2 OverTemp  | 0x00              | ok
      PCIe 3 OverTemp  | 0x00              | ok
      OCP OverTemp     | 0x00              | ok
      [10:26 xcp-ng-host1 ~]# ipmitool sdr | grep -i fan
      Fan Mismatch     | 0x00              | ok
      Fan 1 Front Tach | 8528 RPM          | ok
      Fan 2 Front Tach | 8446 RPM          | ok
      Fan 3 Front Tach | 8446 RPM          | ok
      Fan 4 Front Tach | 8610 RPM          | ok
      Fan 5 Front Tach | 8446 RPM          | ok
      Fan 6 Tach       | 0 RPM             | ok
      Fan 1 Rear Tach  | 7950 RPM          | ok
      Fan 2 Rear Tach  | 7950 RPM          | ok
      Fan 3 Rear Tach  | 8025 RPM          | ok
      Fan 4 Rear Tach  | 7950 RPM          | ok
      Fan 5 Rear Tach  | 7875 RPM          | ok
      Sys Fan Pwr      | 24 Watts          | ok
      
      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @LennertvdBerg Lenovo didn't want to provide any support. However, they just published a new UEFI/BIOS. Not sure if this is going to fix things;
      Screenshot 2024-07-07 at 23.57.02.png

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @RIX_IT I've dropped today a ticket as well, hoping them to realise it would be beneficial for all parties if they could help solving this.

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @ThierryEscande Has anyone made any progress on this? @Riven you got contact details at Lenovo for contacting regarding this?

      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @stormi I’ll be back on Wednesday (just short holiday now), I’ll try your advice and see how it works

      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @stormi I thought it’s convenient to have all in one as it’s easy for installation. But I can check this options as well. So you recommend to extract the iso to a separate USB drive and load drivers from there?

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @olivierlambert would there be a way after GRUB to walk step by step through the boot and see where it goes wrong?

      posted in Hardware
      L
      LennertvdBerg
    • RE: ISO modification with additional RPM for NIC

      @stormi Hi, some help is welcome 🙂 Still haven’t found a solutions.

      posted in Hardware
      L
      LennertvdBerg
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      @ThierryEscande I've updated to Xen 4.17 and it seems the upgrade went fine:

      host                   : xcp-ng-test1
      release                : 4.19.0+1
      version                : #1 SMP Wed Jan 24 17:19:11 CET 2024
      machine                : x86_64
      nr_cpus                : 64
      max_cpu_id             : 63
      nr_nodes               : 1
      cores_per_socket       : 32
      threads_per_core       : 2
      cpu_mhz                : 3245.126
      hw_caps                : 178bf3ff:7efa320b:2e500800:244037ff:0000000f:f1bf97a9:00405fce:00000780
      virt_caps              : pv hvm hvm_directio pv_directio hap gnttab-v1 gnttab-v2
      total_memory           : 130850
      free_memory            : 121721
      sharing_freed_memory   : 0
      sharing_used_memory    : 0
      outstanding_claims     : 0
      free_cpus              : 0
      xen_major              : 4
      xen_minor              : 17
      xen_extra              : .3-3
      xen_version            : 4.17.3-3
      xen_caps               : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
      xen_scheduler          : credit
      xen_pagesize           : 4096
      platform_params        : virt_start=0xffff800000000000
      xen_changeset          : $Format:%H$, pq ???
      xen_commandline        : dom0_mem=7568M,max:7568M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311
      cc_compiler            : gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)
      cc_compile_by          : mockbuild
      cc_compile_domain      : [unknown]
      cc_compile_date        : Wed Feb 28 10:12:19 CET 2024
      build_id               : 9a011a28e29a21a7643376b36aec959253587d42
      xend_config_format     : 4
      

      However, the issues with the fan speeds and missing memory temperature readings still persist. 😕

      posted in Hardware
      L
      LennertvdBerg