XCP-ng

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    1. Home
    2. delaf
    • Profile
    • Following 0
    • Followers 0
    • Topics 0
    • Posts 27
    • Best 13
    • Controversial 0
    • Groups 0

    delaf

    @delaf

    14
    Reputation
    46
    Profile views
    27
    Posts
    0
    Followers
    0
    Following
    Joined Last Online

    delaf Unfollow Follow

    Best posts made by delaf

    • RE: Alert: Control Domain Memory Usage

      @stormi
      It seems to be good here!

      Screenshot 2021-03-09 at 08.36.50.png

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      PS: i'm using these 2 scripts to list all interfaces drivers version accross our servers :

      $ cat get_network_drivers_info.sh
      #!/bin/bash                                                                                                                                                                                                                                                                                                   
      
      format="| %-13.13s | %-20.20s | %-20.20s | %-10.10s | %-7.7s | %-10.10s | %-30.30s | %-s \n"
      printf "${format}" "date" "hostname" "OS" "interface" "driver" "version" "firmware" "yum"
      printf "${format}" "----------------------------" "----------------------------" "----------------------------" "----------------------------" "----------------------------" "----------------------------" "----------------------------" "----------------------------"
      
      if [ $# -gt 0 ]; then
          servers=($(echo ${BASH_ARGV[*]}))
      else
          servers=($(cat host.json | jq -r '.[] | .address' | egrep -v "^192.168.124.9$"))
      fi
      
      for line in ${servers[@]}; do
          scp get_network_drivers_info.sh.tpl ${line}:/tmp/get_network_drivers_info.sh  > /dev/null 2>&1;
          ssh -n ${line} bash /tmp/get_network_drivers_info.sh 2> /dev/null;
          if [ $? -ne 0 ]; then
              echo "${line} fail" >&2
          fi
      done
      
      $ cat get_network_drivers_info.sh.tpl
      #!/bin/bash                                                                                                                                                                                                                                                                                                   
      
      format="| %-13.13s | %-20.20s | %-20.20s | %-10.10s | %-7.7s | %-10.10s | %-30.30s | %-s \n"
      d=$(date '+%Y%m%d-%H%M')
      name=$(hostname)
      cd  /sys/class/net/
      for interface in $(ls -l /sys/class/net/ | awk '/\/pci/ {print $9}'); do
          version=$(ethtool -i ${interface} | awk '/^version:/ {$1=""; print}')
          firmware=$(ethtool -i ${interface} | awk '/^firmware-version:/ {$1=""; print}')
          driver=$(ethtool -i ${interface} | awk '/^driver:/ {$1=""; print}')
          YUM=$(which yum)
          if [ $? -eq 0 ]; then
              packages=$(yum list installed | awk '/ixgbe/ {print $1"@"$2}' | tr '\n' ',')
          else
              packages="NA"
          fi
          os_version=$(lsb_release -d | awk '{$1=""} 1' | sed 's/XenServer/XS/; s/ (xenenterprise)//; s/release //')
          printf "${format}" "${d}" "${name}" "${os_version}" "${interface}" "${driver}" "${version}" "${firmware}" "${packages}"
      done
      

      PS: host.json file is generated via : xo-cli --list-objects type=host

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi Hello, some week after, I can confirm that the problem is solved here by using intel-ixgbe.x86_64@5.5.2-2.1.xcpng8.1 or intel-ixgbe.x86_64@5.5.2-2.1.xcpng8.2

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi I have installed intel-ixgbe 5.5.2-2.1.xcpng8.2 on my server s0267. Let's wait a some days to check if the memleak is solved by this patch.

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi
      Screenshot 2021-03-04 at 21.21.16.png
      The 2 servers have been reinstalled with an up to date 8.2. They host each 2 VMs that are doing the same thing (~100Mb/s of netdata stream).

      The right one has the 5.9.4-1.xcpng8.2, the left one has 5.5.2-2.xcpng8.2.

      The patch seem to be OK for me.

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi

      • server 266 with alt-kernel: still no problem.
        Screen Shot 2020-12-02 at 10.08.47.png

      • server 268 with 4.19.19-6.0.10.1.xcpng8.1: the problem has begun some days ago after some stable days.
        Screen Shot 2020-12-02 at 10.03.57.png

      • server 272 with 4.19.19-6.0.11.1.0.1.patch53disabled.xcpng8.1:
        Screen Shot 2020-12-02 at 10.05.47.png )

      • server 273 with 4.19.19-6.0.11.1.0.1.patch62disabled.xcpng8.1:
        Screen Shot 2020-12-02 at 10.05.50.png

      It seems that 4.19.19-6.0.11.1.0.1.patch62disabled.xcpng8.1 is more stable than 4.19.19-6.0.11.1.0.1.patch53disabled.xcpng8.1. But it is a but early to be sure.

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi For the kernel-4.19.19-6.0.10.1.xcpng8.1 test, i'm not sure it solve the problem because I get a small memory increase. We have to wait a bit more 😕

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi I have installed the two kernels

      272 ~]# yum list installed kernel | grep kernel
      kernel.x86_64                   4.19.19-6.0.11.1.0.1.patch53disabled.xcpng8.1
      
      273 ~]# yum list installed kernel | grep kernel
      kernel.x86_64                   4.19.19-6.0.11.1.0.1.patch62disabled.xcpng8.1
      

      I have removed the modification in /etc/modprobe.d/dist.conf on server 273.

      We have to wait a little bit now 😉

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi @r1
      Four days later, I get:

      • one server (266) with alt-kernel: still no problem
      • one server (268) with 4.19.19-6.0.10.1.xcpng8.1: no more problem!
      • one server (272) with kmemleak kernel: no memleak detected, but the problem is present
      • one server (273) with search extra built-in weak-updates override updates: problem still present
      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @danp Thank you. I have downgraded one server.

      @stormi So i have :

      • one server with 4.19.19-6.0.10.1.xcpng8.1
      • one server with kmemleak kernel
      • one server with search extra built-in weak-updates override updates
      posted in Compute
      delaf
      delaf

    Latest posts made by delaf

    • RE: Alert: Control Domain Memory Usage

      PS: i'm using these 2 scripts to list all interfaces drivers version accross our servers :

      $ cat get_network_drivers_info.sh
      #!/bin/bash                                                                                                                                                                                                                                                                                                   
      
      format="| %-13.13s | %-20.20s | %-20.20s | %-10.10s | %-7.7s | %-10.10s | %-30.30s | %-s \n"
      printf "${format}" "date" "hostname" "OS" "interface" "driver" "version" "firmware" "yum"
      printf "${format}" "----------------------------" "----------------------------" "----------------------------" "----------------------------" "----------------------------" "----------------------------" "----------------------------" "----------------------------"
      
      if [ $# -gt 0 ]; then
          servers=($(echo ${BASH_ARGV[*]}))
      else
          servers=($(cat host.json | jq -r '.[] | .address' | egrep -v "^192.168.124.9$"))
      fi
      
      for line in ${servers[@]}; do
          scp get_network_drivers_info.sh.tpl ${line}:/tmp/get_network_drivers_info.sh  > /dev/null 2>&1;
          ssh -n ${line} bash /tmp/get_network_drivers_info.sh 2> /dev/null;
          if [ $? -ne 0 ]; then
              echo "${line} fail" >&2
          fi
      done
      
      $ cat get_network_drivers_info.sh.tpl
      #!/bin/bash                                                                                                                                                                                                                                                                                                   
      
      format="| %-13.13s | %-20.20s | %-20.20s | %-10.10s | %-7.7s | %-10.10s | %-30.30s | %-s \n"
      d=$(date '+%Y%m%d-%H%M')
      name=$(hostname)
      cd  /sys/class/net/
      for interface in $(ls -l /sys/class/net/ | awk '/\/pci/ {print $9}'); do
          version=$(ethtool -i ${interface} | awk '/^version:/ {$1=""; print}')
          firmware=$(ethtool -i ${interface} | awk '/^firmware-version:/ {$1=""; print}')
          driver=$(ethtool -i ${interface} | awk '/^driver:/ {$1=""; print}')
          YUM=$(which yum)
          if [ $? -eq 0 ]; then
              packages=$(yum list installed | awk '/ixgbe/ {print $1"@"$2}' | tr '\n' ',')
          else
              packages="NA"
          fi
          os_version=$(lsb_release -d | awk '{$1=""} 1' | sed 's/XenServer/XS/; s/ (xenenterprise)//; s/release //')
          printf "${format}" "${d}" "${name}" "${os_version}" "${interface}" "${driver}" "${version}" "${firmware}" "${packages}"
      done
      

      PS: host.json file is generated via : xo-cli --list-objects type=host

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi Hello, some week after, I can confirm that the problem is solved here by using intel-ixgbe.x86_64@5.5.2-2.1.xcpng8.1 or intel-ixgbe.x86_64@5.5.2-2.1.xcpng8.2

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi oh I did not know that as I never use it: I only know that it exists 😉

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      PS: we are not using the netdata config from "Advanced telemetry": we are installing our own netdata config.

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @jcastang we are using a netdata/prometheus/grafana stack.

      @olivierlambert you can change the retention method and keep much more data on netdata. There is also (since netdata 1.18 i think) a dbengine that allows you to store data on disk.

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi
      It seems to be good here!

      Screenshot 2021-03-09 at 08.36.50.png

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi I have installed intel-ixgbe 5.5.2-2.1.xcpng8.2 on my server s0267. Let's wait a some days to check if the memleak is solved by this patch.

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi
      Screenshot 2021-03-04 at 21.21.16.png
      The 2 servers have been reinstalled with an up to date 8.2. They host each 2 VMs that are doing the same thing (~100Mb/s of netdata stream).

      The right one has the 5.9.4-1.xcpng8.2, the left one has 5.5.2-2.xcpng8.2.

      The patch seem to be OK for me.

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      I have installed intel-ixgbe-alt-5.9.4-1.xcpng8.1.x86_64 on my server (268).
      I'll check in some days if I still have the problem or not.

      Thank you guys!

      posted in Compute
      delaf
      delaf
    • RE: Alert: Control Domain Memory Usage

      @stormi @r1 server 273 with 4.19.19-6.0.11.1.0.1.patch62disabled.xcpng8.1 is still stable and 272 has the memory problem.

      • 272
        Screen Shot 2020-12-15 at 14.50.31.png

      • 273
        Screen Shot 2020-12-15 at 14.50.40.png

      posted in Compute
      delaf
      delaf