XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. laszlobortel
    laszlobortelL Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 1
    • Posts 7
    • Groups 0

    laszlobortel

    @laszlobortel

    3
    Reputation
    1
    Profile views
    7
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    laszlobortel Unfollow Follow

    Best posts made by laszlobortel

    • RE: CPU pegged at 100% in several Rocky Linux 8 VMs without workload in guest

      We have checked our kernel versions, some are very old:

           10  4.18.0-553.109.1.el8_10.x86_64
            3  4.18.0-553.117.1.el8_10.x86_64
            2  4.18.0-553.120.1.el8_10.x86_64
            4  4.18.0-553.16.1.el8_10.x86_64
            2  4.18.0-553.22.1.el8_10.x86_64
           12  4.18.0-553.30.1.el8_10.x86_64
            2  4.18.0-553.34.1.el8_10.x86_64
            4  4.18.0-553.36.1.el8_10.x86_64
            1  4.18.0-553.40.1.el8_10.x86_64
            2  4.18.0-553.47.1.el8_10.x86_64
            3  4.18.0-553.51.1.el8_10.x86_64
            2  4.18.0-553.54.1.el8_10.x86_64
            3  4.18.0-553.56.1.el8_10.x86_64
            1  4.18.0-553.58.1.el8_10.x86_64
            4  4.18.0-553.62.1.el8_10.x86_64
            2  4.18.0-553.63.1.el8_10.x86_64
            3  4.18.0-553.69.1.el8_10.x86_64
            2  4.18.0-553.74.1.el8_10.x86_64
            1  4.18.0-553.81.1.el8_10.x86_64
            1  4.18.0-553.83.1.el8_10.x86_64
            2  4.18.0-553.87.1.el8_10.x86_64
           16  4.18.0-553.89.1.el8_10.x86_64
            4  4.18.0-553.94.1.el8_10.x86_64
      

      The OS team will do the kernel upgrade and I will come back with the result. Versions .109, .117, .120 did not fail yet in our environment. We have high hopes!

      posted in Compute
      laszlobortelL
      laszlobortel
    • V2V migration disk transfer speed

      Is there any specific reason to limit nbdkit server to threads=1?
      spanwNbdKitProcess
      According to our tests significant improvement (up to 4 times) could be achieved when increasing thread count.
      Is there any plan to make it an user configurable parameter (in xo-cli for example)

      posted in Advanced features v2v migration disk transfer speed
      laszlobortelL
      laszlobortel

    Latest posts made by laszlobortel

    • RE: CPU pegged at 100% in several Rocky Linux 8 VMs without workload in guest

      We have checked our kernel versions, some are very old:

           10  4.18.0-553.109.1.el8_10.x86_64
            3  4.18.0-553.117.1.el8_10.x86_64
            2  4.18.0-553.120.1.el8_10.x86_64
            4  4.18.0-553.16.1.el8_10.x86_64
            2  4.18.0-553.22.1.el8_10.x86_64
           12  4.18.0-553.30.1.el8_10.x86_64
            2  4.18.0-553.34.1.el8_10.x86_64
            4  4.18.0-553.36.1.el8_10.x86_64
            1  4.18.0-553.40.1.el8_10.x86_64
            2  4.18.0-553.47.1.el8_10.x86_64
            3  4.18.0-553.51.1.el8_10.x86_64
            2  4.18.0-553.54.1.el8_10.x86_64
            3  4.18.0-553.56.1.el8_10.x86_64
            1  4.18.0-553.58.1.el8_10.x86_64
            4  4.18.0-553.62.1.el8_10.x86_64
            2  4.18.0-553.63.1.el8_10.x86_64
            3  4.18.0-553.69.1.el8_10.x86_64
            2  4.18.0-553.74.1.el8_10.x86_64
            1  4.18.0-553.81.1.el8_10.x86_64
            1  4.18.0-553.83.1.el8_10.x86_64
            2  4.18.0-553.87.1.el8_10.x86_64
           16  4.18.0-553.89.1.el8_10.x86_64
            4  4.18.0-553.94.1.el8_10.x86_64
      

      The OS team will do the kernel upgrade and I will come back with the result. Versions .109, .117, .120 did not fail yet in our environment. We have high hopes!

      posted in Compute
      laszlobortelL
      laszlobortel
    • RE: CPU pegged at 100% in several Rocky Linux 8 VMs without workload in guest

      @jgrafton I am a bit confused with the role of lvmohba storage in triggering this problem, because @aflons stated above (back in 2024) that "Seems to happen far less now with shared storage."
      It is not clear for me if shared storage helps to solve the problem or makes it worse? Or lvmohba is a special kind of "bad" shared storage in this aspect?
      In any case lvmohba is a fixed point in our architecture, that we cannot replace. I am just curious if we should experiment with another type of storage to rule out or confirm the contribution of lvmohba in this problem.

      posted in Compute
      laszlobortelL
      laszlobortel
    • RE: CPU pegged at 100% in several Rocky Linux 8 VMs without workload in guest

      @DustinB I wrote "Upgrading to Rocky 9 on the short term is not an option." Please let me explain why! We are a telco with layered operation model: our team is responsible for virtualisation (VMware/Broadcom, Hyper-V, XCP-ng), another team is responsible for OS operation. The IaaS team is tasked with VMware exit, which means that we must migrate hundreds of VMs from VMware to XCP-ng as quick as possible this year, unchanged, with "lift-and-shift" method. It is a requirement that a VM which runs on VMware should run on XCP-ng, preferably unchanged. Even a simple kernel upgrade causes some delay in our migration plan. We can propose to the OS team that they should migrate to Rocky9, and they might consider and schedule it but it will not happen immediately.
      Apart from this organisational reason my experience tells that while upgrading to Rocky9 would most probably solve this issue it would raise others (probably in docker/kubernetes layer or in application layer).

      posted in Compute
      laszlobortelL
      laszlobortel
    • RE: CPU pegged at 100% in several Rocky Linux 8 VMs without workload in guest

      @aflons @jgrafton First of all, I would like to thank very much both of you for replying so quickly to this old thread!
      Our failure rate is roughly 1 frozen VM / 90 Rocky8 VMs / day, which is not tolerable. We have further hundreds of Rocky8 VMs on VMware, waiting for migration to XCP-ng.
      I tried to summarise our options:

      • Our kernels are pretty fresh, but we can try the very latest available for Rocky 8.
      • Upgrading to Rocky 9 on the sort term is not an option. We have to migrate Rocky 8 from VMware to XCP-ng first, then we can think about switching to Rocky 9 later.
      • VMware tools removed during migration as part of the migration procedure.
      • We are aready on shared lvmohba storage, which is a production grade Hitachi Vantara all SSD, same as under VMware, so I see no room for change/improvement here.
      • As last resort we can try disable load-balancing plugin and reboot monthly during our maintenance window, but this would be an ugly workaround.

      Is there anything I forgot?

      @jgrafton Was there any useful suggestion or conclusion in your Vates support ticket #7726289? I am afraid that we are facing a tricky interworking issue between the xen hypervisor and the 4.18.0 kernel and both components are independent from XCP-ng and Vates.

      posted in Compute
      laszlobortelL
      laszlobortel
    • RE: CPU pegged at 100% in several Rocky Linux 8 VMs without workload in guest

      I am afraid that we have the same problem: ~90 Rocky8 VMs migrated from VMware, pegging one CPU very often. We have suspended further migration to XCP-ng due to this issue.
      Has been the root cause identified since 2024? Is there a solution or workaround (apart from upgrading to Rocky9)?

      posted in Compute
      laszlobortelL
      laszlobortel
    • RE: V2V migration disk transfer speed

      @florent Thanks for your reply! We have started to migrate thousands of VMs, so disk transfer speed is important for us.. We will also do our detailed tests soon with different threads setting and publish it here.
      I think threads=1 is a good and logical default, but not efficient. Others might complain if you set it to a higher value. Configuration option would be a real good solution.

      posted in Advanced features
      laszlobortelL
      laszlobortel
    • V2V migration disk transfer speed

      Is there any specific reason to limit nbdkit server to threads=1?
      spanwNbdKitProcess
      According to our tests significant improvement (up to 4 times) could be achieved when increasing thread count.
      Is there any plan to make it an user configurable parameter (in xo-cli for example)

      posted in Advanced features v2v migration disk transfer speed
      laszlobortelL
      laszlobortel