XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Short VM freeze when migrating to another host

    Scheduled Pinned Locked Moved Compute
    33 Posts 8 Posters 6.8k Views 10 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • nikadeN Offline
      nikade Top contributor
      last edited by

      I cant even notice it, tried moving the mouse around, having task manager up in a windows vm, top in a linux vm and i cant really notice a freeze when i migrated my vm's around.

      Maybe it depends on how much ram the vm has? My test VM's only have 2-8Gb ram.

      Z 1 Reply Last reply Reply Quote 0
      • Z Offline
        zmk @nikade
        last edited by

        It depends on how much RAM has not yet been copied to the new VM-server at the time of the freeze.

        If a test virtual machine does virtually nothing, then there are not many changes in its memory.

        1 Reply Last reply Reply Quote 0
        • A Offline
          arc1
          last edited by

          @nikade @planedrop @zmk Thank you all for answering.
          We did the test with RockyLinux, Centos 7, Ubuntu 22.04 and Windows Server 2022.
          On the Windows Server we only loose a few pings (10 pings in testing enviroment) on Linux we see logs about VM freeze too.
          Windows VM isn't busy at all, only test VM but we loose about 10 pings.

          Vates support said that "depending on the load and the Ram size you can have some freeze of the VM during migration, unfortunately at the moment there is not a lot that can be done about that".

          I'm just curious why @nikade and @planedrop don't get any freeze.

          R nikadeN 2 Replies Last reply Reply Quote 0
          • R Offline
            rfx77 @arc1
            last edited by

            @arc1 same situation here. we also had dmesg entries when doing live-migration. but the vm did not have any issues beside that.

            1 Reply Last reply Reply Quote 1
            • Z Offline
              zmk
              last edited by zmk

              What could be the algorithm for copying the RAM of a running virtual machine to another host?

              1. Copy the RAM of the running VM to another host.
              2. While the copying was in progress, the RAM of the running VM has already changed.
              3. Copy the changes.
              4. While the copying was in progress, the RAM of the running VM has already changed.
              5. Copy the changes.

              Finally, we understand that this is an infinite loop.
              Freeze the running virtual machine.
              The RAM of the non-running virtual machine no longer changes.
              Copy the changes RAM of the non-running virtual machine.
              After copying the changes, the RAM of the non-running VM on the old host matches the RAM of the VM on the new host.
              Unfreeze the VM on the new host.

              The more uncopied changes at the time of freezing, the longer the freezing time.

              Copying of uncopied changes after freezing cannot happen instantly.

              R 1 Reply Last reply Reply Quote 1
              • R Offline
                rfx77 @zmk
                last edited by

                @zmk We only had the dmesg entris on Xen, not on VMWare and not on HyperV

                1 Reply Last reply Reply Quote 0
                • nikadeN Offline
                  nikade Top contributor @arc1
                  last edited by

                  @arc1 how much ram/cpu/disk does your VM's have?
                  Seems like something is taking too long in the last phase of the migration, when the original source and destination VM are syncronized.

                  A 1 Reply Last reply Reply Quote 0
                  • Z Offline
                    zmk
                    last edited by

                    The problem may be in the transfer speed between hosts.

                    nikadeN 1 Reply Last reply Reply Quote 0
                    • nikadeN Offline
                      nikade Top contributor @zmk
                      last edited by

                      @zmk yeah maybe, we're connected with 2x10G on each host to the network and while doing a migration (without storage migration) between 2 hosts in the pool I can see it spike at 6-7Gbit/s.

                      1 Reply Last reply Reply Quote 0
                      • A Offline
                        arc1 @nikade
                        last edited by

                        @nikade 4cpu, 16ram and roughly 200gb disk.
                        10ping downtime was on test enviroment with slower speeds between hosts, so this explains longer freeze.
                        But on production 2x25gb lacp is still noticable freeze on VMs with more sensitive software (keepalived/etcd).Nothing too terrible we were just curious if this is normal behaviour.

                        nikadeN 1 Reply Last reply Reply Quote 0
                        • nikadeN Offline
                          nikade Top contributor @arc1
                          last edited by

                          @arc1 so if you go to XOA and the console of the VM, what happends then?
                          Is the VM frozen for the amount of 10 pings? Open taskmanager to see if there is any CPU activity.

                          A 1 Reply Last reply Reply Quote 0
                          • A Offline
                            arc1 @nikade
                            last edited by

                            @nikade Yes, the MV is frozen without cpu activity.

                            nikadeN 1 Reply Last reply Reply Quote 0
                            • nikadeN Offline
                              nikade Top contributor @arc1
                              last edited by

                              @arc1 said in Short VM freeze when migrating to another host:

                              @nikade Yes, the MV is frozen without cpu activity.

                              So the VM is actually frozen in the console?
                              Because if it wasn't I'd suggest adjusting the mac-aging in your switches, since the VM's mac adress will be bound to the physical hosts switch-port for a period of time after migrating.

                              1 Reply Last reply Reply Quote 1
                              • andrewperryA Offline
                                andrewperry
                                last edited by

                                We're seeing this issue when trying to migrate a Debian VM with 16GB of RAM.

                                It is a worker node in a Kubernetes cluster so it is likely that the RAM changes a fair bit. It is not uncommon for the migration to fail due to the freeze hitting a 30 second time limit.

                                A Windows 10 Pro VM with 16GB of RAM migrates fine, because not much is changing in the RAM I expect.

                                Following along for recommendations! Our hosts sound very similar to @arc1 except our network speed is slower, which is one thing we are working on.

                                1 Reply Last reply Reply Quote 0
                                • nikadeN Offline
                                  nikade Top contributor
                                  last edited by

                                  And you guys aren't using any kind of dynamic memory?
                                  Can you post a screen dump of the Advanced tab where it shows the memory configuration?

                                  We have VM's with 128Gb ram that migrates just fine, when migrating it between hosts the network shows peaks at 7,6Gbit/s and it is migrated in about ~20 seconds.
                                  Smaller VM's with 8, 16 or even 32Gb ram is migrated almost instantly.

                                  robytR andrewperryA 2 Replies Last reply Reply Quote 0
                                  • robytR Offline
                                    robyt @nikade
                                    last edited by

                                    @nikade 2568c5bf-5336-4461-8f1f-60cf093f93a2-immagine.png
                                    in VM (linux) with a free i see 94 gb of total memory

                                    1 Reply Last reply Reply Quote 0
                                    • andrewperryA Offline
                                      andrewperry @nikade
                                      last edited by

                                      @nikade thanks for the ideas of where to look!

                                      In my case we're testing and just have a 1Gb link between these hosts, which is what I was putting it down to.

                                      This particular VM is a freshly migrated PV from Debian Xen with:

                                      Memory limits (min/max)
                                      Static: 16 MiB/16 GiB
                                      Dynamic: 8 GiB/16 GiB

                                      Could that Dynamic setting be the problem because as I recall it reduces the VM to 8 on migrate, so when doing the migrate perhaps 8 isn't enough for the VM?

                                      I will try changing it to 16/16 and see if that has any noticeable impact. Thanks!

                                      nikadeN 1 Reply Last reply Reply Quote 0
                                      • olivierlambertO Online
                                        olivierlambert Vates 🪐 Co-Founder CEO
                                        last edited by

                                        Yes that's very likely.

                                        robytR 1 Reply Last reply Reply Quote 0
                                        • nikadeN Offline
                                          nikade Top contributor @andrewperry
                                          last edited by

                                          @andrewperry yeah try set it 16/16Gb instead, it will probably do some magic 🙂

                                          1 Reply Last reply Reply Quote 0
                                          • robytR Offline
                                            robyt @olivierlambert
                                            last edited by

                                            @olivierlambert hi, today i've upgraded my host..
                                            The big VM frozen for ~7 minutes, is a big vm (96 gbram and 32 cpu) but 7 minutes is a very long time (for customer!)
                                            i've setting 96/06 in dynamic: is a normal time?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post