XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Memory in vm half as fast after migration of vm.

    Scheduled Pinned Locked Moved Compute
    41 Posts 6 Posters 15.8k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Online
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by olivierlambert

      It's a very long story. The real impact isn't that big in real usage, and it depends on so many factors that it's hard to really know at one time if you are really affected or not.

      The core issue is related to TSC clock. Time/tick regularity on hardware is a REAL mess, even on the same hardware, Xen default mode is trying to use the TSC without emulation for your VMs, but sometimes TSC is doing weird things, and Xen is able to preserve the behavior in the guest by emulating it.

      This emulation is costing performance. And this is already on the very same hardware. Now imagine live migrate to another machine, to another CPU and motherboard, even on the exact same model. The TSC frequency can't be exactly the same, so there's some variation.

      To keep a perfectly constant/consistent clock on the VM, Xen default TSC mode (1) is detecting those changes to "hide" them to the guest with some emulation (if needed).

      Mode 2 is "no emulation whatsoever" (and mode 0 is always emulate). I'm not exactly sure about the risk on switching to mode 2 in production. If you want to test it and check chrony/ntpd logs, I'm interested in the results 🙂

      ForzaF 1 Reply Last reply Reply Quote 0
      • olivierlambertO Online
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Here is an old paper from VMware, but with a good recap on various timers and the complexity of it: https://nextcloud.vates.fr/index.php/s/WHk64gHTK4iaJAP

        1 Reply Last reply Reply Quote 0
        • ForzaF Offline
          Forza @olivierlambert
          last edited by

          @olivierlambert said in Memory in vm half as fast after migration of vm.:

          Mode 2 is "no emulation whatsoever" (and mode 0 is always emulate). I'm not exactly sure about the risk on switching to mode 2 in production. If you want to test it and check chrony/ntpd logs, I'm interested in the results

          We use local NTP servers so we could use chrony the the like to sync (already do). But, is TSC only about timesync, or is it about other stuff like Linux kernel internals depending on some stability of it?
          I found this article https://superuser.com/questions/393969/what-does-clocksource-tsc-unstable-mean that discusses TSC a little. It seems we can have unstable TSC on multicore systems and the kernels should handle it anyway?

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Online
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            Frankly, I wouldn't speculate on the potential risk, I prefer to answer that I don't know 😄 I might read stuff later when I can to get a better idea.

            ForzaF 1 Reply Last reply Reply Quote 1
            • ForzaF Offline
              Forza @olivierlambert
              last edited by

              @olivierlambert said in Memory in vm half as fast after migration of vm.:

              Frankly, I wouldn't speculate on the potential risk, I prefer to answer that I don't know 😄 I might read stuff later when I can to get a better idea.

              And.. Why does the emulation make it slower after a move - is this a bug in the emulation?

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Online
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                No it's not a bug. Emulating a TSC clock is taking resources. Emulation in general is bad for performance.

                ForzaF 1 Reply Last reply Reply Quote 0
                • ForzaF Offline
                  Forza @olivierlambert
                  last edited by

                  @olivierlambert said in Memory in vm half as fast after migration of vm.:

                  No it's not a bug. Emulating a TSC clock is taking resources. Emulation in general is bad for performance.

                  So what happens is that no emulation is used when the VM is booted, but it gets activated when migrating to another host?

                  And not enabling TSC emulation would mean that TSC behaviour/properties might change when migrating, which in turn has a possible (what?) effect on the guest VM?

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Online
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by olivierlambert

                    1. That's correct. Because of TSC variation between different platforms
                    2. Correct. It means the guest will see that, and the unknown to me is the real effect/risk on the guest 🙂
                    ForzaF 1 Reply Last reply Reply Quote 0
                    • ForzaF Offline
                      Forza @olivierlambert
                      last edited by

                      @olivierlambert said in Memory in vm half as fast after migration of vm.:

                      1. That's correct. Because of TSC variation between different platforms
                      2. Correct. It means the guest will see that, and the unknown to me is the real effect/risk on the guest 🙂

                      Thanks for helping clarifying the problem. Definitely something worth testing fully before making changes.

                      1 Reply Last reply Reply Quote 0
                      • olivierlambertO Online
                        olivierlambert Vates 🪐 Co-Founder CEO
                        last edited by

                        We won't change the default behavior before having a LARGE use base using it without any problem 😄 And if I wonder about the risk on Linux, I have 0 knowledge on Windows based OS.

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post