XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Memory in vm half as fast after migration of vm.

    Scheduled Pinned Locked Moved Compute
    41 Posts 6 Posters 14.7k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by olivierlambert

      That's weird, despite the fact your VM is set to static, right? (same dynamic min, max and static max?)

      edit: is it the only diff?

      A 1 Reply Last reply Reply Quote 0
      • A Offline
        Andreas @olivierlambert
        last edited by Andreas

        @olivierlambert
        no this change from this
        memory-actual ( RO): 6442455040
        to this
        memory-actual ( RO): 6442450944

        and its below but they probably have no significance
        start-time
        console-uuids
        dom-id
        VCPUs-utilisation
        guest-metrics-last-updated
        b2471d1e-2d28-44c6-af0e-c555e9ecf100-image.png

        1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          That's weird. We'll see if we can reproduce this. @Darkbeldin will try when he can (probably in January)

          A 1 Reply Last reply Reply Quote 0
          • A Offline
            Andreas @olivierlambert
            last edited by

            @olivierlambert
            Okay thanks
            and happy New Year 🙂

            1 Reply Last reply Reply Quote 0
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              You too!

              A 1 Reply Last reply Reply Quote 0
              • A Offline
                Andreas @olivierlambert
                last edited by

                @olivierlambert
                Sorry to disturb you.
                Okay just to verify that there was nothing wrong with the physical servers.
                So I took 2 identical PCs and installed clean new xcp-ng 8.2
                then install a virtual machine with static 4GB of memory and with guest tools.
                Install redis and ran the test
                then migrated VM to other pc and ran the test and the speed was half.
                Took out the result before and after attached the files.

                1.Before migration.txt
                2.After migration.txt

                DarkbeldinD 1 Reply Last reply Reply Quote 0
                • DarkbeldinD Offline
                  Darkbeldin Vates 🪐 Pro Support Team @Andreas
                  last edited by

                  @andreas Hi Andreas,

                  After testing it on my side i can confirm i reproduce the issue.
                  I will discuss it at dev level and get back to you.

                  A ForzaF 2 Replies Last reply Reply Quote 2
                  • A Offline
                    Andreas @Darkbeldin
                    last edited by

                    @darkbeldin
                    Okay Thanks

                    DarkbeldinD 1 Reply Last reply Reply Quote 0
                    • ForzaF Offline
                      Forza @Darkbeldin
                      last edited by

                      @darkbeldin said in Memory in vm half as fast after migration of vm.:

                      @andreas Hi Andreas,

                      After testing it on my side i can confirm i reproduce the issue.
                      I will discuss it at dev level and get back to you.

                      This seems quite an important find. Please let is know how this goes.

                      1 Reply Last reply Reply Quote 0
                      • DarkbeldinD Offline
                        Darkbeldin Vates 🪐 Pro Support Team @Andreas
                        last edited by Darkbeldin

                        @andreas

                        So I was doing some testing before reporting to dev team and I have a behavior I will like you to check if you reproduce:
                        my clean VM report like this

                        yachy@ubuntuyachy:~$ redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
                        SET: 156152.41 requests per second
                        GET: 168180.28 requests per second
                        LPUSH: 156421.08 requests per second
                        LPOP: 159757.17 requests per second
                        

                        That's my reference, when I migrate to another host it report like this:

                        yachy@ubuntuyachy:~$ redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
                        SET: 55718.07 requests per second
                        GET: 58683.72 requests per second
                        LPUSH: 55742.91 requests per second
                        LPOP: 54775.01 requests per second
                        

                        If I reboot it goes back to original reporting but if I migrate back to the original host without rebooting it report like that.

                        redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
                        SET: 138092.94 requests per second
                        GET: 153151.08 requests per second
                        LPUSH: 147004.78 requests per second
                        LPOP: 148115.23 requests per second
                        

                        So not perfect as reference but way better than after migration.
                        As I want to be thorough before reporting could you check if you reproduce that?
                        So:

                        • migrate to another host
                        • make the test
                        • migrate back to the original host
                        • make the test

                        Thanks for your help.

                        A 1 Reply Last reply Reply Quote 0
                        • A Offline
                          Andreas @Darkbeldin
                          last edited by

                          @darkbeldin
                          Hello
                          I installed clean new xcp-ng 8.2 on 2 identical PCs name host1 and host2 then updated to latest "yum update"
                          then install a virtual machine ubuntu 20.04 with static 4GB of memory and with guest tools.
                          Install redis-server
                          Then I did the test
                          on host1
                          root@ramtest:/home/andreas# redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
                          SET: 243368.20 requests per second
                          GET: 261917.23 requests per second
                          LPUSH: 257499.67 requests per second
                          LPOP: 264830.50 requests per second

                          Then migrate to host2 got lower speed
                          root@ramtest:/home/andreas# redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
                          SET: 92055.60 requests per second
                          GET: 95297.09 requests per second
                          LPUSH: 95570.31 requests per second
                          LPOP: 95401.64 requests per second

                          Then back to host1 got almost the same speed
                          root@ramtest:/home/andreas# redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
                          SET: 238010.23 requests per second
                          GET: 253100.48 requests per second
                          LPUSH: 259100.92 requests per second
                          LPOP: 259134.50 requests per second

                          DarkbeldinD 1 Reply Last reply Reply Quote 0
                          • DarkbeldinD Offline
                            Darkbeldin Vates 🪐 Pro Support Team @Andreas
                            last edited by

                            @andreas Ok so migrating back to the original host give us a small perf issue but clearly not what we see when we migrate to another host.
                            I will report it like that thanks for the test Andreas 😉

                            A 1 Reply Last reply Reply Quote 0
                            • A Offline
                              Andreas @Darkbeldin
                              last edited by

                              @darkbeldin
                              Okay
                              Did more tests
                              Started on host1 normal speed
                              migrate to host2
                              make the test got lower speed
                              restart vm
                              make the test on host2
                              Got normal speed
                              migrate to host1
                              make the test got lower speed
                              migrate to host2 normal speed

                              so it seems to be something that happens after first migrating to another host

                              I have a third exactly the same pc i should test install on it
                              and see what happens if i move vm to host3 after moving to host2
                              but I have to do it tomorrow, I do not have time now.

                              DarkbeldinD 2 Replies Last reply Reply Quote 0
                              • DarkbeldinD Offline
                                Darkbeldin Vates 🪐 Pro Support Team @Andreas
                                last edited by

                                @andreas Yes i tested it no need to do it, migrating to a third hosts result to half perf has first migration.

                                1 Reply Last reply Reply Quote 0
                                • DarkbeldinD Offline
                                  Darkbeldin Vates 🪐 Pro Support Team @Andreas
                                  last edited by

                                  @andreas Ok so after discussing it with Dev team the issue has been identified.
                                  The trouble is linked to TSC management in the VM.
                                  You can work around the issue by setting the VM:

                                  xe vm-param-set uuid=<VM_UUID> platform:tsc_mode=2
                                  

                                  But be aware we can not recommend this settings to go to a production VM.
                                  TSC clock won't be emulated at all if you enable this settings. So you might have some weird time behavior during migration.

                                  A TheNorthernLightT A 3 Replies Last reply Reply Quote 0
                                  • A Offline
                                    Andreas @Darkbeldin
                                    last edited by

                                    @darkbeldin Okay thanks
                                    I did test this and it worked.

                                    1 Reply Last reply Reply Quote 0
                                    • TheNorthernLightT Offline
                                      TheNorthernLight @Darkbeldin
                                      last edited by

                                      @darkbeldin Hello, So for those of us in production, does this problem affect rolling pool upgrades?

                                      If so, how do we fix this and update our pools without needing to explicitly shutdown VMs in the pool?

                                      1 Reply Last reply Reply Quote 0
                                      • A Offline
                                        Andrew Top contributor @Darkbeldin
                                        last edited by

                                        @darkbeldin Is memory access actually slower or is a timing issue with the statistics?

                                        DarkbeldinD 1 Reply Last reply Reply Quote 1
                                        • DarkbeldinD Offline
                                          Darkbeldin Vates 🪐 Pro Support Team @Andrew
                                          last edited by

                                          @andrew Sorry guys not sure i understand the issue enough to answer, @olivierlambert has a way better understanding of it, think it's better you to ask him 😉

                                          1 Reply Last reply Reply Quote 0
                                          • olivierlambertO Offline
                                            olivierlambert Vates 🪐 Co-Founder CEO
                                            last edited by olivierlambert

                                            It's a very long story. The real impact isn't that big in real usage, and it depends on so many factors that it's hard to really know at one time if you are really affected or not.

                                            The core issue is related to TSC clock. Time/tick regularity on hardware is a REAL mess, even on the same hardware, Xen default mode is trying to use the TSC without emulation for your VMs, but sometimes TSC is doing weird things, and Xen is able to preserve the behavior in the guest by emulating it.

                                            This emulation is costing performance. And this is already on the very same hardware. Now imagine live migrate to another machine, to another CPU and motherboard, even on the exact same model. The TSC frequency can't be exactly the same, so there's some variation.

                                            To keep a perfectly constant/consistent clock on the VM, Xen default TSC mode (1) is detecting those changes to "hide" them to the guest with some emulation (if needed).

                                            Mode 2 is "no emulation whatsoever" (and mode 0 is always emulate). I'm not exactly sure about the risk on switching to mode 2 in production. If you want to test it and check chrony/ntpd logs, I'm interested in the results 🙂

                                            ForzaF 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post