XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    VM's going really slow after 3 - 4 weeks

    Scheduled Pinned Locked Moved Solved Compute
    36 Posts 9 Posters 5.1k Views 7 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B Offline
      Berrick
      last edited by

      Evening All,

      Would be grateful if anyone can offer guidance on how to fault find this issue. Or solve it :). The google Fu I have used thus far hasn't really help.

      All appear to be working great. However over a period of about 3 - 4 weeks the VM's slow down. For example the Windows 10 VM will become noticeably slower.

      Rebooting the physical server instantly brings the VM's back to their zippy self.

      This is a pretty much an out of the box installation
      I haven't found anything obviously wrong.
      The only thing I have seen is that some of the CPU's are getting very busy when the issue presents. Whereas when "all" appears OK all CPU's are pretty much around 12% usage.

      I have just applied the following pool patch's

      • Linux firmware 20190314
      • microcode_cti 2.1
      • xen-dom0-libs 4.13.4
      • xen-dom0-tools 4.13.4
      • xen-hypervisor 4.13.4
      • xen-libs 4.13.4
      • xen-tools 4.13.4

      Below is the spec of the physical and XCP setup

      **Physical Server**
      CPU model	Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
      GPUs	MGA G200EH
      Core (socket)	32 (2)
      Hyper-threading (SMT)	Enabled
      Manufacturer info	HP (ProLiant DL380p Gen8)
      BIOS info	HP (P70)
      
      **Network**
      HP Ethernet 1Gb 4-port 366FLR Adapter
      
      **Storage**
      Smart Array P420i Controller
      Raid 5
      4 x ST9900805SS
      
      **XCP-ng**
      Build Date 2022-02-11
      Version: 8.2
      DBV: 0.0.1
      
      17.4 GB RAM available (48.0 GB total)
      
      Local storage 	Ext	No	36% (893.9 GB used)	2.4 TB	2.9 TB
      
      **Networking**
      Dedicate NIC 0 for management
      LACP for all VM's using NIC 2 & 3
      
      **VM's**
      CheckMK - Ubuntu Bionic Beaver 18.04 (1): using 3.0 GB
      graylog: using 4.0 GB
      Windows Server 2012 R2 (64-bit) (1): using 3.0 GB
      Windows 10 (64-bit) : using 5.0 GB
      Exchange: using 4.0 GB
      DC: using 4.0 GB
      XO Ubuntu Focal Fossa 20.04: using 3.0 GB
      
      Vendor: GenuineIntel
      Model: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
      Speed: 2892 MHz
      

      Many thanks

      1 Reply Last reply Reply Quote 0
      • B Offline
        Berrick
        last edited by

        Hi, So Only been about 2 weeks since last server reboot and already VM's are noticeably slower.

        Still struggling to make any head way regards how to fault find this issue....

        Is Dom0 the most likely culprit. Or could it be storage?

        AnonabharA 1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          Hi,

          Why do you think it's dom0 fault? have you took a look at the logs?

          1 Reply Last reply Reply Quote 0
          • AnonabharA Offline
            Anonabhar @Berrick
            last edited by

            @Berrick Have your virtual machines started to use swap space?

            1 Reply Last reply Reply Quote 0
            • B Offline
              Berrick
              last edited by

              Thanks for the replies.

              @olivierlambert I have looked at the logs but as mentioned I can seen nothing which to me indicates an issue. But to my peers; I could be talking rubbish.

              As all VM's appear to suffer and rebooting the physical server corrects the "issue" short term, am thinking what is common to all, hence Dom0.

              As testament to xneserver/xcp it has pretty much just worked and until now, any issue's that have occurred I have been able to find the answer. So haven't really learnt a whole lot regards trouble shooting.

              @Anonabhar No, I dont believe so. See below

              # cat /proc/swaps
              Filename                                Type            Size    Used    Priority
              /dev/sda6                               partition       1048572 0       -2
              
              # grep Swap /proc/meminfo
              SwapCached:            0 kB
              SwapTotal:       1048572 kB
              SwapFree:        1048572 kB
              
              1 Reply Last reply Reply Quote 0
              • M Offline
                mjtbrady
                last edited by

                Have you tried using xentop to see what th VMs are doing?

                B 1 Reply Last reply Reply Quote 1
                • B Offline
                  Berrick @mjtbrady
                  last edited by

                  @mjtbrady

                  We ran xentop after you suggested it but dont believe it shows anything?

                  XentopAfterReboot.jpg

                  Just recently we had notification of another lot of patch's which have been applied.

                  So now we wait!

                  1 Reply Last reply Reply Quote 0
                  • DanpD Offline
                    Danp Pro Support Team
                    last edited by

                    This doesn't look normal to me --
                    05f778de-d7f5-40ac-a489-388327c8aff4-image.png

                    Can you tell us more about this VM?

                    GheppyG 1 Reply Last reply Reply Quote 0
                    • GheppyG Offline
                      Gheppy @Danp
                      last edited by

                      @Danp
                      I have the same strange values on my server to
                      7b9159c9-30e6-40e6-b982-54a5aab158cf-image.png

                      DanpD 1 Reply Last reply Reply Quote 0
                      • olivierlambertO Offline
                        olivierlambert Vates 🪐 Co-Founder CEO
                        last edited by

                        100% = 1 vCPU full.

                        GheppyG 1 Reply Last reply Reply Quote 0
                        • GheppyG Offline
                          Gheppy
                          last edited by

                          this is from one server with one VM
                          6687a557-58b5-447a-8cd1-cd564c3387e8-image.png

                          1 Reply Last reply Reply Quote 0
                          • GheppyG Offline
                            Gheppy @olivierlambert
                            last edited by

                            @olivierlambert
                            so this means that in the last case I have 6 vCPU at 100% loade

                            1 Reply Last reply Reply Quote 0
                            • olivierlambertO Offline
                              olivierlambert Vates 🪐 Co-Founder CEO
                              last edited by

                              How many vCPUs you have assigned to this VM?

                              GheppyG 1 Reply Last reply Reply Quote 0
                              • GheppyG Offline
                                Gheppy @olivierlambert
                                last edited by

                                @olivierlambert
                                Server has 24 core and the VM 20

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by

                                  So (roughly and IIRC), you are using 6,23 vCPUs at 100%.

                                  1 Reply Last reply Reply Quote 0
                                  • DanpD Offline
                                    Danp Pro Support Team @Gheppy
                                    last edited by

                                    @Gheppy I was surprised to see a graylog VM using so much CPU, but maybe it's normal. 🤷

                                    B 1 Reply Last reply Reply Quote 0
                                    • GheppyG Offline
                                      Gheppy
                                      last edited by

                                      Yes, it is a VM with a heavily used database.
                                      At the moment I am trying to convince them to buy an XO license for this server.
                                      I work for a public service and I have no say when it comes to money.

                                      1 Reply Last reply Reply Quote 0
                                      • B Offline
                                        Berrick @Danp
                                        last edited by Berrick

                                        @Danp said in VM's going really slow after 3 - 4 weeks:

                                        @Gheppy I was surprised to see a graylog VM using so much CPU, but maybe it's normal. 🤷

                                        Not sure if Gheppy DB is graylog. Mine was and when I searched for an answer as to why the strange CPU utilization came up with the same answer Oliver supplied.

                                        I would like to point out, as I didn't earlier, that the graylog cpu utilization in the xentop image I up loaded has been fixed so CPU util is much much less now.

                                        However, the CPU utilization of that vm was also at the high levels after a server reboot so dont think its the answer to why all vm's slowly slow down

                                        1 Reply Last reply Reply Quote 0
                                        • olivierlambertO Offline
                                          olivierlambert Vates 🪐 Co-Founder CEO
                                          last edited by

                                          What's the hardware behind it by the way?

                                          1 Reply Last reply Reply Quote 0
                                          • GheppyG Offline
                                            Gheppy
                                            last edited by Gheppy

                                            My server is HP DL380 G9, CPU 2 x E5-2620 v3 @ 2.40GHz with 4xSSD and 16xHDD 2.5" and 128Gb RAM.
                                            System ( XCP-ng and OS ) is on 4 x SSD RAID 10, DB is on 14 x HDD RAID 10.

                                            B 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post