XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    VM's going really slow after 3 - 4 weeks

    Scheduled Pinned Locked Moved Solved Compute
    36 Posts 9 Posters 5.1k Views 7 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DanpD Offline
      Danp Pro Support Team @Gheppy
      last edited by

      @Gheppy I was surprised to see a graylog VM using so much CPU, but maybe it's normal. 🤷

      B 1 Reply Last reply Reply Quote 0
      • GheppyG Offline
        Gheppy
        last edited by

        Yes, it is a VM with a heavily used database.
        At the moment I am trying to convince them to buy an XO license for this server.
        I work for a public service and I have no say when it comes to money.

        1 Reply Last reply Reply Quote 0
        • B Offline
          Berrick @Danp
          last edited by Berrick

          @Danp said in VM's going really slow after 3 - 4 weeks:

          @Gheppy I was surprised to see a graylog VM using so much CPU, but maybe it's normal. 🤷

          Not sure if Gheppy DB is graylog. Mine was and when I searched for an answer as to why the strange CPU utilization came up with the same answer Oliver supplied.

          I would like to point out, as I didn't earlier, that the graylog cpu utilization in the xentop image I up loaded has been fixed so CPU util is much much less now.

          However, the CPU utilization of that vm was also at the high levels after a server reboot so dont think its the answer to why all vm's slowly slow down

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            What's the hardware behind it by the way?

            1 Reply Last reply Reply Quote 0
            • GheppyG Offline
              Gheppy
              last edited by Gheppy

              My server is HP DL380 G9, CPU 2 x E5-2620 v3 @ 2.40GHz with 4xSSD and 16xHDD 2.5" and 128Gb RAM.
              System ( XCP-ng and OS ) is on 4 x SSD RAID 10, DB is on 14 x HDD RAID 10.

              B 1 Reply Last reply Reply Quote 0
              • B Offline
                Berrick @Gheppy
                last edited by

                @Gheppy What is free disk space like on this VM?

                This is what caused the CPU's in our graylog server to go nuts. It had almost run out of free disk space. The only thing I have done was to increase free disk space by clearing out some of the older logs.

                72b922e6-3c6a-457e-984a-271e5ff95961-image.png

                B 1 Reply Last reply Reply Quote 0
                • GheppyG Offline
                  Gheppy
                  last edited by Gheppy

                  There is free space, on real disks I have:

                  • 4 x 400Gb SSD = 800Gb RAID 10, on which I have XCP-ng and the rest of the disk is with the first disk of the VM where is the operating system, which has 350 Gb allocated in which I have 2 partitions with 150 Gb SO and 200 Gb SQL log (for some speed). The occupied space being for SO 65Gb and for SQL log of 120Gb.

                  • 14 x 1200Gb = 8.4Tb RAID 10, on which I have the rest of the database divided on 3 virtual disks of 2Tb with soft RAID in the OS ( which is windows ),the configuration is like this because of the XCP-ng limit of 2Tb per vdisk. The database currently has 3.7 Tb.

                  tjkreidlT 1 Reply Last reply Reply Quote 0
                  • tjkreidlT Offline
                    tjkreidl Ambassador @Gheppy
                    last edited by

                    @Gheppy If you run top, how is the dom0 CPU load and how is the swap space? If dom0 is saturated it may need more memory allocated to it.

                    GheppyG 1 Reply Last reply Reply Quote 0
                    • GheppyG Offline
                      Gheppy
                      last edited by Gheppy

                      This is a top img, swap it seems to be ok but memory is 103 Mb free of 7.4Gb.
                      I will increase the memory for Xen. Thank you
                      a4628398-2b29-4965-ae22-76500a7c151c-image.png

                      1 Reply Last reply Reply Quote 0
                      • GheppyG Offline
                        Gheppy @tjkreidl
                        last edited by

                        @tjkreidl
                        I increased the RAM for Xen to 10Gb and it is much better, maximum 156% on CPU on xentop.
                        Thank you

                        tjkreidlT 1 Reply Last reply Reply Quote 0
                        • tjkreidlT Offline
                          tjkreidl Ambassador @Gheppy
                          last edited by

                          @Gheppy You have to give dom0 enough CPU power for various reaasons: 1) to be able to not be bogged down, itself, 2) to allow VMs to be able to interact with dom0 reasonably fast, 3) to provide enough resources to deal with storage and network I/O. Showing essentially no swap space in use is good!

                          1 Reply Last reply Reply Quote 0
                          • B Offline
                            Berrick @Berrick
                            last edited by

                            Seasons greetings to all,

                            So the slowness of is creeping back. Still cant find any thing obvious
                            Latest Xen Top

                            4ccf84eb-8d2e-4434-bafc-cd38c43007f2-image.png

                            Due to business pressures we have had to reboot the physical server.
                            So now have to wait for the issue to re occur.

                            there were 5 more patch's which have been applied.

                            If anyone has any other suggestion of what to check, specific logs to search etc, etc they would be gratefully received

                            Kind regards

                            tjkreidlT 1 Reply Last reply Reply Quote 0
                            • olivierlambertO Offline
                              olivierlambert Vates 🪐 Co-Founder CEO
                              last edited by

                              Adding @fohdeesha in the loop

                              1 Reply Last reply Reply Quote 0
                              • Tristis OrisT Offline
                                Tristis Oris Top contributor
                                last edited by

                                can't say that time from reboot a cause of problem, but sometimes i'm feel something about it. Even ssh connection takes tooooo long, like 10 sec in local network. Same for http speed for small services without any load.

                                1 Reply Last reply Reply Quote 0
                                • tjkreidlT Offline
                                  tjkreidl Ambassador @Berrick
                                  last edited by

                                  @Berrick Run iostat and see if anything shows up as slow or saturated with your storage I/O.

                                  1 Reply Last reply Reply Quote 1
                                  • EddieCh08666741E Offline
                                    EddieCh08666741
                                    last edited by

                                    hi Berrick, most of the issues that happen like this is due to IO i believe. I've been using XCP ng for many years and I have about 100 VMs. I love xcp and its the best out there. There are few instance that my VM starts to crawl to be unusable.

                                    1. One of the disk fails in the raid and cause my VMs on that particular server to crawl.
                                    2. Too little memory provisioned. Heavy usage.

                                    Hope my above helps

                                    tjkreidlT 1 Reply Last reply Reply Quote 0
                                    • tjkreidlT Offline
                                      tjkreidl Ambassador @EddieCh08666741
                                      last edited by

                                      @EddieCh08666741 Indeed. Run top and make sure you have adequate resources allocated for dom0 for both memory and CPU. There should be no swapping to speak of and the CPU should be nowhere close to 100% in use.

                                      B 1 Reply Last reply Reply Quote 0
                                      • B Offline
                                        Berrick @tjkreidl
                                        last edited by

                                        Thanks for all the replies.
                                        I will be checking when the issue starts again in a few weeks and report back.

                                        From previous xentop output I don't believe Dom0 is showing high CPU or Memory.

                                        I also dont think there is a problem with the physical disks.

                                        B 1 Reply Last reply Reply Quote 0
                                        • B Offline
                                          Berrick @Berrick
                                          last edited by

                                          @Berrick
                                          So its been a while since my last post on this topic but finally I think we have bottomed this issue out! 🙂

                                          So the long and the short.
                                          Nothing we tried, checked or diagnostics ran up to this point showed any issue.
                                          Then one Monday morning there was one alert from the iLO about "Corrected Memory Error threshold exceeded"

                                          As this server was due a memory upgrade to LV memory soon and we would need to reboot to clear the alert we brought the upgrade forward.

                                          Since the new memory was installed, tempting fate here, it has been OK

                                          Thanks to all that offered advice.

                                          Kind regards

                                          1 Reply Last reply Reply Quote 0
                                          • olivierlambertO Offline
                                            olivierlambert Vates 🪐 Co-Founder CEO
                                            last edited by

                                            Oh wow. So something was wrong with the hardware, specifically the memory, and potentially more errors and error correction trying to keep up, slowing down everything. Is it a good recap?

                                            B 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post