XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XCP-NG vm's extremly slow

    Scheduled Pinned Locked Moved Compute
    45 Posts 6 Posters 14.6k Views 5 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      Andi79 @Forza
      last edited by

      • What kind of storage do you use for your VMs?

      both machine have 2 raid1 with 2x Seagate Guardian BarraCuda ST4000LM024 and 2x Intel Solid-State Drive D3-S4610 Series- 960GB.

      there is no problem on the host, only on the VMs (and problem occurs on VMs on the ssd and on others on the seagate).

      • When you do ls, what path do you do ls for?
        It's no problem with the "ls" command, it appears on every command (or even on commands that does not exists).... and on other disk operations. If i repeat the command it get's faster, but still to slow.

      • What is the output of mount?

      on the host system:

      sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
      proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
      devtmpfs on /dev type devtmpfs (rw,nosuid,size=2091400k,nr_inodes=522850,mode=755)
      securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
      tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
      devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
      tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
      tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
      cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
      pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
      cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
      cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
      cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
      cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
      cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
      cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
      cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
      configfs on /sys/kernel/config type configfs (rw,relatime)
      /dev/md127p1 on / type ext3 (rw,relatime)
      mqueue on /dev/mqueue type mqueue (rw,relatime)
      debugfs on /sys/kernel/debug type debugfs (rw,relatime)
      xenfs on /proc/xen type xenfs (rw,relatime)
      xenstore on /var/lib/xenstored type tmpfs (rw,relatime,mode=755)
      /dev/md127p5 on /var/log type ext3 (rw,relatime)
      sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
      tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=420744k,mode=700)

      A ForzaF 3 Replies Last reply Reply Quote 0
      • A Offline
        Andi79 @Andi79
        last edited by

        @Andi79 xoa.png
        this is the current from the pc that only hosts 2vms. One VM with backuppc that is running (extremly slow backups..... many file accesses from backuppc). CPU usage is verly low. I really suspect an IO problem, but i have no idea how to prove or solve it.

        1 Reply Last reply Reply Quote 0
        • ForzaF Offline
          Forza @Andi79
          last edited by

          @Andi79

          What SR do you have setup for storing your VMs? Your mount output does not list /dev/mapper/XSLocalEXT-xxxx as would be normal for a local SR or a NFS mount used for share storage. What do you have in /run/sr-mount/ ?

          Is this a plain xcp-ng installation or did you build from source?

          It is unusual that ls would be slow. It sounds like a networking issue, perhaps DNS timeout or something similar that is affecting things.

          1 Reply Last reply Reply Quote 0
          • A Offline
            Andi79 @Andi79
            last edited by

            here is a Video of the problem

            https://we.tl/t-A8JF4EAWht

            as you can see in the beginning everyhing is ok, then it beginns to stop when i make the apt update (no network problem). When I make the ctrl + c on the wget it hangs... also a short time on the rm. but this could also happen on an ls or when I enter a command that does not exists.

            it's a plain xcp-ng installation with default values, i only added the second raid by mdadm and attached it as an sr.

            ForzaF 1 Reply Last reply Reply Quote 0
            • ForzaF Offline
              Forza @Andi79
              last edited by

              @Andi79 said in XCP-NG vm's extremly slow:

              here is a Video of the problem
              https://we.tl/t-A8JF4EAWht

              I can't download from there. But in any case. Can you provide the details exactly how things are configured?

              A 1 Reply Last reply Reply Quote 0
              • A Offline
                Andi79 @Forza
                last edited by

                @Forza i uploaded here again as direct link

                http://pzka.de/xen.mp4

                "It sounds like a networking issue"

                how could network issues cause a problem like a freze after an ls?

                "Can you provide the details exactly how things are configured?"

                what exactly do you need to know? it's a standard instalation, the ssd is attached during instalation as a raid1 on md127. the hdd is attached after that as md126. The VMs where migrated from some other hosts where they did not made this troubles.
                The Host system took 4GB Ram by default.

                ForzaF 1 Reply Last reply Reply Quote 0
                • ForzaF Offline
                  Forza @Andi79
                  last edited by

                  You did not show any SR details. You can see those in XOA->Hosts->(your host->Storage-tab

                  Example:
                  0a8d0788-7efc-4dbd-ab11-76f34529f36a-image.png

                  Also. Your VM. What storage does it use?
                  Example:
                  5ff7b548-176d-4ce7-9e6b-73ced2a2beec-image.png

                  1 Reply Last reply Reply Quote 0
                  • A Offline
                    Andi79
                    last edited by

                    storage.png

                    here is the storage list

                    H 1 Reply Last reply Reply Quote 0
                    • H Offline
                      hoerup @Andi79
                      last edited by hoerup

                      @Andi79
                      Is the problem occurring both when the vm's are on HDD and on SSD storage or only on HDD ?

                      A 1 Reply Last reply Reply Quote 0
                      • A Offline
                        Andi79 @hoerup
                        last edited by

                        @hoerup on both. I allready thought this could happen on some idle thing on the hdd's, but same problem on the ssd's, so that could not be the cause

                        ForzaF 1 Reply Last reply Reply Quote 0
                        • ForzaF Offline
                          Forza @Andi79
                          last edited by

                          Can you try ext/thin instead of LVM? Not that this should matter that much.

                          ForzaF A 2 Replies Last reply Reply Quote 0
                          • ForzaF Offline
                            Forza @Forza
                            last edited by

                            Looking at the video it seems the writes get queued up (Dirty in /proc/meminfo). I wonder why writes are so slow.

                            What filesystems do you use on the guest?

                            A 1 Reply Last reply Reply Quote 0
                            • A Offline
                              Andi79 @Forza
                              last edited by

                              @Forza

                              this was the standard from xcp-ng during installation. The problem is that there allready are VMs on this machine and i can't reformat it without any problems (no other hosts in this data center where i could use the same IPs).

                              guest system uses ext3

                              A 1 Reply Last reply Reply Quote 0
                              • A Offline
                                Andi79 @Andi79
                                last edited by olivierlambert

                                Some more Infos:

                                #cat /proc/mdstat  
                                Personalities : [raid1] 
                                md126 : active raid1 sdb[1] sda[0]
                                      3906886464 blocks super 1.2 [2/2] [UU]
                                      bitmap: 3/30 pages [12KB], 65536KB chunk
                                
                                md127 : active raid1 sdd[1] sdc[0]
                                      937692352 blocks super 1.0 [2/2] [UU]
                                      bitmap: 1/7 pages [4KB], 65536KB chunk
                                
                                
                                #mdadm --detail /dev/md126
                                /dev/md126:
                                           Version : 1.2
                                     Creation Time : Sat Jun  4 12:08:56 2022
                                        Raid Level : raid1
                                        Array Size : 3906886464 (3725.90 GiB 4000.65 GB)
                                     Used Dev Size : 3906886464 (3725.90 GiB 4000.65 GB)
                                      Raid Devices : 2
                                     Total Devices : 2
                                       Persistence : Superblock is persistent
                                
                                     Intent Bitmap : Internal
                                
                                       Update Time : Tue Jun 14 19:55:36 2022
                                             State : clean 
                                    Active Devices : 2
                                   Working Devices : 2
                                    Failed Devices : 0
                                     Spare Devices : 0
                                
                                Consistency Policy : bitmap
                                
                                              Name : server2-neu:md126  (local to host server2)
                                              UUID : 784c25d6:18f3a0c2:ca8fe399:d16ec0e2
                                            Events : 35383
                                
                                    Number   Major   Minor   RaidDevice State
                                       0       8        0        0      active sync   /dev/sda
                                       1       8       16        1      active sync   /dev/sdb
                                
                                #mdadm --detail /dev/md127
                                /dev/md127:
                                           Version : 1.0
                                     Creation Time : Sat Jun  4 09:56:54 2022
                                        Raid Level : raid1
                                        Array Size : 937692352 (894.25 GiB 960.20 GB)
                                     Used Dev Size : 937692352 (894.25 GiB 960.20 GB)
                                      Raid Devices : 2
                                     Total Devices : 2
                                       Persistence : Superblock is persistent
                                
                                     Intent Bitmap : Internal
                                
                                       Update Time : Tue Jun 14 19:56:15 2022
                                             State : clean 
                                    Active Devices : 2
                                   Working Devices : 2
                                    Failed Devices : 0
                                     Spare Devices : 0
                                
                                Consistency Policy : bitmap
                                
                                              Name : localhost:127
                                              UUID : b5ab10b2:b89109af:9f4a274a:d7af50b3
                                            Events : 4450
                                
                                    Number   Major   Minor   RaidDevice State
                                       0       8       32        0      active sync   /dev/sdc
                                       1       8       48        1      active sync   /dev/sdd
                                
                                #vgdisplay
                                  Device read short 82432 bytes remaining
                                  Device read short 65536 bytes remaining
                                  --- Volume group ---
                                  VG Name               VG_XenStorage-cd8f9061-df06-757c-efb6-4ada0927a984
                                  System ID             
                                  Format                lvm2
                                  Metadata Areas        1
                                  Metadata Sequence No  93
                                  VG Access             read/write
                                  VG Status             resizable
                                  MAX LV                0
                                  Cur LV                5
                                  Open LV               3
                                  Max PV                0
                                  Cur PV                1
                                  Act PV                1
                                  VG Size               <3,64 TiB
                                  PE Size               4,00 MiB
                                  Total PE              953826
                                  Alloc PE / Size       507879 / <1,94 TiB
                                  Free  PE / Size       445947 / 1,70 TiB
                                  VG UUID               yqsyV9-h2Gl-5lMf-486M-BI7f-r3Ar-Todeh9
                                   
                                  --- Volume group ---
                                  VG Name               VG_XenStorage-c29b2189-edf2-8349-d964-381431c48be1
                                  System ID             
                                  Format                lvm2
                                  Metadata Areas        1
                                  Metadata Sequence No  25
                                  VG Access             read/write
                                  VG Status             resizable
                                  MAX LV                0
                                  Cur LV                1
                                  Open LV               0
                                  Max PV                0
                                  Cur PV                1
                                  Act PV                1
                                  VG Size               <852,74 GiB
                                  PE Size               4,00 MiB
                                  Total PE              218301
                                  Alloc PE / Size       1 / 4,00 MiB
                                  Free  PE / Size       218300 / 852,73 GiB
                                  VG UUID               4MFkwD-1JW1-zVE3-QFKf-XmOX-QsSf-60oCKZ
                                
                                A 1 Reply Last reply Reply Quote 0
                                • A Offline
                                  Andi79 @Andi79
                                  last edited by

                                  io.png

                                  ok.... it really seems to be an io problem. any ideas what could cause this?

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambertO Online
                                    olivierlambert Vates 🪐 Co-Founder CEO
                                    last edited by

                                    @fohdeesha does it ring any bell?

                                    fohdeeshaF 1 Reply Last reply Reply Quote 0
                                    • A Offline
                                      Andi79 @Forza
                                      last edited by

                                      @Forza

                                      now i noticed that tje jdb2 and kworker processes have gone... but system is still extremly slow

                                      https://www.pzka.de/xen2.mp4

                                      on this video i try to install munin for future data to analyse the problem. As you can see absolutly nothing happens (this can take many minutes now). There is a rsync "running" at a very slow speed, but as you can see cpu usage is ultra low and also top says no system load.

                                      I think it must be a problem with xcp-ng, but I have no idea what it could be.

                                      1 Reply Last reply Reply Quote 0
                                      • olivierlambertO Online
                                        olivierlambert Vates 🪐 Co-Founder CEO
                                        last edited by

                                        Can you do a dmesg and also a smartctl -a /dev/sdb and smartctl /dev/sda?

                                        A 1 Reply Last reply Reply Quote 0
                                        • A Offline
                                          Andi79 @olivierlambert
                                          last edited by Andi79

                                          @olivierlambert

                                          output of dmsg (on the vm):

                                          [132514.270681] systemd[1]: Starting Journal Service...
                                          [132604.475796] systemd[1]: systemd-journald.service: start operation timed out. Terminating.
                                          [132694.707996] systemd[1]: systemd-journald.service: State 'stop-sigterm' timed out. Killing.
                                          [132694.708039] systemd[1]: systemd-journald.service: Killing process 60936 (systemd-journal) with signal SIGKILL.
                                          [132784.940258] systemd[1]: systemd-journald.service: Processes still around after SIGKILL. Ignoring.
                                          [132797.289939] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=9/KILL
                                          [132797.289947] systemd[1]: systemd-journald.service: Failed with result 'timeout'.
                                          [132797.290322] systemd[1]: Failed to start Journal Service.
                                          [132797.291608] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 18.
                                          [132797.291833] systemd[1]: Stopped Journal Service.
                                          [132797.324750] systemd[1]: Starting Journal Service...
                                          [132808.130140] systemd-journald[61002]: File /var/log/journal/eb029b8cf0534f998db52d5afecd252b/system.journal corrupted or uncleanly shut down, renaming and replacing.
                                          [132817.090264] systemd[1]: Started Journal Service.
                                          [158702.774840] VFS: busy inodes on changed media sr0
                                          
                                          

                                          i haven't installed smarttools on the VMs, but i can do.... from the video to now the apt for installing munin is on 4% now. it tooks very very long

                                          1 Reply Last reply Reply Quote 0
                                          • olivierlambertO Online
                                            olivierlambert Vates 🪐 Co-Founder CEO
                                            last edited by

                                            Not in the VM, in the Dom0 please 🙂

                                            A 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post