XCP-ng

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups

    XCP-NG vm's extremly slow

    Compute
    6
    45
    3618
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Forza
      Forza @Andi79 last edited by

      @Andi79

      • What kind of storage do you use for your VMs?
      • When you do ls, what path do you do ls for?
      • What is the output of mount?
      A 1 Reply Last reply Reply Quote 0
      • A
        Andi79 @Forza last edited by

        • What kind of storage do you use for your VMs?

        both machine have 2 raid1 with 2x Seagate Guardian BarraCuda ST4000LM024 and 2x Intel Solid-State Drive D3-S4610 Series- 960GB.

        there is no problem on the host, only on the VMs (and problem occurs on VMs on the ssd and on others on the seagate).

        • When you do ls, what path do you do ls for?
          It's no problem with the "ls" command, it appears on every command (or even on commands that does not exists).... and on other disk operations. If i repeat the command it get's faster, but still to slow.

        • What is the output of mount?

        on the host system:

        sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
        proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
        devtmpfs on /dev type devtmpfs (rw,nosuid,size=2091400k,nr_inodes=522850,mode=755)
        securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
        tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
        devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
        tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
        tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
        cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
        pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
        cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
        cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
        cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
        cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
        cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
        cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
        cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
        configfs on /sys/kernel/config type configfs (rw,relatime)
        /dev/md127p1 on / type ext3 (rw,relatime)
        mqueue on /dev/mqueue type mqueue (rw,relatime)
        debugfs on /sys/kernel/debug type debugfs (rw,relatime)
        xenfs on /proc/xen type xenfs (rw,relatime)
        xenstore on /var/lib/xenstored type tmpfs (rw,relatime,mode=755)
        /dev/md127p5 on /var/log type ext3 (rw,relatime)
        sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
        tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=420744k,mode=700)

        A Forza 3 Replies Last reply Reply Quote 0
        • A
          Andi79 @Andi79 last edited by

          @Andi79 xoa.png
          this is the current from the pc that only hosts 2vms. One VM with backuppc that is running (extremly slow backups..... many file accesses from backuppc). CPU usage is verly low. I really suspect an IO problem, but i have no idea how to prove or solve it.

          1 Reply Last reply Reply Quote 0
          • Forza
            Forza @Andi79 last edited by

            @Andi79

            What SR do you have setup for storing your VMs? Your mount output does not list /dev/mapper/XSLocalEXT-xxxx as would be normal for a local SR or a NFS mount used for share storage. What do you have in /run/sr-mount/ ?

            Is this a plain xcp-ng installation or did you build from source?

            It is unusual that ls would be slow. It sounds like a networking issue, perhaps DNS timeout or something similar that is affecting things.

            1 Reply Last reply Reply Quote 0
            • A
              Andi79 @Andi79 last edited by

              here is a Video of the problem

              https://we.tl/t-A8JF4EAWht

              as you can see in the beginning everyhing is ok, then it beginns to stop when i make the apt update (no network problem). When I make the ctrl + c on the wget it hangs... also a short time on the rm. but this could also happen on an ls or when I enter a command that does not exists.

              it's a plain xcp-ng installation with default values, i only added the second raid by mdadm and attached it as an sr.

              Forza 1 Reply Last reply Reply Quote 0
              • Forza
                Forza @Andi79 last edited by

                @Andi79 said in XCP-NG vm's extremly slow:

                here is a Video of the problem
                https://we.tl/t-A8JF4EAWht

                I can't download from there. But in any case. Can you provide the details exactly how things are configured?

                A 1 Reply Last reply Reply Quote 0
                • A
                  Andi79 @Forza last edited by

                  @Forza i uploaded here again as direct link

                  http://pzka.de/xen.mp4

                  "It sounds like a networking issue"

                  how could network issues cause a problem like a freze after an ls?

                  "Can you provide the details exactly how things are configured?"

                  what exactly do you need to know? it's a standard instalation, the ssd is attached during instalation as a raid1 on md127. the hdd is attached after that as md126. The VMs where migrated from some other hosts where they did not made this troubles.
                  The Host system took 4GB Ram by default.

                  Forza 1 Reply Last reply Reply Quote 0
                  • Forza
                    Forza @Andi79 last edited by

                    You did not show any SR details. You can see those in XOA->Hosts->(your host->Storage-tab

                    Example:
                    0a8d0788-7efc-4dbd-ab11-76f34529f36a-image.png

                    Also. Your VM. What storage does it use?
                    Example:
                    5ff7b548-176d-4ce7-9e6b-73ced2a2beec-image.png

                    1 Reply Last reply Reply Quote 0
                    • A
                      Andi79 last edited by

                      storage.png

                      here is the storage list

                      H 1 Reply Last reply Reply Quote 0
                      • H
                        hoerup @Andi79 last edited by hoerup

                        @Andi79
                        Is the problem occurring both when the vm's are on HDD and on SSD storage or only on HDD ?

                        A 1 Reply Last reply Reply Quote 0
                        • A
                          Andi79 @hoerup last edited by

                          @hoerup on both. I allready thought this could happen on some idle thing on the hdd's, but same problem on the ssd's, so that could not be the cause

                          Forza 1 Reply Last reply Reply Quote 0
                          • Forza
                            Forza @Andi79 last edited by

                            Can you try ext/thin instead of LVM? Not that this should matter that much.

                            Forza A 2 Replies Last reply Reply Quote 0
                            • Forza
                              Forza @Forza last edited by

                              Looking at the video it seems the writes get queued up (Dirty in /proc/meminfo). I wonder why writes are so slow.

                              What filesystems do you use on the guest?

                              A 1 Reply Last reply Reply Quote 0
                              • A
                                Andi79 @Forza last edited by

                                @Forza

                                this was the standard from xcp-ng during installation. The problem is that there allready are VMs on this machine and i can't reformat it without any problems (no other hosts in this data center where i could use the same IPs).

                                guest system uses ext3

                                A 1 Reply Last reply Reply Quote 0
                                • A
                                  Andi79 @Andi79 last edited by olivierlambert

                                  Some more Infos:

                                  #cat /proc/mdstat  
                                  Personalities : [raid1] 
                                  md126 : active raid1 sdb[1] sda[0]
                                        3906886464 blocks super 1.2 [2/2] [UU]
                                        bitmap: 3/30 pages [12KB], 65536KB chunk
                                  
                                  md127 : active raid1 sdd[1] sdc[0]
                                        937692352 blocks super 1.0 [2/2] [UU]
                                        bitmap: 1/7 pages [4KB], 65536KB chunk
                                  
                                  
                                  #mdadm --detail /dev/md126
                                  /dev/md126:
                                             Version : 1.2
                                       Creation Time : Sat Jun  4 12:08:56 2022
                                          Raid Level : raid1
                                          Array Size : 3906886464 (3725.90 GiB 4000.65 GB)
                                       Used Dev Size : 3906886464 (3725.90 GiB 4000.65 GB)
                                        Raid Devices : 2
                                       Total Devices : 2
                                         Persistence : Superblock is persistent
                                  
                                       Intent Bitmap : Internal
                                  
                                         Update Time : Tue Jun 14 19:55:36 2022
                                               State : clean 
                                      Active Devices : 2
                                     Working Devices : 2
                                      Failed Devices : 0
                                       Spare Devices : 0
                                  
                                  Consistency Policy : bitmap
                                  
                                                Name : server2-neu:md126  (local to host server2)
                                                UUID : 784c25d6:18f3a0c2:ca8fe399:d16ec0e2
                                              Events : 35383
                                  
                                      Number   Major   Minor   RaidDevice State
                                         0       8        0        0      active sync   /dev/sda
                                         1       8       16        1      active sync   /dev/sdb
                                  
                                  #mdadm --detail /dev/md127
                                  /dev/md127:
                                             Version : 1.0
                                       Creation Time : Sat Jun  4 09:56:54 2022
                                          Raid Level : raid1
                                          Array Size : 937692352 (894.25 GiB 960.20 GB)
                                       Used Dev Size : 937692352 (894.25 GiB 960.20 GB)
                                        Raid Devices : 2
                                       Total Devices : 2
                                         Persistence : Superblock is persistent
                                  
                                       Intent Bitmap : Internal
                                  
                                         Update Time : Tue Jun 14 19:56:15 2022
                                               State : clean 
                                      Active Devices : 2
                                     Working Devices : 2
                                      Failed Devices : 0
                                       Spare Devices : 0
                                  
                                  Consistency Policy : bitmap
                                  
                                                Name : localhost:127
                                                UUID : b5ab10b2:b89109af:9f4a274a:d7af50b3
                                              Events : 4450
                                  
                                      Number   Major   Minor   RaidDevice State
                                         0       8       32        0      active sync   /dev/sdc
                                         1       8       48        1      active sync   /dev/sdd
                                  
                                  #vgdisplay
                                    Device read short 82432 bytes remaining
                                    Device read short 65536 bytes remaining
                                    --- Volume group ---
                                    VG Name               VG_XenStorage-cd8f9061-df06-757c-efb6-4ada0927a984
                                    System ID             
                                    Format                lvm2
                                    Metadata Areas        1
                                    Metadata Sequence No  93
                                    VG Access             read/write
                                    VG Status             resizable
                                    MAX LV                0
                                    Cur LV                5
                                    Open LV               3
                                    Max PV                0
                                    Cur PV                1
                                    Act PV                1
                                    VG Size               <3,64 TiB
                                    PE Size               4,00 MiB
                                    Total PE              953826
                                    Alloc PE / Size       507879 / <1,94 TiB
                                    Free  PE / Size       445947 / 1,70 TiB
                                    VG UUID               yqsyV9-h2Gl-5lMf-486M-BI7f-r3Ar-Todeh9
                                     
                                    --- Volume group ---
                                    VG Name               VG_XenStorage-c29b2189-edf2-8349-d964-381431c48be1
                                    System ID             
                                    Format                lvm2
                                    Metadata Areas        1
                                    Metadata Sequence No  25
                                    VG Access             read/write
                                    VG Status             resizable
                                    MAX LV                0
                                    Cur LV                1
                                    Open LV               0
                                    Max PV                0
                                    Cur PV                1
                                    Act PV                1
                                    VG Size               <852,74 GiB
                                    PE Size               4,00 MiB
                                    Total PE              218301
                                    Alloc PE / Size       1 / 4,00 MiB
                                    Free  PE / Size       218300 / 852,73 GiB
                                    VG UUID               4MFkwD-1JW1-zVE3-QFKf-XmOX-QsSf-60oCKZ
                                  
                                  A 1 Reply Last reply Reply Quote 0
                                  • A
                                    Andi79 @Andi79 last edited by

                                    io.png

                                    ok.... it really seems to be an io problem. any ideas what could cause this?

                                    1 Reply Last reply Reply Quote 0
                                    • olivierlambert
                                      olivierlambert Vates 🪐 Co-Founder🦸 CEO 🧑‍💼 last edited by

                                      @fohdeesha does it ring any bell?

                                      fohdeesha 1 Reply Last reply Reply Quote 0
                                      • A
                                        Andi79 @Forza last edited by

                                        @Forza

                                        now i noticed that tje jdb2 and kworker processes have gone... but system is still extremly slow

                                        https://www.pzka.de/xen2.mp4

                                        on this video i try to install munin for future data to analyse the problem. As you can see absolutly nothing happens (this can take many minutes now). There is a rsync "running" at a very slow speed, but as you can see cpu usage is ultra low and also top says no system load.

                                        I think it must be a problem with xcp-ng, but I have no idea what it could be.

                                        1 Reply Last reply Reply Quote 0
                                        • olivierlambert
                                          olivierlambert Vates 🪐 Co-Founder🦸 CEO 🧑‍💼 last edited by

                                          Can you do a dmesg and also a smartctl -a /dev/sdb and smartctl /dev/sda?

                                          A 1 Reply Last reply Reply Quote 0
                                          • A
                                            Andi79 @olivierlambert last edited by Andi79

                                            @olivierlambert

                                            output of dmsg (on the vm):

                                            [132514.270681] systemd[1]: Starting Journal Service...
                                            [132604.475796] systemd[1]: systemd-journald.service: start operation timed out. Terminating.
                                            [132694.707996] systemd[1]: systemd-journald.service: State 'stop-sigterm' timed out. Killing.
                                            [132694.708039] systemd[1]: systemd-journald.service: Killing process 60936 (systemd-journal) with signal SIGKILL.
                                            [132784.940258] systemd[1]: systemd-journald.service: Processes still around after SIGKILL. Ignoring.
                                            [132797.289939] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=9/KILL
                                            [132797.289947] systemd[1]: systemd-journald.service: Failed with result 'timeout'.
                                            [132797.290322] systemd[1]: Failed to start Journal Service.
                                            [132797.291608] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 18.
                                            [132797.291833] systemd[1]: Stopped Journal Service.
                                            [132797.324750] systemd[1]: Starting Journal Service...
                                            [132808.130140] systemd-journald[61002]: File /var/log/journal/eb029b8cf0534f998db52d5afecd252b/system.journal corrupted or uncleanly shut down, renaming and replacing.
                                            [132817.090264] systemd[1]: Started Journal Service.
                                            [158702.774840] VFS: busy inodes on changed media sr0
                                            
                                            

                                            i haven't installed smarttools on the VMs, but i can do.... from the video to now the apt for installing munin is on 4% now. it tooks very very long

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post