XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    sr iso disconnect and crashed my hosts

    Scheduled Pinned Locked Moved XCP-ng
    10 Posts 2 Posters 34 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • L Offline
      Luiz Avelino
      last edited by

      I have 50 hosts using xcp-ng and all of these hosts are using an smb/cifs sr to access the isos for new installations when necessary. 2 days ago all of the servers lost access to the repository and the sriso was locked, and I was unable to create, turn off or turn on any VM on any host, and I was unable to detach them either through xcng-center or cli. So I did a umount -l /run/sr-mount/UUID and then ran xe-toolstack-restart and then I was able to detach them and create, turn off and turn on the VMs. What is happening now that the CPU load is very high and I don't want to have to restart my 50 servers? Is there anything I can do to avoid having to turn them off?

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Hi,

        If you use a remote storage for VM disks or ISO, you must be sure the connection is stable. Otherwise, having issues is rather normal. It's like physically cut a SATA cable to your DVD/CD drive. You need to understand why you lost the connection to your SMB/CIFS share in the first place.

        L 2 Replies Last reply Reply Quote 0
        • L Offline
          Luiz Avelino @olivierlambert
          last edited by

          @olivierlambert

          ok olivierlambert
          anything i can do about the hosts as they are all experiencing high cpu consumption after this problem.

          1 Reply Last reply Reply Quote 0
          • L Offline
            Luiz Avelino @olivierlambert
            last edited by olivierlambert

            @olivierlambert

            Even after normalizing the processes, they still exist and I can't kill them.

            3874 ?        Ds     0:00 /usr/bin/python /opt/xensource/sm/ISOSR <methodCall><methodName>sr_scan</methodName><params><param><value><struct><member><name>host_ref</name><value>OpaqueRef:e64c8226-4c5d-431b-b7d7-e47b0d657348</value></member><member><name>command</name><value>sr_scan</value></member><member><name>args</name><value><array><data/></array></value></member><member><name>device_config</name><value><struct><member><name>SRmaster</name><value>true</value></member><member><name>username</name><value>fmcudi\luiz.avelino</value></member><member><name>vers</name><value>3.0</value></member><member><name>cifspassword_secret</name><value>83ff228e-4d7c-12d5-e685-fae07066b3ac</value></member><member><name>iso_path</name><value>/ISO</value></member><member><name>location</name><value>//10.40.2.235/repositorio</value></member><member><name>type</name><value>cifs</value></member></struct></value></member><member><name>session_ref</name><value>OpaqueRef:77a8540e-8149-4889-8148-ff5ea9a2d7f1</value></member><member><name>sr_ref</name><value>OpaqueRef:72cda44d-eccb-4b11-8400-1cc9ad6e400b</value></member><member><name>sr_uuid</name><value>6a09db76-744c-4123-18ef-7e423d0bcad6</value></member><member><name>subtask_of</name><value>DummyRef:|32004b14-6e20-4a66-bfa5-f2dd6b3472d9|SR.scan</value></member></struct></value></param></params></methodCall>
            14091 ?        D      0:00      \_ df -h
            14722 ?        D      0:00      \_ mount -o remount /run/sr-mount/6a09db76-744c-4123-18ef-7e423d0bcad6
            22275 ?        Ds     0:00 /usr/bin/python /opt/xensource/sm/ISOSR <methodCall><methodName>sr_scan</methodName><params><param><value><struct><member><name>host_ref</name><value>OpaqueRef:e64c8226-4c5d-431b-b7d7-e47b0d657348</value></member><member><name>command</name><value>sr_scan</value></member><member><name>args</name><value><array><data/></array></value></member><member><name>device_config</name><value><struct><member><name>SRmaster</name><value>true</value></member><member><name>username</name><value>fmcudi\luiz.avelino</value></member><member><name>vers</name><value>3.0</value></member><member><name>cifspassword_secret</name><value>83ff228e-4d7c-12d5-e685-fae07066b3ac</value></member><member><name>iso_path</name><value>/ISO</value></member><member><name>location</name><value>//10.40.2.235/repositorio</value></member><member><name>type</name><value>cifs</value></member></struct></value></member><member><name>session_ref</name><value>OpaqueRef:5bf06125-0518-43cc-99dc-8d0c235a0e84</value></member><member><name>sr_ref</name><value>OpaqueRef:72cda44d-eccb-4b11-8400-1cc9ad6e400b</value></member><member><name>sr_uuid</name><value>6a09db76-744c-4123-18ef-7e423d0bcad6</value></member><member><name>subtask_of</name><value>DummyRef:|73124cc8-678a-406d-8bca-4a22ad1178a4|SR.scan</value></member></struct></value></param></params></methodCall>
            25031 ?        D      0:00  \_ mount.cifs //10.40.2.235/repositorio /var/run/sr-mount/6a09db76-744c-4123-18ef-7e423d0bcad6 -o cache=none,vers=3.0,domain=fmcudi
            27964 ?        D      0:00  \_ mount.cifs //10.40.2.235/repositorio /var/run/sr-mount/6a09db76-744c-4123-18ef-7e423d0bcad6 -o cache=none,vers=3.0,domain=fmcudi
            
            1 Reply Last reply Reply Quote 0
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              Double check your hosts are fully up to date (which version? you haven't provided many useful information on your environment in your first post 😢 )

              L 1 Reply Last reply Reply Quote 0
              • L Offline
                Luiz Avelino @olivierlambert
                last edited by

                @olivierlambert

                are all on XCP-ng 8.2 version and are not up to date.

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  You should really start by getting all your hosts up to date first, reboot after updates and see if it happens again, while trying to fix your connectivity problem to your network shares.

                  L 1 Reply Last reply Reply Quote 0
                  • L Offline
                    Luiz Avelino @olivierlambert
                    last edited by

                    @olivierlambert

                    the connection problem is already fixed, I didn't want to have to update or restart the hosts at this time.

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by olivierlambert

                      Then reboot in the next maintenance window 🙂

                      L 1 Reply Last reply Reply Quote 0
                      • L Offline
                        Luiz Avelino @olivierlambert
                        last edited by Danp

                        @olivierlambert

                        The connectivity problem with sr iso was fixed, but the server load was a little high and also several sleep processes.

                        top - 12:29:57 up 273 days, 13:48,  2 users,  load average: 166.86, 166.80, 166.56
                        Tasks: 763 total,   2 running, 618 sleeping,   0 stopped,   0 zombie
                        %Cpu(s):  1.1 us,  1.2 sy,  0.0 ni, 96.1 id,  0.2 wa,  0.0 hi,  0.1 si,  1.2 st
                        KiB Mem :  8110880 total,  6877372 free,   783152 used,   450356 buff/cache
                        
                        
                        
                        
                           1 22275 root     Ds    0.0 ISOSR           -
                        22691 22701 root     Ds    0.0 sadc            smb2_reconnect
                        23099 23104 root     Ds    0.0 sadc            smb2_reconnect
                        23232 23243 root     Ds    0.0 sadc            smb2_reconnect
                        23259 23264 root     Ds    0.0 sadc            smb2_reconnect
                        23284 23292 root     Ds    0.0 sadc            smb2_reconnect
                        23381 23386 root     Ds    0.0 sadc            smb2_reconnect
                        23977 23983 root     Ds    0.0 sadc            smb2_reconnect
                        24062 24074 root     Ds    0.0 sadc            smb2_reconnect
                        24065 24077 root     Ds    0.0 sadc            smb2_reconnect
                        24089 24094 root     Ds    0.0 sadc            smb2_reconnect
                        24120 24125 root     Ds    0.0 sadc            smb2_reconnect
                        24792 24797 root     Ds    0.0 sadc            smb2_reconnect
                        24959 24970 root     Ds    0.0 sadc            smb2_reconnect
                        25018 25028 root     Ds    0.0 sadc            smb2_reconnect
                            1 25031 root     D     0.0 mount.cifs      cifs_get_smb_ses
                        25195 25200 root     Ds    0.0 sadc            smb2_reconnect
                        25279 25289 root     Ds    0.0 sadc            smb2_reconnect
                        25742 25747 root     Ds    0.0 sadc            smb2_reconnect
                        26013 26018 root     Ds    0.0 sadc            smb2_reconnect
                        26137 26142 root     Ds    0.0 sadc            smb2_reconnect
                        26266 26276 root     Ds    0.0 sadc            smb2_reconnect
                        26494 26505 root     Ds    0.0 sadc            smb2_reconnect
                        26519 26524 root     Ds    0.0 sadc            smb2_reconnect
                        26975 26981 root     Ds    0.0 sadc            smb2_reconnect
                        27006 27014 root     Ds    0.0 sadc            smb2_reconnect
                        27636 27641 root     Ds    0.0 sadc            smb2_reconnect
                            1 27964 root     D     0.0 mount.cifs      cifs_get_smb_ses
                        27966 27983 root     Ds    0.0 sadc            smb2_reconnect
                        28127 28139 root     Ds    0.0 sadc            smb2_reconnect
                        28138 28143 root     Ds    0.0 sadc            smb2_reconnect
                        28182 28187 root     Ds    0.0 sadc            smb2_reconnect
                        28339 28350 root     Ds    0.0 sadc            smb2_reconnect
                        28937 28942 root     Ds    0.0 sadc            smb2_reconnect
                        29031 29036 root     Ds    0.0 sadc            smb2_reconnect
                        
                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post