XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    I cannot save due to VDI Chain error

    Scheduled Pinned Locked Moved Backup
    15 Posts 3 Posters 488 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • NephilimN Offline
      Nephilim
      last edited by Nephilim

      1 of our 5 servers went down and now can't be saved due to VDI Chain error and occasionally restarts vm's.
      (The vm's were not stored on the server that failed)
      There is one vm that is stuck and I can't delete it, surely it is related.
      I read on forum as a temporary solution to copy and migrate the vm, but of course after a few days it will fail again.
      We are now going to put a freshly installed one in place of the downed server. I've had this problem for a few weeks now and haven't found a solution to solve it, can anyone help me? 🙂

      tasks.png vdi_chain.png orphan.png

      1 Reply Last reply Reply Quote 0
      • DanpD Offline
        Danp Pro Support Team
        last edited by

        Look under Dashboard > Health tab in XO to see if you have VDIs listed in the Unhealthy VDIs section. You may have a storage issue that is preventing the coalesce process from completing successfully.

        NephilimN 1 Reply Last reply Reply Quote 0
        • NephilimN Offline
          Nephilim @Danp
          last edited by

          @Danp
          I'm sorry I'm just now writing back, this is what the interface is giving. I also tried "rescan all disks".

          84ab07e1-f503-4370-aaf4-c766ad557440-kép.png

          NephilimN 1 Reply Last reply Reply Quote 0
          • NephilimN Offline
            Nephilim @Nephilim
            last edited by

            If this helps: Current version: 5.95.2

            NephilimN 1 Reply Last reply Reply Quote 0
            • NephilimN Offline
              Nephilim @Nephilim
              last edited by

              I also restarted the toolstack on each of our servers.

              DanpD 1 Reply Last reply Reply Quote 0
              • DanpD Offline
                Danp Pro Support Team @Nephilim
                last edited by

                @Nephilim You should check the contents of SMlog on the pool master. You can use the following to locate any exceptions ---

                grep -A 5 -B 5 -i exception /var/log/SMlog

                NephilimN 1 Reply Last reply Reply Quote 0
                • NephilimN Offline
                  Nephilim
                  last edited by

                  08f12efb-e9f9-4ab7-bcfc-79c30724fb0b-kép.png

                  1500eb19-23fe-45a3-876d-636e587cbdb2-kép.png

                  Suddenly fixed, I'll write if I find anything in the log

                  NephilimN 1 Reply Last reply Reply Quote 0
                  • NephilimN Offline
                    Nephilim @Nephilim
                    last edited by Nephilim

                    Number of disk 200+ was also (and decreased to the above number)

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      So everything is OK now?

                      NephilimN 1 Reply Last reply Reply Quote 0
                      • NephilimN Offline
                        Nephilim @olivierlambert
                        last edited by

                        @olivierlambert Yes 🙂

                        1 Reply Last reply Reply Quote 0
                        • NephilimN Offline
                          Nephilim @Danp
                          last edited by Nephilim

                          @Danp grep does not return anything
                          4a1967ff-6f47-4eb0-8d32-96fcf479ce78-kép.png

                          509ba24d-0670-4bb9-ad8f-7dab09cd1854-kép.png

                          8e6561ce-430e-4108-bdab-ca3cadffd4a3-kép.png

                          NephilimN 1 Reply Last reply Reply Quote 0
                          • NephilimN Offline
                            Nephilim @Nephilim
                            last edited by

                            But this is still wrong. What can I do with this?

                            70e98265-06fa-4347-af8a-0ff3beaf62db-kép.png

                            DanpD 1 Reply Last reply Reply Quote 0
                            • DanpD Offline
                              Danp Pro Support Team @Nephilim
                              last edited by

                              @Nephilim said in I cannot save due to VDI Chain error:

                              But this is still wrong. What can I do with this?

                              There isn't anything wrong on that screen. It simply is showing a list of XO tasks that (mostly) haven't succeeded.

                              1 Reply Last reply Reply Quote 0
                              • NephilimN Offline
                                Nephilim
                                last edited by Danp

                                These are repeated in yesterday's log when it fixed itself, but I'd like to know what happened.

                                Sep 30 19:15:17 xcpng-Companyxyz SM: [4116] lock: released /var/lock/sm/xyID1/vdi
                                Sep 30 19:15:18 xcpng-Companyxyz SM: [4160] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID3.vhd)
                                Sep 30 19:15:20 xcpng-Companyxyz SM: [4184] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID4.vhd)
                                Sep 30 19:15:21 xcpng-Companyxyz SM: [4191] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID5.vhd)
                                Sep 30 19:15:21 xcpng-Companyxyz SM: [4200] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID6.vhd)
                                Sep 30 19:15:22 xcpng-Companyxyz SM: [4207] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID7.vhd)
                                Sep 30 19:15:23 xcpng-Companyxyz SM: [4220] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID8.vhd)
                                Sep 30 19:15:24 xcpng-Companyxyz SM: [4243] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID9.vhd)
                                Sep 30 19:15:25 xcpng-Companyxyz SM: [4250] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID10.vhd)
                                Sep 30 19:15:25 xcpng-Companyxyz SM: [4257] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID11.vhd)
                                Sep 30 19:15:26 xcpng-Companyxyz SM: [4264] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID12.vhd)
                                Sep 30 19:15:27 xcpng-Companyxyz SM: [4271] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID13.vhd)
                                Sep 30 19:15:27 xcpng-Companyxyz SM: [4287] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID14.vhd)
                                Sep 30 19:15:28 xcpng-Companyxyz SM: [4309] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID15.vhd)
                                Sep 30 19:15:29 xcpng-Companyxyz SM: [4323] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID16.vhd)
                                Sep 30 19:15:30 xcpng-Companyxyz SM: [4330] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID17.vhd)
                                Sep 30 19:15:30 xcpng-Companyxyz SM: [4339] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID18.vhd)
                                Sep 30 19:15:31 xcpng-Companyxyz SM: [4346] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID19.vhd)
                                Sep 30 19:15:32 xcpng-Companyxyz SM: [4353] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID20.vhd)
                                Sep 30 19:15:32 xcpng-Companyxyz SM: [4360] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID21.vhd)
                                Sep 30 19:15:33 xcpng-Companyxyz SM: [4373] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID22.vhd)
                                Sep 30 19:15:33 xcpng-Companyxyz SM: [4396] nfs-on-slave.check(/var/run/sr-mount/xyID2/xyID23.vhd)
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] lock: opening lock file /var/lock/sm/xyID24/vdi
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] lock: acquired /var/lock/sm/xyID25/vdi
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] Pause for xyID26
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] Calling tap pause with minor 8
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] ['/usr/sbin/tap-ctl', 'pause', '-p', '26306', '-m', '8']
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749]  = 0
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] lock: released /var/lock/sm/xyID27/vdi
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] lock: acquired /var/lock/sm/xyID27/vdi
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] Unpause for xyID27
                                Sep 30 19:19:56 xcpng-Companyxyz SM: [5749] Realpath: /var/run/sr-mount/xyID2/xyID27.vhd
                                Sep 30 19:19:57 xcpng-Companyxyz SM: [5749] lock: opening lock file /var/lock/sm/xyID2/sr
                                Sep 30 19:19:57 xcpng-Companyxyz SM: [5749] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/xyID2/xyID27.vhd']
                                Sep 30 19:19:57 xcpng-Companyxyz SM: [5749]   pread SUCCESS
                                Sep 30 19:19:57 xcpng-Companyxyz SM: [5749] Calling tap unpause with minor 8
                                Sep 30 19:19:57 xcpng-Companyxyz SM: [5749] ['/usr/sbin/tap-ctl', 'unpause', '-p', '26306', '-m', '8', '-a', 'vhd:/var/run/sr-mount/xyID2/xyID27.vhd']
                                Sep 30 19:19:57 xcpng-Companyxyz SM: [5749]  = 0
                                Sep 30 19:19:57 xcpng-Companyxyz SM: [5749] lock: released /var/lock/sm/xyID27/vdi
                                
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851] lock: released /var/lock/sm/xyID28/vdi
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851] lock: acquired /var/lock/sm/xyID28/vdi
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851] Unpause for xyID28
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851] Realpath: /var/run/sr-mount/xyID2/xyID28.vhd
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851] lock: opening lock file /var/lock/sm/xyID2/sr
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/xyID2/xyID28.vhd']
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851]   pread SUCCESS
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851] Calling tap unpause with minor 7
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851] ['/usr/sbin/tap-ctl', 'unpause', '-p', '13585', '-m', '7', '-a', 'vhd:/var/run/sr-mount/xyID2/xyID28.vhd']
                                Sep 30 19:43:51 xcpng-Companyxyz SM: [13851]  = 0
                                
                                Sep 30 20:16:26 xcpng-Companyxyz SM: [24489] lock: released /var/lock/sm/IDxz1/vdi
                                Sep 30 20:16:27 xcpng-Companyxyz SM: [24522] nfs-on-slave.check(/var/run/sr-mount/xyID2/OLD_IDxz1.vhd)
                                Sep 30 20:16:59 xcpng-Companyxyz SM: [24695] nfs-on-slave.check(/var/run/sr-mount/xyID2/IDxz2.vhd)
                                Sep 30 20:17:02 xcpng-Companyxyz SM: [24706] nfs-on-slave.check(/var/run/sr-mount/xyID2/OLD_IDxz3.vhd)
                                Sep 30 20:29:07 xcpng-Companyxyz SM: [28403] nfs-on-slave.check(/var/run/sr-mount/xyID2/IDxz4.vhd)
                                Sep 30 20:29:15 xcpng-Companyxyz SM: [28437] nfs-on-slave.check(/var/run/sr-mount/xyID2/IDxz.vhd)
                                Sep 30 20:29:17 xcpng-Companyxyz SM: [28466] nfs-on-slave.check(/var/run/sr-mount/xyID2/OLD_IDxz6.vhd)
                                Sep 30 20:29:19 xcpng-Companyxyz SM: [28473] lock: opening lock file /var/lock/sm/xyID27/vdi
                                Sep 30 20:29:19 xcpng-Companyxyz SM: [28473] lock: acquired /var/lock/sm/xyID27/vdi
                                Sep 30 20:29:19 xcpng-Companyxyz SM: [28473] Pause for xyID27
                                Sep 30 20:29:19 xcpng-Companyxyz SM: [28473] Calling tap pause with minor 8
                                Sep 30 20:29:19 xcpng-Companyxyz SM: [28473] ['/usr/sbin/tap-ctl', 'pause', '-p', '26306', '-m', '8']
                                

                                And They are repeated in today's log:

                                Oct  1 09:50:10 xcpng-xycompany SM: [17232] lock: released /var/lock/sm/xyzID1/vdi
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241] lock: opening lock file /var/lock/sm/xyzID1/vdi
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241] lock: acquired /var/lock/sm/xyzID1/vdi
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241] Unpause for xyzID1
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241] Realpath: /var/run/sr-mount/xyzID2/xyzID1.vhd
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241] lock: opening lock file /var/lock/sm/xyzID3/sr
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/xyzID3/xyzID1.vhd']
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241]   pread SUCCESS
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241] Calling tap unpause with minor 8
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241] ['/usr/sbin/tap-ctl', 'unpause', '-p', '26306', '-m', '8', '-a', 'vhd:/var/run/sr-mount/xyzID3/xyzID1.vhd']
                                Oct  1 09:50:11 xcpng-xycompany SM: [17241]  = 0
                                
                                DanpD 1 Reply Last reply Reply Quote 0
                                • DanpD Offline
                                  Danp Pro Support Team @Nephilim
                                  last edited by

                                  @Nephilim Nothing shown here would indicate a problem AFAICS.

                                  P.S. Added proper markup to your post

                                  1 Reply Last reply Reply Quote 1
                                  • First post
                                    Last post