XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Unable to copy a machine on the same host.

    Scheduled Pinned Locked Moved Xen Orchestra
    9 Posts 3 Posters 2.2k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      pnunn
      last edited by pnunn

      Hi Folks,
      I'm having a world of problems trying to copy a machine on a test setup I'm evaluating. I have a linux box running on the machine and using XO have selected copy. The snapshot happens and tasks are started.. but nothing more happens. They sit at 0% completed until they eventually vanish.

      I have started two today and here is the screen shot of where they sit several hours later.

      Screenshot from 2019-04-03 12-23-24.png

      This is XO running from source and has the latest patches. XCP-NG is patched on the machine and is the latest version. The SR is a shared nfs storage pool.

      I have another host running in another DC and have managed to start a machine there based on a backup just fine.

      I have asked on the XO forum and @olivierlambert has tested this on his setup and its working, so has suggested it might be an issue with XCP itself.

      I also just ran xe-toolstack-restart which removed the hung tasks, but restarting the copy produced the same thing (I also tried creating a machine from a snapshot with the same result).

      I have also just discovered that there are two machines with that name, one halted and one running (is the second a snapshot?). Neither should be named that if it is the failed copy.

      [root@xcp-oldyella log]# xe vm-list  uuid=548bf883-7e2b-511a-1847-eecde0b16ab3
      uuid ( RO)           : 548bf883-7e2b-511a-1847-eecde0b16ab3
           name-label ( RW): DarlingDownsPPDB
          power-state ( RO): halted
      
      
      [root@xcp-oldyella log]# xe vm-list  uuid=5f90dcd2-f92a-bd55-634a-84d1872736d0
      uuid ( RO)           : 5f90dcd2-f92a-bd55-634a-84d1872736d0
           name-label ( RW): DarlingDownsPPDB
          power-state ( RO): running
      
      

      The command line produces the same result... sigh

      xe vm-copy uuid=5f90dcd2-f92a-bd55-634a-84d1872736d0 sr-uuid=b7288376-5c37-5fcb-a3cc-fef74364bfef new-name-label=DarlingDownsDB
      

      More tasks in the task list with 0% against them, and the command never returns.

      I'm not sure where to go from here, but this is a bit of a show stopper.

      Ideas?

      Peter.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        If it's triggered also by xe, it's indeed not a XO issue.

        Try to take a look at the logs when you start the command to see if you spot any error message.

        1 Reply Last reply Reply Quote 0
        • P Offline
          pnunn
          last edited by

          OK, tried to run the command line version again and this time got the following in xensource.log

          Apr  3 22:15:58 xcp-oldyella xapi: [ info|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi||cli] xe vm-copy uuid=5f90dcd2-f92a-bd55-634a-84d1872736d0 sr-uuid=b7288376-5c37-5fcb-a3cc-fef74364bfef new-name-label=DarlingDownsDB username=root password=(omitted)
          Apr  3 22:15:58 xcp-oldyella xapi: [ info|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|session.login_with_password D:c816a2aa768d|xapi] Session.create trackid=e11966f780034f77cdd9c07de291314f pool=false uname=root originator=cli is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|session.login_with_password D:c816a2aa768d|mscgen] xapi=>xapi [label="pool.get_all"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27596 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:pool.get_all D:918652c94061 created by task D:c816a2aa768d
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|audit] VM.copy: VM = '5f90dcd2-f92a-bd55-634a-84d1872736d0 (DarlingDownsPPDB)'; new_name = 'DarlingDownsDB'; SR = 'b7288376-5c37-5fcb-a3cc-fef74364bfef (TheBeast1)'
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|xapi] Copying disks to SR: b7288376-5c37-5fcb-a3cc-fef74364bfef
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="session.slave_login"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27597 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:session.slave_login D:92adcd309180 created by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [ info|xcp-oldyella|27597 UNIX /var/lib/xcp/xapi|session.slave_login D:0bc07e2bf822|xapi] Session.create trackid=feedcfb9da2e88d058dfa9003edadc77 pool=true uname= originator=xapi is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27597 UNIX /var/lib/xcp/xapi|session.slave_login D:0bc07e2bf822|mscgen] xapi=>xapi [label="pool.get_all"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27598 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:pool.get_all D:de732102bd57 created by task D:0bc07e2bf822
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="VBD.get_record"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27599 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:VBD.get_record D:de8a36833970 created by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="VBD.get_record"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27600 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:VBD.get_record D:749b6a23bce2 created by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="Async.VDI.copy"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27601 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:VDI.copy D:c6a64ceda36f created by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [ info|xcp-oldyella|27601 UNIX /var/lib/xcp/xapi|dispatch:VDI.copy D:c6a64ceda36f|taskhelper] task Async.VDI.copy R:99082f096367 (uuid:adb3644f-fa2e-eaa4-c5ce-24504241e114) created (trackid=feedcfb9da2e88d058dfa9003edadc77) by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [ info|xcp-oldyella|27602 |Async.VDI.copy R:99082f096367|dispatcher] spawning a new thread to handle the current task (trackid=feedcfb9da2e88d058dfa9003edadc77)
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27602 |Async.VDI.copy R:99082f096367|audit] VDI.copy: VDI = 'ffc7a041-ff8d-42e1-8194-75327de3a184'; SR = 'b7288376-5c37-5fcb-a3cc-fef74364bfef (TheBeast1)'; base_vdi = 'invalid'; into_vdi = 'invalid'
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27602 |Async.VDI.copy R:99082f096367|xapi] Caught exception while marking VDI for VDI.copy in message forwarder: OTHER_OPERATION_IN_PROGRESS: [ VDI; OpaqueRef:4fb03903-4d95-4249-87cd-b927f71f086b ]
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27602 |Async.VDI.copy R:99082f096367|helpers] VDI.copy locking failed: caught transient failure OTHER_OPERATION_IN_PROGRESS: [ VDI; OpaqueRef:4fb03903-4d95-4249-87cd-b927f71f086b ]
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27602 |Async.VDI.copy R:99082f096367|helpers] Waiting for up to 6.735385 seconds before retrying...
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="session.slave_login"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27603 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:session.slave_login D:55e483ef9b42 created by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [ info|xcp-oldyella|27603 UNIX /var/lib/xcp/xapi|session.slave_login D:4d715d4d32cf|xapi] Session.create trackid=54457b3cb62c69c750d1baa6510f1616 pool=true uname= originator=xapi is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27603 UNIX /var/lib/xcp/xapi|session.slave_login D:4d715d4d32cf|mscgen] xapi=>xapi [label="pool.get_all"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27604 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:pool.get_all D:28f7d18ac003 created by task D:4d715d4d32cf
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="task.get_record"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27605 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:task.get_record D:0f2f88a646d2 created by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="task.get_record"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27606 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:task.get_record D:62a20eb48fe0 created by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="event.from"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27607 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:event.from D:d549c967696c created by task R:72edf37079f1
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27595 UNIX /var/lib/xcp/xapi|VM.copy R:72edf37079f1|mscgen] xapi=>xapi [label="event.from"];
          Apr  3 22:15:58 xcp-oldyella xapi: [debug|xcp-oldyella|27608 UNIX /var/lib/xcp/xapi||dummytaskhelper] task dispatch:event.from D:3ac54939abd2 created by task R:72edf37079f1
          Apr  3 22:16:04 xcp-oldyella xapi: [debug|xcp-oldyella|27120 INET :::80||dummytaskhelper] task dispatch:event.from D:99c1c67946fb created by task D:ed4ac73754eb
          Apr  3 22:16:04 xcp-oldyella xapi: [debug|xcp-oldyella|27602 |Async.VDI.copy R:99082f096367|xapi] Caught exception while marking VDI for VDI.copy in message forwarder: OTHER_OPERATION_IN_PROGRESS: [ VDI; OpaqueRef:4fb03903-4d95-4249-87cd-b927f71f086b ]
          Apr  3 22:16:04 xcp-oldyella xapi: [debug|xcp-oldyella|27602 |Async.VDI.copy R:99082f096367|helpers] VDI.copy locking failed: caught transient failure OTHER_OPERATION_IN_PROGRESS: [ VDI; OpaqueRef:4fb03903-4d95-4249-87cd-b927f71f086b ]
          Apr  3 22:16:04 xcp-oldyella xapi: [debug|xcp-oldyella|27602 |Async.VDI.copy R:99082f096367|helpers] Waiting for up to 14.295223 seconds before retrying...
          
          

          It looks as though there is already a task running somewhere... sigh... this is now really getting nasty.
          P.

          DanpD 1 Reply Last reply Reply Quote 0
          • DanpD Online
            Danp Pro Support Team @pnunn
            last edited by

            @pnunn said in Unable to copy a machine on the same host.:

            base_vdi = 'invalid'; into_vdi = 'invalid'

            Not sure, but this doesn't look right. Are you having issues cloning all VMs or just a single one?

            1 Reply Last reply Reply Quote 0
            • P Offline
              pnunn
              last edited by

              It seems to be a problem with all of the vm's (atleast the ones in the backup rotation, I haven't tried any of the test ones yet but will).

              Peter.

              P 1 Reply Last reply Reply Quote 0
              • P Offline
                pnunn @pnunn
                last edited by

                Hi Guys,

                OK, I finally managed to clone a machine using fast clone with the machine off. Will continue digging into this to try and work out why a simple copy wouldn't work.

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  It seems it could be related to something weird happening on your SR.

                  1 Reply Last reply Reply Quote 0
                  • P Offline
                    pnunn
                    last edited by

                    Yeh but unfortunate "something weird" could well be a show stopper. We are in the process of trying to work out weather to go with XCP-NG using our NFS shared storage or go with VMWare on the same storage. My strong preference is XCP, but I have to convince some others that its better than VMWare which is what they are used to. Something weird events are really not helping.

                    I'm not for a second saying its anything other than my lack of knowledge, but I don't want to paint us into a corner that I will cause me pain once I have production hardware and loads going.

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      All I can tell is that's not an usual issue. Never heard of problem of this kind before… So yeah, there is something weird. Could be related to NFS permissions maybe, or something between XCP-ng and your NFS host. It's not XCP-ng code directly, because otherwise everyone else using NFS will endure the same problems. So it's more a "combo" between XCP-ng and your NFS within your infrastructure.

                      Please read the wiki on where to check logs (especially SMlog) to spot potential hints on where's the issue.

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post