XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Detached VM Snapshots after Warm Migration

    Scheduled Pinned Locked Moved Backup
    13 Posts 3 Posters 90 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DustyArmstrongD Offline
      DustyArmstrong
      last edited by

      I recently migrated all my VMs to new hosts in a new pool via warm migration, which worked great, but I am now seeing "detached snapshots" for the VMs that I moved. I have tested on another VM which was created natively on one of the new hosts and it does not have the same problem.

      It says it has a "Missing VM":

      ff5b9719-c951-4ceb-8b02-048737513911-image.png

      I also recently migrated from a Xen Orchestra container to another (x86 -> ARM), but unsure if that has anything to do with it - I restored my XO config from a metadata backup. The natively created VM was made after this move.

      Is there something I need to do or re-configure to prevent this? The snapshots appear to work, but I have not yet tried to revert to one.

      DustyArmstrongD 1 Reply Last reply Reply Quote 0
      • DustyArmstrongD Offline
        DustyArmstrong @DustyArmstrong
        last edited by DustyArmstrong

        Looked at some other posts, it seems it probably does have something to do with my old XO. This is no longer active, can I re-associate the VMs with my current XO instance? I imported my config from a backup, so I thought that would cover it, my new XO instance spun up exactly like the old one.

        Broken snapshot (logs show it initiated from the pool master's IP):

        5505 HTTPS 123.456.789.2->:::80|Async.VM.snapshot

        ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' '' '' 'OpaqueRef:cce8c856-3310-a551-7cf3-f3ec6d68903d')

        Working snapshot (logs show it initiated from my XO's IP):

        62327 HTTPS 123.456.789.6->|Async.VM.snapshot

        ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' 'CCTV' 'cd95df02-a907-bdaa-2c0e-ca503656460b' 'OpaqueRef:00c31a47-69d9-753d-5481-d5a10e881e13')

        So it seems that the broken VMs are being initiated from the host they run on, not XO.

        DustyArmstrongD 1 Reply Last reply Reply Quote 0
        • DustyArmstrongD Offline
          DustyArmstrong @DustyArmstrong
          last edited by

          OK, I still had my old pool master registered as a server in XO (disabled), removing this and restarting XO now shows the VMs properly in the audit logs when snapshot, and it shows them being initiated by my XO IP, but the snapshots are still seen as detached ("Missing VM").

          Maybe it will just resolve itself eventually if XAPI is holding onto stale references? I still have my old backups ("Detached backups") that I'm waiting to remove (job itself removed), maybe that's also impacting things - I will be removing the backed up VHDs once my NAS has copied them to an external drive (very slow!).

          I'm not opposed to blowing XO away and starting it over, I need to start a new backup job anyway - if there's a procedure/best practice to go by in order to re-create all the metadata I'd be interested.

          P 1 Reply Last reply Reply Quote 0
          • P Online
            Pilow @DustyArmstrong
            last edited by

            @DustyArmstrong I have a corner case where I see these detached snapshots, related to backup.

            I have a (remote) pool, the master is attached by a XOproxy on main XOA.
            All snapshots done by remote XOA are seen as "detached" snapshots on the main XOA where the pool is a distant one.

            Main XOA do not own backup jobs, so it is not aware of the snapshots, so mark them as detached.

            I think this is your current situation, you have 2 XO servers, one doing snapshots, the other seeing them as detached.

            DustyArmstrongD 1 Reply Last reply Reply Quote 0
            • DustyArmstrongD Offline
              DustyArmstrong @Pilow
              last edited by

              @Pilow Thanks for your reply.

              I only have 1 XO currently, the other old one is offline - they're both on the same physical box, just a different Docker container. The old XO is currently downed and has been for a while, it never interacted with this new pool I made either. I downed it, restored my XO config to the new XO, which all worked fine. Steps were:

              1. Down old XO
              2. Up new XO and restore XO config (same IP), all good
              3. Install XCP on new hosts (hardware refresh)
              4. Create pool from new hosts on new XO
              5. Warm migrate VMs from old hosts to new hosts, all good

              My old XO server hasn't interacted with any of this, it can't, it would cause a clash by using the same ports. I haven't checked yet today, but I'm hoping it just kind of resolves itself now the audit logs are showing the correct VM info. I feel like it's just a sticky frontend cache of some sort.

              P 1 Reply Last reply Reply Quote 0
              • P Online
                Pilow @DustyArmstrong
                last edited by

                @DustyArmstrong ho okay, so old XO is downed.

                You could have 2 XO connected to the same pools, as long as they do not have same IP address.

                beware, I do not tell you to do that, it is not best practice at all if you ever are in this situation, you must understand what could happen (detached snaps, licence problems, ...)

                but I thought this was the case

                in your case, did you do snaps/backups BEFORE reverting the XO config of old XO to new XO ?
                as far as I understand, this could have get you detached snapshots (snapshots being on VMs but not initiated by the restored version of your current online XO)

                or do I overthink this too much ?!

                keep us informed if you manage to clear the situation, curious about it

                DustyArmstrongD 1 Reply Last reply Reply Quote 0
                • DustyArmstrongD Offline
                  DustyArmstrong @Pilow
                  last edited by

                  @Pilow They would have the same IP address, the new XO is just a new Docker container on the same physical host but with a new database (different version of Redis on ARM so couldn't like-for-like re-use). There is only 1 XO running, 100% certain. The snapshots in the audit log now reflect XO as having initiated them, where before it was the host itself (fallback) - this may all be magically resolved now, but I'm not home to look yet.

                  No, I made a backup of all my VMs before the warm migration, but the backup was made on the new XO instance, successfully, backing up the VMs on the old hosts (XCP1 & XCP2).

                  So the full process I took was:

                  1. 2 weeks ago, downed old XO
                  2. Brought up brand new XO (fresh Redis DB), imported config, all working
                  3. This weekend, spun up 2 new XCP hosts (lot of drama but we will ignore that) XC1 & XC2
                  4. Created new pool containing the new XCP hosts (XC1 & XC2)
                  5. Initiated a backup of all VMs on old host pool (XCP1 & XCP2) using an existing scheduled backup - manually triggered - backup succeeds
                  6. Warm migrate VMs from pool XCP1/XCP2 to XC1/XC2 - success
                  7. Disable old host pool XCP1/XCP2, VMs working as expected
                  8. Snapshot a warm migrated VM - detached snapshot - snapshotting a VM created natively on XC1/XC2 does not have this issue
                  9. Removed old pool XCP1/XCP2 entirely from XO - this solved the audit logs side of things (I think this is XAPI)
                  10. Logs now show VM info correctly, but frontend still displays a detached snapshot

                  My guess is that it may think the snapshots were happening on the old pool/old VM, not the warm migrated copy, or something like that.

                  I will update the thread if I manage to resolve everything, I'll keep an eye out anyway in case someone from Vates knows any more on this!

                  DustyArmstrongD 1 Reply Last reply Reply Quote 0
                  • DustyArmstrongD Offline
                    DustyArmstrong @DustyArmstrong
                    last edited by DustyArmstrong

                    Still not working, blew away my old backups and deleted the job, still getting detached snapshots. Interestingly, of my two hosts, the pool slave isn't actually recording the snapshot properly still. Tested the "revert snapshot" from XO, which works properly - everything works properly except the detached snapshot warning. Probably also worth noting that I did not select the "delete source VM", I deleted them manually later.

                    Snapshotting a VM on XC2, slave in a pool. Don't know if this is expected or not for a pool slave.

                    Pool master (Displays the IP of XO):

                    HTTPS 123.456.789.6->|Async.VM.snapshot R:da375f8192ab|audit] ('trackid=2f566a5e238297cab4efbd4023dba8da' 'LOCAL_SUPERUSER' 'root' 'ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' 'VM_Name0' 'e0f0 29ec-f0b8-1c14-c654-80c6ecec582c' 'OpaqueRef:f0fae2b6-7729-b539-066f-16a795b0b22f')

                    Pool slave (displays IP of the XCP host itself):

                    HTTPS 123.456.789.2->:::80|Async.VM.snapshot R:da375f8192ab|audit] ('trackid=f49a8b9396c939d4cd5335542cba3848' 'LOCAL_SUPERUSER' '' 'ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' '' '' 'OpaqueRef:f0fae2b6-7729-b539-066f-16a795b0b22f')

                    I don't know if this means anything. When I snapshot a VM on XC1 (warm migrated) it also shows in the logs of the pool master, but with the IP of XO (still detached):

                    HTTPS 123.456.789.6->|Async.VM.snapshot R:cbd3f663c5d7|audit] ('trackid=2f566a5e238297cab4efbd4023dba8da' 'LOCAL_SUPERUSER' 'root' 'ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' 'VM_Name1 'c7c63201-e25a-a7d5-7e39-394636538866' 'OpaqueRef:72f934a8-bfd6-f01c-5917-234cacdd49d5')

                    When I snapshot a VM I created natively (no warm migration), it looks exactly the same and is not detached:

                    HTTPS 123.456.789.6->|Async.VM.snapshot R:3e4f5c3dd9f2|audit] ('trackid=2f566a5e238297cab4efbd4023dba8da' 'LOCAL_SUPERUSER' 'root' 'ALLOWED' 'OK' 'API' 'VM.snapshot' (('vm' 'VM_Name2' 'cd95df02-a907-bdaa-2c0e-ca503656460b' 'OpaqueRef:00c31a47-69d9-753d-5481-d5a10e881e13')

                    Looking at another thread, running xl list shows VMs with a mixture of some with [XO warm migration Warm migration] tagged and some not. My VM that wasn't migrated shows as its normal name (matching output of xe vm-list), this one snapshots fine. Notably, another VM that was migrated doesn't show the migration tags, but doesn't snapshot properly anyway. I renamed the VMs after migrating. Renaming again doesn't update the output of xl list but does update the output of xe vm-list.

                    If anyone in the know can help me understand what's going on I'd be most appreciative, I doubt I'll be able to backup my VMs until it's resolved. Alternatively, if anyone can confirm what steps I should take to completely rebuild XO such that the data is held correctly so this stops happening. To re-iterate, this only seems to happen on warm migrated VMs.

                    P 1 Reply Last reply Reply Quote 0
                    • P Online
                      Pilow @DustyArmstrong
                      last edited by

                      @DustyArmstrong mmmm what do you mean by pool master and pool slave ?

                      You have one pool with 2 hosts ? one master and one slave ?
                      could you screenshot the HOME/Hosts page ?
                      and the SETTINGS/SERVERS page ?

                      DustyArmstrongD 1 Reply Last reply Reply Quote 0
                      • DustyArmstrongD Offline
                        DustyArmstrong @Pilow
                        last edited by

                        @Pilow Yes one pool, 2 servers (a master and a secondary/slave).

                        2e9ca12d-21b4-4bf9-a7ac-e256b6e15652-image.png

                        c042186d-ecbb-476b-ad6d-3f83a265bfd4-image.png

                        I think I've realised per my last post update, when I warm migrated I didn't select "delete source VM", which has probably broken something since I retired the old hosts afterwards.

                        I mainly just want to know the best method to wipe XO and start over so it can rebuild the database.

                        P 1 Reply Last reply Reply Quote 0
                        • P Online
                          Pilow @DustyArmstrong
                          last edited by

                          @DustyArmstrong said in Detached VM Snapshots after Warm Migration:

                          I mainly just want to know the best method to wipe XO and start over so it can rebuild the database.

                          ha, to wipe XO 😲 , check the troubleshooting section of documentation here :
                          https://docs.xen-orchestra.com/troubleshooting#reset-configuration

                          it is a destructive command for your XO database ! 
                          
                          DustyArmstrongD 1 Reply Last reply Reply Quote 0
                          • DustyArmstrongD Offline
                            DustyArmstrong @Pilow
                            last edited by DustyArmstrong

                            @Pilow Yeah I don't particularly want to, but in the absence of any alternatives! I'll give it a day in case anyone responds, otherwise I'll just wipe it - assuming it leaves the hosts untouched. I just need to do whatever needs to be done to re-associate the VM UUIDs and would hope a total rebuild of XO would do that.

                            A 1 Reply Last reply Reply Quote 0
                            • A Online
                              acebmxer @DustyArmstrong
                              last edited by acebmxer

                              @DustyArmstrong

                              Try to deploy XO from sources and connect host to that and see if problem persist.

                              I use this script for install - https://github.com/ronivay/XenOrchestraInstallerUpdater

                              Just dont do what i did and connect SR to seperate pools 🙂

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post