XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Backup fails only on master with VM_NO_SUSPEND_SR

    Scheduled Pinned Locked Moved Xen Orchestra
    20 Posts 3 Posters 642 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C Offline
      CJ
      last edited by

      I recently changed my backups from Normal to With Memory. I have two hosts in my pool. The master host gives me the VM_NO_SUSPEND_SR error when backing up any VMs on it. The other host successfully performs backups.

      I do not have a Suspend SR set on the pool nor do any of the VMs have one set. If I migrate a VM from the master to my other host then I can successfully backup. However, upon attempting to migrate the VM back to the master host gives me the cryptic log message of "operation failed".

      There are no settings differences that I can find between the two hosts. Hardware wise the working host is one generation newer a cpu but I'm not sure why that would let it suspend a VM.

      Based on this post https://xcp-ng.org/forum/topic/5475/backup-failed/5 my understanding is that because I don't have a Suspend SR configured anywhere that all of my backups should fail. But instead they only fail when on the master host.

      1 Reply Last reply Reply Quote 0
      • DanpD Offline
        Danp Pro Support Team
        last edited by

        Check each host to see what the following command returns --

        xe host-list params=suspend-image-sr-uuid

        C 1 Reply Last reply Reply Quote 0
        • C Offline
          CJ @Danp
          last edited by CJ

          @Danp

          They both returned the same result.

          suspend-image-sr-uuid ( RW)    : <not in database>
          
          
          suspend-image-sr-uuid ( RW)    : UUID
          

          The UUID that gets returned shows up as the local storage on the working host when I paste it into the Storage filter box.

          DanpD 1 Reply Last reply Reply Quote 0
          • DanpD Offline
            Danp Pro Support Team @CJ
            last edited by

            @CJ Those seem to be different results to me. I'm guessing the host that returns a UUID is the one that successfully performs the backup, correct?

            C 1 Reply Last reply Reply Quote 0
            • C Offline
              CJ @Danp
              last edited by

              @Danp Both hosts return the exact same text from the command. They both show two entries.

              Yes, the host that works is the one whose local storage matches the UUID.

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                It means you have no suspend SR defined when "<not in database>" and that's your problem 🙂

                C 1 Reply Last reply Reply Quote 0
                • C Offline
                  CJ @olivierlambert
                  last edited by

                  @olivierlambert Well, yes. But how is there one suspend-sr defined? I never set any as I wasn't even aware that I needed to and there isn't anything in the XO UI that shows it.

                  How does one host have a suspend-sr and the other if it's defined at the pool level? Does it need to be networked storage or can each host just use their local storage?

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by olivierlambert

                    In the pool view/advanced tab/Suspend SR.

                    It's mandatory to be able to suspend/restore snapshots with memory 🙂

                    edit: it's a setting that's pool wide, not host wide. It's not set by default, it's up to you to choose the SR to suspend.

                    C 1 Reply Last reply Reply Quote 0
                    • C Offline
                      CJ @olivierlambert
                      last edited by

                      @olivierlambert I understand that a Suspend SR is required. What I'm struggling with is that I have no Suspend SR defined anywhere in XO. The advanced tab for my Pool shows None, as does the advanced tab for all of my VMs.

                      Running the command @Danp provided shows the two entries, which I'm additionally confused about if the setting is pool wide. Both hosts return a "not in database" result as well as the UUID result.

                      The reason I asked if it needs to be networked is because I currently use local storage for my VMs. Therefore I assume I would need a Suspend SR for each host, but as it's pool wide than I believe I would need to make a network share that they could both access.

                      DanpD 1 Reply Last reply Reply Quote 0
                      • olivierlambertO Offline
                        olivierlambert Vates 🪐 Co-Founder CEO
                        last edited by olivierlambert

                        Okay I understand. It's meant to work poolwide, but if you are using a local SR (which is not really the best practice in a pool with multiple hosts), then you can setup one per host.

                        You can specify one per host with the following CLI:
                        xe host-param-set suspend-image-sr-uuid=<SR UUID>

                        C 2 Replies Last reply Reply Quote 0
                        • DanpD Offline
                          Danp Pro Support Team @CJ
                          last edited by

                          @CJ said in Backup fails only on master with VM_NO_SUSPEND_SR:

                          Both hosts return a "not in database" result as well as the UUID result.

                          Maybe this will make it more understandable --

                          xe host-list params=uuid,suspend-image-sr-uuid

                          C 1 Reply Last reply Reply Quote 0
                          • C Offline
                            CJ @olivierlambert
                            last edited by

                            @olivierlambert Any ideas how it would have gotten set in the first place and why it doesn't show in XO? I didn't even know it was a thing previously so I didn't set it.

                            1 Reply Last reply Reply Quote 0
                            • C Offline
                              CJ @Danp
                              last edited by

                              @Danp That still returns the same result from each host. But the included UUIDs map the "not in database" to my master host.

                              DanpD 1 Reply Last reply Reply Quote 0
                              • DanpD Offline
                                Danp Pro Support Team @CJ
                                last edited by

                                @CJ Right... each time you run the command it returns the result for all hosts. By adding the UUID param, you can more easily identify which result goes with which host.

                                C 1 Reply Last reply Reply Quote 0
                                • C Offline
                                  CJ @Danp
                                  last edited by

                                  @Danp Your initial post seemed to suggest that I should see different results on each host.

                                  Any ideas how the one got set?

                                  1 Reply Last reply Reply Quote 0
                                  • DanpD Offline
                                    Danp Pro Support Team
                                    last edited by

                                    Yes... I currently have only a single host for testing, and I didn't realize that the result from xe would contain the setting for all hosts.

                                    1 Reply Last reply Reply Quote 0
                                    • C Offline
                                      CJ @olivierlambert
                                      last edited by

                                      @olivierlambert I set the suspend-sr for the master host and now my backup completes successfully. Can XO be changed to expose this setting for each host?

                                      I do want to clarify two things though in order to make sure I understand things correctly.

                                      1. If I set a Suspend SR in XO for the pool, will this replace the existing values for each host?

                                      2. If my VMs are on local SR but the Suspend SR is a network share, does this mean that the backup process will send the memory to the networked SR during the snapshot and then send the data from the local SR and the networked SR to XO for transfer to the backup remote?

                                      1 Reply Last reply Reply Quote 0
                                      • olivierlambertO Offline
                                        olivierlambert Vates 🪐 Co-Founder CEO
                                        last edited by

                                        It's not meant to be configured per host, only in cases where you need local SR, that's why it's not visible in XO. But in those specific cases, the CLI does the trick 🙂

                                        If you set it for the pool, I'm not sure it will erase it for each host, I never tried myself to modify it both at pool and host level, so I don't know.

                                        The suspend SR is only used to store the memory, nothing else. So yes in your case, the memory disk will only be sent to the networked SR and that's it. XO will fetch all the disks, but in any case this will go through the host themselves then to XOA.

                                        C 1 Reply Last reply Reply Quote 0
                                        • C Offline
                                          CJ @olivierlambert
                                          last edited by

                                          My VM backups have been working since I set the Suspend SR for the other host. However, my metadata backup failed with the following error.

                                          EBUSY: resource busy or locked, unlink '/run/xo-server/mounts/UUID/xo-pool-metadata-backups/UUID/UUID/DATETIME/.nfs000000000000000e00000060'
                                          

                                          I restarted the XO VM and was able to successfully complete a metadata backup. Hopefully this isn't a consequence of backing XO up with memory.

                                          1 Reply Last reply Reply Quote 0
                                          • olivierlambertO Offline
                                            olivierlambert Vates 🪐 Co-Founder CEO
                                            last edited by

                                            It's completely unrelated 🙂

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post