XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Incremental Backups Periodically Results In EBUSY File Lock Error

    Scheduled Pinned Locked Moved Backup
    8 Posts 3 Posters 508 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • planedropP Offline
      planedrop Top contributor
      last edited by

      Trying to figure out the root cause of this, as I think it is something I'm doing but maybe something should be in place to prevent it.

      In a production environment we run nightly incremental backups for some pretty large VMs (1-2TiB), it works perfectly fine 99% of the time, but once in a while I get an error on just 1 VM (not always the same one), it will say something like EBUSY: and then something about the file being locked. With this we usually get failed merges too saying the parent VHD is missing. I will include more error details below, didn't want to fill up the top of the post with it.

      Anyway, point is I started to finally notice a pattern to this, it was commonly happening on Patch Tuesday, which got me thinking that I snapshot all our VMs on patch tuesday before doing the updates in case something breaks, and I start this around the start time of our backups.

      Is there any way that taking a snapshot of a VM, while it's backup is running, could result in a missing parent VHD?

      The only way to fix this, so far, that I've found, is to wipe the entire backup directory for that VMs UUID and then re-start the backup (then it always works from there on out until it happens again).

      ceaa6349-77b3-4565-9739-450defddb83c-image.png

      d1550146-f8b5-49de-bc19-c1d42e2c40d9-image.png

      For clarity, the child is in the TrueNAS directory for both of these whereas the parent is the local VM VHD.

      The only other thing I can think of is related to backing up/snapshotting our TrueNAS dataset for this, which starts about an hour before our VM backups to the TrueNAS and takes a snapshot first which should avoid messing with the data being written even if those cloud backups go longer than the VM backups to the NAS.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Hi,

        Can you switch to block backup and see if you have a similar problem?

        planedropP 1 Reply Last reply Reply Quote 0
        • planedropP Offline
          planedrop Top contributor @olivierlambert
          last edited by

          @olivierlambert Apologies, do you mean NBD? Or is there another setting and I'm just forgetting where it is?

          I'll also try to set this up in my lab and see if I can reproduce it, it's not a huge deal since I could easily just avoid taking snapshots at the same time a backup is running, but if that does create an issue IMO the UI should prevent users from doing it.

          Thanks as always for the help and excellent work!

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            No, I mean the block mode, which stores the VHD in 2MiB block files instead of big flat VHD. It is setup on the "remote" section directly 🙂

            planedropP 1 Reply Last reply Reply Quote 0
            • planedropP Offline
              planedrop Top contributor @olivierlambert
              last edited by

              @olivierlambert Ah gotcha, forgot that was an option in remotes haha! I'll give this a shot and see if things are any better.

              Is it safe to change a remote that isn't setup in block mode to block mode or should I create a new remote/redo the backups?

              florentF 1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                Good question, let me bring you @florent here 🙂

                1 Reply Last reply Reply Quote 1
                • florentF Offline
                  florent Vates 🪐 XO Team @planedrop
                  last edited by

                  @planedrop yes
                  but the first backup after will be a key ( complete) backup for all the VMs in the job

                  planedropP 1 Reply Last reply Reply Quote 0
                  • planedropP Offline
                    planedrop Top contributor @florent
                    last edited by

                    @florent OK good to know, thank you! I will do what I can to replicate this issue in my lab and then see if changing to block backups fixes it, just want to try and avoid changing things too much in the prod environment.

                    Appreciate the help!

                    1 Reply Last reply Reply Quote 1
                    • First post
                      Last post