XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    New and exciting backup errors

    Scheduled Pinned Locked Moved Xen Orchestra
    7 Posts 4 Posters 904 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      andrewreid
      last edited by

      Running c0d58 from source.

      I can't work out what suddenly happened, because I haven't changed anything that I can think of.

      All of a sudden, some delta backups are failing. Seems to be to do with timestamp formats, and the VHD cleaning process, with two of my seven VMs failing to backup to the S3 remote (Backblaze). In the GUI, the errors look like:

      1. Expected values to be strictly equal: + actual - expected + 4294960416 - 4294959235 ^
      2. Clean VM directory, missing or broken alias target, some metadata VHDs are missing
      3. Invalid RFC-7231 date-time value

      Here's the error log if it helps: https://gist.github.com/andrewreid/4a8e7ac8da8d7f381884d4732a03d94f

      Any ideas what I've done, or what might be happening?

      Cheers,

      Andrew

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates πŸͺ Co-Founder CEO
        last edited by

        Hi,

        The reflex should be to go on latest commit on master and see what's going on from there πŸ™‚

        A 1 Reply Last reply Reply Quote 0
        • A Offline
          andrewreid @olivierlambert
          last edited by andrewreid

          @olivierlambert Ta – you're right, I should have checked that before posting, but the issue persists on 667d0. Slightly different words but the same smell:

          https://gist.github.com/andrewreid/cf4f7299b2ae7e52c61e31471675740f

          Is this the best spot to discuss this, or is a Github issue the better forum?

          florentF 1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates πŸͺ Co-Founder CEO
            last edited by

            Let's start to investigate here πŸ™‚

            Invoking @florent

            1 Reply Last reply Reply Quote 0
            • florentF Offline
              florent Vates πŸͺ XO Team @andrewreid
              last edited by

              @andrewreid hi andrew, I think it's good to discuss of it here.

              We added a lot of information in the clean phase to ensure we do not miss the information causing full vhd backups when delta may have seem ok

              1. means a merge have been interrupted and could not be restarted safely
              2. an alias ( a link ) to a vhd is missing

              theses errors points to a ( or some ) interrupted backup and XO detecting it and recover : removing the broken files and probably making full backup after

              1. is interesting since its' in the aws sdk, I will try to find where does it come from. Also you reproduced it from master. Do you use a specific configuration of your bucket ( like object locking ) ?
              A 1 Reply Last reply Reply Quote 0
              • A Offline
                andrewreid @florent
                last edited by

                @florent Thank you for your reply!

                No, bog standard configuration with no object locking changes. This bucket has been receiving backups for months without fault, and no configuration has changed.

                The other remote is an NFS share and that’s working perfectly well.

                Is your hypothesis that the S3 backup has become corrupt? Thus, would the solution be to simply abandon these backups and create new ones?

                β€” Andrew

                A 1 Reply Last reply Reply Quote 0
                • A Online
                  Andrew Top contributor @andrewreid
                  last edited by

                  @andrewreid Not that this helps with your error, but my nightly S3 delta backups to Wasabi have been working just fine. I'm using XO source and keep mostly up to date with master.

                  S3 and NFS are different backup formats so NFS working does not mean S3 will work.

                  Sometimes Wasabi causes failures but not recently and not that last more than a few VMs in one night. The next backup run (manual restart or next nightly) seems to work correctly.

                  I force a full backup every 3 months just to make sure I have a good checkpoint in case undetected corruption creeps into the delta data. Plus I have replication and other backups.

                  I did have a problem once (not exactly yours) with a VM and just deleted the S3 backup data which forced a full backup again. Sucks because you don't have the weeks of deltas anymore.

                  Error #3 (date-time) is strange. May be a backblaze issue.

                  You could create a new S3 remote to backblaze (new bucket or directory) and test one failing VM backup to the new remote and see if it works (and works again for delta). You could set the retention low to test merge too. I have a setup like this with a local MinIO server for S3 testing and local backups.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post