XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XOA 5.107.2 Backup Failure via SMB and S3 (Backblaze)

    Scheduled Pinned Locked Moved Backup
    26 Posts 4 Posters 1.2k Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • planedropP Offline
      planedrop Top contributor @ravenet
      last edited by

      @ravenet @olivierlambert yeah going back to 5.106 seems to have resolved the issue. I want to give it one more day before saying 100% that it did, but all VMs in both my backup jobs last night finished properly.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Okay thanks, the backup code made a big leap in latest, so there's maybe something fishy in there 🤔 @florent : might be a lead to find a potential bug in the new code

        planedropP R 2 Replies Last reply Reply Quote 0
        • planedropP Offline
          planedrop Top contributor @olivierlambert
          last edited by

          @olivierlambert Happy to help in any way that I can as well!

          Notably, I am not seeing any issues doing backups to SMB or S3 with my lab at home which is on the latest. My lab is XCP-ng 8.3 though, rather than 8.2 like this production setup (which will be getting upgraded to 8.3 now that it's LTS), so maybe something specific with the new backup code and 8.2?

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            It's more likely related to your XO backup code than XCP-ng version (my gut feeling ATM)

            planedropP 1 Reply Last reply Reply Quote 1
            • planedropP Offline
              planedrop Top contributor @olivierlambert
              last edited by

              @olivierlambert Gotcha. I'll see if I can get this issue to replicate in my lab at all but so far my backups have been smooth over there.

              I'll try to re-create more similar backup jobs in the lab as well, maybe it's a specific setting or something on my jobs.

              1 Reply Last reply Reply Quote 0
              • R Offline
                ravenet @olivierlambert
                last edited by

                @olivierlambert

                @florent already jumped in on our case and submitted a fix for "_removeUnusedSnapshots don t handle vdi related to multiple VMs" that we were seeing.
                We have a vdi that won't coalesce, so I need to reopen that case. I think this above error was triggered from this angry vdi and my previous attempt to fix it.

                It was also noted that 5.107 was ignoring our backup setting for concurrent backups and was running all 24vms at once instead of the 3 we had set. Reverting to 5.106.4 resolved this. Waiting for an update on what's broken in 5.107 to ignore this setting. Different timezone so assume I'll hear tonight.

                I'm on 8.3 as well and fully updated with latest patches

                planedropP 1 Reply Last reply Reply Quote 0
                • planedropP Offline
                  planedrop Top contributor @ravenet
                  last edited by

                  @ravenet All of my errors seemed related to NBD access, so if the concurrency setting was being ignored, that might be the source of the issue I was seeing.

                  I'll watch my lab as well and see if the concurrency is being respected or not on the latest from the sources build.

                  Glad to see you were on 8.3, so not related to me being on 8.2.

                  R 1 Reply Last reply Reply Quote 0
                  • R Offline
                    ravenet @planedrop
                    last edited by

                    @planedrop I was getting a lot of NBD errors as well. so I'm not positive on if it was fully just ignoring the concurrency, or just moving onto next backup because of nbd communication error then just leaving the previous backups under active attempt. Either way, there's a bug if it leaves backup 'active' and then starts another one beyond set concurrent limit.

                    planedropP 1 Reply Last reply Reply Quote 0
                    • planedropP Offline
                      planedrop Top contributor @ravenet
                      last edited by

                      @ravenet Yeah another night of successful backups so I think going back to Stable did fix the issue. 2 for 2 on that now.

                      planedropP 1 Reply Last reply Reply Quote 0
                      • planedropP Offline
                        planedrop Top contributor @planedrop
                        last edited by

                        Wanted to post a quick update, it's been over a week now and the backups have been 100% successful.

                        Figured as such, but thought it was worth at least coming back here and confirming.

                        1 Reply Last reply Reply Quote 2
                        • olivierlambertO Offline
                          olivierlambert Vates 🪐 Co-Founder CEO
                          last edited by

                          Thanks for keeping us posted, really appreciated here by all the team 🙂

                          planedropP 1 Reply Last reply Reply Quote 1
                          • planedropP Offline
                            planedrop Top contributor @olivierlambert
                            last edited by

                            @olivierlambert @florent wanted to post an update here. I upgraded to 5.108.1 in this environment and my SMB backups are doing the same thing with the error just being Footer1 !== !Footer2

                            It seems about half of my backups fail, so it's not all VMs.

                            I will try to watch it tonight to see if it's possibly a concurrency issue where it's doing all of them at the same time instead of taking my 2 concurrency setting in place.

                            Let me know if you want any additional info and I can also submit a ticket.

                            1 Reply Last reply Reply Quote 0
                            • olivierlambertO Offline
                              olivierlambert Vates 🪐 Co-Founder CEO
                              last edited by

                              I think I've read a recent PR about this:

                              https://github.com/vatesfr/xen-orchestra/pull/8882

                              Can you try that branch and report?

                              fbeauchamp opened this pull request in vatesfr/xen-orchestra

                              draft fix(backup) error footer1 !== footer2 #8882

                              planedropP 1 Reply Last reply Reply Quote 0
                              • planedropP Offline
                                planedrop Top contributor @olivierlambert
                                last edited by

                                @olivierlambert This is a production setup, rather than my lab, so I don't want to use anything that has a chance of being unstable.

                                Do we know what version this change may end up in? I'm happy to try the Latest channel if/when this fix is in place.

                                Apologies if I am misunderstanding anything.

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by

                                  I don't know, but I can ask @xo-team

                                  Bastien NolletB 1 Reply Last reply Reply Quote 0
                                  • Bastien NolletB Offline
                                    Bastien Nollet Vates 🪐 XO Team @olivierlambert
                                    last edited by

                                    We'll try to make it available on 5.110 but we can't guarantee it for the moment (I think the PR Olivier linked is more of a temporary fix than a definitive solution).

                                    If you feel comfortable with directly modifying the code of your XO, you can apply this fix just by commenting two lines in a file and restarting xo-server. The lines to comment are these ones : https://github.com/vatesfr/xen-orchestra/pull/8882/commits/51a543df25af1b0069e07f6d9ad2608ed3476e29

                                    fbeauchamp opened this pull request in vatesfr/xen-orchestra

                                    draft fix(backup) error footer1 !== footer2 #8882

                                    planedropP 1 Reply Last reply Reply Quote 1
                                    • olivierlambertO Offline
                                      olivierlambert Vates 🪐 Co-Founder CEO
                                      last edited by

                                      Thanks @Bastien-Nollet !

                                      1 Reply Last reply Reply Quote 0
                                      • planedropP Offline
                                        planedrop Top contributor @Bastien Nollet
                                        last edited by

                                        @Bastien-Nollet Thanks for the suggestion, though I think I am going to avoid doing that in a production setup.

                                        I'll look forward to it being included in a newer version.

                                        Our backups run nightly and it's random which VMs fail with this error, but so far it's almost never been the same one each night, so at worst we are 24hrs behind on a single VM backup which isn't a huge deal.

                                        Our other nightly backup job to S3 works just fine too so we're covered.

                                        Thanks!

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post