XOA 5.107.2 Backup Failure via SMB and S3 (Backblaze)
-
@ravenet @olivierlambert yeah going back to 5.106 seems to have resolved the issue. I want to give it one more day before saying 100% that it did, but all VMs in both my backup jobs last night finished properly.
-
Okay thanks, the backup code made a big leap in
latest
, so there's maybe something fishy in there@florent : might be a lead to find a potential bug in the new code
-
@olivierlambert Happy to help in any way that I can as well!
Notably, I am not seeing any issues doing backups to SMB or S3 with my lab at home which is on the latest. My lab is XCP-ng 8.3 though, rather than 8.2 like this production setup (which will be getting upgraded to 8.3 now that it's LTS), so maybe something specific with the new backup code and 8.2?
-
It's more likely related to your XO backup code than XCP-ng version (my gut feeling ATM)
-
@olivierlambert Gotcha. I'll see if I can get this issue to replicate in my lab at all but so far my backups have been smooth over there.
I'll try to re-create more similar backup jobs in the lab as well, maybe it's a specific setting or something on my jobs.
-
@florent already jumped in on our case and submitted a fix for "_removeUnusedSnapshots don t handle vdi related to multiple VMs" that we were seeing.
We have a vdi that won't coalesce, so I need to reopen that case. I think this above error was triggered from this angry vdi and my previous attempt to fix it.It was also noted that 5.107 was ignoring our backup setting for concurrent backups and was running all 24vms at once instead of the 3 we had set. Reverting to 5.106.4 resolved this. Waiting for an update on what's broken in 5.107 to ignore this setting. Different timezone so assume I'll hear tonight.
I'm on 8.3 as well and fully updated with latest patches
-
@ravenet All of my errors seemed related to NBD access, so if the concurrency setting was being ignored, that might be the source of the issue I was seeing.
I'll watch my lab as well and see if the concurrency is being respected or not on the latest from the sources build.
Glad to see you were on 8.3, so not related to me being on 8.2.
-
@planedrop I was getting a lot of NBD errors as well. so I'm not positive on if it was fully just ignoring the concurrency, or just moving onto next backup because of nbd communication error then just leaving the previous backups under active attempt. Either way, there's a bug if it leaves backup 'active' and then starts another one beyond set concurrent limit.
-
@ravenet Yeah another night of successful backups so I think going back to Stable did fix the issue. 2 for 2 on that now.
-
Wanted to post a quick update, it's been over a week now and the backups have been 100% successful.
Figured as such, but thought it was worth at least coming back here and confirming.
-
Thanks for keeping us posted, really appreciated here by all the team