VDI Chain on Deltas
-
Hi All!
We were having problems with our backup remotes not working (on-site Synology, off-site Wasabi) with vdi chain issues. Checked the logs per the referenced article and didn't get anywhere with anything obvious. Noticed we were using an older encryption method with the encrypted remotes. So decide to purge the remotes, setup fresh, and off we go.
Fast forward a week. Full backups went ok on a forced full. We'll see this weekend if it goes well automatically. However deltas continue to fail when run on schedule with the new remotes. I figured for sure that new backups, new snapshots, etc. wouldn't have a coalescing issue. What we've found so far though is "force run" results in a successful backup for the deltas too.
At a bit of a loss on troubleshooting. Anyone else seeing this? Both remotes are encrypted.
Thanks!
Nick -
Hi,
Can you be more specific on what's going on exactly?
-
@olivierlambert Sure I can try.
I can confirm now though that with both my full and my delta jobs that they fail with every single VM on the "Job canceled to protect the VDI chain" error.
If we do a standard restart then it fails the same way. If we use the "force restart" option then it does work properly and backups seem to finish without issue.
The remote configuration is brand new with encrypted remotes with the multiple data block option selected. The backup job itself is not new, it's been in place for about a year. The job uses VM tags to determine which VMs to backup. The full is a weekly run with 6 retained backups, it remotes to both the external and local. The delta only goes to the local synology and is set with 14 retained backups.
The storage for the VMs is on a Synology NAS. The VMs live on one of 3 hosts with similar vintage hardware.
Per the backup troubleshooting article:
cat /var/log/SMlog | grep -i exception : no results
cat /var/log/SMlog | grep -i error : no results
grep -i coales /var/log/SMlog : lots of messages that say "UNDO LEAF-COEALESCE"The host I ran those commands on is the one which houses the Xen Orchestra VM (whose backup also fails).
The synology backup remote has 10TB assigned to it with 8.7TB free. The VDI disk volume has 5.4TB of 10TB free.
Status on the hosts patch-wise shows 6 patches are needed currently, though they were up-to-date last week.
XO is on commit 9ed55.
Other specifics I can provide?
Thanks!
Nick -
@nvoss Hello, The
UNDO LEAF-COEALESCE
usually has a cause that is listed in the error above it. Could you share this part please? -
@dthenot when I grep looking for coalesce I don't see any errors. Everything is the undo message.
Looking at the line labeled 3680769 in this case corresponding with one of those undo's I see lock opens, variety of what looks like successful mounts and subsequent snapshot activity then at the end the undo. After the undo message I see something not super helpful.
Attached is that entire region. Below an excerpt.
It's definitely confusing as to why a force on the job works instead of the regular run?
-
@nvoss Could you try to run
vhd-util check -n /var/run/sr-mount/f23aacc2-d566-7dc6-c9b0-bc56c749e056/3a3e915f-c903-4434-a2f0-cfc89bbe96bf.vhd
? -
@dthenot sure, here you go!