Why does the backup use snapshots and not CBT
-
I wanted to share my first experiences so far, for 1 single disk vm it works perfect, cbt is created and left only for metadata till the next backup.
For vms with 2 disks i run into some strange behavior, after the bacup job is see one of the 2 disks snapshot (random) at the health page as not connected to a vm and when i run the backup job again it gives an error "source and target are unrelated", the orphan vdi section.
not shure if this is an issue on XCP side or on a bug inside there software.
as i cannot seem to reproduce this issue on single disk vm's i feel like this is an issue with multiple disks. i am contacting there support to see if this is a bug on there end or if this is an issue inside XCP itself.
Keep u all posted on the progess of this tests.
-
@rtjdamen said in Why does the backup use snapshots and not CBT:
@olivierlambert i am currently looking at Quadric Software and storware, Quadric Software is doing the cbt backup exactly as needed, i got there description on the workflow yesterday
*When you run an ABD/hypervisor backup, the A3 will snapshot the VM for the duration of the backup data acquisition phase. After the datea is collected, the snapshot is immediately deleted.β
In the case of CBT backups, the snapshot is deleted with the special XAPI command vdi-data-destroy, which frees data associated with the snapshot, but retains metadata, including the block allocation for the disk. This allows us to perform a delta backup the next time, with extremely negligible storage overhead.*
This is what we need
we did a poc of it
we still need a snapshot ( or to stop the VM), bu at least we won't keep it
the hard part is that we need to change the way we store metadata : today it is stored at the VM snapshot level, but this snapshot is deleted after a CBT enabled backup
Hopefully it will be done this year -
@florent quadric currently keeps the snapshot as well but it only contains the cbt metadata, i believe very similar as we are currently running the backups with XOA
-
Yes, you can keep it but just remove the snap data while keeping metadata (IIRC, it's been a while I took a look)
-
@olivierlambert ok, is there a way we can keep an eye on this devepment? shall i open a feature request on this with support or what is the best way to proceed?
-
That might help, yes
-
@olivierlambert i have created a feature request for this over support.
in the meantime i will let u know the results with the current software we run, if this is working as required this can be a good addition for the environment document as well.
-
Hi all,
Just an update on our progress, i have been doing backups for around 10 larger vms now on CBT, i have seen this is working pretty well but it is not the wonderfull solution we had in mind.
First of all, the Alike backup software we now use to make CBT backups does have issues with multi disk vms, for some reason now and then the CBT got disabled by XEN on one of the disks resulting in a new baseline that needs to be created the next day, this is causing higher and longer backup times, still workable but it seems there are still issues with CBT in XCP-NG, i can determine if this is an issue of the alike software or that this is being caused by XCP-Ng itself. I have currently no other backup solution using the CBT so we canβt compare the results there.
Also a new issue can occur, when u create a CBT snapshot during peak hours it can cause issues with coalesce as well, the snapshot is deleted and it has to be coalesced into the base disk, but when the vm is under higher i/o load it can cause an endless process of coalesce. This is basically a limitation of the XEN garbarge collection process but it is giving issues with backups from now and then.
From what i understand XOA is working on CBT as well, i believe some fundamental changes are needed to get this all working in the same way as it currently is in vmware and other products. I think it would be much better and much more reliable if this kind of coalesces are done direct at the vm level and not limited to 1 task per SR (as i understand there currenlty is).
I am curious if anybody else allready is doing tests with this and does have some feedback that could help us on this problems.
-
@rtjdamen
All our coalescing issues were solved by changing some constants in this file:/opt/xensource/sm/cleanup.py LIVE_LEAF_COALESCE_MAX_SIZE = 1024 * 1024 * 1024 # bytes LIVE_LEAF_COALESCE_TIMEOUT = 300 # seconds
We did not experience any side-effects
-
Hi all,
i have an update on this item, we are running CBT based backups for some larger vms now for a few weeks. We found that in our setup we had issues with NFS based storage, snapshots created there on multiple disk vms do give issues with the cbt snapshot, Xen is disabling CBT on the snapshot. we believe maybe because of latency in the tapdisk that is not ready yet for the specific vdi. We have migrated these vms to iscsi based storage and do not have issues there. That works like a charm!
@olivierlambert is it possible to add quadricsoftware to the backup solutions overview as well?
And in the end the best would be when CBT is enabled on XOA based backups as well, that will help improve the backups in general.
-
We have not been contacted by Quadric yet. We are also working on testing CBT on our side, it's possible to be available in July's release.
-
@olivierlambert in then end that will be the best, but i am inpressed with the experience of there team on Xen and XCP as well, i think they are a good addition to the allready complete list, to be honest so far there backup tool is doing better then the ones allready on the list.
-
Thanks, it will be useful to test CBT so it might not be enabled by default first, but your feedback will be very useful when it will land!
-
@olivierlambert we will help testing it, and as we are using it for some vms allready i can provide u with results and experiences from there as well. We will discuss that over a call later this week.
-
Same issue there , we have around 125VMs and this is becoming a real problem , sometime a REP should used no more than 10GB and this is causing big transfer and rep delay, we just tried the xo-cr-seed and once we restart the backup ( 2 days from initial seed snapshot ) the transfer initiated a 12 hours backup delay so this is almost useless for remote DR site
-
@vkeven vates is currenlty working hard on making the CBT backups available, we are currenlty using a product called Alike A3 for CBT enabled backups and they rely on the xapi for cbt as well. in general XCP-NG does allready support it and it works as it should. I hope we will see some progress here soon!
-
We have already a branch started with CBT (we had first to store the backup metadata outside the snapshot itself because we'll remove it with CBT). This work is now done and now Florent is directly working on CBT