CBT: the thread to centralize your feedback
-
That's not what I'm talking about in my previous post.
-
why are the snapshots of psrv30_base and psrv30_data01 kept after a full backup???
Here what you wanted me tho check:
-
- So first, you still have VDI attached to the control domain. Outside the actual backup jop/export, it's NOT normal to have those here.
- The snapshots you are showing a snapshots without any data, just the metadata. We can improve the UI to detect it and show it's not a regular VDI snapshot. If you don't see the VM snapshot in the VM/snapshot view, it's OK.
-
-
-
Another Backup is running
-
So to summarize: You always get (in th UI) one snapshot which takes no space if you do delta with CBT. The snapshots can be ignored. Is this right?
-
-
Yes. It's not taking any space and won't prevent coalesce immediately after data are purged.
-
Thanks.
Additional Info: The VDI_IN_USE problem was because i always deleted the Snaps manually because i thougt they where orphan. Than the next Backup complains with this error
EDIT: Doesnot seems to be the root cause. VDI_IN_USE seems to be a comon problem when he wants to delete the snapshot-data and is not able to. there seems to be a deeper problem here.
-
I have another problem.
The size of some CBT-Snapshots is greater than 0B after some hours while others are 0B as it is to be expected when "Purge snapshot data" is checked. All disks which begin with psrv3... are backuped with the same job.
coalescing of snapshots is working flawless even with much bigger disks > 300GB. subsequent deltas are creating snaps and removing them so the problem only exists on the snaps taken by the first full. it seems that the problem is again VDI_IN_USE. The logs are at the bottom of the post.
Job config
{ "data": { "mode": "delta", "reportWhen": "failure" }, "id": "1721134380101", "jobId": "67da224e-ca75-47d3-87bf-ef30684b812e", "jobName": "psrv3x", "message": "backup", "scheduleId": "4c4eadfe-8517-47ba-9489-2e155d9d8f67", "start": 1721134380101, "status": "success", "infos": [ { "data": { "vms": [ "c72117ee-5b7e-2cc6-ddd3-5b5e4d2ded3f", "a1f2621a-b9a0-8b8f-a623-e8d1a465b9d1", "227439aa-f1c0-38de-e805-210ca4cfe4f1", "369fb26f-c45b-2911-b8ae-0b891162e55a", "a5a47785-992e-4322-1902-a06e87e1b89b", "b2101bf0-10b2-3296-bf46-9f6ab5417258", "b7cf3c63-f78f-8b13-a8cc-7ca7e99f7f2d", "3d15d157-6ca0-b13b-c6e4-2c49d2f90a82" ] }, "message": "vms" } ], "tasks": [ { "data": { "type": "VM", "id": "c72117ee-5b7e-2cc6-ddd3-5b5e4d2ded3f", "name_label": "psrv30" }, "id": "1721134381225", "message": "backup VM", "start": 1721134381225, "status": "success", "tasks": [ { "id": "1721134381253", "message": "clean-vm", "start": 1721134381253, "status": "success", "end": 1721134382311, "result": { "merge": false } }, { "id": "1721134382324", "message": "snapshot", "start": 1721134382324, "status": "success", "end": 1721134468381, "result": "7a671117-98ff-442c-8a25-0b518159dd91" }, { "data": { "id": "bfa75298-0f2d-4525-a4f4-c27c7b4443d3", "isFull": true, "type": "remote" }, "id": "1721134468381:0", "message": "export", "start": 1721134468381, "status": "success", "tasks": [ { "id": "1721134474475", "message": "transfer", "start": 1721134474475, "status": "success", "end": 1721134603873, "result": { "size": 5604999168 } }, { "id": "1721134610042", "message": "clean-vm", "start": 1721134610042, "status": "success", "end": 1721134612114, "result": { "merge": false } } ], "end": 1721134612115 } ], "infos": [ { "message": "Transfer data using NBD" }, { "message": "will delete snapshot data" }, { "data": { "vdiRef": "OpaqueRef:6257f019-dd31-8b16-732c-5d29911e0e51" }, "message": "Snapshot data has been deleted" } ], "warnings": [ { "data": { "error": { "code": "VDI_IN_USE", "params": [ "OpaqueRef:a6b4051c-6007-96d7-6053-77ff8c2bcdaa", "data_destroy" ], "call": { "method": "VDI.data_destroy", "params": [ "OpaqueRef:a6b4051c-6007-96d7-6053-77ff8c2bcdaa" ] } }, "vdiRef": "OpaqueRef:a6b4051c-6007-96d7-6053-77ff8c2bcdaa" }, "message": "Couldn't deleted snapshot data" } ], "end": 1721134612115 }, { "data": { "type": "VM", "id": "a1f2621a-b9a0-8b8f-a623-e8d1a465b9d1", "name_label": "psrv32" }, "id": "1721134381231", "message": "backup VM", "start": 1721134381231, "status": "success", "tasks": [ { "id": "1721134381329", "message": "clean-vm", "start": 1721134381329, "status": "success", "end": 1721134381498, "result": { "merge": false } }, { "id": "1721134381509", "message": "snapshot", "start": 1721134381509, "status": "success", "end": 1721134414736, "result": "46548732-64a4-9ec2-8f2f-fcd6695f6127" }, { "data": { "id": "bfa75298-0f2d-4525-a4f4-c27c7b4443d3", "isFull": true, "type": "remote" }, "id": "1721134414737", "message": "export", "start": 1721134414737, "status": "success", "tasks": [ { "id": "1721134422205", "message": "transfer", "start": 1721134422205, "status": "success", "end": 1721134749129, "result": { "size": 11214152704 } }, { "id": "1721134755579", "message": "clean-vm", "start": 1721134755579, "status": "success", "end": 1721134756596, "result": { "merge": false } } ], "end": 1721134756596 } ], "infos": [ { "message": "Transfer data using NBD" }, { "message": "will delete snapshot data" }, { "data": { "vdiRef": "OpaqueRef:11727e21-ce91-c93b-9381-bd741466aff5" }, "message": "Snapshot data has been deleted" } ], "warnings": [ { "data": { "error": { "code": "VDI_IN_USE", "params": [ "OpaqueRef:453add79-e2b9-f1a3-1940-2fe9fb6e9955", "data_destroy" ], "call": { "method": "VDI.data_destroy", "params": [ "OpaqueRef:453add79-e2b9-f1a3-1940-2fe9fb6e9955" ] } }, "vdiRef": "OpaqueRef:453add79-e2b9-f1a3-1940-2fe9fb6e9955" }, "message": "Couldn't deleted snapshot data" } ], "end": 1721134756597 }, { "data": { "type": "VM", "id": "369fb26f-c45b-2911-b8ae-0b891162e55a", "name_label": "psrv33" }, "id": "1721134756624", "message": "backup VM", "start": 1721134756624, "status": "success", "tasks": [ { "id": "1721134756646", "message": "clean-vm", "start": 1721134756646, "status": "success", "end": 1721134756699, "result": { "merge": false } }, { "id": "1721134756709", "message": "snapshot", "start": 1721134756709, "status": "success", "end": 1721134807757, "result": "3712099c-5138-f9d1-975b-75a8779e0e8c" }, { "data": { "id": "bfa75298-0f2d-4525-a4f4-c27c7b4443d3", "isFull": true, "type": "remote" }, "id": "1721134807758", "message": "export", "start": 1721134807758, "status": "success", "tasks": [ { "id": "1721134814098", "message": "transfer", "start": 1721134814098, "status": "success", "end": 1721136709333, "result": { "size": 42135879168 } }, { "id": "1721136715790", "message": "clean-vm", "start": 1721136715790, "status": "success", "end": 1721136717005, "result": { "merge": false } } ], "end": 1721136717005 } ], "infos": [ { "message": "Transfer data using NBD" }, { "message": "will delete snapshot data" }, { "data": { "vdiRef": "OpaqueRef:dfc7cdf5-70cc-7161-31d8-2ea8ef39a232" }, "message": "Snapshot data has been deleted" } ], "warnings": [ { "data": { "error": { "code": "VDI_IN_USE", "params": [ "OpaqueRef:98cb909a-79a2-3138-66b9-1ebe15701963", "data_destroy" ], "call": { "method": "VDI.data_destroy", "params": [ "OpaqueRef:98cb909a-79a2-3138-66b9-1ebe15701963" ] } }, "vdiRef": "OpaqueRef:98cb909a-79a2-3138-66b9-1ebe15701963" }, "message": "Couldn't deleted snapshot data" } ], "end": 1721136717005 }, { "data": { "type": "VM", "id": "a5a47785-992e-4322-1902-a06e87e1b89b", "name_label": "psrv38" }, "id": "1721136717009", "message": "backup VM", "start": 1721136717009, "status": "success", "tasks": [ { "id": "1721136717037", "message": "clean-vm", "start": 1721136717037, "status": "success", "end": 1721136717114, "result": { "merge": false } }, { "id": "1721136717126", "message": "snapshot", "start": 1721136717126, "status": "success", "end": 1721136756030, "result": "9161f706-3a4e-aa61-c485-f239bea58c1c" }, { "data": { "id": "bfa75298-0f2d-4525-a4f4-c27c7b4443d3", "isFull": true, "type": "remote" }, "id": "1721136756030:0", "message": "export", "start": 1721136756030, "status": "success", "tasks": [ { "id": "1721136762801", "message": "transfer", "start": 1721136762801, "status": "success", "end": 1721137081725, "result": { "size": 14236907008 } }, { "id": "1721137087610", "message": "clean-vm", "start": 1721137087610, "status": "success", "end": 1721137088697, "result": { "merge": false } } ], "end": 1721137088697 } ], "infos": [ { "message": "Transfer data using NBD" }, { "message": "will delete snapshot data" }, { "data": { "vdiRef": "OpaqueRef:8348309e-a904-0ef5-718d-5464aa73bfdf" }, "message": "Snapshot data has been deleted" } ], "warnings": [ { "data": { "error": { "code": "VDI_IN_USE", "params": [ "OpaqueRef:f577ba25-5f0e-79c8-c19b-8f67b9e70817", "data_destroy" ], "call": { "method": "VDI.data_destroy", "params": [ "OpaqueRef:f577ba25-5f0e-79c8-c19b-8f67b9e70817" ] } }, "vdiRef": "OpaqueRef:f577ba25-5f0e-79c8-c19b-8f67b9e70817" }, "message": "Couldn't deleted snapshot data" } ], "end": 1721137088697 }, { "data": { "type": "VM", "id": "227439aa-f1c0-38de-e805-210ca4cfe4f1", "name_label": "psrv31" }, "id": "1721134612122", "message": "backup VM", "start": 1721134612122, "status": "success", "tasks": [ { "id": "1721134612144", "message": "clean-vm", "start": 1721134612144, "status": "success", "end": 1721134612234, "result": { "merge": false } }, { "id": "1721134612261", "message": "snapshot", "start": 1721134612261, "status": "success", "end": 1721134637673, "result": "7fca43e7-50ce-c409-78f9-3f622990231b" }, { "data": { "id": "bfa75298-0f2d-4525-a4f4-c27c7b4443d3", "isFull": true, "type": "remote" }, "id": "1721134637674", "message": "export", "start": 1721134637674, "status": "success", "tasks": [ { "id": "1721134644637", "message": "transfer", "start": 1721134644637, "status": "success", "end": 1721137110501, "result": { "size": 67748438528 } }, { "id": "1721137116357", "message": "clean-vm", "start": 1721137116357, "status": "success", "end": 1721137117343, "result": { "merge": false } } ], "end": 1721137117343 } ], "infos": [ { "message": "Transfer data using NBD" }, { "message": "will delete snapshot data" }, { "data": { "vdiRef": "OpaqueRef:e5166784-8ccb-599c-2e26-14c1cb7cc455" }, "message": "Snapshot data has been deleted" } ], "warnings": [ { "data": { "error": { "code": "VDI_IN_USE", "params": [ "OpaqueRef:8b8c3e0e-d961-ac5f-807f-8871117ff4e8", "data_destroy" ], "call": { "method": "VDI.data_destroy", "params": [ "OpaqueRef:8b8c3e0e-d961-ac5f-807f-8871117ff4e8" ] } }, "vdiRef": "OpaqueRef:8b8c3e0e-d961-ac5f-807f-8871117ff4e8" }, "message": "Couldn't deleted snapshot data" } ], "end": 1721137117343 }, { "data": { "type": "VM", "id": "b2101bf0-10b2-3296-bf46-9f6ab5417258", "name_label": "psrv37" }, "id": "1721137088704", "message": "backup VM", "start": 1721137088704, "status": "success", "tasks": [ { "id": "1721137088727", "message": "clean-vm", "start": 1721137088727, "status": "success", "end": 1721137088789, "result": { "merge": false } }, { "id": "1721137088814", "message": "snapshot", "start": 1721137088814, "status": "success", "end": 1721137140639, "result": "ce015f67-e251-4719-04e4-3fc1f5a32e4d" }, { "data": { "id": "bfa75298-0f2d-4525-a4f4-c27c7b4443d3", "isFull": true, "type": "remote" }, "id": "1721137140639:0", "message": "export", "start": 1721137140639, "status": "success", "tasks": [ { "id": "1721137149516", "message": "transfer", "start": 1721137149516, "status": "success", "end": 1721137256573, "result": { "size": 5334400512 } }, { "id": "1721137263812", "message": "clean-vm", "start": 1721137263812, "status": "success", "end": 1721137265019, "result": { "merge": false } } ], "end": 1721137265020 } ], "infos": [ { "message": "Transfer data using NBD" }, { "message": "will delete snapshot data" }, { "data": { "vdiRef": "OpaqueRef:53a79aeb-5a64-4e2f-b016-bb2d52b913ba" }, "message": "Snapshot data has been deleted" }, { "data": { "vdiRef": "OpaqueRef:8164fe16-721e-e6c9-4a45-c7a435ab84e7" }, "message": "Snapshot data has been deleted" } ], "end": 1721137265020 }, { "data": { "type": "VM", "id": "b7cf3c63-f78f-8b13-a8cc-7ca7e99f7f2d", "name_label": "psrv36" }, "id": "1721137117347", "message": "backup VM", "start": 1721137117347, "status": "success", "tasks": [ { "id": "1721137117366", "message": "clean-vm", "start": 1721137117366, "status": "success", "end": 1721137117421, "result": { "merge": false } }, { "id": "1721137117433", "message": "snapshot", "start": 1721137117433, "status": "success", "end": 1721137182291, "result": "40eaa629-2c0d-cfe5-a969-59511d803d33" }, { "data": { "id": "bfa75298-0f2d-4525-a4f4-c27c7b4443d3", "isFull": true, "type": "remote" }, "id": "1721137182291:0", "message": "export", "start": 1721137182291, "status": "success", "tasks": [ { "id": "1721137188995", "message": "transfer", "start": 1721137188995, "status": "success", "end": 1721137354685, "result": { "size": 6817448960 } }, { "id": "1721137361886", "message": "clean-vm", "start": 1721137361886, "status": "success", "end": 1721137363289, "result": { "merge": false } } ], "end": 1721137363289 } ], "infos": [ { "message": "Transfer data using NBD" }, { "message": "will delete snapshot data" }, { "data": { "vdiRef": "OpaqueRef:a4556739-a7a7-0ad9-83d3-d982b7c6b634" }, "message": "Snapshot data has been deleted" }, { "data": { "vdiRef": "OpaqueRef:52e2e793-1301-f647-16cc-a49c49f54844" }, "message": "Snapshot data has been deleted" } ], "end": 1721137363289 }, { "data": { "type": "VM", "id": "3d15d157-6ca0-b13b-c6e4-2c49d2f90a82", "name_label": "psrv34" }, "id": "1721137265039", "message": "backup VM", "start": 1721137265039, "status": "success", "tasks": [ { "id": "1721137265065", "message": "clean-vm", "start": 1721137265065, "status": "success", "end": 1721137265129, "result": { "merge": false } }, { "id": "1721137265190", "message": "snapshot", "start": 1721137265190, "status": "success", "end": 1721137301489, "result": "27fac29e-86ea-bbf9-4232-fd69c8518562" }, { "data": { "id": "bfa75298-0f2d-4525-a4f4-c27c7b4443d3", "isFull": true, "type": "remote" }, "id": "1721137301490", "message": "export", "start": 1721137301490, "status": "success", "tasks": [ { "id": "1721137308542", "message": "transfer", "start": 1721137308542, "status": "success", "end": 1721137521612, "result": { "size": 8711639552 } }, { "id": "1721137527713", "message": "clean-vm", "start": 1721137527713, "status": "success", "end": 1721137528729, "result": { "merge": false } } ], "end": 1721137528729 } ], "infos": [ { "message": "Transfer data using NBD" }, { "message": "will delete snapshot data" }, { "data": { "vdiRef": "OpaqueRef:5bf6cd52-e947-fc28-1815-cc4d035f9c68" }, "message": "Snapshot data has been deleted" } ], "warnings": [ { "data": { "error": { "code": "VDI_IN_USE", "params": [ "OpaqueRef:47a39aa7-7dea-0956-00d7-02324dd0495d", "data_destroy" ], "call": { "method": "VDI.data_destroy", "params": [ "OpaqueRef:47a39aa7-7dea-0956-00d7-02324dd0495d" ] } }, "vdiRef": "OpaqueRef:47a39aa7-7dea-0956-00d7-02324dd0495d" }, "message": "Couldn't deleted snapshot data" } ], "end": 1721137528729 } ], "end": 1721137528729 }
-
xe vdi-data-destroy does not free the space either.
[root@host3 sm]# xe vdi-list params=uuid,name-label,virtual-size,physical-utilisation,type type="CBT metadata" name-label=psrv34_data01 uuid ( RO) : ceb5a540-7d53-4614-be3e-b329d8d7336b name-label ( RW): psrv34_data01 virtual-size ( RO): 10737418240 physical-utilisation ( RO): 8388608 type ( RO): CBT metadata [root@host3 sm]# xe vdi-data-destroy uuid=ceb5a540-7d53-4614-be3e-b329d8d7336b [root@host3 sm]# xe vdi-list params=uuid,name-label,virtual-size,physical-utilisation,type type="CBT metadata" name-label=psrv34_data01 uuid ( RO) : ceb5a540-7d53-4614-be3e-b329d8d7336b name-label ( RW): psrv34_data01 virtual-size ( RO): 10737418240 physical-utilisation ( RO): 8388608 type ( RO): CBT metadata
-
The current cbt backup is working but it contains several bugs that needs to be resolved. @florent is working on fixes but from what i understand it is difficult to be fixed. Hope there will be progress soon!
-
@rtjdamen I noticed this today as well. A VM within a pool migrated from one host to another, after which i received
Error: stream has ended with not enough data (actual: 449, expected: 512)
Retrying the backup resulting it in running a Full Backup. Hopefully XO can start to handle this otherwise im not sure CBT backups are worth it over the old method for pools with multiple hosts. Running a Rolling Pool Update or Rolling Pool Reboot would guarantee every VM would fail and require a full backup? Or is this not expected behaviour?
-
@flakpyro no this is a bug, we see this on vms that are not migrated. In our other backup software we do not see this issue, using cbt there as well.
-
I cannot reproduce in my own production here, the only problem is when I have a job on 2 different pool, one pool doesn't have NBD enabled and re-doing a full. The NBD enabled pool works perfectly though
-
@olivierlambert From what i understand about this issue is it occurs when snapshot is deleted at the wrong time, although this is what florent told me. he created a fix for it but that does not resolve the issue.
-
I do not reproduce that problem here, so it's clearly something subtle and/or configuration dependent. Any other feedback from the community with CBT? More feedback will be helpful to pinpoint the issues left
-
@olivierlambert Maybe it was just a coincidence then that this happened after a migration. Backups with this VM had been working well for about a week until i migrated it to a different host yesterday.
-
@flakpyro we see this happening to a few vms (2 or 3) random on a large pool 400+ so i think it is a coincidence.
-
@flakpyro moving to another SR or just live migrating?
-
@olivierlambert Just live migrating.
So as a test today after the failure last night i ran a full backup with CBT enabled on 2 VMs, i then migrated both from Host 1 to Host 2 (using the same shared SR mounted on each host via NFS 4.1 (TrueNAS)). Tonight the backup failed with the same error:
Error: stream has ended with not enough data (actual: 449, expected: 512)
If it would at all help i could open a ticket referencing this thread and enable the remote support tunnel. We can always go back to the old backup method but if i can help make CBT rock solid im always willing to help!
-
@flakpyro i did the same test at out end, live migration of a vm and no issue with the backup after. I canβt reproduce. We use iscsi so maybe the difference is there, i have seen cbt issues on nfs as well with our 3th party tooling. Will do some more testing tonight.