CBT: the thread to centralize your feedback
-
@olivierlambert Im making progress getting to the bottom of this thanks to some documentation from XenServer about using cbt-util.
You can use the cbt-util utility, which helps establish chain relationship. If the VDI snapshots are not linked by changed block metadata, you get errors like “SR_BACKEND_FAILURE_460”, “Failed to calculate changed blocks for given VDIs”, and “Source and target VDI are unrelated”. Example usage of cbt-util: cbt-util get –c –n <name of cbt log file> The -c option prints the child log file UUID.
I cleared all CBT snapshots from my test VMs and run a full backup on each VM. Then ensured the CBT chain was consistent using cbt-util, the output was:
[14:22 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 867063fc-4d86-420a-9ad2-dfe1749ecbc1.cbtlog 1950d6a3-c6a9-4b0c-b79f-068dd44479cc
After the backup was complete i then migrated the VM to the second host in the pool and ran the same command from both hosts:
[14:26 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 867063fc-4d86-420a-9ad2-dfe1749ecbc1.cbtlog 00000000-0000-0000-0000-000000000000
And from the second host:
[14:26 xcpng-test-02 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 867063fc-4d86-420a-9ad2-dfe1749ecbc1.cbtlog 00000000-0000-0000-0000-000000000000
That clearly is the problem right there, question is, what is causing that to happen?
After running another full the zero'd out cbtlog file is removed and a new one is created which will work fine until the VM is migrated again:
[14:39 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 1eefb7bf-9dc3-4830-8352-441a77412576.cbtlog 1950d6a3-c6a9-4b0c-b79f-068dd44479cc
-
@flakpyro i can't reproduce this on our end, after migration within pool on the same storage pool the cbt is preserved. When i migrate to a different storage pool the cbt is reset.
-
@rtjdamen interesting, this is with iSCSI (block) or with an NFS SR?
-
@flakpyro both scenarios
-
@rtjdamen Hmm very strange.
The only thing i can think of is that this maybe due to the fact these VMs were imported from VMware.
Next week i can try creating a brand new NFSv3 SR (Since NFS4 has created issues in the past) as well as a new clean install VM that was not imported from VMware and see if the issue persists.
-
This is a completely different 5 host pool backed by a Pure storage array with SRs mounted via NFSv3, migrating a VM between hosts results in the same issue.
Before migration: [01:41 xcpng-prd-03 b04d9910-8671-750f-050e-8b55c64fbede]# cbt-util get -c -n 83035854-b5a9-4f7e-869f-abe43ddc658d.cbtlog e28065ff-342f-4eae-a910-b91842dd39ca After migration [01:41 xcpng-prd-03 b04d9910-8671-750f-050e-8b55c64fbede]# cbt-util get -c -n 83035854-b5a9-4f7e-869f-abe43ddc658d.cbtlog 00000000-0000-0000-0000-000000000000
I dont think i have anything "custom" running that would be causing this so no idea why this is happening but its happening on multiple pools for us.
-
@flakpyro is there any difference in migrating with the vm powered on or powered off?
-
@flakpyro i have just tested live migration and offline on our end, both kept the cbt alive. Tested on both iscsi and nfs.
-
Looks like it does this if the VM is powered off as well. Im really not sure what else to try since this is happening on 2 different pools for us.
I may need to end up submitting a ticket with Vates for them to get to the bottom of it.
-
@flakpyro are u running the latest xcp-ng version 8.2 or 8.3?
-
@rtjdamen Both pools are on 8.3 with all the latest updates.
I did find this PR on github and wonder if it may be related: https://github.com/vatesfr/xen-orchestra/pull/8127 but not sure why it would only happen after a migration.... -
@flakpyro we are still on 8.2 sor maybe there is some difference there.
-
Thanks for the feedback @flakpyro and it shows it's not an XO issue. There's something not preserving CBT in your case where it shouldn't, and IDK why. But clearly, you have a way to test it easily, which is progress