@gduperrey Installed on the same 2 hosts as the last batch of test updates released in December.
No issues to report so far, ran a backup job after after without issue.
@gduperrey Installed on the same 2 hosts as the last batch of test updates released in December.
No issues to report so far, ran a backup job after after without issue.
@olivierlambert I wonder if it would be beneficial to show all snapshots older than 30 days not just ones not created by an automated process. For example what happens if a XO backup job runs, creates a snapshot but the process is interrupted and the job fails. Will the next time the job runs clean up the previous failed jobs snapshot or is there a chance a snapshot could be left behind?
I have ran into this numerous times. Its one of the reasons i have not switched to "Purge Snapshot data when using CBT" on all my jobs yet.
I hope the fixes in testing solve the issue, what has been fixing it for me in the meantime is modifying the following:
Edit
/opt/xensource/sm/cleanup.py :
Modify LIVE_LEAF_COALESCE_MAX_SIZE and LIVE_LEAF_COALESCE_TIMEOUT to the following values:
LIVE_LEAF_COALESCE_MAX_SIZE = 1024 * 1024 * 1024 # bytes
LIVE_LEAF_COALESCE_TIMEOUT = 300 # seconds
Im not sure what the best criteria would be for listing the snapshot on the health check page. Perhaps if the snapshot is over 30 days old? There is already "Too many snapshots" (Not sure what counts as too many?) and "Orphaned VMs snapshot" perhaps just simply another section called "Old Snapshots"?
I have often thought this might be something worth having in the health check area. Coming from VMware we used to frequently run a tool called "RVTools" that could show you old snapshots that may have been forgotten about or snapshots that were created by a failed backup job that were not removed properly. This would be useful to have to find snapshots like that and remove them before they become a problem.
As an update we just spun up our DR pool yesterday,a fresh install of XCP-NG 8.3 on all hosts in the pool. Testing migrations and backups with CBT enabled shows the same behaviour we experience on the other pools. Removing the default migration network allows CBT to work properly, however specifying a default migration network causes CBT to be reset after a VM migration. So i think this is pretty reproducible at least using a file based SR like NFS.
@olivierlambert Not to dig up an old thread but was this ever added? I was looking around and wasn't able to find it anywhere.
@DustinB Are the any downsides to having two XOA instances pointing at the same pool? Since the config itself is stored at the pool level im guessing theres no downside?
IE: Priimary XOA running in core DC and secondary XOA running at your DR site. Is it just a matter of adding the pool on the secondary XOA and it downloads the existing config or did you need to do a full export / import?
@manilx When i did this i used the xe CLI, you can also use xcp-ng center but with 8.3 you'll need to download a beta version linked on the xcp-ng center forum thread.
xe vm-migrate uuid=UUID_OF_XOA_VM remote-master=new_pool_master_IP remote-username=root remote-password=PASSWORD host-uuid=destination_host_uuid vdi:vdi_uuid=destination_sr_uuid vif:source_vif_uuid=destination_network_uuid
Docs: https://docs.xenserver.com/en-us/citrix-hypervisor/command-line-interface.html#vm-migrate
So in the case where CBT is being reset the network of the VM is not actually being changed during migration. The VM is moving from Host A to Host B within the same pool, using NFS shared storage which is also not changing. However when "Default Migration Network" in the pools advanced tab is set on the pool, CBT data is reset. When a default migration network is not set, the CBT data remains in tact.
I seems like migrate_send
will always reset CBT data during a migration then even if its within the same pool on shared storage and that this is used when a default migration network is specified in XO's Pool - Advanced tab. While vm.pool_migrate
will not reset CBT but is only used when a default migration network is NOT set in XO's Pool - Advanced tab. Not sure how we work around that short of not using a dedicated migration network?
Thanks for the tip!
Looking at the output:
command name : vm-migrate
reqd params :
optional params : live, host, host-uuid, remote-master, remote-username, remote-password, remote-network, force, copy, compress, vif:, vdi:, <vm-selectors>
Ir does not appear there is a way for me to specify a migration network using the vm-migrate command?
It sounds to me like vm.migrate_send
is causing CBT to be reset while vm.pool_migrate
is leaving it intact? The difference between a migration that is known to be kept within a pool vs one that could potentially be migrating a VM anywhere?
I think we have a pretty good idea of the cause now, It seems to be related to having a migration network specific at the pool level.
I think we are closer than ever to having this worked out and should help a lot of us using a dedicated migration network. (As was best practice in Vmware land) What are the next steps we need to take?
@olivierlambert @MathieuRA once you are able to provide me xe migrate flag to specify a migration network i will test this ASAP. I think we're really close to getting to the bottom of this issue!
@olivierlambert Makes sense! I would schedule the replication to occur a couple hours after the backup runs are complete to ensure its a replica of all data and not a partial replica!
Hi Everyone
So in our current environment we use XOA to backup our environment nightly to a large ZFS array of disks. I am in the process of standing up new remote offsite backup, i was wondering the best way to send a copy of these backups over there.
I have the choice of using a nightly rsync job, ZFS replication, or using XOAs own mirror backup function.
I am learning more towards using rsync or zfs replication as then traffic flows directly from the onsite backup repo to the remote and is completely transparent and separate from the XO environment, while using XOAs mirror function sends the data though the XOA appliance itself. In the case of zfs replication i can also keep multiple snapshots of the zfs pool as well.
My understanding is once the data lands at the remote site i could simply add the remote server as a "remote" to an XOA appliance and the backups would then appear in its inventory, at which point i could restore or test my backups if i wanted.
Any pitfalls with doing that as opposed to the XOA Mirror backup function that i'm missing?
@olivierlambert
Im on it! However after searching the XCP-NG docs as well as the XenServer docs i can't see to find how to specify a migration network using xe from the cli. Are you able to provide me the flag i need to use?
@olivierlambert Glad we're getting to the bottom of this!
Out of curiosity is having an isolated migration network only available to the XCP-NG hosts considered best practice with XCP-NG? It was with VMware to keep VMotion traffic on its own subnet and since the VLAN was already created on our switches i decided to keep with that setup. Ideally we can get this fixed either way, I'm just curious if I'm doing something considered strange?
@olivierlambert
We're making progress i think!
Correct letting the migration run with those settings results in 0000 when running the cbt-check command.
I tried removing the migration network and ran a migration with the following settings:
Before migration:
[14:27 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 7560326c-8b15-4c58-841f-6a8f962a7d28.cbtlog
fe6e3edd-4d63-4005-b0f3-932f5f34e036
And after migration:
[14:27 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 7560326c-8b15-4c58-841f-6a8f962a7d28.cbtlog
fe6e3edd-4d63-4005-b0f3-932f5f34e036
If i select a default migration network and run the same migration:
[14:31 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 7560326c-8b15-4c58-841f-6a8f962a7d28.cbtlog
00000000-0000-0000-0000-000000000000
I think we're getting somewhere now! I have the migration network on both the test and DR pools. This used to be our "Vmotion" network back when we ran vsphere and i decided to continue using it to keep migration traffic on an isolated secure vlan.
In fact these Veeam VMs are not even being used anymore they exist in our test lab as VMs to mess around with for things like this.
Here is a screenshot of how i am doing the migration in XOA: moving from host 2 to host 1, leaving the SR drop down empty.
For sure, i ran:
xe vm-migrate uuid=a14f0ad0-854f-b7a8-de5c-88056100b6c6 host-uuid=c354a202-3b30-486b-9645-2fd713dee85f
To move the VM from host 1 to host 2....
Doing it this way i noticed checking the CBT log file does not result in all zeros being output.
[10:00 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 087ad136-f31b-4d7c-9271-7c926fd51089.cbtlog
fe6e3edd-4d63-4005-b0f3-932f5f34e036
For fun i then moved the VM back from Host 2 to host 1 and again, the cbtlog file seems to be intact:
[10:02 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]# cbt-util get -c -n 087ad136-f31b-4d7c-9271-7c926fd51089.cbtlog
fe6e3edd-4d63-4005-b0f3-932f5f34e036
After all this migrating i then ran a job which ran fine and without any errors about not being able to do a delta.
So it seems like it works fine via xe CLI
Update:
After the backup ran properly and generated a new CBT log file i then moved it back and forth between hosts again using the CLI. And the cbtlog file seems to stay in tact again when checking using cbt-util. When i do this with XOA the result from cbtutil is all zeros.