XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    CBT: the thread to centralize your feedback

    Scheduled Pinned Locked Moved Backup
    439 Posts 37 Posters 386.1k Views 29 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates πŸͺ Co-Founder CEO
      last edited by

      We are moving forward to implement CBT, and in this thread, we'll centralize everything you need to test and report issues.

      More instructions to come πŸ™‚

      A 2 Replies Last reply Reply Quote 1
      • olivierlambertO olivierlambert pinned this topic on
      • olivierlambertO olivierlambert referenced this topic on
      • olivierlambertO olivierlambert referenced this topic on
      • olivierlambertO olivierlambert referenced this topic on
      • Tristis OrisT Offline
        Tristis Oris Top contributor
        last edited by

        i'll bump my ticket up) https://github.com/vatesfr/xen-orchestra/issues/7771

        TristisOris created this issue in vatesfr/xen-orchestra

        open delta backup failed after last update - no alias references VHD #7771

        M 1 Reply Last reply Reply Quote 0
        • M Offline
          manilx @Tristis Oris
          last edited by manilx

          @olivierlambert Posted this:https://xcp-ng.org/forum/topic/9275/starting-getting-again-retry-the-vm-backup-due-to-an-error-error-vdi-must-be-free-or-attached-to-exactly-one-vm

          Just posting here because I found .cbtlog files again hanging coalesces, in the last build supposed to have removed all this.....

          1 Reply Last reply Reply Quote 0
          • olivierlambertO olivierlambert referenced this topic on
          • AnonabharA Offline
            Anonabhar
            last edited by

            I think I may have a bit of a similar problem here. About a week ago, I did an update to the broken version of XO and it threw the same error as is in the subject line here. I reverted and everything was OK, but then I started to get unhealthy VDI warnings on my backups.

            I tried to rescan the SR and I would see in the SMLog that it believed another GC was running, so it would abort. Rebooting the host was the only way to force the coalesce to complete; however as soon as the next inc-backup ran, it would go into the same problem (the GC thinking another is running and would no do any work).

            I then did a full power off of the host, reboot and let all the VM's sit in a "powered off" state, rescanned the SR and let it coalesce. Once everything was idle, I then deleted all snapshots and waited for the coalesce to finish. Only then did I restart the VM's. Now a few VM's immediately have come up as 'unhealthy' and once again the GC will not run, thinking there is another GC working..

            I'm kind of running out of idea's 8-) Does anyone know what might be stuck or what I need to look for to find out?


            Just a side note here. I noticed that all the VM's that I am having problems with have CBT enabled.

            I have a VM that is a snapshot only VM and even when the coalesces is stuck, I can delete snapshots off this non-cbt VM and the coalesces process runs (then gives an exception when it gets to the VM's that have CBT enabled)

            Is there a way to disable CBT?

            1 Reply Last reply Reply Quote 0
            • florentF Offline
              florent Vates πŸͺ XO Team
              last edited by Danp

              Hello everybody,
              thanks for your feedback, Here is a work branch with CBT enabled : https://github.com/vatesfr/xen-orchestra/pull/7792 . The branch name is fix_cbt

              It fixes :

              • snapshot retention with full backups

              • off by one error for retention length

              • parent locator error

              • can't destructure undefined error

              • it don't leak vdi attached in the dom0 in our lab

              • progress is back on the export task

                Please test it if you can , and don't hesitate to provide feedback

              Regards,

              Florent

              fbeauchamp opened this pull request in vatesfr/xen-orchestra

              draft fix(backups): CBT omnibus #7792

              R D 2 Replies Last reply Reply Quote 0
              • AnonabharA Offline
                Anonabhar
                last edited by Anonabhar

                For those that may be stuck, like I was, I finally have un-done the coaless nightmare the previous CBT did.

                For note: I am using XCP-ng 8.3 Beta fully patched.

                1. What I had to do was shutdown every VM and delete every snapshot
                2. Find every VDI that had CBT enabled and disable it. I did this in a simple bash command (not the best, I know)
                for i in `xe vdi-list cbt-enabled=true | grep "^uuid ( RO)" | cut -d " " -f 20`
                do
                     echo $i
                     xe vdi-disable-cbt uuid=$i
                done
                
                1. Reboot the server
                2. Create a snapshot on any VM and immidately delete it. (If you just do a rescan, it says that the GC is running when it is not but for whatever reason, deleting a shapshot seems to kick in the GC regardless)
                3. Keep an eye on the SMLog and look for exceptions... I tend to do something like: (It will sleep for 5 minutes - so dont get anxious)
                tail -f /var/log/SMLog | grep SMGC
                
                1. When it finishes, check XO to see if there are any remaining uncoalessed disk and repeat from step 4.

                It took about 5 iterations of the above to finally clean up all the stuck coalessed leafs but it eventually did it. The key, for me, was making sure the VM's were not running and turning CBT off.

                1 Reply Last reply Reply Quote 0
                • R Offline
                  rtjdamen @florent
                  last edited by

                  @florent hi Florent, i would love to help u test this on our lab, i have XO from sources running there, but i have no cbt options, do i need to download it in a specific way?

                  1 Reply Last reply Reply Quote 0
                  • D Offline
                    Delgado @florent
                    last edited by

                    @florent I'll be more than happy to help. I will get my homelab instance upgraded to that branch and report back with any issues,

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates πŸͺ Co-Founder CEO
                      last edited by

                      @rtjdamen You need to switch on fix_cbt branch, like git checkout fix_cbt and rebuild.

                      R 1 Reply Last reply Reply Quote 0
                      • R Offline
                        rtjdamen @olivierlambert
                        last edited by

                        @olivierlambert thank you, found it, i will run some backups with one or two vms to start with and will report the results.

                        R 1 Reply Last reply Reply Quote 0
                        • R Offline
                          rtjdamen @rtjdamen
                          last edited by

                          This seems to be working fine. Once the backup is complete, we'll execute the vdi_data_destroy command, right? Currently, it doesn't appear obvious that this is a CBT metadata-only snapshot. Is there a way to make this more visible?

                          1 Reply Last reply Reply Quote 0
                          • olivierlambertO Offline
                            olivierlambert Vates πŸͺ Co-Founder CEO
                            last edited by

                            You mean in the VM view/snapshot tab? You are seeing the VM snapshot, not the VDI snapshot, so I wonder if this VM snapshot can be reverted while being CBT metadata only, and if not, we must make it clear in the UI, yes!

                            R A 2 Replies Last reply Reply Quote 0
                            • D Offline
                              Delgado
                              last edited by

                              I enabled cbt on the disks and nbd + cbt in my delta backup and so far so good. I plan on letting another backup run over night. I also ran a full backup and it removed the snapshot like it's supposed to.

                              1 Reply Last reply Reply Quote 0
                              • R Offline
                                rtjdamen @olivierlambert
                                last edited by

                                @olivierlambert yes indeed, this is currently visible like a normal snapshot, i think it should be visible like a metadata only snapshot.

                                1 Reply Last reply Reply Quote 0
                                • R Offline
                                  rtjdamen
                                  last edited by

                                  @florent i have been watching the backup process and in the end i only seed vdi.destroy happening nog vdi.data_destroy is this correct? are we handling this last step correct or do we remain data on the snapshot at this time?

                                  1 Reply Last reply Reply Quote 0
                                  • florentF Offline
                                    florent Vates πŸͺ XO Team
                                    last edited by

                                    dataDestroy will be enable-able (not sure if it's really a word) today, in he meantime, the

                                    Please note that the metadata snapshot won't be visible in the UI since it's not a VM Snapshot, but only the metadata of the vdi snapshots

                                    latest commits in the fix_cbt branch add an additionnal check on dom0 connect, more error handling

                                    R 1 Reply Last reply Reply Quote 0
                                    • R Offline
                                      rtjdamen @florent
                                      last edited by

                                      @florent ok so currently the data remains? When do u think this addition is ready for testing? I am interested as we saw some issues with this on nfs and i am curious if it will make a difference with this code.

                                      @olivierlambert i now understand there is in general no difference on coalesce as long as the data destroy is not done. So u were right on that part and it’s safe pushing it this way!

                                      florentF 1 Reply Last reply Reply Quote 0
                                      • olivierlambertO Offline
                                        olivierlambert Vates πŸͺ Co-Founder CEO
                                        last edited by

                                        Yes, that's why we'll be able to offer a safe route for people not using the data destroy but leave people who want to explore it to do so in opt in πŸ™‚

                                        1 Reply Last reply Reply Quote 0
                                        • florentF Offline
                                          florent Vates πŸͺ XO Team @rtjdamen
                                          last edited by

                                          @rtjdamen it's still fresh, but on the other hand, the worse that can happen is falling back to a full backup. So for now I would not use it on the bigger VM ( multi terabytes )
                                          We are sure that it will be a game changer on thick provisioning ( because snapshot cost the full virtual size) or on fast changing VM , where coalescing an older snapshot is a major hurdle

                                          If everything goes well it will be on stable by the end of july, and we'll probably enable it by default on new backup in the near future

                                          R 3 Replies Last reply Reply Quote 1
                                          • Tristis OrisT Offline
                                            Tristis Oris Top contributor
                                            last edited by

                                            can't commit, too small for ticket.

                                            typo

                                            preferNbdInformation:
                                                'A network accessible by XO or the proxy must have NBD enabled,. Storage must support Change Block Tracking (CBT) to ue it in a backup',
                                            

                                            enabled,.
                                            to ue

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post