XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    CBT: the thread to centralize your feedback

    Scheduled Pinned Locked Moved Backup
    455 Posts 37 Posters 400.3k Views 29 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      Andrw0830 @flakpyro
      last edited by

      @flakpyro I usually don't keep snapshots on my VMs and just have it create snapshots as part of the NBD/CBT backups and have it purge then afterwards. I've tested created a standard snapshot (without memory), deleted it and did a backup and it was able to use the old NBD and CBT chain without error.

      @rtjdamen I just have the migration network issue so thought you were referring to that so not sure about the control domain part. Is there a way to check or know if we are affected?

      R 1 Reply Last reply Reply Quote 0
      • R Offline
        rtjdamen @Andrw0830
        last edited by

        @Andrw0830 yeah shure, from dashboard u have the health tab, there u find vdi attached to control domain, we gave random vdi’s that stay attached. Although it seems to be random so we are investigating if this is an infeastructure thing or a bug. In xapi. We did not face this on 8.2.

        We do not have issues with cbt after creating and then removing snapshots.

        A 1 Reply Last reply Reply Quote 0
        • A Offline
          Andrw0830 @rtjdamen
          last edited by

          @rtjdamen As of writing this, I don't see any VMs VDI's listed as being attached to Control Domain. I have 10+ VMs on 3 XCP-NG 8.3 servers in one pool so I'll let you know if we come across that.

          F 1 Reply Last reply Reply Quote 0
          • F Offline
            flakpyro @Andrw0830
            last edited by

            So testing after a few more rounds of updates and it appears i'm still having the same issue. If i have CBT with snapshot removal enabled, and i take a manual snapshot of a VM (say for running maintenance), then remove it after some time CBT will be reset and a full backup will run during the next backup schedule. This is fine for local backups where i have 10GBe between the ZFS backup server and the pool but not ideal for replication offsite. I see there are some big changes coming with the backup code which is great news but i'd REALLY like to be able to use CBT with snapshot deletion enabled!

            1 Reply Last reply Reply Quote 0
            • D DustinB referenced this topic on
            • F Offline
              FritzGerald
              last edited by

              Hi everyone,

              Using Xen-NG for couple of years but I am new to Xen Orchestra. And first of all thank you for this great software.

              Considering moving partly away from my own written backups scripts I am currently playing around being particular interested in the XOA backup routines, in particular the CBT based delta backups. But maybe first the official facts about my setup:

              Server: HP ProLiant DL380p Gen8
              Xen-NG: 8.3 - fully patched
              Xen Orchestra: self build - commit 9ed55 (as of today)
              SR: 2 / both local ext / both are raid 10 disc based by the HPE Smart Array Gen8 Controllers
              Remote: NFS mounted (Synology - Fully patched as of today)

              Everything I tested so far works great. However I cannot get CBT based delta backups to run. I always get full backups and the error message is:

              Can't do delta with this vdi, transfer will be a full
              Can't do delta, will try to get a full stream

              As said I am using an NFS based remote and all "normal" backups work like a charm. The Dummy VM I am testing to backup a newly set up Debian 12 (management tools installed) on the 2nd - non default- SR. CBT is enabled upon the disc as well in the the backup job ("Use NBD + CBT to transfer disk if available" enabled as well as "Purge snapshot data when using CBT "). I also enabled "Merge backups synchronously".

              Unfortunately I do not really get my hands on this problem, because I do not find any particular error messages and do not really find good docs upon cbt. There are 2 observations I made:

              • There are 2 *.cbtlog files in the SR directory of the VM. One showing 00000000-0000-0000-0000-000000000000 the other e011e9dd-e14b-4f0a-b143-092ea8f1b6a3. Is that normal?
              • If I enable a memory including snapshot mode for each backup run there is one snapshot created in the default SR of the Host. These are not removed after the backup and remain orphaned. It looks to me these snapshots contain the memory (maybe this total BS - then please excuse this, but the observation might be helpful). However this problem is gone, if I switch to "offline snapshots".

              Maybe I am just missing some stupid setting. Does anyone have any suggestion where to start troubleshoot.

              R 1 Reply Last reply Reply Quote 0
              • R Offline
                rtjdamen @FritzGerald
                last edited by

                @FritzGerald 2 thing that i think could be causing, first check if u have set a dedicated nbd network on the pool.

                Second u need to remove all snapshots from the vms and disable cbt on them. This will remove the cbt files and makes shure there will be a clean cbt chain.

                F 2 Replies Last reply Reply Quote 0
                • F Offline
                  FritzGerald @rtjdamen
                  last edited by

                  @rtjdamen Thanks a lot for the quick response. The second step I already tried. But I did not set up a dedicated nbd network yet. I will have to try that. Thanks!

                  1 Reply Last reply Reply Quote 0
                  • F Offline
                    FritzGerald @rtjdamen
                    last edited by

                    @rtjdamen
                    Thank you for your hint! I indeed did not enable NBD on my management interface. Somehow I did not notice that in the docs.

                    Maybe it would be a helpful for others to mention in the docs that you need to enable NBD in: a.) the network settings b.) the vm disc settings and c.) the actual backup task in order to work.

                    Thank you again for your quick help!!!

                    F 1 Reply Last reply Reply Quote 0
                    • F Offline
                      flakpyro @FritzGerald
                      last edited by flakpyro

                      Not to dig up an old thread but was the issue with CBT being reset to 00000000-0000-0000-0000-000000000000 after a manual snapshot is taken with NFS storage ever confirmed or reproduced by Vates? I would love to start being able to use CBT and purge snapshots after each backup run, or have there been enough changes / improvements that this is worth testing again?

                      R 1 Reply Last reply Reply Quote 0
                      • R Offline
                        rtjdamen @flakpyro
                        last edited by

                        @flakpyro i understand they did, but not shure if it is allready fixed or not. it had to do with the dedicated migration network being selected. maybe @olivierlambert is aware of the current status?

                        F 1 Reply Last reply Reply Quote 0
                        • F Offline
                          flakpyro @rtjdamen
                          last edited by

                          @rtjdamen I do know the dedicated migration network was an issue. CBT data would be reset if you preformed a migration using a dedicated migration network, removing the dedicated network was a work around there. The next issue was taking a manual snapshot and removing would sometimes also reset CBT data. Perhaps i need to spin up a test VM and try again and i know there have been a lot of updates.

                          1 Reply Last reply Reply Quote 0
                          • R Offline
                            rtjdamen
                            last edited by

                            Hi all,

                            We recently upgraded our production pools to the latest XCP-ng 8.3 release. After some struggles during the upgrade (mostly around the pool master), everything seems to be running fine now in general.

                            However, since upgrading, we’re seeing much longer durations for certain XAPI-related tasks, especially:

                            VDI.enable_cbt

                            VDI.destroy

                            VDI.list_changed_blocks (during backups)

                            In some cases, these tasks take up to 25 minutes to complete on specific VMs. Meanwhile, similar operations on other VMs are done in just a few minutes. The behavior is inconsistent but reproducible.

                            We’ve checked:

                            Storage performance is normal (LVM over local SSD)

                            No I/O bottlenecks on the hosts

                            No VM performance impact during these tasks

                            It seems to affect CBT-enabled VMs more strongly, but we’re only seeing this behavior since the upgrade to 8.3 — especially after upgrading the pool master.

                            Has anyone else seen this since upgrading?
                            Is there a known issue with CBT or coalesce interaction in 8.3?
                            Would love to hear if others experience this or have suggestions for tuning.

                            1 Reply Last reply Reply Quote 0
                            • olivierlambertO Offline
                              olivierlambert Vates 🪐 Co-Founder CEO
                              last edited by

                              I'm not aware of such issues (at least now) 🤔 (doesn't mean it doesn't exist more broadly but it's the first report). We'll upgrade our own production to 8.3 relatively soon, so it will be the opportunity to also check internally.

                              R 1 Reply Last reply Reply Quote 0
                              • R Offline
                                rtjdamen @olivierlambert
                                last edited by

                                @olivierlambert ok, maybe u experience some differences as well, we were on 8.3 since February and did patch last friday, since the patching is see a decrease.

                                What i believe is happening is that GC processes are blocking other storage operations. So only if a coalece is done on an sr i see multiple actions like destroy, enable cbt and change block calculations being processed. As far as i know this was not the case before, they also could take some longer but it was not related to (or i have never noticed it).

                                Maybe we can confirm if this behavior is by design or not?

                                R 1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by

                                  I'm not sure there's a change related to that but I can ask @Team-Storage

                                  1 Reply Last reply Reply Quote 0
                                  • R Offline
                                    rtjdamen
                                    last edited by

                                    Anyone else experiencing this issue?
                                    https://github.com/vatesfr/xen-orchestra/issues/8713

                                    it's a long time bug that i believe is pretty easy to fix and get CBT backups to get more robust. Would be great if this can be implemented!

                                    rtjdamen created this issue in vatesfr/xen-orchestra

                                    open CBT backup fails repeatedly due to leftover invalid CBT snapshots #8713

                                    1 Reply Last reply Reply Quote 0
                                    • olivierlambertO Offline
                                      olivierlambert Vates 🪐 Co-Founder CEO
                                      last edited by

                                      Pinging @florent about this

                                      florentF 1 Reply Last reply Reply Quote 0
                                      • florentF Offline
                                        florent Vates 🪐 XO Team @olivierlambert
                                        last edited by

                                        @olivierlambert yes the fix is in the pipeline (on xapi side )
                                        it won't migrate a snapshot with cbt enabled, and won't allow to disable cbt on a snapshot

                                        R 1 Reply Last reply Reply Quote 1
                                        • olivierlambertO Offline
                                          olivierlambert Vates 🪐 Co-Founder CEO
                                          last edited by

                                          Thanks!

                                          1 Reply Last reply Reply Quote 0
                                          • R Offline
                                            rtjdamen @florent
                                            last edited by

                                            @florent so not an option solving this inside xoa? Could be usefull for the short term.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post