XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    CBT: the thread to centralize your feedback

    Scheduled Pinned Locked Moved Backup
    439 Posts 37 Posters 389.5k Views 29 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R Offline
      Rhodderz @rtjdamen
      last edited by Rhodderz

      @rtjdamen Just trying that now
      However it seems if i disable CBT on the vm, the backup (trying a new backup job for this testing) just re-enabled it.
      Seems based on the job i can have NBD+CBT or neither.
      Annoyingly we would like NBD to run to speed up backups as they take quite some time.

      EDIT:
      To add, the new test backup for the vm that failed before actually finished successfully
      Just manually rrerunning it on the main job now
      If it works there, the temproary workaround could be to just disable CBT and let the backup job re-enable it.

      EDIT EDIT:
      Re-running the backup on the vm in the original job still failed with the same error
      Testing with the new job and making it the same with NBD connection set to 2, purge snapshots after, still passes fine
      So i am guessing CBT is job dependant and no vm dependant?
      Which would explain why a new job on the same VM to the same place works fine?

      F 1 Reply Last reply Reply Quote 0
      • F Offline
        flakpyro @Rhodderz
        last edited by

        For our production pool i have CBT + NBD enabled but i have "Purge snapshot data when using CBT" disabled. The results in successful backups but the snapshot is retained. I assume it then ends up using that snapshot for the following delta backups.

        R 1 Reply Last reply Reply Quote 0
        • R Offline
          Rhodderz @flakpyro
          last edited by

          @flakpyro ah I will try that once proxy for that pool is back
          We upgrade XOA from stable channel to latest as we had another issue which is apparently resolved in that with NBD (causing some machines to go RO)
          Once thats fixed I will try again to see if the above update and/or disabling "Purge snapshot" works as a workaround.

          We have purge enabled (and would like it left enabled) as we use iSCSI (Dell SC5020's) so everything is a little fat, especially with some clients.
          I shal update tommorow on what happens.

          F 1 Reply Last reply Reply Quote 0
          • F Offline
            flakpyro @Rhodderz
            last edited by

            @Rhodderz I agree we are using NFS so snapshots are thin at least but we would love to be able to delete the snapshots after a backup run as well. Hopefully in time we can get this working!

            1 Reply Last reply Reply Quote 0
            • R Offline
              Rhodderz
              last edited by

              To add an update and to not leave on a cliff hanger.
              We have since updated our XOA to the latest channel to attempt to fix an NBD issue.
              This move broke a proxy of ours, but also all the backups are going through the XOA and after this the backups have not had an issue since.
              So either the new NBD fixes, it being only on an XOA or something somehwere else resolved this problem for now.

              We will be enabling the same in our other pool soon so will update if we have the same issues there.

              F 1 Reply Last reply Reply Quote 0
              • F Offline
                flakpyro @Rhodderz
                last edited by

                Sadly the latest XOA release from today does not resolve my strange CBT issue,

                [08:32 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]#  cbt-util get -c -n 4d7f0341-bbce-4957-a4c4-d603725a807a.cbtlog 
                1950d6a3-c6a9-4b0c-b79f-068dd44479cc
                After Migration from Host 01 to Host 02 (Shared NFS SR):
                [08:33 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]#  cbt-util get -c -n 4d7f0341-bbce-4957-a4c4-d603725a807a.cbtlog 
                00000000-0000-0000-0000-000000000000
                
                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates πŸͺ Co-Founder CEO
                  last edited by

                  I don't think that's an XO issue, but more something weird in your XCP-ng setup that nobody can reproduce 😒 (but it doesn't mean we couldn't solve it)

                  F 1 Reply Last reply Reply Quote 0
                  • F Offline
                    flakpyro @olivierlambert
                    last edited by

                    @olivierlambert Hmm im really not sure whats unique about my two pools. One is AMD + TrueNAS the other Intel + Pure Storage. If this is actually unique to me only perhaps i would be better off submitting a ticket to help get to the bottom of this?

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates πŸͺ Co-Founder CEO
                      last edited by

                      You manage to find a CBT issue without using any XO command, which is great because we know it's not XO now πŸ˜„ I think @dthenot is already taking a look internally.

                      dthenotD 1 Reply Last reply Reply Quote 1
                      • dthenotD Offline
                        dthenot Vates πŸͺ XCP-ng Team @olivierlambert
                        last edited by

                        @olivierlambert I am πŸ™‚

                        F 1 Reply Last reply Reply Quote 2
                        • F Offline
                          flakpyro @dthenot
                          last edited by

                          @dthenot @olivierlambert thanks guys ill hold off on submitting a ticket for now to keep the conversation centralized here but if you need any more info, would like me to try anything or would like a remote support tunnel opened just let me know! πŸ™‚

                          1 Reply Last reply Reply Quote 2
                          • Tristis OrisT Offline
                            Tristis Oris Top contributor
                            last edited by

                            can't run live migration to another pool because VDI_CBT_ENABLED. is it intended?

                            Tristis OrisT R 2 Replies Last reply Reply Quote 0
                            • Tristis OrisT Offline
                              Tristis Oris Top contributor @Tristis Oris
                              last edited by

                              @Tristis-Oris even halted VMs can't migrate with snapshot. need to remove it.

                              ForzaF 1 Reply Last reply Reply Quote 0
                              • olivierlambertO Offline
                                olivierlambert Vates πŸͺ Co-Founder CEO
                                last edited by

                                That's weird, ping @florent

                                1 Reply Last reply Reply Quote 0
                                • ForzaF Offline
                                  Forza @Tristis Oris
                                  last edited by

                                  @Tristis-Oris We've had the same problem, so are not using CBT for now.

                                  1 Reply Last reply Reply Quote 0
                                  • R Offline
                                    rtjdamen @Tristis Oris
                                    last edited by

                                    @Tristis-Oris migration of a vdi between sr is not supported with cbt enabled. U need to disable cbt first. This is done by xoa. Live migration of vm between hosts is supported as long as the sr stays the same. This is by design on xen

                                    Tristis OrisT 1 Reply Last reply Reply Quote 0
                                    • Tristis OrisT Offline
                                      Tristis Oris Top contributor @rtjdamen
                                      last edited by

                                      @rtjdamen but i can't disable CBT globaly? it auto applied to every VDI when been implemented.
                                      Disable CBT for each VDI not required, because it happens automaticaly during migration. I only need to remove all snapshots.

                                      R 1 Reply Last reply Reply Quote 0
                                      • R Offline
                                        rtjdamen @Tristis Oris
                                        last edited by

                                        @Tristis-Oris indeed seems like thats a bug in xoa that it does not delete the snapshots

                                        1 Reply Last reply Reply Quote 0
                                        • J Offline
                                          jon02
                                          last edited by

                                          @flakpyro
                                          I have the same problem.
                                          I'm on 8.2 as well and have a local ZFS SR.
                                          I'm going to upgrade to 8.3 and look, if it helps.

                                          F 1 Reply Last reply Reply Quote 0
                                          • F Offline
                                            flakpyro @jon02
                                            last edited by flakpyro

                                            Another interesting development. In our test environment this week i installed the latest HP Service pack for proliant, doing so required a server reboot so I ran a rolling pool reboot from XOA, later when the test environment backup job kicked off, i noticed it was running a regular Delta despite the migrations that must have occurred during rolling pool reboot.

                                            SSHing onto a host and checking i see sure enough the cbtlog is reporting all zeros...

                                            [17:27 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]#  cbt-util get -c -n 73877c18-a5bf-43bb-aaf5-299f46710d7e.cbtlog 
                                            00000000-0000-0000-0000-000000000000
                                            
                                            

                                            However the backup ran as a delta, running the backup again manually and it is once again it runs as a delta.

                                            Checking after the manual backup the result is not all zeros anymore:

                                            [17:28 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]#  cbt-util get -c -n 65d8656e-93e8-4e81-b1a8-0b0462f6fbb8.cbtlog 
                                            1950d6a3-c6a9-4b0c-b79f-068dd44479cc
                                            

                                            Now..just for fun i decided to manually migrate a small VM to another host and then back to see what happens:

                                            After the migration back to all zeros:

                                            [17:32 xcpng-test-01 45e457aa-16f8-41e0-d03d-8201e69638be]#  cbt-util get -c -n 65d8656e-93e8-4e81-b1a8-0b0462f6fbb8.cbtlog 
                                            00000000-0000-0000-0000-000000000000
                                            

                                            And running a backup manually resulted in the usual error:

                                            Can't do delta with this vdi, transfer will be a full
                                            Can't do delta, will try to get a full stream
                                            

                                            So...this just makes the issue even more confusing, why does a rolling pool reboot not cause this behaviour but a manual migration does? Does the ID being all zeros not actually matter? I seem to be able to consistently reproduce this too. Ill be curious to next test if a "rolling pool update" causes this behaviour next time a batch of updates is released.

                                            R olivierlambertO 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post