XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    CBT: the thread to centralize your feedback

    Scheduled Pinned Locked Moved Backup
    442 Posts 37 Posters 397.1k Views 29 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S Offline
      StormMaster @rtjdamen
      last edited by

      @rtjdamen Seems to happen when you run a mixture of backup solutions.

      R 1 Reply Last reply Reply Quote 0
      • R Offline
        rtjdamen @StormMaster
        last edited by

        @StormMaster thanks, seems logical if u backup the same vm with 2 different solutions, but in our case we don't use a different backup tool to do so. We do use Alike for some smaller backups but not this specific vms.

        S 1 Reply Last reply Reply Quote 0
        • S Offline
          StormMaster @rtjdamen
          last edited by

          @rtjdamen Sorry! Just to clarify. When I said a mixture of backup solutions, I was talking about the different backup solutions that XCP-NG backup provides. IE Running a delta backup after running a continuous replication backup.

          When running a mixture of XCP-NG incremental backups, there appears to be a bug somewhere that has been causing the fall back to base errors along with a couple of other errors that break the backup process.

          R C 2 Replies Last reply Reply Quote 1
          • S Offline
            StormMaster @olivierlambert
            last edited by

            @olivierlambert @florent If it helps to know... Disabling "Use NBD + CBT to transfer disk if available" on the same backup jobs as I used above works flawlessly. Although on big backup jobs, not having NBD available does add about a quarter of the time to the backup process.

            1 Reply Last reply Reply Quote 0
            • R Offline
              rtjdamen @StormMaster
              last edited by

              @StormMaster i understand, we are also not using this on these vms but it does make sense, something is breaking the cbt chain and causing a full, question is if this is caused by the backup job or something else. Thanks for your input.

              1 Reply Last reply Reply Quote 0
              • A Offline
                andyh
                last edited by

                I was looking to do some updates on our TrueNAS Scale device providing an NFS share to my XCP-ng hosts (8.2.1), we have CBT enabled for backups.

                However when I try to move the Xen Orchestra VDI from TrueNAS to local storage I receive the following error:

                {
                  "id": "0m261vorl",
                  "properties": {
                    "method": "vdi.migrate",
                    "params": {
                      "id": "f91f81f2-308d-4de9-879e-c1fa84a37d27",
                      "sr_id": "49822b62-3367-7e7c-76ee-1cfc91a262e9"
                    },
                    "name": "API call: vdi.migrate",
                    "userId": "7b63bade-51f3-4916-9174-f969da17774a",
                    "type": "api.call"
                  },
                  "start": 1728731129889,
                  "status": "failure",
                  "updatedAt": 1728731132752,
                  "end": 1728731132752,
                  "result": {
                    "code": "VDI_CBT_ENABLED",
                    "params": [
                      "OpaqueRef:4f16cd0e-fbaf-48c3-aae4-092b9906b9e4"
                    ],
                    "task": {
                      "uuid": "7ce61fba-d6d3-12cb-2585-79d5b69d3857",
                      "name_label": "Async.VDI.pool_migrate",
                      "name_description": "",
                      "allowed_operations": [],
                      "current_operations": {},
                      "created": "20241012T11:05:31Z",
                      "finished": "20241012T11:05:32Z",
                      "status": "failure",
                      "resident_on": "OpaqueRef:fe0440a3-4a31-44d6-8317-a0e64d0ee01e",
                      "progress": 1,
                      "type": "<none/>",
                      "result": "",
                      "error_info": [
                        "VDI_CBT_ENABLED",
                        "OpaqueRef:4f16cd0e-fbaf-48c3-aae4-092b9906b9e4"
                      ],
                      "other_config": {},
                      "subtask_of": "OpaqueRef:NULL",
                      "subtasks": [],
                      "backtrace": "(((process xapi)(filename ocaml/xapi-client/client.ml)(line 7))((process xapi)(filename ocaml/xapi-client/client.ml)(line 19))((process xapi)(filename ocaml/xapi-client/client.ml)(line 12359))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 134))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 205))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 95)))"
                    },
                    "message": "VDI_CBT_ENABLED(OpaqueRef:4f16cd0e-fbaf-48c3-aae4-092b9906b9e4)",
                    "name": "XapiError",
                    "stack": "XapiError: VDI_CBT_ENABLED(OpaqueRef:4f16cd0e-fbaf-48c3-aae4-092b9906b9e4)\n    at Function.wrap (file:///opt/xo/xo-builds/xen-orchestra-202410111017/packages/xen-api/_XapiError.mjs:16:12)\n    at default (file:///opt/xo/xo-builds/xen-orchestra-202410111017/packages/xen-api/_getTaskResult.mjs:13:29)\n    at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202410111017/packages/xen-api/index.mjs:1041:24)\n    at file:///opt/xo/xo-builds/xen-orchestra-202410111017/packages/xen-api/index.mjs:1075:14\n    at Array.forEach (<anonymous>)\n    at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202410111017/packages/xen-api/index.mjs:1065:12)\n    at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202410111017/packages/xen-api/index.mjs:1238:14)"
                  }
                }
                

                I can see a task disabling CBT on the disk and looking at the UI it shows as CBT disabled.

                I experience the same issue attempting to migrate other VDI's too.

                1 Reply Last reply Reply Quote 0
                • R Offline
                  rtjdamen
                  last edited by

                  You need to remove all snapshots before migration and disable cbt. Storage migration is not supported when cbt is invalid. I believe xoa should do this automatically however.

                  A 1 Reply Last reply Reply Quote 1
                  • A Offline
                    andyh @rtjdamen
                    last edited by

                    @rtjdamen Thanks for the super fast response!

                    Just removed the existing snapshots and the task is proceeding.

                    Did you mean cbt is enabled as opposed to cbt is invalid?

                    R 1 Reply Last reply Reply Quote 0
                    • R Offline
                      rtjdamen @andyh
                      last edited by

                      @andyh no cbt should be disabled, u can’t migrate an cbt enabled vdi.

                      A 1 Reply Last reply Reply Quote 1
                      • A Offline
                        andyh @rtjdamen
                        last edited by

                        @rtjdamen I understand now

                        F 1 Reply Last reply Reply Quote 0
                        • F Offline
                          flakpyro @andyh
                          last edited by flakpyro

                          After a number of XOA Updates i decided to test CBT with snapshot delete again.

                          Instead of " "can't create a stream from a metadata VDI, fall back to a base" i am seeing a more verbose error but the issue remains the same. In a 2 host pool with shared NFS storage if i have CBT with Snap delete enabled after a VM is migrated from host A to host B (remaining on the shared NFS SR) and a backup runs the delta backup fails and a full runs. This time the error shows " Can't do delta with this vdi, transfer will be a full"

                          This is with XOA Latest: 5.100.0

                          I have attached the backup log if this helps.

                          2024-11-05T17_37_58.455Z - backup NG.json.txt

                          R 1 Reply Last reply Reply Quote 0
                          • R Offline
                            rtjdamen @flakpyro
                            last edited by

                            @flakpyro in general this error means that the cbt is not valid, we have seen this as well and i know vates is looking into this. In general we saw this problem more on nfs then on iscsi, not shure if it is nfs related but maybe if u have access to an iscsi target try your results there.

                            F 1 Reply Last reply Reply Quote 0
                            • F Offline
                              flakpyro @rtjdamen
                              last edited by

                              @rtjdamen this test pool is running on TrueNAS, so i could configure iSCSI. Our production is on NFS so i tried to keep test close to the same as far as storage is concerned. We currently use NBD + CBT without the snapshot delete function in production and it works well via NFS. Will continue to keep testing as updates roll out and look forward to when this is completely stable. If there's anything anyone at Vates needs log or test wise i'm happy to help!

                              R 1 Reply Last reply Reply Quote 0
                              • R Offline
                                rtjdamen @flakpyro
                                last edited by

                                @flakpyro i think that would be a good test, can u check how it is behaving on iscsi on your end?

                                1 Reply Last reply Reply Quote 0
                                • C Offline
                                  chr1st0ph9 @StormMaster
                                  last edited by

                                  @StormMaster Thank you - I experienced that too , 2 backup plans with same VMs. Backup failed once out of two with error message "can't create a stream from a metadata VDI, fall back to a base"

                                  R 1 Reply Last reply Reply Quote 0
                                  • R Offline
                                    rtjdamen @chr1st0ph9
                                    last edited by

                                    @chr1st0ph9 i understand a fix is being made for this, @florent patched our proxy yesterday and since then no more fulls so far!

                                    F 1 Reply Last reply Reply Quote 0
                                    • F Offline
                                      flakpyro @rtjdamen
                                      last edited by

                                      @rtjdamen Thats great news, with both block and file based SRs?

                                      1 Reply Last reply Reply Quote 0
                                      • florentF Offline
                                        florent Vates 🪐 XO Team
                                        last edited by florent

                                        this branch ( already deployed on @rtjdamen systems) add a better handing of host that took too much time to compute the changed block list :
                                        https://github.com/vatesfr/xen-orchestra/pull/8120

                                        it will be release in patch this week

                                        I am still investigating an error that still occurs occasionally : XapiError: SR_BACKEND_FAILURE_460(, Failed to calculate changed blocks for given VDIs. [opterr=Source and target VDI are unrelated], )

                                        fbeauchamp opened this pull request in vatesfr/xen-orchestra

                                        closed fix(@xen-orchestra/xapi): use callAsync for VDI_listChangedBlocks #8120

                                        R 1 Reply Last reply Reply Quote 0
                                        • R Offline
                                          rtjdamen @florent
                                          last edited by

                                          @florent thanks for letting me know, on our end this error seems to occur on the same vms every time, it are just a handfull. Could it be these vms are facing higher load on them what causes xapi tasks to take longer then expected?

                                          F 1 Reply Last reply Reply Quote 0
                                          • F Offline
                                            flakpyro @rtjdamen
                                            last edited by flakpyro

                                            @rtjdamen @florent Be very curious to test this once it hits XOA.

                                            The lab i am testing this a two host pool with each server having a 24 Core Epyc CPU and 256 GB of ram. There are only around 4 VMs running on this test environment currently and all are low load VMs.

                                            Is it still something that mostly occurs with file based SRs? I never did get a chance to setup ISCSI and test with it instead of NFS.

                                            Once this becomes stable there i definitely plan to switch to this backup method in production!

                                            R 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post