XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Backups with qcow2 enabled

    Scheduled Pinned Locked Moved Backup
    28 Posts 3 Posters 1.2k Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • florentF Offline
      florent Vates 🪐 XO Team @acebmxer
      last edited by

      @acebmxer you need to enable NBD on the backup to ensure ti works with qcow2

      we 'll check for the progress bar

      acebmxerA 1 Reply Last reply Reply Quote 0
      • acebmxerA Online
        acebmxer @florent
        last edited by

        @florent

        was already enabled. Just not Purge snapshot data when using CBT.

        Screenshot_20260417_065841.png

        florentF 1 Reply Last reply Reply Quote 0
        • florentF Offline
          florent Vates 🪐 XO Team @acebmxer
          last edited by

          @acebmxer did you enabled NBD on the network ( in the pool view ? ) is it a network accessible by the xo ?

          acebmxerA 2 Replies Last reply Reply Quote 0
          • acebmxerA Online
            acebmxer @florent
            last edited by

            @florent
            Yes...

            Think i have solved it / fixed it... I ran the errors though AI and it suggested running a full backup to reset the chain. Full backup was successfull with out issues. Running delta backup now as we speak but looks good so far.

            One mistake i made is enabled the purge snapshot with CBT and now vms now failling to full backups.

            P 1 Reply Last reply Reply Quote 0
            • P Offline
              Pilow @acebmxer
              last edited by

              @acebmxer I have a new case of managing to force the fell back to full error...
              i'll create a new topic for this

              in the time being, if you can, do a toolstack restart on your pool when no tasks is ongoing
              your backups with NBD could be better (spoiler alert : iptables rules... 😃 )

              1 Reply Last reply Reply Quote 0
              • acebmxerA Online
                acebmxer @florent
                last edited by

                @florent

                Both full and delta passed...

                Full - 2026-04-17T11_32_42.906Z - backup NG.txt
                Delta - 2026-04-17T13_26_54.973Z - backup NG.txt

                @pilow

                I just restarted tool stack on both pool master and second host. Re-edit backup job to remove the purge snapshot with cbt and re-ran a delta backup. Which still fell back to full backup.

                P 1 Reply Last reply Reply Quote 0
                • P Offline
                  Pilow @acebmxer
                  last edited by

                  @acebmxer bottom of POOL advanced tab, is BACKUP NETWORK selected on the NBD enabled network accessible by both hosts and XOA ?

                  acebmxerA 1 Reply Last reply Reply Quote 0
                  • acebmxerA Online
                    acebmxer @Pilow
                    last edited by acebmxer

                    @Pilow

                    Was already configured...

                    Screenshot 2026-04-17 102001.png

                    Screenshot 2026-04-17 102051.png

                    Think this is the issue....
                    Network tab under pool.
                    Screenshot 2026-04-17 102135.png

                    Will run another delta once the current one finishes.

                    Edit - update

                    Can a warning be made if NBD is not enabled at the pool level? or make the error more clear?

                    I enabled the nbd at the pool level and ran another delta - 2026-04-17T14_40_39.535Z - backup NG.txt

                    I Then re-enable the purge snapshot in back up job and ran another delta. 2026-04-17T15_00_40.358Z - backup NG.txt

                    Screenshot 2026-04-17 110321.png

                    Screenshot 2026-04-17 111349.png

                    P 1 Reply Last reply Reply Quote 0
                    • acebmxerA acebmxer referenced this topic
                    • P Offline
                      Pilow @acebmxer
                      last edited by

                      @acebmxer so, NBD it was...

                      holy molly, you have some good network performance !
                      what kind of SR at source ? and remote at destination ?
                      what about the PIFs ?

                      acebmxerA 1 Reply Last reply Reply Quote 0
                      • acebmxerA Online
                        acebmxer @Pilow
                        last edited by acebmxer

                        @Pilow

                        Just a little old truenas running on AMD 5900x with 10 Gic nics bound, Unifi Agg switch with 10gb links to hosts.

                        Storage 5 8tb tohisba drives and two 1tb nvme drives 1 for cache 1 for log vdev.

                        Backup device is a DS1819+ with 4 12 tb seagate exo drives with 10gb link.

                        Screenshot 2026-04-17 122521.png

                        P 1 Reply Last reply Reply Quote 0
                        • P Offline
                          Pilow @acebmxer
                          last edited by Pilow

                          @acebmxer NFS remotes on the DS1819+ ?

                          we have iSCSI SR (25Gb mellanox 6 PIFs on hosts to 25Gb MSA2062 SAN dual controller)
                          our remotes are iSCSI os mounted volumes on MSA SANs, presented as S3 (minio VMs)
                          using XO PROXIES to offload backups from XOA

                          we max out a 150/200Mb/s during backups 😕

                          but we are on VHD VDIs, asking myself if the added backup performance you present could be due to QCOW2 format on source SR ?
                          will have to try VDIs on such SR to see the diff

                          acebmxerA 1 Reply Last reply Reply Quote 0
                          • acebmxerA Online
                            acebmxer @Pilow
                            last edited by acebmxer

                            @Pilow Yes NFS on vm storage and on backup storage.

                            All vms are now on qcow2 except for the windows vm what was vhd. However i just mirgrated it over to qcow2. the Nics in all systems are the intel 10gb either x520 or x540

                            Edit - Sorry missed your question about performance vhd vers qcow2. I would say its on equal to. I didnt run any benchmarks for comparison. (probably should have) But havent seen any major slowness other then GC issues. (See latest post)

                            1 Reply Last reply Reply Quote 1
                            • acebmxerA Online
                              acebmxer
                              last edited by acebmxer

                              So progress bar seems to be working now on exporting. But i am also noticing Garbage collection seems to be running quite often that i feel its slowing down the import for the health check.

                              Screenshot_20260418_104201.png

                              Screenshot_20260418_104514.png

                              1 Reply Last reply Reply Quote 0
                              • acebmxerA Online
                                acebmxer
                                last edited by acebmxer

                                Next thing I am noticing that garbage collection is not able to coalesce vdi to vms that are running. garbage will keep trying to run every 30 - 45 seconds for 30 seconds run time. VDIs to coalesce will keep increasing unless a vm is powered off and given enough time for garbage collection to actually run.

                                Because garbage collection is spamming so often importing a vdi for health check will take logger.

                                1 Reply Last reply Reply Quote 0
                                • acebmxerA Online
                                  acebmxer
                                  last edited by acebmxer

                                  @florent

                                  I disabled CBT on the vms and re-ran a delta backup. I noticed the following.

                                  While watching the backup task when it shows [XO] Exporting it will not show the percentage and completion. Watching backup task a vm will show not using NBD at first but using XO and then fail. The backup log will then show vm fell back to full backup and the task will then show that it is using NBD.

                                  Screenshot_20260420_002103.png

                                  Screenshot_20260420_002557.png

                                  Screenshot_20260420_002717.png

                                  Screenshot_20260420_003717.png

                                  Also with out CBT GC is not constantly spamming once a vm has finished exporting and started health check. It appears to actually be running and able to coalesce as they are no longer showing in the dashboard / health...

                                  delta job with CBT enabled - 2026-04-20T00_43_48.487Z - backup NG.txt

                                  CBT disabled delta job. two vms fell back to full - 2026-04-20T03_19_02.894Z - backup NG.txt

                                  Next delta job with CBT disabled 1 vm fell back to full - 2026-04-20T04_19_10.531Z - backup NG.txt

                                  Screenshot_20260420_001804.png

                                  Screenshot_20260420_010236.png

                                  acebmxerA 1 Reply Last reply Reply Quote 0
                                  • acebmxerA Online
                                    acebmxer @acebmxer
                                    last edited by

                                    Ok something is happening. I made no additional changes from last night. Now schedule backup job 1 pm is still going on 2 hours later.

                                    Screenshot 2026-04-20 145633.png

                                    Think i might revert back to VHD as this has cause so many issues.

                                    florentF 1 Reply Last reply Reply Quote 0
                                    • florentF Offline
                                      florent Vates 🪐 XO Team @acebmxer
                                      last edited by

                                      @acebmxer do you have something in the xo logs ( journalct probably ) ?

                                      acebmxerA 1 Reply Last reply Reply Quote 0
                                      • acebmxerA Online
                                        acebmxer @florent
                                        last edited by acebmxer

                                        @florent said:

                                        @acebmxer do you have something in the xo logs ( journalct probably ) ?

                                        from host
                                        sudo journalctl -u xo-server -n 50
                                        -- No entries --

                                        From XO - Currently on fbba0 commit (4 behind)

                                        Apr 20 15:21:51 xo-ce xo-server[849]:   }
                                        Apr 20 15:21:51 xo-ce xo-server[849]: }
                                        Apr 20 15:22:35 xo-ce xo-server[849]: 2026-04-20T19:22:35.554Z xo:main INFO + Console proxy ( - 127.0.0.1)
                                        Apr 20 15:22:41 xo-ce xo-server[849]: 2026-04-20T19:22:41.365Z xo:main INFO - Console proxy ( - 127.0.0.1)
                                        Apr 20 15:22:55 xo-ce xo-server[849]: 2026-04-20T19:22:55.223Z xo:main INFO + Console proxy ( - 127.0.0.1)
                                        Apr 20 15:23:04 xo-ce xo-server[849]: 2026-04-20T19:23:04.562Z xo:main INFO - Console proxy ( - 127.0.0.1)
                                        Apr 20 16:11:49 xo-ce xo-server[849]: (node:849) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 update list>
                                        Apr 20 16:11:49 xo-ce xo-server[849]: 2026-04-20T20:11:49.190Z xo:xo-server WARN Node warning {
                                        Apr 20 16:11:49 xo-ce xo-server[849]:   error: MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 update listen>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at genericNodeError (node:internal/errors:985:15)
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at wrappedFn (node:internal/errors:539:14)
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at _addListener (node:events:590:17)
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at Tasks.addListener (node:events:608:10)
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at TaskController.getTasks (file:///opt/xen-orchestra/@xen-orchestra/rest-api/dist/tasks/tas>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at ExpressTemplateService.buildPromise (/opt/xen-orchestra/node_modules/@tsoa/runtime/src/ro>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at ExpressTemplateService.apiHandler (/opt/xen-orchestra/node_modules/@tsoa/runtime/src/rout>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at TaskController_getTasks (file:///opt/xen-orchestra/@xen-orchestra/rest-api/dist/open-api/>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:     emitter: Tasks {
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       _events: [Object: null prototype],
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       _eventsCount: 2,
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       _maxListeners: undefined,
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       Symbol(shapeMode): false,
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       Symbol(kCapture): false
                                        Apr 20 16:11:49 xo-ce xo-server[849]:     },
                                        Apr 20 16:11:49 xo-ce xo-server[849]:     type: 'update',
                                        Apr 20 16:11:49 xo-ce xo-server[849]:     count: 11
                                        Apr 20 16:11:49 xo-ce xo-server[849]:   }
                                        Apr 20 16:11:49 xo-ce xo-server[849]: }
                                        Apr 20 16:11:49 xo-ce xo-server[849]: (node:849) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 remove list>
                                        Apr 20 16:11:49 xo-ce xo-server[849]: 2026-04-20T20:11:49.190Z xo:xo-server WARN Node warning {
                                        Apr 20 16:11:49 xo-ce xo-server[849]:   error: MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 remove listen>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at genericNodeError (node:internal/errors:985:15)
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at wrappedFn (node:internal/errors:539:14)
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at _addListener (node:events:590:17)
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at Tasks.addListener (node:events:608:10)
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at TaskController.getTasks (file:///opt/xen-orchestra/@xen-orchestra/rest-api/dist/tasks/tas>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at ExpressTemplateService.buildPromise (/opt/xen-orchestra/node_modules/@tsoa/runtime/src/ro>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at ExpressTemplateService.apiHandler (/opt/xen-orchestra/node_modules/@tsoa/runtime/src/rout>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       at TaskController_getTasks (file:///opt/xen-orchestra/@xen-orchestra/rest-api/dist/open-api/>
                                        Apr 20 16:11:49 xo-ce xo-server[849]:     emitter: Tasks {
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       _events: [Object: null prototype],
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       _eventsCount: 2,
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       _maxListeners: undefined,
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       Symbol(shapeMode): false,
                                        Apr 20 16:11:49 xo-ce xo-server[849]:       Symbol(kCapture): false
                                        Apr 20 16:11:49 xo-ce xo-server[849]:     },
                                        Apr 20 16:11:49 xo-ce xo-server[849]:     type: 'remove',
                                        Apr 20 16:11:49 xo-ce xo-server[849]:     count: 11
                                        Apr 20 16:11:49 xo-ce xo-server[849]:   }
                                        Apr 20 16:11:49 xo-ce xo-server[849]: }
                                        
                                        1 Reply Last reply Reply Quote 0
                                        • acebmxerA Online
                                          acebmxer
                                          last edited by

                                          @florent

                                          Killed the current tasks. Updated xo to latest commit and start backups again... here are current errors from xo...

                                           sudo journalctl -u xo-server -n 50
                                          Apr 20 17:29:54 xo-ce xo-server[3147]:   Symbol(undici.error.UND_ERR): true,
                                          Apr 20 17:29:54 xo-ce xo-server[3147]:   Symbol(undici.error.UND_ERR_SOCKET): true
                                          Apr 20 17:29:54 xo-ce xo-server[3147]: }
                                          Apr 20 17:30:29 xo-ce xo-server[3262]: 2026-04-20T21:30:29.390Z xo:xapi:vdi INFO  OpaqueRef:4efd6d02-6c4d-26f8-7ed5-1b9b34daa89d has been disconnected from dom0>
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:   vdiRef: 'OpaqueRef:ce354818-4cf0-2ac3-8156-3d7118548ecd',
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:   vbdRef: 'OpaqueRef:4efd6d02-6c4d-26f8-7ed5-1b9b34daa89d'
                                          Apr 20 17:30:29 xo-ce xo-server[3262]: }
                                          Apr 20 17:30:29 xo-ce xo-server[3262]: 2026-04-20T21:30:29.392Z xo:xapi:xapi-disks WARN Either transmit the source to the constructor or implement openSource an>
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:   error: Error: Either transmit the source to the constructor or implement openSource and call init
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at get source (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:53:19)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at XapiQcow2StreamSource.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:86>
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at XapiQcow2StreamSource.close (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/XapiQcow2StreamSource.mjs:61:18)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at #openExportStream (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/Xapi.mjs:189:21)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at async #openNbdStream (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/Xapi.mjs:97:22)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at async XapiDiskSource.openSource (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/Xapi.mjs:258:18)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at async XapiDiskSource.init (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:28:4>
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at async file:///opt/xen-orchestra/@xen-orchestra/backups/_incrementalVm.mjs:66:5
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at async Promise.all (index 0)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]: }
                                          Apr 20 17:30:29 xo-ce xo-server[3262]: 2026-04-20T21:30:29.392Z xo:xapi:xapi-disks WARN can't compute delta OpaqueRef:ce354818-4cf0-2ac3-8156-3d7118548ecd from >
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:   error: BodyTimeoutError: Body Timeout Error
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at FastTimer.onParserTimeout [as _onTimeout] (/opt/xen-orchestra/node_modules/undici/lib/dispatcher/client-h1.js:64>
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at Timeout.onTick [as _onTimeout] (/opt/xen-orchestra/node_modules/undici/lib/util/timers.js:162:13)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at listOnTimeout (node:internal/timers:605:17)
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:       at process.processTimers (node:internal/timers:541:7) {
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:     code: 'UND_ERR_BODY_TIMEOUT',
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR): true,
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR_BODY_TIMEOUT): true
                                          Apr 20 17:30:29 xo-ce xo-server[3262]:   }
                                          Apr 20 17:30:29 xo-ce xo-server[3262]: }
                                          Apr 20 17:30:29 xo-ce xo-server[3262]: 2026-04-20T21:30:29.395Z xo:xapi:xapi-disks INFO export through qcow2
                                          Apr 20 17:30:36 xo-ce xo-server[3262]: 2026-04-20T21:30:36.324Z xo:backups:worker ERROR unhandled error event {
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:   error: RequestAbortedError [AbortError]: Request aborted
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at BodyReadable.destroy (/opt/xen-orchestra/node_modules/undici/lib/api/readable.js:51:13)
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at QcowStream.close (file:///opt/xen-orchestra/@xen-orchestra/qcow2/dist/disk/QcowStream.mjs:40:22)
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at XapiQcow2StreamSource.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:86>
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at XapiQcow2StreamSource.close (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/XapiQcow2StreamSource.mjs:61:18)
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at DiskLargerBlock.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskLargerBlock.mjs:87:28)
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at TimeoutDisk.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:34:29)
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at XapiStreamNbdSource.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:34:2>
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at XapiStreamNbdSource.init (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/XapiStreamNbd.mjs:66:17)
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:       at async #openNbdStream (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/Xapi.mjs:108:7) {
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:     code: 'UND_ERR_ABORTED',
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR): true,
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR_ABORT): true,
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR_ABORTED): true
                                          Apr 20 17:30:36 xo-ce xo-server[3262]:   }
                                          Apr 20 17:30:36 xo-ce xo-server[3262]: }
                                          
                                          1 Reply Last reply Reply Quote 0
                                          • acebmxerA Online
                                            acebmxer
                                            last edited by acebmxer

                                            @florent

                                            After some digging this is what I have come up with. Please double check everything...
                                            I can PM you the whole chat session if you like.

                                            Bug Report: XO Backup Intermittent Failure — RequestAbortedError During NBD Stream Init
                                            Environment:

                                            XCP-ng: 8.3.0 (build 20260408, xapi 26.1.3)
                                            xapi-nbd: 26.1.3-1.6.xcpng8.3
                                            xo-server: community edition (xen-orchestra from source)
                                            Pool: 2-node pool (host1 10.100.2.10, host2 10.100.2.11)
                                            Backup NFS target: 10.100.2.23:/volume1/backup
                                            Symptom:

                                            Scheduled backup jobs intermittently fail with RequestAbortedError: Request aborted during NBD stream initialization. The failure is transient — the same VMs back up successfully on subsequent runs.

                                            xo:backups:worker ERROR unhandled error event
                                            error: RequestAbortedError [AbortError]: Request aborted
                                            at BodyReadable.destroy (undici/lib/api/readable.js:51:13)
                                            at QcowStream.close (@xen-orchestra/qcow2/dist/disk/QcowStream.mjs:40:22)
                                            at XapiQcow2StreamSource.close (@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:86:28)
                                            at XapiQcow2StreamSource.close (@xen-orchestra/xapi/disks/XapiQcow2StreamSource.mjs:61:18)
                                            at DiskLargerBlock.close (@xen-orchestra/disk-transform/dist/DiskLargerBlock.mjs:87:28)
                                            at TimeoutDisk.close (@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:34:29)
                                            at XapiStreamNbdSource.close (@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:34:29)
                                            at XapiStreamNbdSource.init (@xen-orchestra/xapi/disks/XapiStreamNbd.mjs:66:17)
                                            at async #openNbdStream (@xen-orchestra/xapi/disks/Xapi.mjs:108:7)
                                            Root Cause Analysis:

                                            The error chain is misleading — QcowStream.close and BodyReadable.destroy are cleanup, not the cause. The actual failure is inside connectNbdClientIfPossible() called at XapiStreamNbd.mjs:66.

                                            The sequence in #openNbdStream (Xapi.mjs) is:

                                            #openExportStream() — opens a qcow2/VHD HTTP stream from XAPI (succeeds)
                                            new XapiStreamNbdSource(streamSource, ...) — wraps it
                                            await source.init() — calls super.init() then connectNbdClientIfPossible()
                                            If connectNbdClientIfPossible() throws for any reason other than NO_NBD_AVAILABLE, execution goes to the catch block in #openNbdStream which calls source?.close() — this closes the already-open qcow2 HTTP stream, producing the BodyReadable.destroy → AbortError cascade
                                            The underlying NBD connection failure: MultiNbdClient.connect() opens nbdConcurrency (default 2) sequential connections. Each NbdClient.connect() failure causes the candidate host to be removed and retried with another candidate. With only 2 hosts in the pool and nbdConcurrency=2, a single transient TLS or TCP failure on one host during the NBD option negotiation can exhaust all candidates, causing MultiNbdClient to throw NO_NBD_AVAILABLE — but this error IS caught and falls back to stream export. So the failure here is something else: a connection that partially succeeds then aborts, throwing a non-NO_NBD_AVAILABLE error that propagates uncaught to #openNbdStream's catch block.

                                            Specific issue: When nbdClient.connect() throws with UND_ERR_ABORTED (an undici abort), the error code is not NO_NBD_AVAILABLE, so #openNbdStream re-throws it instead of falling back to stream export. The backup then fails entirely rather than gracefully degrading.

                                            Proposed Fix:

                                            In Xapi.mjs, the catch block in #openNbdStream should treat any NBD connection failure as fallback-eligible, not just NO_NBD_AVAILABLE:

                                            } catch (err) {
                                            if (err.code === 'NO_NBD_AVAILABLE' || err.code === 'UND_ERR_ABORTED') {
                                            warn(can't connect through NBD, fall back to stream export, { err })
                                            if (streamSource === undefined) {
                                            throw new Error(Can't open stream source)
                                            }
                                            return streamSource
                                            }
                                            await source?.close().catch(warn)
                                            throw err
                                            }
                                            Or more robustly, treat any NBD connection error as fallback-eligible rather than hardcoding error codes:

                                            } catch (err) {
                                            warn(can't connect through NBD, fall back to stream export, { err })
                                            if (streamSource === undefined) {
                                            throw new Error(Can't open stream source)
                                            }
                                            return streamSource
                                            }
                                            This matches the intent of the existing NO_NBD_AVAILABLE fallback — NBD is opportunistic, and any failure to establish it should degrade gracefully to HTTP stream export rather than failing the entire backup job.

                                            Observed Timeline:

                                            02:22:11 — xo-server opens VHD + qcow2 export streams
                                            02:22:12–15 — NBD connections attempted, fail mid-handshake
                                            02:22:15 — backup fails with UND_ERR_ABORTED, no fallback
                                            02:33:51 — retry attempt also fails in 5 seconds
                                            23:03 — same VMs back up successfully (transient condition resolved)
                                            Impact: Backup jobs fail entirely on transient NBD connectivity issues instead of falling back to HTTP stream export, which is already implemented and working.

                                            You can file this at the XO GitHub issues or the XCP-ng forum. The fix is straightforward and low-risk — the fallback path already exists and works, it's just not being reached for UND_ERR_ABORTED errors.

                                            1 Reply Last reply Reply Quote 0

                                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                            With your input, this post could be even better 💗

                                            Register Login
                                            • First post
                                              Last post