XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Backups with qcow2 enabled

    Scheduled Pinned Locked Moved Backup
    21 Posts 3 Posters 292 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Online
      Pilow @acebmxer
      last edited by Pilow

      @acebmxer NFS remotes on the DS1819+ ?

      we have iSCSI SR (25Gb mellanox 6 PIFs on hosts to 25Gb MSA2062 SAN dual controller)
      our remotes are iSCSI os mounted volumes on MSA SANs, presented as S3 (minio VMs)
      using XO PROXIES to offload backups from XOA

      we max out a 150/200Mb/s during backups 😕

      but we are on VHD VDIs, asking myself if the added backup performance you present could be due to QCOW2 format on source SR ?
      will have to try VDIs on such SR to see the diff

      A 1 Reply Last reply Reply Quote 0
      • A Offline
        acebmxer @Pilow
        last edited by acebmxer

        @Pilow Yes NFS on vm storage and on backup storage.

        All vms are now on qcow2 except for the windows vm what was vhd. However i just mirgrated it over to qcow2. the Nics in all systems are the intel 10gb either x520 or x540

        Edit - Sorry missed your question about performance vhd vers qcow2. I would say its on equal to. I didnt run any benchmarks for comparison. (probably should have) But havent seen any major slowness other then GC issues. (See latest post)

        1 Reply Last reply Reply Quote 1
        • A Offline
          acebmxer
          last edited by acebmxer

          So progress bar seems to be working now on exporting. But i am also noticing Garbage collection seems to be running quite often that i feel its slowing down the import for the health check.

          Screenshot_20260418_104201.png

          Screenshot_20260418_104514.png

          1 Reply Last reply Reply Quote 0
          • A Offline
            acebmxer
            last edited by acebmxer

            Next thing I am noticing that garbage collection is not able to coalesce vdi to vms that are running. garbage will keep trying to run every 30 - 45 seconds for 30 seconds run time. VDIs to coalesce will keep increasing unless a vm is powered off and given enough time for garbage collection to actually run.

            Because garbage collection is spamming so often importing a vdi for health check will take logger.

            1 Reply Last reply Reply Quote 0
            • A Offline
              acebmxer
              last edited by acebmxer

              @florent

              I disabled CBT on the vms and re-ran a delta backup. I noticed the following.

              While watching the backup task when it shows [XO] Exporting it will not show the percentage and completion. Watching backup task a vm will show not using NBD at first but using XO and then fail. The backup log will then show vm fell back to full backup and the task will then show that it is using NBD.

              Screenshot_20260420_002103.png

              Screenshot_20260420_002557.png

              Screenshot_20260420_002717.png

              Screenshot_20260420_003717.png

              Also with out CBT GC is not constantly spamming once a vm has finished exporting and started health check. It appears to actually be running and able to coalesce as they are no longer showing in the dashboard / health...

              delta job with CBT enabled - 2026-04-20T00_43_48.487Z - backup NG.txt

              CBT disabled delta job. two vms fell back to full - 2026-04-20T03_19_02.894Z - backup NG.txt

              Next delta job with CBT disabled 1 vm fell back to full - 2026-04-20T04_19_10.531Z - backup NG.txt

              Screenshot_20260420_001804.png

              Screenshot_20260420_010236.png

              A 1 Reply Last reply Reply Quote 0
              • A Offline
                acebmxer @acebmxer
                last edited by

                Ok something is happening. I made no additional changes from last night. Now schedule backup job 1 pm is still going on 2 hours later.

                Screenshot 2026-04-20 145633.png

                Think i might revert back to VHD as this has cause so many issues.

                florentF 1 Reply Last reply Reply Quote 0
                • florentF Offline
                  florent Vates 🪐 XO Team @acebmxer
                  last edited by

                  @acebmxer do you have something in the xo logs ( journalct probably ) ?

                  A 1 Reply Last reply Reply Quote 0
                  • A Offline
                    acebmxer @florent
                    last edited by acebmxer

                    @florent said:

                    @acebmxer do you have something in the xo logs ( journalct probably ) ?

                    from host
                    sudo journalctl -u xo-server -n 50
                    -- No entries --

                    From XO - Currently on fbba0 commit (4 behind)

                    Apr 20 15:21:51 xo-ce xo-server[849]:   }
                    Apr 20 15:21:51 xo-ce xo-server[849]: }
                    Apr 20 15:22:35 xo-ce xo-server[849]: 2026-04-20T19:22:35.554Z xo:main INFO + Console proxy ( - 127.0.0.1)
                    Apr 20 15:22:41 xo-ce xo-server[849]: 2026-04-20T19:22:41.365Z xo:main INFO - Console proxy ( - 127.0.0.1)
                    Apr 20 15:22:55 xo-ce xo-server[849]: 2026-04-20T19:22:55.223Z xo:main INFO + Console proxy ( - 127.0.0.1)
                    Apr 20 15:23:04 xo-ce xo-server[849]: 2026-04-20T19:23:04.562Z xo:main INFO - Console proxy ( - 127.0.0.1)
                    Apr 20 16:11:49 xo-ce xo-server[849]: (node:849) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 update list>
                    Apr 20 16:11:49 xo-ce xo-server[849]: 2026-04-20T20:11:49.190Z xo:xo-server WARN Node warning {
                    Apr 20 16:11:49 xo-ce xo-server[849]:   error: MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 update listen>
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at genericNodeError (node:internal/errors:985:15)
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at wrappedFn (node:internal/errors:539:14)
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at _addListener (node:events:590:17)
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at Tasks.addListener (node:events:608:10)
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at TaskController.getTasks (file:///opt/xen-orchestra/@xen-orchestra/rest-api/dist/tasks/tas>
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at ExpressTemplateService.buildPromise (/opt/xen-orchestra/node_modules/@tsoa/runtime/src/ro>
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at ExpressTemplateService.apiHandler (/opt/xen-orchestra/node_modules/@tsoa/runtime/src/rout>
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at TaskController_getTasks (file:///opt/xen-orchestra/@xen-orchestra/rest-api/dist/open-api/>
                    Apr 20 16:11:49 xo-ce xo-server[849]:     emitter: Tasks {
                    Apr 20 16:11:49 xo-ce xo-server[849]:       _events: [Object: null prototype],
                    Apr 20 16:11:49 xo-ce xo-server[849]:       _eventsCount: 2,
                    Apr 20 16:11:49 xo-ce xo-server[849]:       _maxListeners: undefined,
                    Apr 20 16:11:49 xo-ce xo-server[849]:       Symbol(shapeMode): false,
                    Apr 20 16:11:49 xo-ce xo-server[849]:       Symbol(kCapture): false
                    Apr 20 16:11:49 xo-ce xo-server[849]:     },
                    Apr 20 16:11:49 xo-ce xo-server[849]:     type: 'update',
                    Apr 20 16:11:49 xo-ce xo-server[849]:     count: 11
                    Apr 20 16:11:49 xo-ce xo-server[849]:   }
                    Apr 20 16:11:49 xo-ce xo-server[849]: }
                    Apr 20 16:11:49 xo-ce xo-server[849]: (node:849) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 remove list>
                    Apr 20 16:11:49 xo-ce xo-server[849]: 2026-04-20T20:11:49.190Z xo:xo-server WARN Node warning {
                    Apr 20 16:11:49 xo-ce xo-server[849]:   error: MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 remove listen>
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at genericNodeError (node:internal/errors:985:15)
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at wrappedFn (node:internal/errors:539:14)
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at _addListener (node:events:590:17)
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at Tasks.addListener (node:events:608:10)
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at TaskController.getTasks (file:///opt/xen-orchestra/@xen-orchestra/rest-api/dist/tasks/tas>
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at ExpressTemplateService.buildPromise (/opt/xen-orchestra/node_modules/@tsoa/runtime/src/ro>
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at ExpressTemplateService.apiHandler (/opt/xen-orchestra/node_modules/@tsoa/runtime/src/rout>
                    Apr 20 16:11:49 xo-ce xo-server[849]:       at TaskController_getTasks (file:///opt/xen-orchestra/@xen-orchestra/rest-api/dist/open-api/>
                    Apr 20 16:11:49 xo-ce xo-server[849]:     emitter: Tasks {
                    Apr 20 16:11:49 xo-ce xo-server[849]:       _events: [Object: null prototype],
                    Apr 20 16:11:49 xo-ce xo-server[849]:       _eventsCount: 2,
                    Apr 20 16:11:49 xo-ce xo-server[849]:       _maxListeners: undefined,
                    Apr 20 16:11:49 xo-ce xo-server[849]:       Symbol(shapeMode): false,
                    Apr 20 16:11:49 xo-ce xo-server[849]:       Symbol(kCapture): false
                    Apr 20 16:11:49 xo-ce xo-server[849]:     },
                    Apr 20 16:11:49 xo-ce xo-server[849]:     type: 'remove',
                    Apr 20 16:11:49 xo-ce xo-server[849]:     count: 11
                    Apr 20 16:11:49 xo-ce xo-server[849]:   }
                    Apr 20 16:11:49 xo-ce xo-server[849]: }
                    
                    1 Reply Last reply Reply Quote 0
                    • A Offline
                      acebmxer
                      last edited by

                      @florent

                      Killed the current tasks. Updated xo to latest commit and start backups again... here are current errors from xo...

                       sudo journalctl -u xo-server -n 50
                      Apr 20 17:29:54 xo-ce xo-server[3147]:   Symbol(undici.error.UND_ERR): true,
                      Apr 20 17:29:54 xo-ce xo-server[3147]:   Symbol(undici.error.UND_ERR_SOCKET): true
                      Apr 20 17:29:54 xo-ce xo-server[3147]: }
                      Apr 20 17:30:29 xo-ce xo-server[3262]: 2026-04-20T21:30:29.390Z xo:xapi:vdi INFO  OpaqueRef:4efd6d02-6c4d-26f8-7ed5-1b9b34daa89d has been disconnected from dom0>
                      Apr 20 17:30:29 xo-ce xo-server[3262]:   vdiRef: 'OpaqueRef:ce354818-4cf0-2ac3-8156-3d7118548ecd',
                      Apr 20 17:30:29 xo-ce xo-server[3262]:   vbdRef: 'OpaqueRef:4efd6d02-6c4d-26f8-7ed5-1b9b34daa89d'
                      Apr 20 17:30:29 xo-ce xo-server[3262]: }
                      Apr 20 17:30:29 xo-ce xo-server[3262]: 2026-04-20T21:30:29.392Z xo:xapi:xapi-disks WARN Either transmit the source to the constructor or implement openSource an>
                      Apr 20 17:30:29 xo-ce xo-server[3262]:   error: Error: Either transmit the source to the constructor or implement openSource and call init
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at get source (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:53:19)
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at XapiQcow2StreamSource.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:86>
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at XapiQcow2StreamSource.close (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/XapiQcow2StreamSource.mjs:61:18)
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at #openExportStream (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/Xapi.mjs:189:21)
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at async #openNbdStream (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/Xapi.mjs:97:22)
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at async XapiDiskSource.openSource (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/Xapi.mjs:258:18)
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at async XapiDiskSource.init (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:28:4>
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at async file:///opt/xen-orchestra/@xen-orchestra/backups/_incrementalVm.mjs:66:5
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at async Promise.all (index 0)
                      Apr 20 17:30:29 xo-ce xo-server[3262]: }
                      Apr 20 17:30:29 xo-ce xo-server[3262]: 2026-04-20T21:30:29.392Z xo:xapi:xapi-disks WARN can't compute delta OpaqueRef:ce354818-4cf0-2ac3-8156-3d7118548ecd from >
                      Apr 20 17:30:29 xo-ce xo-server[3262]:   error: BodyTimeoutError: Body Timeout Error
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at FastTimer.onParserTimeout [as _onTimeout] (/opt/xen-orchestra/node_modules/undici/lib/dispatcher/client-h1.js:64>
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at Timeout.onTick [as _onTimeout] (/opt/xen-orchestra/node_modules/undici/lib/util/timers.js:162:13)
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at listOnTimeout (node:internal/timers:605:17)
                      Apr 20 17:30:29 xo-ce xo-server[3262]:       at process.processTimers (node:internal/timers:541:7) {
                      Apr 20 17:30:29 xo-ce xo-server[3262]:     code: 'UND_ERR_BODY_TIMEOUT',
                      Apr 20 17:30:29 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR): true,
                      Apr 20 17:30:29 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR_BODY_TIMEOUT): true
                      Apr 20 17:30:29 xo-ce xo-server[3262]:   }
                      Apr 20 17:30:29 xo-ce xo-server[3262]: }
                      Apr 20 17:30:29 xo-ce xo-server[3262]: 2026-04-20T21:30:29.395Z xo:xapi:xapi-disks INFO export through qcow2
                      Apr 20 17:30:36 xo-ce xo-server[3262]: 2026-04-20T21:30:36.324Z xo:backups:worker ERROR unhandled error event {
                      Apr 20 17:30:36 xo-ce xo-server[3262]:   error: RequestAbortedError [AbortError]: Request aborted
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at BodyReadable.destroy (/opt/xen-orchestra/node_modules/undici/lib/api/readable.js:51:13)
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at QcowStream.close (file:///opt/xen-orchestra/@xen-orchestra/qcow2/dist/disk/QcowStream.mjs:40:22)
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at XapiQcow2StreamSource.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:86>
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at XapiQcow2StreamSource.close (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/XapiQcow2StreamSource.mjs:61:18)
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at DiskLargerBlock.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskLargerBlock.mjs:87:28)
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at TimeoutDisk.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:34:29)
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at XapiStreamNbdSource.close (file:///opt/xen-orchestra/@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:34:2>
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at XapiStreamNbdSource.init (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/XapiStreamNbd.mjs:66:17)
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
                      Apr 20 17:30:36 xo-ce xo-server[3262]:       at async #openNbdStream (file:///opt/xen-orchestra/@xen-orchestra/xapi/disks/Xapi.mjs:108:7) {
                      Apr 20 17:30:36 xo-ce xo-server[3262]:     code: 'UND_ERR_ABORTED',
                      Apr 20 17:30:36 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR): true,
                      Apr 20 17:30:36 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR_ABORT): true,
                      Apr 20 17:30:36 xo-ce xo-server[3262]:     Symbol(undici.error.UND_ERR_ABORTED): true
                      Apr 20 17:30:36 xo-ce xo-server[3262]:   }
                      Apr 20 17:30:36 xo-ce xo-server[3262]: }
                      
                      1 Reply Last reply Reply Quote 0
                      • A Offline
                        acebmxer
                        last edited by acebmxer

                        @florent

                        After some digging this is what I have come up with. Please double check everything...
                        I can PM you the whole chat session if you like.

                        Bug Report: XO Backup Intermittent Failure — RequestAbortedError During NBD Stream Init
                        Environment:

                        XCP-ng: 8.3.0 (build 20260408, xapi 26.1.3)
                        xapi-nbd: 26.1.3-1.6.xcpng8.3
                        xo-server: community edition (xen-orchestra from source)
                        Pool: 2-node pool (host1 10.100.2.10, host2 10.100.2.11)
                        Backup NFS target: 10.100.2.23:/volume1/backup
                        Symptom:

                        Scheduled backup jobs intermittently fail with RequestAbortedError: Request aborted during NBD stream initialization. The failure is transient — the same VMs back up successfully on subsequent runs.

                        xo:backups:worker ERROR unhandled error event
                        error: RequestAbortedError [AbortError]: Request aborted
                        at BodyReadable.destroy (undici/lib/api/readable.js:51:13)
                        at QcowStream.close (@xen-orchestra/qcow2/dist/disk/QcowStream.mjs:40:22)
                        at XapiQcow2StreamSource.close (@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:86:28)
                        at XapiQcow2StreamSource.close (@xen-orchestra/xapi/disks/XapiQcow2StreamSource.mjs:61:18)
                        at DiskLargerBlock.close (@xen-orchestra/disk-transform/dist/DiskLargerBlock.mjs:87:28)
                        at TimeoutDisk.close (@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:34:29)
                        at XapiStreamNbdSource.close (@xen-orchestra/disk-transform/dist/DiskPassthrough.mjs:34:29)
                        at XapiStreamNbdSource.init (@xen-orchestra/xapi/disks/XapiStreamNbd.mjs:66:17)
                        at async #openNbdStream (@xen-orchestra/xapi/disks/Xapi.mjs:108:7)
                        Root Cause Analysis:

                        The error chain is misleading — QcowStream.close and BodyReadable.destroy are cleanup, not the cause. The actual failure is inside connectNbdClientIfPossible() called at XapiStreamNbd.mjs:66.

                        The sequence in #openNbdStream (Xapi.mjs) is:

                        #openExportStream() — opens a qcow2/VHD HTTP stream from XAPI (succeeds)
                        new XapiStreamNbdSource(streamSource, ...) — wraps it
                        await source.init() — calls super.init() then connectNbdClientIfPossible()
                        If connectNbdClientIfPossible() throws for any reason other than NO_NBD_AVAILABLE, execution goes to the catch block in #openNbdStream which calls source?.close() — this closes the already-open qcow2 HTTP stream, producing the BodyReadable.destroy → AbortError cascade
                        The underlying NBD connection failure: MultiNbdClient.connect() opens nbdConcurrency (default 2) sequential connections. Each NbdClient.connect() failure causes the candidate host to be removed and retried with another candidate. With only 2 hosts in the pool and nbdConcurrency=2, a single transient TLS or TCP failure on one host during the NBD option negotiation can exhaust all candidates, causing MultiNbdClient to throw NO_NBD_AVAILABLE — but this error IS caught and falls back to stream export. So the failure here is something else: a connection that partially succeeds then aborts, throwing a non-NO_NBD_AVAILABLE error that propagates uncaught to #openNbdStream's catch block.

                        Specific issue: When nbdClient.connect() throws with UND_ERR_ABORTED (an undici abort), the error code is not NO_NBD_AVAILABLE, so #openNbdStream re-throws it instead of falling back to stream export. The backup then fails entirely rather than gracefully degrading.

                        Proposed Fix:

                        In Xapi.mjs, the catch block in #openNbdStream should treat any NBD connection failure as fallback-eligible, not just NO_NBD_AVAILABLE:

                        } catch (err) {
                        if (err.code === 'NO_NBD_AVAILABLE' || err.code === 'UND_ERR_ABORTED') {
                        warn(can't connect through NBD, fall back to stream export, { err })
                        if (streamSource === undefined) {
                        throw new Error(Can't open stream source)
                        }
                        return streamSource
                        }
                        await source?.close().catch(warn)
                        throw err
                        }
                        Or more robustly, treat any NBD connection error as fallback-eligible rather than hardcoding error codes:

                        } catch (err) {
                        warn(can't connect through NBD, fall back to stream export, { err })
                        if (streamSource === undefined) {
                        throw new Error(Can't open stream source)
                        }
                        return streamSource
                        }
                        This matches the intent of the existing NO_NBD_AVAILABLE fallback — NBD is opportunistic, and any failure to establish it should degrade gracefully to HTTP stream export rather than failing the entire backup job.

                        Observed Timeline:

                        02:22:11 — xo-server opens VHD + qcow2 export streams
                        02:22:12–15 — NBD connections attempted, fail mid-handshake
                        02:22:15 — backup fails with UND_ERR_ABORTED, no fallback
                        02:33:51 — retry attempt also fails in 5 seconds
                        23:03 — same VMs back up successfully (transient condition resolved)
                        Impact: Backup jobs fail entirely on transient NBD connectivity issues instead of falling back to HTTP stream export, which is already implemented and working.

                        You can file this at the XO GitHub issues or the XCP-ng forum. The fix is straightforward and low-risk — the fallback path already exists and works, it's just not being reached for UND_ERR_ABORTED errors.

                        1 Reply Last reply Reply Quote 0

                        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                        With your input, this post could be even better 💗

                        Register Login
                        • First post
                          Last post