XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. MajorP93
    3. Posts
    M
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 22
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: Long backup times via NFS to Data Domain from Xen Orchestra

      @florent said in Long backup times via NFS to Data Domain from Xen Orchestra:

      @MajorP93 this settings exists (not in the ui )

      you can create a configuration file named /etc/xo-server/config.diskConcurrency.toml if you use a xoa

      containing

      [backups]
      diskPerVmConcurrency = 2
      
      

      Hey, does this also work for XO from sources users?

      It would be great indeed if there was an UI option for this.

      best regards

      posted in Backup
      M
      MajorP93
    • RE: Long backup times via NFS to Data Domain from Xen Orchestra

      @florent said in Long backup times via NFS to Data Domain from Xen Orchestra:

      @MajorP93
      interesting, Note that this is orthogonal to NBD.
      I note that there is probably more work to do to improve the performance and will retest VM with a lot of disk
      Performance is really depending on the underlying storage.
      compression and encryption can't be done in "legacy mode" , since we won't be able to merge block in place in this case.

      I see thanks for the insights.
      The problem that we saw could also be solved if you guys would add another config parameter to the delta backup job: disk concurrency per VM.
      That way it would be possible to backup only e.g. 2 out of 10 virtual disks at the time.

      posted in Backup
      M
      MajorP93
    • RE: Long backup times via NFS to Data Domain from Xen Orchestra

      @florent said in Long backup times via NFS to Data Domain from Xen Orchestra:

      interesting

      can you try to do a perfomance test while using block storage ?
      81709355-5c09-4581-a424-9245a367bbdb-image.png

      This will store backup are multiple small ( typically 1MB ) files , that are easy to deduplicated, and the merge process will be moving / deleting files instead of modifying one big monolithic file per disk. It could sidestep the hydratation process.
      This is the mode by default on S3 / azure, and will probably be the mode by default everywhere in the future, given its advantages

      (Note for later : don't use XO encryption at rest if you need dedup, since even the same block encrypted twice will give different results)

      I am not sure if it is a good idea to use that feature as default.

      We just switched away from it in our XCP-ng / XO environment as it tanked performance really hard.
      The issue is with using NBD, delta backups will open at least 1 data stream per virtual disk. Even with concurrency set to 1 in the backup job there will be a "spam" of small files on the remote when there is a virtual machine with a lot of virtual disks attached to it (around 10 virtual disks).

      Data blocks feature resulted in transfer speeds to our NAS going down to 300 Mbit/s for VMs that have many disks.
      After disabling data blocks feature transfer speed went up to 2.8 Gbit/s.

      Instead I would like to see brotli compression becoming default for delta backups no matter if data blocks is turned on or not. Also encryption for remotes that do not use data blocks would be awesome. This way people can combine good performance with security.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I worked around this issue by changing my full backup job to "delta backup" and enabling "force full backup" in the schedule options.

      Delta backup seems more reliable as of now.

      Looking forward to a fix as Zstd compression is an appealing feature of the full backup method.

      posted in Backup
      M
      MajorP93
    • RE: What Are You All Doing for Disaster Recovery of XOA Itself?

      @wezke said in What Are You All Doing for Disaster Recovery of XOA Itself?:

      @probain

      I would be interested in youre ansible role for deploying XO from sources 🙂

      I don't think there is a need to build an ansible role for this purpose when the whole process has been scripted in bash already by ronivay 😅

      You can just download and execute the script via ansible and you're good to go:

        tasks:
        - name: Download ronivay XO from sources bash script
          ansible.builtin.shell: wget "https://raw.githubusercontent.com/ronivay/XenOrchestraInstallerUpdater/refs/heads/master/xo-install.sh" -O /tmp/xo-install.sh
      
        - name: Set permissions
          ansible.builtin.shell: chmod +x /tmp/xo-install.sh
      
        - name: Run ronivay XO from sources bash script
          become: yes
          ansible.builtin.shell: /tmp/xo-install.sh --install
      
      
      posted in Xen Orchestra
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I can imagine that a fix could be to send "keepalive" packets in addition to the XCP-ng export-VM-data-stream so that the timeout on XO side does not occur 🤔

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert said in Potential bug with Windows VM backup: "Body Timeout Error":

      @MajorP93 said in Potential bug with Windows VM backup: "Body Timeout Error":

      Why would there be long “no data” gaps? With full exports, XAPI compresses on the host side. When it encounters very large runs of zeros / sparse/unused regions, compression may yield almost nothing for long stretches. If the stream goes quiet longer than Undici’s bodyTimeout (default ~5 minutes), XO aborts. This explains why it only hits some big VMs and why delta (NBD) backups are fine.

      I think that's exactly the issue in here (if I remember our internal discussion).

      Do you think it could be a good idea to make the timeout configurable via Xen Orchestra?

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      Hey, I did some more digging.
      I found some log entries that were written at the moment the backup job threw the "body timeout error". (Those blocks of log entries appear for all VMs that show this problem).

      grep -i "VM.export" xensource.log | grep -i "error"

      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 HTTPS 192.168.60.30->:::80|[XO] VM export R:edfb08f9b55b|xapi_compression] nice failed to compress: exit code 70
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] [XO] VM export R:edfb08f9b55b failed with exception Server_error(CLIENT_ERROR, [ INTERNAL_ERROR: [ Unix.Unix_error(Unix.EPIPE, "write", "") ] ])
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] Raised Server_error(CLIENT_ERROR, [ INTERNAL_ERROR: [ Unix.Unix_error(Unix.EPIPE, "write", "") ] ])
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 1/16 xapi Raised at file ocaml/xapi/stream_vdi.ml, line 127
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 2/16 xapi Called from file ocaml/xapi/stream_vdi.ml, line 307
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 3/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 4/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 5/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 6/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 7/16 xapi Called from file ocaml/xapi/stream_vdi.ml, line 263
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 8/16 xapi Called from file list.ml, line 110
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 9/16 xapi Called from file ocaml/xapi/export.ml, line 707
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 10/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 11/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 12/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 13/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 14/16 xapi Called from file ocaml/xapi/server_helpers.ml, line 75
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 15/16 xapi Called from file ocaml/xapi/server_helpers.ml, line 97
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 16/16 xapi Called from file ocaml/libs/log/debug.ml, line 250
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace]
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|handler:http/get_export D:b71f8095b88f|backtrace] VM.export D:6f2ee2bc66b8 failed with exception Server_error(CLIENT_ERROR, [ INTERNAL_ERROR: [ Unix.Unix_error(Unix.EPIPE, "write", "") ] ])
      
      

      daemon.log shows:

      Nov  4 08:02:31 dat-xcpng01 forkexecd: [error||0 ||forkexecd] 135416 (/bin/nice -n 19 /usr/bin/ionice -c 3 /usr/bin/zstd) exited with code 70
      

      Maybe this gives some insights 🤔
      To me it appears that the issue is caused by the compression (I had zstd enabled during the backup run).

      Best regards

      //EDIT: This is what ChatGPT thinks about this (I know AI responses have to be taken with a grain of salt):
      Why would there be long “no data” gaps? With full exports, XAPI compresses on the host side. When it encounters very large runs of zeros / sparse/unused regions, compression may yield almost nothing for long stretches. If the stream goes quiet longer than Undici’s bodyTimeout (default ~5 minutes), XO aborts. This explains why it only hits some big VMs and why delta (NBD) backups are fine.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I did 2 more tests.

      1. using full backup with encryption disabled on the remote (had it enabled before) --> same issue
      2. switching from zstd to gzip --> same issue
      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert said in Potential bug with Windows VM backup: "Body Timeout Error":

      I think we have a lead to explore, we'll keep you posted when we have a branch to test 🙂

      Sure! Thank you very much.
      When there is a branch available I will be happy to compile, test and provide any information / log needed.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert said in Potential bug with Windows VM backup: "Body Timeout Error":

      I think we have a lead, I've seen a discussion between @florent and @dinhngtu recently about that topic

      Sounds good!
      So there is a fix currently being worked on?

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @nikade Hmm I have a hard time understanding what might cause this issue in my case since all of our 5 XCP-ng hosts are on the same site. They can talk on layer 2 with each other and have 2x 50 Gbit/s LACP bond each...
      The XO VM is running on the pool master itself.
      Some of the VMs that threw this error are also even running on the pool master itself.
      So I would expect that the traffic does not even have to exit the physical host in this case...
      Latency should be perfectly fine in this case...

      All XCP-ng hosts, XO VM and NAS (backup remote) can ping each other at below 1ms latency...

      Really weird.

      If anyone has an idea regarding what could possibly cause this I would be grateful.
      As I said before I want to test Gzip instead of Zstd but I have to wait until this backup job finished.
      It has ~40TB of data to backup in total 😅

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      Hey,

      I am experiencing the same issue using XO from sources (commit 4d77b79ce920925691d84b55169ea3b70f7a52f6), Node version 22, Debian 13.

      I have multiple backup jobs and only one which is a full backup job is giving me issues.

      Most VMs can be backed up by this full backup job just fine but some error out with "body timeout error", e.g.:

                  {
                    "id": "1762017810483",
                    "message": "transfer",
                    "start": 1762017810483,
                    "status": "failure",
                    "end": 1762018134258,
                    "result": {
                      "name": "BodyTimeoutError",
                      "code": "UND_ERR_BODY_TIMEOUT",
                      "message": "Body Timeout Error",
                      "stack": "BodyTimeoutError: Body Timeout Error\n    at FastTimer.onParserTimeout [as _onTimeout] (/etc/xen-orchestra/node_modules/undici/lib/dispatcher/client-h1.js:646:28)\n    at Timeout.onTick [as _onTimeout] (/etc/xen-orchestra/node_modules/undici/lib/util/timers.js:162:13)\n    at listOnTimeout (node:internal/timers:588:17)\n    at process.processTimers (node:internal/timers:523:7)"
                    }
                  }
      

      XO from sources VM has 8 vCPU and 8GB RAM.
      Link speed of the XCP-ng hosts is 50 Gbit/s.
      XO VM can reach 20 Gbit/s to the NAS in iperf.

      Zstd is enabled for this backup job.
      It appears that only big VMs (as in disk size) have this issue.
      The VMs that have this issue on the full backup job can be backed up just fine via delta backup job.

      I read in another thread that this issue can be caused by dom0 hardware constrains but dom0 has 16 vCPU and is at ~40% CPU usage while backups are running.
      RAM usage sits at 2GB out of 8GB used.

      I changed my full backup job to GZIP compression and will see if this helps.
      Will report back.
      I really need compression due to the large virtual disks of some VMs...

      Best regards
      MajorP

      posted in Backup
      M
      MajorP93
    • Xen Orchestra Node 24 compatibility

      Hey Vates team and XCP-ng community,

      since version 24 just became the new LTS release of Node.js I was wondering:

      is Xen Orchestra expected to be fully compatible with this version?
      Can XO from sources users safely transition to the new version without sacrificing stability?

      Node 24 LTS blog announcement: https://nodejs.org/en/blog/release/v24.11.0

      In case somebody already tested this or have any insights I would be very grateful.

      Thanks in advance!

      Best regards
      MajorP

      posted in Xen Orchestra
      M
      MajorP93
    • RE: "Backup fell back to a full" on delta backups

      Hey,
      we also have the problem of delta backups falling back to full very often using latest XO from sources.
      We are also using CBT.
      I read in the documentation that "CBT snapshot data purge" feature can be hit or miss.
      It would be awesome if that feature would be more stable though. I consider it "cleaner" to not have snapshots that can't be deleted.
      Best regards

      posted in Backup
      M
      MajorP93
    • RE: [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs

      @florent said in [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs:

      @MajorP93 the size are different between the disks, did you modify it since the snapshots ?

      would it be possible to take one new snapshot with the same disk structure ?

      Sorry it was my bad indeed.
      On the VMWare side there are 2 VMs that have almost the exact same name.
      When I checked for disk layout to verify this was an issue I looked at the wrong VM. 🤦

      I checked again and can confirm that the VM in question has 1x 60GiB and 1x 25GiB VMDK.

      So this is not an issue. It is working as intended.

      Thread can be closed / deleted.
      Sorry again and thanks for the replies.

      Best regards
      MajorP

      posted in Xen Orchestra
      M
      MajorP93
    • RE: [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs

      @Pilow said in [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs:

      @MajorP93 in your screenshot, CA and CA_1 seems to be two disks on the original VM... ?

      They're even different in size.

      each are suffixed with -000001-delta, indicating one snapshot, on each disk

      two vmdk on source gives you two vdis on destination, as intended 😛

      Thanks for your reply.

      The VM only has 1 virtual disk (60GB) on VMWare side.
      Otherwise the situation would be clear and I would not have asked the question 😅

      Best regards
      MajorP

      posted in Xen Orchestra
      M
      MajorP93
    • [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs

      Dear Vates team and community,

      I am currently in the process of migrating our whole internal VMWare infrastructure to XCP-ng / Xen Orchestra.

      I am using the new VDDK based V2V tool for this matter.

      I noticed that when having more than 1 snapshot for a VM on the VMWare side the V2V tool will create multiple VHDs even though the VM only has 1 virtual disk on the VMWare side.

      After migration the disk layout of the VM looks like this:

      2d308bc5-2d5e-4fcd-b2ef-ae02eebaf02d-image.png

      My question is: is it possible to merge these into 1 single VHD so the disk layout matches the one of the VMWare side?

      And: is it possible to have 2 snapshots per VM on VMWare side and prevent this from happening so that the resulting XCP-ng VM only has 1 VHD per VMWare VMDK?

      Thanks a lot in advance!

      Best regards
      MajorP

      posted in Xen Orchestra
      M
      MajorP93
    • RE: Async.VM.pool_migrate stuck at 57%

      @wmazren I had a similar issue which costed my many hours to troubleshoot.

      I'd advise you to check "dmesg" output within the VM that is not able to get live migrated.

      XCP-ng / Xen behaves different than VMWare regarding live migration.

      XCP-ng will interact with the linux kernel upon live migration and the kernel will try to freeze all processes before performing the live migration.

      In my case a "fuse" process blocked the graceful freezing of all processes and my live migration task also stuck in task view similar to your case.

      After solving the fuse process issue and therefore making the system able to live migrate the issue was gone.

      All of this can be viewed in dmesg as the kernel will tell you about what is being done during live migration via XCP-ng.

      //EDIT: another thing you might want to try is toggling "migration compression" in pool settings as well as making sure you have a dedicated connection / VLAN configured for the live migration. Those 2 things also helped my live migrations being faster and more robust.

      posted in Management
      M
      MajorP93
    • RE: XOA 5.110 - Import from VMWare shows "vddk:" without ok or checkmark status

      @Andrw0830 I am using paid version of ESXi / vSphere.
      The free version of ESXi has so many limitations that I consider it pretty much unusable / only usable for very limited basic tests.
      It would not suprise me if your issue is due to the license limitations within ESXi.

      posted in Xen Orchestra
      M
      MajorP93