XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. MajorP93
    M
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 22
    • Groups 0

    MajorP93

    @MajorP93

    5
    Reputation
    2
    Profile views
    22
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    MajorP93 Unfollow Follow

    Best posts made by MajorP93

    • RE: [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs

      @florent said in [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs:

      @MajorP93 the size are different between the disks, did you modify it since the snapshots ?

      would it be possible to take one new snapshot with the same disk structure ?

      Sorry it was my bad indeed.
      On the VMWare side there are 2 VMs that have almost the exact same name.
      When I checked for disk layout to verify this was an issue I looked at the wrong VM. 🤦

      I checked again and can confirm that the VM in question has 1x 60GiB and 1x 25GiB VMDK.

      So this is not an issue. It is working as intended.

      Thread can be closed / deleted.
      Sorry again and thanks for the replies.

      Best regards
      MajorP

      posted in Xen Orchestra
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I worked around this issue by changing my full backup job to "delta backup" and enabling "force full backup" in the schedule options.

      Delta backup seems more reliable as of now.

      Looking forward to a fix as Zstd compression is an appealing feature of the full backup method.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I can imagine that a fix could be to send "keepalive" packets in addition to the XCP-ng export-VM-data-stream so that the timeout on XO side does not occur 🤔

      posted in Backup
      M
      MajorP93
    • RE: Async.VM.pool_migrate stuck at 57%

      @wmazren I had a similar issue which costed my many hours to troubleshoot.

      I'd advise you to check "dmesg" output within the VM that is not able to get live migrated.

      XCP-ng / Xen behaves different than VMWare regarding live migration.

      XCP-ng will interact with the linux kernel upon live migration and the kernel will try to freeze all processes before performing the live migration.

      In my case a "fuse" process blocked the graceful freezing of all processes and my live migration task also stuck in task view similar to your case.

      After solving the fuse process issue and therefore making the system able to live migrate the issue was gone.

      All of this can be viewed in dmesg as the kernel will tell you about what is being done during live migration via XCP-ng.

      //EDIT: another thing you might want to try is toggling "migration compression" in pool settings as well as making sure you have a dedicated connection / VLAN configured for the live migration. Those 2 things also helped my live migrations being faster and more robust.

      posted in Management
      M
      MajorP93

    Latest posts made by MajorP93

    • RE: Long backup times via NFS to Data Domain from Xen Orchestra

      @florent said in Long backup times via NFS to Data Domain from Xen Orchestra:

      @MajorP93 this settings exists (not in the ui )

      you can create a configuration file named /etc/xo-server/config.diskConcurrency.toml if you use a xoa

      containing

      [backups]
      diskPerVmConcurrency = 2
      
      

      Hey, does this also work for XO from sources users?

      It would be great indeed if there was an UI option for this.

      best regards

      posted in Backup
      M
      MajorP93
    • RE: Long backup times via NFS to Data Domain from Xen Orchestra

      @florent said in Long backup times via NFS to Data Domain from Xen Orchestra:

      @MajorP93
      interesting, Note that this is orthogonal to NBD.
      I note that there is probably more work to do to improve the performance and will retest VM with a lot of disk
      Performance is really depending on the underlying storage.
      compression and encryption can't be done in "legacy mode" , since we won't be able to merge block in place in this case.

      I see thanks for the insights.
      The problem that we saw could also be solved if you guys would add another config parameter to the delta backup job: disk concurrency per VM.
      That way it would be possible to backup only e.g. 2 out of 10 virtual disks at the time.

      posted in Backup
      M
      MajorP93
    • RE: Long backup times via NFS to Data Domain from Xen Orchestra

      @florent said in Long backup times via NFS to Data Domain from Xen Orchestra:

      interesting

      can you try to do a perfomance test while using block storage ?
      81709355-5c09-4581-a424-9245a367bbdb-image.png

      This will store backup are multiple small ( typically 1MB ) files , that are easy to deduplicated, and the merge process will be moving / deleting files instead of modifying one big monolithic file per disk. It could sidestep the hydratation process.
      This is the mode by default on S3 / azure, and will probably be the mode by default everywhere in the future, given its advantages

      (Note for later : don't use XO encryption at rest if you need dedup, since even the same block encrypted twice will give different results)

      I am not sure if it is a good idea to use that feature as default.

      We just switched away from it in our XCP-ng / XO environment as it tanked performance really hard.
      The issue is with using NBD, delta backups will open at least 1 data stream per virtual disk. Even with concurrency set to 1 in the backup job there will be a "spam" of small files on the remote when there is a virtual machine with a lot of virtual disks attached to it (around 10 virtual disks).

      Data blocks feature resulted in transfer speeds to our NAS going down to 300 Mbit/s for VMs that have many disks.
      After disabling data blocks feature transfer speed went up to 2.8 Gbit/s.

      Instead I would like to see brotli compression becoming default for delta backups no matter if data blocks is turned on or not. Also encryption for remotes that do not use data blocks would be awesome. This way people can combine good performance with security.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I worked around this issue by changing my full backup job to "delta backup" and enabling "force full backup" in the schedule options.

      Delta backup seems more reliable as of now.

      Looking forward to a fix as Zstd compression is an appealing feature of the full backup method.

      posted in Backup
      M
      MajorP93
    • RE: What Are You All Doing for Disaster Recovery of XOA Itself?

      @wezke said in What Are You All Doing for Disaster Recovery of XOA Itself?:

      @probain

      I would be interested in youre ansible role for deploying XO from sources 🙂

      I don't think there is a need to build an ansible role for this purpose when the whole process has been scripted in bash already by ronivay 😅

      You can just download and execute the script via ansible and you're good to go:

        tasks:
        - name: Download ronivay XO from sources bash script
          ansible.builtin.shell: wget "https://raw.githubusercontent.com/ronivay/XenOrchestraInstallerUpdater/refs/heads/master/xo-install.sh" -O /tmp/xo-install.sh
      
        - name: Set permissions
          ansible.builtin.shell: chmod +x /tmp/xo-install.sh
      
        - name: Run ronivay XO from sources bash script
          become: yes
          ansible.builtin.shell: /tmp/xo-install.sh --install
      
      
      posted in Xen Orchestra
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I can imagine that a fix could be to send "keepalive" packets in addition to the XCP-ng export-VM-data-stream so that the timeout on XO side does not occur 🤔

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert said in Potential bug with Windows VM backup: "Body Timeout Error":

      @MajorP93 said in Potential bug with Windows VM backup: "Body Timeout Error":

      Why would there be long “no data” gaps? With full exports, XAPI compresses on the host side. When it encounters very large runs of zeros / sparse/unused regions, compression may yield almost nothing for long stretches. If the stream goes quiet longer than Undici’s bodyTimeout (default ~5 minutes), XO aborts. This explains why it only hits some big VMs and why delta (NBD) backups are fine.

      I think that's exactly the issue in here (if I remember our internal discussion).

      Do you think it could be a good idea to make the timeout configurable via Xen Orchestra?

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      Hey, I did some more digging.
      I found some log entries that were written at the moment the backup job threw the "body timeout error". (Those blocks of log entries appear for all VMs that show this problem).

      grep -i "VM.export" xensource.log | grep -i "error"

      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 HTTPS 192.168.60.30->:::80|[XO] VM export R:edfb08f9b55b|xapi_compression] nice failed to compress: exit code 70
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] [XO] VM export R:edfb08f9b55b failed with exception Server_error(CLIENT_ERROR, [ INTERNAL_ERROR: [ Unix.Unix_error(Unix.EPIPE, "write", "") ] ])
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] Raised Server_error(CLIENT_ERROR, [ INTERNAL_ERROR: [ Unix.Unix_error(Unix.EPIPE, "write", "") ] ])
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 1/16 xapi Raised at file ocaml/xapi/stream_vdi.ml, line 127
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 2/16 xapi Called from file ocaml/xapi/stream_vdi.ml, line 307
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 3/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 4/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 5/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 6/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 7/16 xapi Called from file ocaml/xapi/stream_vdi.ml, line 263
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 8/16 xapi Called from file list.ml, line 110
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 9/16 xapi Called from file ocaml/xapi/export.ml, line 707
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 10/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 11/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 12/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 13/16 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 14/16 xapi Called from file ocaml/xapi/server_helpers.ml, line 75
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 15/16 xapi Called from file ocaml/xapi/server_helpers.ml, line 97
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace] 16/16 xapi Called from file ocaml/libs/log/debug.ml, line 250
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|VM.export D:6f2ee2bc66b8|backtrace]
      Nov  4 08:02:33 dat-xcpng01 xapi: [error||141766 :::80|handler:http/get_export D:b71f8095b88f|backtrace] VM.export D:6f2ee2bc66b8 failed with exception Server_error(CLIENT_ERROR, [ INTERNAL_ERROR: [ Unix.Unix_error(Unix.EPIPE, "write", "") ] ])
      
      

      daemon.log shows:

      Nov  4 08:02:31 dat-xcpng01 forkexecd: [error||0 ||forkexecd] 135416 (/bin/nice -n 19 /usr/bin/ionice -c 3 /usr/bin/zstd) exited with code 70
      

      Maybe this gives some insights 🤔
      To me it appears that the issue is caused by the compression (I had zstd enabled during the backup run).

      Best regards

      //EDIT: This is what ChatGPT thinks about this (I know AI responses have to be taken with a grain of salt):
      Why would there be long “no data” gaps? With full exports, XAPI compresses on the host side. When it encounters very large runs of zeros / sparse/unused regions, compression may yield almost nothing for long stretches. If the stream goes quiet longer than Undici’s bodyTimeout (default ~5 minutes), XO aborts. This explains why it only hits some big VMs and why delta (NBD) backups are fine.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I did 2 more tests.

      1. using full backup with encryption disabled on the remote (had it enabled before) --> same issue
      2. switching from zstd to gzip --> same issue
      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert said in Potential bug with Windows VM backup: "Body Timeout Error":

      I think we have a lead to explore, we'll keep you posted when we have a branch to test 🙂

      Sure! Thank you very much.
      When there is a branch available I will be happy to compile, test and provide any information / log needed.

      posted in Backup
      M
      MajorP93