Potential bug with Windows VM backup: "Body Timeout Error"
-
We are facing up with Xen Orchestra backup issue: Full VM backups failing with "Body Timeout Error".
IT happens on specific conditions. After some experimentation we can reproduce this consistently on several hosts.
Findings bellow.Here are log entries from:
Xen Orchestra (/var/log/syslog):
XCP-NG host (/var/log/xensource.log):
(notice that Xen Orchestra registers the error and sends reports much sooner (almost at the beginning of VM backup task) and XCP-NG host at the real end of VM backup task (was monitoring tasks in Xen Orchestra))
This happens when (must meet both conditions below):
- Compression is on (does not matter Zstd or GZIP)
- Free space of VM's virtual disk is large (we have noticed that about 150GB free space is sufficient for backups to start failing sometimes. 1TB of free space fails 99% of time) OR VM disk is uninitialized/unformatted (e.g. freshly created and attached virtual disk)
Additional things we noticed:
-
Windows VMs with large virtual disks as of 1TB are failing all the time (tested about several dozen times). Virtual disks with about 150GB of free space fail only sometimes (failing usually when backing up in parallel with other VMs). Does not matter what particular OS version or type (We have tested with Windows server 2022, Windows Server 2012 R2 and Windows 11), partition table (MRB or GPT) and what filesystem (NTFS, ReFS, exFAT) is used.
-
Backing up Linux VMs with large free spaces usually does NOT fail (roughly about 99% of backups are successful) but sometimes backing up in parallel with other VMs fails. We have tested on Debian 12.9 and virtual disk with about 1TB of free space.
-
Behavior does not change when changing XCP-NG host or storage provisioning type (fat or thin). XCP-NG hosts (using LTS 8.2.1) and Xen Orchestra are up to date.
-
Ping @lsouai-vates
-
@Hex
Ditto
https://xcp-ng.org/forum/topic/10532/backup-failed-with-body-timeout-error/8I have it happen on almost every large backup. I had to give up. In the VMs that would not backup, I moved to delta backups.
FWIW, here were my results:
Regular backup to TrueNas, “compression” set to “Zstd”: backup fails. Regular backup to TrueNas, “compression” set to “disabled”: backup is successful. Regular backup to vanilla Ubuntu test VM, “compression” set to “Zstd”: backup is successful. Delta backup to TrueNas: backup is successful.
-
@Hex Hello, thanks for your bug report. I am informing Xen Orchestra team on the subject, and keep you in touch when I have some answers that can help you.
Have a good day. -
@Hex I have some answers from XO Team that I hope will be able to help you:
"During full backup, XO dowloads an XVA file from the XCP-ng/XenServer host.
This error mean that it did not get an answer after a configured delay (5 mins by default)
There can be an XO issue somewhere but the problem lies most likely on the host's side...""The timeout is not the bug, only the failsafe to not lock the process. "
-
This means we probably need to check with the XCP-ng team. But just before @Hex , can you try to reproduce the issue with
xe
CLI. If you do, then it's clearly it's not XO's fault. -
@olivierlambert
Since I'm having the same issue, can I give that suggestion a shot? If so, how do you do it from the command line (with xe CLI)? -
xe vm-export filename=export_filename compress=true
(for gzip, otherwise usezstd
for Zstd compression)Note: make sure to mount a share with enough space and not export directly on the dom0 root, otherwise you'll fill it.
-
@olivierlambert
I have tested several scenarios on 1TB test VM.
Mounted share from same storage where backups from XO are stored.- Export of shutdown VM with zstd - succeeded
- Export of snapshot while VM is running with zstd - succeeded
- Export of snapshot while VM is running with gzip - succeeded
-
That's a very interesting result
It means the problem is either an interaction between XO and XAPI, or on XO's side, but not simply an XCP-ng issue as we could have thought initially
Can you check if the XVA file seems to work when importing it? (it case
xe
fails silently). Usexe vm-import
. -
@olivierlambert
Tested on one of the previous exports and import with " xe vm-import" was successful. VM Windows OS starts normally. -