Backup fails with "Body Timeout Error", "all targets have failed, step: writer.run()"
-
For the last week or so, the backup on one paticular VM gives this error:
VM1 (xcp-ng-1) • Snapshot Start: 2024-05-12 04:00 End: 2024-05-12 04:00 • NFS2 o transfer Start: 2024-05-12 04:00 End: 2024-05-12 05:21 Duration: an hour Error: Body Timeout Error • Start: 2024-05-12 04:00 • End: 2024-05-12 05:21 • Duration: an hour • Error: Body Timeout Error NFS2 o transfer Start: 2024-05-12 04:00 End: 2024-05-12 05:21 Duration: an hour Error: Body Timeout Error • Start: 2024-05-12 04:00 • End: 2024-05-12 05:21 • Duration: an hour • Error: Body Timeout Error Start: 2024-05-12 04:00 End: 2024-05-12 05:23 Duration: an hour Error: all targets have failed, step: writer.run() Type: full
The host is running xcp beta 3 (updated through 5-12-24), XO is version "Xen Orchestra, commit 9b9c7".
This VM and numerous other VM's back up the same way (goes to two different NFS targets) for the last five months.
I've tried running it in the middle of the day (when other backups are not running) but it didn't help.
"Number of retries if VM backup fails"=0 "Timeout " = <blank> "Compression"="zstd"
Any ideas?
-
@archw
FWIW...It happened again last night.Start: 2024-05-15 04:00
End: 2024-05-15 07:07
Duration: 3 hours
Error: Body Timeout Error
Duration: 3 hours
Error: all targets have failed, step: writer.run()
Type: full -
@archw did you find a solution for this? We expiriencing this error on all of our backup jobs since two days.
Start: 2024-06-28 14:07 End: 2024-06-28 14:28 Duration: 21 minutes Error: Body Timeout Error Type: delta
-
In the words of Ronald Reagan "I don't recall the answer to that question"
If I remember correctly (subject to the after effects of many happy hours since 5-12-24), I ended up rebooting the host that had that VM. It has never done it since.
Arch
-
We realize this is an older issue, but we're experiencing something similar. Last week, I performed a rolling pool update, which involved rebooting all nodes and migrating VMs as part of the process.
Interestingly, the issue consistently affects the same VMs each time. These VMs have the necessary tools installed, and I can't pinpoint why only they are impacted.
We're encountering the same error across multiple pools. All pools use the same backup repositories, but out of approximately 100 VMs, only 3-4 are affected.
i know even more happy hours since the last post haha.
I could clone the VM etc but that seems a bit drastic.
-
-
Many, many happy hours have since transpired
I ended up wiping out the XO vm that was running the process and making a new one. That seems to have fixed it.
With all that said, I got one again last night with backing up the same VM that has caused an issue in the past. I just told the backup to restart so lets see what happens.