xcp-ng:
8.2 latest patched
xen-orchestra:
from sources on bare metal (Xen Orchestra, commit d796e; Master, commit f1780; You are not up to date with master. 38 commits behind )
added fix for inactivity:
/etc/xo-server/config.httpInactivityTimeout.toml
# Work-around HTTP timeout issue during backups
[xapiOptions]
httpInactivityTimeout = 1800000 # 30 mins
I have a full backup job with 9 VMs
Concurrency: 10
Number of retries if VM backup fails: 5
Timeout: 72
Compression: zstd
Backup is run weekly, on Sunday at 12:15 when there is no load on xcp-ng or VMs.
8 servers finish backup normally but 1 VM fails in first attempt, but manages to finish in second attempt. This also fails the entire backup job.
Retry the VM backup due to an error
attempt
1
error
"VDI must be free or attached to exactly one VM"
Snapshot
Start: 2024-09-01 12:15
End: 2024-09-01 12:16
full
transfer
Start: 2024-09-01 12:16
End: 2024-09-01 13:23
Duration: an hour
Error: Body Timeout Error
Start: 2024-09-01 12:16
End: 2024-09-01 13:23
Duration: an hour
Error: Body Timeout Error
Snapshot
Start: 2024-09-01 22:16
End: 2024-09-01 22:16
full
transfer
Start: 2024-09-01 22:16
End: 2024-09-01 23:26
Duration: an hour
Size: 17.59 GiB
Speed: 4.31 MiB/s
Start: 2024-09-01 22:16
End: 2024-09-01 23:26
Duration: an hour
Start: 2024-09-01 12:15
End: 2024-09-01 23:26
Duration: 11 hours
Type: full
This is present for couple of months now, and continues to be present after 2 XO updates from sources. Only thing that changed was the error message in XO web interface.
The strange thing is that it's always the same VM. There are 2 similar VMs, both were freshly installed in the spring.
There are no errors at that time on xcp-ng server.
Only errors available are on XO server in syslog:
root@backup:/var/log# grep "xo-server" syslog.1
2024-09-01T12:15:00.735286+02:00 backup xo-server[142373]: 2024-09-01T10:15:00.733Z xo:backups:worker INFO starting backup
2024-09-01T13:23:20.144978+02:00 backup xo-server[142373]: 2024-09-01T11:23:20.141Z xo:backups:worker WARN possibly unhandled rejection {
2024-09-01T13:23:20.145550+02:00 backup xo-server[142373]: error: BodyTimeoutError: Body Timeout Error
2024-09-01T13:23:20.145592+02:00 backup xo-server[142373]: at Timeout.onParserTimeout [as callback] (/opt/xen-orchestra/node_modules/undici/lib/dispatcher/client-h1.js:626:28)
2024-09-01T13:23:20.145617+02:00 backup xo-server[142373]: at Timeout.onTimeout [as _onTimeout] (/opt/xen-orchestra/node_modules/undici/lib/util/timers.js:22:13)
2024-09-01T13:23:20.145639+02:00 backup xo-server[142373]: at listOnTimeout (node:internal/timers:581:17)
2024-09-01T13:23:20.145660+02:00 backup xo-server[142373]: at process.processTimers (node:internal/timers:519:7) {
2024-09-01T13:23:20.145681+02:00 backup xo-server[142373]: code: 'UND_ERR_BODY_TIMEOUT'
2024-09-01T13:23:20.145703+02:00 backup xo-server[142373]: }
2024-09-01T13:23:20.145723+02:00 backup xo-server[142373]: }
2024-09-01T13:23:21.160056+02:00 backup xo-server[142373]: 2024-09-01T11:23:21.156Z xo:backups:AbstractVmRunner WARN writer step failed {
2024-09-01T13:23:21.160144+02:00 backup xo-server[142373]: error: BodyTimeoutError: Body Timeout Error
2024-09-01T13:23:21.160170+02:00 backup xo-server[142373]: at Timeout.onParserTimeout [as callback] (/opt/xen-orchestra/node_modules/undici/lib/dispatcher/client-h1.js:626:28)
2024-09-01T13:23:21.160191+02:00 backup xo-server[142373]: at Timeout.onTimeout [as _onTimeout] (/opt/xen-orchestra/node_modules/undici/lib/util/timers.js:22:13)
2024-09-01T13:23:21.160212+02:00 backup xo-server[142373]: at listOnTimeout (node:internal/timers:581:17)
2024-09-01T13:23:21.160232+02:00 backup xo-server[142373]: at process.processTimers (node:internal/timers:519:7) {
2024-09-01T13:23:21.160253+02:00 backup xo-server[142373]: code: 'UND_ERR_BODY_TIMEOUT'
2024-09-01T13:23:21.160285+02:00 backup xo-server[142373]: },
2024-09-01T13:23:21.160312+02:00 backup xo-server[142373]: step: 'writer.run()',
2024-09-01T13:23:21.160336+02:00 backup xo-server[142373]: writer: 'FullRemoteWriter'
2024-09-01T13:23:21.160357+02:00 backup xo-server[142373]: }
2024-09-01T23:26:54.251959+02:00 backup xo-server[142373]: 2024-09-01T21:26:54.250Z xo:backups:worker INFO backup has ended
2024-09-01T23:26:54.383030+02:00 backup xo-server[142373]: 2024-09-01T21:26:54.380Z xo:backups:worker INFO process will exit {
2024-09-01T23:26:54.383086+02:00 backup xo-server[142373]: duration: 40313647176,
2024-09-01T23:26:54.383115+02:00 backup xo-server[142373]: exitCode: 0,
2024-09-01T23:26:54.383154+02:00 backup xo-server[142373]: resourceUsage: {
2024-09-01T23:26:54.383175+02:00 backup xo-server[142373]: userCPUTime: 34734731925,
2024-09-01T23:26:54.383200+02:00 backup xo-server[142373]: systemCPUTime: 4723503045,
2024-09-01T23:26:54.383224+02:00 backup xo-server[142373]: maxRSS: 70968,
2024-09-01T23:26:54.383247+02:00 backup xo-server[142373]: sharedMemorySize: 0,
2024-09-01T23:26:54.383272+02:00 backup xo-server[142373]: unsharedDataSize: 0,
2024-09-01T23:26:54.383294+02:00 backup xo-server[142373]: unsharedStackSize: 0,
2024-09-01T23:26:54.383315+02:00 backup xo-server[142373]: minorPageFault: 2966203,
2024-09-01T23:26:54.383339+02:00 backup xo-server[142373]: majorPageFault: 0,
2024-09-01T23:26:54.383364+02:00 backup xo-server[142373]: swappedOut: 0,
2024-09-01T23:26:54.383384+02:00 backup xo-server[142373]: fsRead: 105920,
2024-09-01T23:26:54.383411+02:00 backup xo-server[142373]: fsWrite: 3910825184,
2024-09-01T23:26:54.383437+02:00 backup xo-server[142373]: ipcSent: 0,
2024-09-01T23:26:54.383466+02:00 backup xo-server[142373]: ipcReceived: 0,
2024-09-01T23:26:54.383487+02:00 backup xo-server[142373]: signalsCount: 0,
2024-09-01T23:26:54.383507+02:00 backup xo-server[142373]: voluntaryContextSwitches: 112085178,
2024-09-01T23:26:54.383532+02:00 backup xo-server[142373]: involuntaryContextSwitches: 79583
2024-09-01T23:26:54.383557+02:00 backup xo-server[142373]: },
2024-09-01T23:26:54.383587+02:00 backup xo-server[142373]: summary: { duration: '11h', cpuUsage: '98%', memoryUsage: '69.3 MiB' }
2024-09-01T23:26:54.383614+02:00 backup xo-server[142373]: }
Same VMs are successfully finishing delta backups.
What am I missing?