Delta backup stuck on "Clean VM Directory" for a long time
-
Hi,
I have problems with delta backup which take a long time to finish.
The remote is in a local Minio instance to buckets with a retention-policy set to get immutable backups.
I has been working fine for about 4 month but it's after that it got bad.First of all, I'm not sure it's related to the long backup times but I get some of these in the logs and my guess it's because with retention enabled Minio keeps empty directories/prefixes until the deleted files exceeds than the retention period:
no alias references VHD
Transfer and Merge seems quite ok and I think it is "Clean VM Directory" which is the culprit, it can go on for hours.
At first I thought the long backup times could be related to performance on Minio but I can't see any S3 requests at that time.
During my troubleshooting I start to think it's the backup-process which is idle or waiting and will continue after a timeout but I don't know how to debug this further.
There have been a few timeout logged:"message": "Connection timed out after 600000 ms", "stack": "TimeoutError: Connection timed out after 600000 ms\n at ClientRequest.<anonymous> (/opt/xen-orchestra/node_modules/@aws-sdk/node-http-handler/node_modules/@smithy/node-http-handler/dist-cjs/set-socket-timeout.js:7:30)\n at Object.onceWrapper (node:events:632:28)\n at ClientRequest.emit (node:events:518:28)\n at ClientRequest.patchedEmit [as emit] (/opt/xen-orchestra/@xen-orchestra/log/configure.js:52:17)\n at TLSSocket.emitRequestTimeout (node:_http_client:863:9)\n at Object.onceWrapper (node:events:632:28)\n at TLSSocket.emit (node:events:530:35)\n at TLSSocket.patchedEmit [as emit] (/opt/xen-orchestra/@xen-orchestra/log/configure.js:52:17)\n at Socket._onTimeout (node:net:604:8)\n at listOnTimeout (node:internal/timers:588:17)"
I have adjusted Concurrency both up and down, with a lower value the timeout seems to be avoided, but even without there errors the backup times is just as large.
Some assistance would be much appreciated.
-
Hi,
Sadly, even if Minio is probably better than others (eg BackBlaze), they can suffer from various issues. There's a reason why we only provide official pro support on AWS.
Would it be possible to test another provider to double check it's not XO related?
-
One thing I've notices is that if I do a trace in Minio I see no S3-requests but suddenly it shows a couple of s3.list_objects_v2 witch gives status 499 and the duration is just about 10 minutes. But this gives no timeoutin the backuplog.
Are you thinking about test a public provider such as AWS?
I'm not sure we are able to do so. One thing I'm about to do is to test a newer version of Minio in docker on another server running Ubuntu 24.04
As of now it's an older version running native on a OmniOS server ontop of ZFS-storage.Is there a way to get more verbose logging from the backup job?
-
Adding @florent in the loop