S3 backup broken
-
@Andrew the problem occurs only during resuming a merge
Does your S3 provider apply some rate limits to queries ? We do a lot of copy / delete query during merge
-
@florent For local testing I'm using MinIO and there is no limit or throttling enabled.
-
@Andrew I am testing on another user with less concurrency during merge ( today there is 16 blocks merged in parallel) , and it seems to solve the problem
I will make it configurable soon ( today it's a parameter in the code )
-
@florent I see the
fix-s3-merge
branch is gone. I updated to current master (commit d8e01) and delta backup S3 is not working (same problem as before). -
@olivierlambert @florent The
s3.js
program has different code between thefix-s3-merge
branch and master. -
@Andrew back to the nosuchkey error ?
-
@florent 10 minute timeout...
{ "data": { "mode": "delta", "reportWhen": "never" }, "id": "1662038805540", "jobId": "d6c0a656-62c5-4c39-a57a-f246b39f1cef", "jobName": "minio-test", "message": "backup", "scheduleId": "bd4ef436-fd85-4f16-bf9e-71d1d0c8586f", "start": 1662038805540, "status": "failure", "infos": [ { "data": { "vms": [ "c45dd52b-fa92-df6f-800a-10853c183c23" ] }, "message": "vms" } ], "tasks": [ { "data": { "type": "VM", "id": "c45dd52b-fa92-df6f-800a-10853c183c23" }, "id": "1662038806415", "message": "backup VM", "start": 1662038806415, "status": "failure", "tasks": [ { "id": "1662038806845", "message": "clean-vm", "start": 1662038806845, "status": "success", "end": 1662038807088, "result": { "merge": false } }, { "id": "1662038807286", "message": "snapshot", "start": 1662038807286, "status": "success", "end": 1662038808752, "result": "03c16279-9af3-e35a-a7a0-7724e62c9cc2" }, { "data": { "id": "9890e0c4-ba3a-4810-8245-a49fdf16b16e", "isFull": false, "type": "remote" }, "id": "1662038808752:0", "message": "export", "start": 1662038808752, "status": "failure", "tasks": [ { "id": "1662038808802", "message": "transfer", "start": 1662038808802, "status": "success", "end": 1662038832144, "result": { "size": 1558597632 } }, { "id": "1662038832622", "message": "clean-vm", "start": 1662038832622, "status": "failure", "tasks": [ { "id": "1662038832809", "message": "merge", "start": 1662038832809, "status": "failure", "end": 1662039432907, "result": { "chain": [ "/xo-vm-backups/c45dd52b-fa92-df6f-800a-10853c183c23/vdis/d6c0a656-62c5-4c39-a57a-f246b39f1cef/ae8fffde-b2bd-4205-a596-9139ef59193f/20220901T124634Z.alias.vhd", "/xo-vm-backups/c45dd52b-fa92-df6f-800a-10853c183c23/vdis/d6c0a656-62c5-4c39-a57a-f246b39f1cef/ae8fffde-b2bd-4205-a596-9139ef59193f/20220901T130323Z.alias.vhd" ], "message": "operation timed out", "name": "TimeoutError", "stack": "TimeoutError: operation timed out\n at Promise.timeout (/opt/xo/xo-builds/xen-orchestra-202209010921/node_modules/promise-toolbox/timeout.js:11:16)\n at S3Handler.rename (/opt/xo/xo-builds/xen-orchestra-202209010921/@xen-orchestra/fs/dist/abstract.js:338:37)\n at Queue.next (/opt/xo/xo-builds/xen-orchestra-202209010921/node_modules/limit-concurrency-decorator/dist/index.js:21:22)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)" } } ], "end": 1662039432908, "result": { "chain": [ "/xo-vm-backups/c45dd52b-fa92-df6f-800a-10853c183c23/vdis/d6c0a656-62c5-4c39-a57a-f246b39f1cef/ae8fffde-b2bd-4205-a596-9139ef59193f/20220901T124634Z.alias.vhd", "/xo-vm-backups/c45dd52b-fa92-df6f-800a-10853c183c23/vdis/d6c0a656-62c5-4c39-a57a-f246b39f1cef/ae8fffde-b2bd-4205-a596-9139ef59193f/20220901T130323Z.alias.vhd" ], "message": "operation timed out", "name": "TimeoutError", "stack": "TimeoutError: operation timed out\n at Promise.timeout (/opt/xo/xo-builds/xen-orchestra-202209010921/node_modules/promise-toolbox/timeout.js:11:16)\n at S3Handler.rename (/opt/xo/xo-builds/xen-orchestra-202209010921/@xen-orchestra/fs/dist/abstract.js:338:37)\n at Queue.next (/opt/xo/xo-builds/xen-orchestra-202209010921/node_modules/limit-concurrency-decorator/dist/index.js:21:22)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)" } } ], "end": 1662039433046 } ], "end": 1662039433046 } ], "end": 1662039433047 }
-
@Andrew yes The timeout are not fixed for now, only the nosuchkey
the fix will allow you to at least let you set a custom concurrency limit or maybe to calculate a value in a smarter way
-
@florent Sure.... but delta backup merge S3 no longer works in master.
-
@Andrew yes, we will reduce default concurrency while waiting for the parametrized version
here is the branch if you want to test it https://github.com/vatesfr/xen-orchestra/pull/6400
also , can you monitor the minio resource usage ? I'm curious to where is the bottleneck during a rename ( cpu / ram or disk usage)
-
@florent Limiting concurrency did not fix my S3 backup problem, but it's working again after updating the build. So I guess it's resolved.