Backups started to fail again (overall status: failure, but both snapshot and transfer returns success)
-
Got these backup failures again. Usually only the "Docker" VM, but now all backups gives the status as mentioned in the topic. Below is one of the examples.
I have not updated XenOrchestra in a "long" time, I'm on c8f9d81 which was current at 3rd of July.
My hosts are fully updated, as well as the VM running XO.
The first non-Docker-VM failure appeared before I updated the hosts.
Anything you want to investigate, or should I just update XO and hope for these errors to stop ?{ "data": { "mode": "delta", "reportWhen": "failure" }, "id": "1753140173983", "jobId": "38f0068f-c124-4876-85d3-83f1003db60c", "jobName": "HomeAssistant", "message": "backup", "scheduleId": "dcb1c759-76b8-441b-9dc0-595914e60608", "start": 1753140173983, "status": "failure", "infos": [ { "data": { "vms": [ "ed4758f3-de34-7a7e-a46b-dc007d52f5c3" ] }, "message": "vms" } ], "tasks": [ { "data": { "type": "VM", "id": "ed4758f3-de34-7a7e-a46b-dc007d52f5c3", "name_label": "HomeAssistant" }, "id": "1753140251984", "message": "backup VM", "start": 1753140251984, "status": "failure", "tasks": [ { "id": "1753140251993", "message": "clean-vm", "start": 1753140251993, "status": "success", "end": 1753140258038, "result": { "merge": false } }, { "id": "1753140354122", "message": "snapshot", "start": 1753140354122, "status": "success", "end": 1753140356461, "result": "fc6d5d87-a2b5-cae9-8c2a-377ffff5febc" }, { "data": { "id": "2b919467-704c-4e35-bac9-2d6a43118bda", "isFull": false, "type": "remote" }, "id": "1753140356462", "message": "export", "start": 1753140356462, "status": "failure", "tasks": [ { "id": "1753140359386", "message": "transfer", "start": 1753140359386, "status": "success", "end": 1753140753378, "result": { "size": 5630853120 } }, { "id": "1753140761602", "message": "clean-vm", "start": 1753140761602, "status": "failure", "end": 1753140775782, "result": { "name": "InternalError", "$fault": "client", "$metadata": { "httpStatusCode": 500, "requestId": "D98294C01B729C95", "extendedRequestId": "RDk4Mjk0QzAxQjcyOUM5NUQ5ODI5NEMwMUI3MjlDOTVEOTgyOTRDMDFCNzI5Qzk1RDk4Mjk0QzAxQjcyOUM5NQ==", "attempts": 3, "totalRetryDelay": 112 }, "Code": "InternalError", "message": "Internal Error", "stack": "InternalError: Internal Error\n at throwDefaultError (/opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@smithy/smithy-client/dist-cjs/index.js:867:20)\n at /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@smithy/smithy-client/dist-cjs/index.js:876:5\n at de_CommandError (/opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/client-s3/dist-cjs/index.js:4952:14)\n at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20\n at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:484:18\n at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38\n at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:110:22\n at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:137:14\n at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:33:22" } } ], "end": 1753140775783 } ], "end": 1753140775783 } ], "end": 1753140775784 }
-
Hi,
We can't help on outdated XO commits, it's impossible to do so, see https://docs.xen-orchestra.com/community#report-a-bug. Please update to the latest one and report if you still have the issue, thanks!
edit: note that the error is likely not an XO side here, we got a HTTP 500.
-
@olivierlambert Thanks, will update every machine and XO involved in the backup process, and possibly even the individual machines that fails. First failure on vm-cleanup was 15 July, that's a few days before I patched the hosts (as a part of troubleshooting and preventing further failures). Still these backups will (probably) be fully restorable (as I have tested out with the always-failing Docker vm)
-
@peo said in Backups started to fail again (overall status: failure, but both snapshot and transfer returns success):
@olivierlambert Thanks, will update every machine and XO involved in the backup process, and possibly even the individual machines that fails. First failure on vm-cleanup was 15 July, that's a few days before I patched the hosts (as a part of troubleshooting and preventing further failures). Still these backups will (probably) be fully restorable (as I have tested out with the always-failing Docker vm)
So you patch your host, but not the administrative tools for the hosts?
Seems a little cart before the horse there, no?
-
@DustinB said in Backups started to fail again (overall status: failure, but both snapshot and transfer returns success):
@peo said in Backups started to fail again (overall status: failure, but both snapshot and transfer returns success):
@olivierlambert Thanks, will update every machine and XO involved in the backup process, and possibly even the individual machines that fails. First failure on vm-cleanup was 15 July, that's a few days before I patched the hosts (as a part of troubleshooting and preventing further failures). Still these backups will (probably) be fully restorable (as I have tested out with the always-failing Docker vm)
So you patch your host, but not the administrative tools for the hosts?
Seems a little cart before the horse there, no?
That's a fault-finding procedure: do not patch everything at once (but now I did, after finding out that patching the hosts did not solve the problem)