XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Backups started to fail again (overall status: failure, but both snapshot and transfer returns success)

    Scheduled Pinned Locked Moved Backup
    5 Posts 3 Posters 33 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      peo
      last edited by peo

      Got these backup failures again. Usually only the "Docker" VM, but now all backups gives the status as mentioned in the topic. Below is one of the examples.
      I have not updated XenOrchestra in a "long" time, I'm on c8f9d81 which was current at 3rd of July.
      My hosts are fully updated, as well as the VM running XO.
      The first non-Docker-VM failure appeared before I updated the hosts.
      Anything you want to investigate, or should I just update XO and hope for these errors to stop ?

      {
        "data": {
          "mode": "delta",
          "reportWhen": "failure"
        },
        "id": "1753140173983",
        "jobId": "38f0068f-c124-4876-85d3-83f1003db60c",
        "jobName": "HomeAssistant",
        "message": "backup",
        "scheduleId": "dcb1c759-76b8-441b-9dc0-595914e60608",
        "start": 1753140173983,
        "status": "failure",
        "infos": [
          {
            "data": {
              "vms": [
                "ed4758f3-de34-7a7e-a46b-dc007d52f5c3"
              ]
            },
            "message": "vms"
          }
        ],
        "tasks": [
          {
            "data": {
              "type": "VM",
              "id": "ed4758f3-de34-7a7e-a46b-dc007d52f5c3",
              "name_label": "HomeAssistant"
            },
            "id": "1753140251984",
            "message": "backup VM",
            "start": 1753140251984,
            "status": "failure",
            "tasks": [
              {
                "id": "1753140251993",
                "message": "clean-vm",
                "start": 1753140251993,
                "status": "success",
                "end": 1753140258038,
                "result": {
                  "merge": false
                }
              },
              {
                "id": "1753140354122",
                "message": "snapshot",
                "start": 1753140354122,
                "status": "success",
                "end": 1753140356461,
                "result": "fc6d5d87-a2b5-cae9-8c2a-377ffff5febc"
              },
              {
                "data": {
                  "id": "2b919467-704c-4e35-bac9-2d6a43118bda",
                  "isFull": false,
                  "type": "remote"
                },
                "id": "1753140356462",
                "message": "export",
                "start": 1753140356462,
                "status": "failure",
                "tasks": [
                  {
                    "id": "1753140359386",
                    "message": "transfer",
                    "start": 1753140359386,
                    "status": "success",
                    "end": 1753140753378,
                    "result": {
                      "size": 5630853120
                    }
                  },
                  {
                    "id": "1753140761602",
                    "message": "clean-vm",
                    "start": 1753140761602,
                    "status": "failure",
                    "end": 1753140775782,
                    "result": {
                      "name": "InternalError",
                      "$fault": "client",
                      "$metadata": {
                        "httpStatusCode": 500,
                        "requestId": "D98294C01B729C95",
                        "extendedRequestId": "RDk4Mjk0QzAxQjcyOUM5NUQ5ODI5NEMwMUI3MjlDOTVEOTgyOTRDMDFCNzI5Qzk1RDk4Mjk0QzAxQjcyOUM5NQ==",
                        "attempts": 3,
                        "totalRetryDelay": 112
                      },
                      "Code": "InternalError",
                      "message": "Internal Error",
                      "stack": "InternalError: Internal Error\n    at throwDefaultError (/opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@smithy/smithy-client/dist-cjs/index.js:867:20)\n    at /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@smithy/smithy-client/dist-cjs/index.js:876:5\n    at de_CommandError (/opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/client-s3/dist-cjs/index.js:4952:14)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20\n    at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:484:18\n    at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38\n    at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:110:22\n    at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:137:14\n    at async /opt/xo/xo-builds/xen-orchestra-202507041243/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:33:22"
                    }
                  }
                ],
                "end": 1753140775783
              }
            ],
            "end": 1753140775783
          }
        ],
        "end": 1753140775784
      }
      
      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by olivierlambert

        Hi,

        We can't help on outdated XO commits, it's impossible to do so, see https://docs.xen-orchestra.com/community#report-a-bug. Please update to the latest one and report if you still have the issue, thanks!

        edit: note that the error is likely not an XO side here, we got a HTTP 500.

        P 1 Reply Last reply Reply Quote 0
        • P Offline
          peo @olivierlambert
          last edited by

          @olivierlambert Thanks, will update every machine and XO involved in the backup process, and possibly even the individual machines that fails. First failure on vm-cleanup was 15 July, that's a few days before I patched the hosts (as a part of troubleshooting and preventing further failures). Still these backups will (probably) be fully restorable (as I have tested out with the always-failing Docker vm)

          D 1 Reply Last reply Reply Quote 0
          • D Offline
            DustinB @peo
            last edited by

            @peo said in Backups started to fail again (overall status: failure, but both snapshot and transfer returns success):

            @olivierlambert Thanks, will update every machine and XO involved in the backup process, and possibly even the individual machines that fails. First failure on vm-cleanup was 15 July, that's a few days before I patched the hosts (as a part of troubleshooting and preventing further failures). Still these backups will (probably) be fully restorable (as I have tested out with the always-failing Docker vm)

            So you patch your host, but not the administrative tools for the hosts?

            Seems a little cart before the horse there, no?

            P 1 Reply Last reply Reply Quote 0
            • P Offline
              peo @DustinB
              last edited by

              @DustinB said in Backups started to fail again (overall status: failure, but both snapshot and transfer returns success):

              @peo said in Backups started to fail again (overall status: failure, but both snapshot and transfer returns success):

              @olivierlambert Thanks, will update every machine and XO involved in the backup process, and possibly even the individual machines that fails. First failure on vm-cleanup was 15 July, that's a few days before I patched the hosts (as a part of troubleshooting and preventing further failures). Still these backups will (probably) be fully restorable (as I have tested out with the always-failing Docker vm)

              So you patch your host, but not the administrative tools for the hosts?

              Seems a little cart before the horse there, no?

              That's a fault-finding procedure: do not patch everything at once (but now I did, after finding out that patching the hosts did not solve the problem)

              1 Reply Last reply Reply Quote 0
              • First post
                Last post