XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Backup Suddenly Failing

    Scheduled Pinned Locked Moved Backup
    21 Posts 4 Posters 141 Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      Hi,

      We need more details, because otherwise it's really hard to answer. Are you on XO source or XOA? Which commit/release?

      JSylvia007J 1 Reply Last reply Reply Quote 0
      • JSylvia007J Online
        JSylvia007 @olivierlambert
        last edited by

        @olivierlambert - It's XO from sources. The about page says:

        Xen Orchestra, commit d1736
        Master, commit f2b19

        What's weird is that I have 6 other backups, all configured the same way, and all work perfectly fine. There's even another VM in that same backup, and that one works fine too.

        florentF 1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          First thing to do is to upgrade on latest commit available, because we can't know if it's a bug that could have been fixed before.

          JSylvia007J 1 Reply Last reply Reply Quote 0
          • florentF Offline
            florent Vates 🪐 XO Team @JSylvia007
            last edited by

            @JSylvia007 this means that one of the task coalescing on of the older backups failed .

            we are working on making it more observable, because for now it is a very opaque process. Even if the failure are rare, they can happen

            I would advise you to check the toggle " merge synchronously" in the backup job if your backup window allow it. At least you will have an error message earlier

            JSylvia007J 1 Reply Last reply Reply Quote 0
            • JSylvia007J Online
              JSylvia007 @olivierlambert
              last edited by

              @olivierlambert - You beat me to it. I did just that. I'm at the latest commit and just reran the backups. This is the ONLY VM that fails out of 6 VMs.

              Failed with the exact same error. The files referenced are different, but the error related to the stream is the exact same.

              The VM works fine. There's no indications that there's an issue with the virtual hard drives itself.

              I've also tried with the VM running and powered off. Same issue.

              It happens to be a Windows 10 VM. I do have another Windows VM that backup just fine (but it's a Windows Server VM).

              They all backup to the exact same remote.

              1 Reply Last reply Reply Quote 0
              • JSylvia007J Online
                JSylvia007 @florent
                last edited by

                @florent - It was on, I toggled it off, re-ran the backup, still failed.

                JSylvia007J 1 Reply Last reply Reply Quote 0
                • JSylvia007J Online
                  JSylvia007 @JSylvia007
                  last edited by

                  @florent - Any additional information I can provide? The backup is still failing, and there's really no indication why.

                  florentF 1 Reply Last reply Reply Quote 0
                  • florentF Offline
                    florent Vates 🪐 XO Team @JSylvia007
                    last edited by

                    @JSylvia007 Error: stream has ended with not enough data (actual: 397, expected: 2097152) is the root cause, can you post the json here ?

                    remove the snapshot on the source to start a new backup chain .

                    JSylvia007J 1 Reply Last reply Reply Quote 0
                    • JSylvia007J Online
                      JSylvia007 @florent
                      last edited by

                      @florent - Here is the JSON. Removing the Snapshots now and trying again with the merge synchronously toggled off.

                      Note the remote is a Synology using NFS, if that matters.

                      {
                        "data": {
                          "mode": "delta",
                          "reportWhen": "failure"
                        },
                        "id": "1774449668020",
                        "jobId": "7fc5396a-5383-4dab-91fe-6758eb8b7474",
                        "jobName": "ADMIN VMS",
                        "message": "backup",
                        "scheduleId": "d09acecc-cc98-4cfd-84a4-5bfd1575b20f",
                        "start": 1774449668020,
                        "status": "failure",
                        "infos": [
                          {
                            "data": {
                              "vms": [
                                "b827a2ad-361d-e44c-19ca-f9d632baacf8",
                                "afe4bee2-745d-da4a-0016-c74751856556"
                              ]
                            },
                            "message": "vms"
                          }
                        ],
                        "tasks": [
                          {
                            "data": {
                              "type": "VM",
                              "id": "b827a2ad-361d-e44c-19ca-f9d632baacf8",
                              "name_label": "ADMIN-VM01"
                            },
                            "id": "1774449670085",
                            "message": "backup VM",
                            "start": 1774449670085,
                            "status": "success",
                            "tasks": [
                              {
                                "id": "1774449670095",
                                "message": "clean-vm",
                                "start": 1774449670095,
                                "status": "success",
                                "end": 1774449670170,
                                "result": {
                                  "merge": false
                                }
                              },
                              {
                                "id": "1774449670451",
                                "message": "snapshot",
                                "start": 1774449670451,
                                "status": "success",
                                "end": 1774449672123,
                                "result": "dad1585e-4094-88aa-4894-d521fae5cb63"
                              },
                              {
                                "data": {
                                  "id": "9f2e49f9-4e87-444a-aa68-4cbf73f28e6d",
                                  "isFull": false,
                                  "type": "remote"
                                },
                                "id": "1774449672123:0",
                                "message": "export",
                                "start": 1774449672123,
                                "status": "success",
                                "tasks": [
                                  {
                                    "id": "1774449673924",
                                    "message": "transfer",
                                    "start": 1774449673924,
                                    "status": "success",
                                    "end": 1774449690670,
                                    "result": {
                                      "size": 283115520
                                    }
                                  },
                                  {
                                    "id": "1774449697186",
                                    "message": "clean-vm",
                                    "start": 1774449697186,
                                    "status": "success",
                                    "tasks": [
                                      {
                                        "id": "1774449698513",
                                        "message": "merge",
                                        "start": 1774449698513,
                                        "status": "success",
                                        "end": 1774449706694
                                      }
                                    ],
                                    "end": 1774449706704,
                                    "result": {
                                      "merge": true
                                    }
                                  }
                                ],
                                "end": 1774449706707
                              }
                            ],
                            "end": 1774449706707
                          },
                          {
                            "data": {
                              "type": "VM",
                              "id": "afe4bee2-745d-da4a-0016-c74751856556",
                              "name_label": "ADMIN-VM02"
                            },
                            "id": "1774449670088",
                            "message": "backup VM",
                            "start": 1774449670088,
                            "status": "failure",
                            "tasks": [
                              {
                                "id": "1774449670096",
                                "message": "clean-vm",
                                "start": 1774449670096,
                                "status": "success",
                                "end": 1774449670110,
                                "result": {
                                  "merge": false
                                }
                              },
                              {
                                "id": "1774449670452",
                                "message": "snapshot",
                                "start": 1774449670452,
                                "status": "success",
                                "end": 1774449673024,
                                "result": "77d9de45-e6b7-d202-9245-7db47b6fd9c9"
                              },
                              {
                                "data": {
                                  "id": "9f2e49f9-4e87-444a-aa68-4cbf73f28e6d",
                                  "isFull": true,
                                  "type": "remote"
                                },
                                "id": "1774449673024:0",
                                "message": "export",
                                "start": 1774449673024,
                                "status": "failure",
                                "tasks": [
                                  {
                                    "id": "1774449674094",
                                    "message": "transfer",
                                    "start": 1774449674094,
                                    "status": "failure",
                                    "end": 1774451157435,
                                    "result": {
                                      "text": "HTTP/1.1 500 Internal Error\r\ncontent-length: 266\r\ncontent-type: text/html\r\nconnection: close\r\ncache-control: no-cache, no-store\r\n\r\n<html><body><h1>HTTP 500 internal server error</h1>An unexpected error occurred; please wait a while and try again. If the problem persists, please contact your support representative.<h1> Additional information </h1>VDI_IO_ERROR: [ Device I/O errors ]</body></html>",
                                      "message": "stream has ended with not enough data (actual: 397, expected: 2097152)",
                                      "name": "Error",
                                      "stack": "Error: stream has ended with not enough data (actual: 397, expected: 2097152)\n    at readChunkStrict (/opt/xo/xo-builds/xen-orchestra-202603241416/@vates/read-chunk/index.js:88:19)\n    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)\n    at async #read (file:///opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/xapi/disks/XapiVhdStreamSource.mjs:98:65)\n    at async generator (file:///opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/xapi/disks/XapiVhdStreamSource.mjs:199:22)\n    at async Timeout.next (file:///opt/xo/xo-builds/xen-orchestra-202603241416/@vates/generator-toolbox/dist/timeout.mjs:14:24)\n    at async generatorWithLength (file:///opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/disk-transform/dist/Throttled.mjs:12:44)\n    at async Throttle.createThrottledGenerator (file:///opt/xo/xo-builds/xen-orchestra-202603241416/@vates/generator-toolbox/dist/throttle.mjs:53:30)\n    at async ThrottledDisk.diskBlocks (file:///opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/disk-transform/dist/Disk.mjs:26:30)\n    at async Promise.all (index 0)\n    at async ForkedDisk.diskBlocks (file:///opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/disk-transform/dist/SynchronizedDisk.mjs:18:30)"
                                    }
                                  },
                                  {
                                    "id": "1774451158098",
                                    "message": "clean-vm",
                                    "start": 1774451158098,
                                    "status": "success",
                                    "end": 1774451158157,
                                    "result": {
                                      "merge": false
                                    }
                                  }
                                ],
                                "end": 1774451158216
                              }
                            ],
                            "end": 1774451158218,
                            "result": {
                              "errno": -2,
                              "code": "ENOENT",
                              "syscall": "stat",
                              "path": "/opt/xo/mounts/9f2e49f9-4e87-444a-aa68-4cbf73f28e6d/xo-vm-backups/afe4bee2-745d-da4a-0016-c74751856556/vdis/7fc5396a-5383-4dab-91fe-6758eb8b7474/530abab7-9ea9-43d4-be6e-acb3fbf67065/20260325T144114Z.alias.vhd",
                              "message": "ENOENT: no such file or directory, stat '/opt/xo/mounts/9f2e49f9-4e87-444a-aa68-4cbf73f28e6d/xo-vm-backups/afe4bee2-745d-da4a-0016-c74751856556/vdis/7fc5396a-5383-4dab-91fe-6758eb8b7474/530abab7-9ea9-43d4-be6e-acb3fbf67065/20260325T144114Z.alias.vhd'",
                              "name": "Error",
                              "stack": "Error: ENOENT: no such file or directory, stat '/opt/xo/mounts/9f2e49f9-4e87-444a-aa68-4cbf73f28e6d/xo-vm-backups/afe4bee2-745d-da4a-0016-c74751856556/vdis/7fc5396a-5383-4dab-91fe-6758eb8b7474/530abab7-9ea9-43d4-be6e-acb3fbf67065/20260325T144114Z.alias.vhd'\nFrom:\n    at NfsHandler.addSyncStackTrace (/opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/fs/dist/local.js:21:26)\n    at NfsHandler._getSize (/opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/fs/dist/local.js:113:48)\n    at /opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/fs/dist/utils.js:29:26\n    at new Promise (<anonymous>)\n    at NfsHandler.<anonymous> (/opt/xo/xo-builds/xen-orchestra-202603241416/@xen-orchestra/fs/dist/utils.js:24:12)\n    at loopResolver (/opt/xo/xo-builds/xen-orchestra-202603241416/node_modules/promise-toolbox/retry.js:83:46)\n    at new Promise (<anonymous>)\n    at loop (/opt/xo/xo-builds/xen-orchestra-202603241416/node_modules/promise-toolbox/retry.js:85:22)\n    at NfsHandler.retry (/opt/xo/xo-builds/xen-orchestra-202603241416/node_modules/promise-toolbox/retry.js:87:10)\n    at NfsHandler._getSize (/opt/xo/xo-builds/xen-orchestra-202603241416/node_modules/promise-toolbox/retry.js:103:18)"
                            }
                          }
                        ],
                        "end": 1774451158219
                      }
                      
                      JSylvia007J 1 Reply Last reply Reply Quote 0
                      • JSylvia007J Online
                        JSylvia007 @JSylvia007
                        last edited by

                        @florent - Same. Still just this one failing.

                        P 1 Reply Last reply Reply Quote 0
                        • P Online
                          Pilow @JSylvia007
                          last edited by

                          @JSylvia007 is the VM failing on the same host as the VMs not failing, in this same job ?

                          JSylvia007J 1 Reply Last reply Reply Quote 0
                          • JSylvia007J Online
                            JSylvia007 @Pilow
                            last edited by

                            @Pilow - That's correct. It's the same host and same job and same remote.

                            P 1 Reply Last reply Reply Quote 0
                            • P Online
                              Pilow @JSylvia007
                              last edited by

                              @JSylvia007 said:

                              That's correct. It's the same host and same job and same remote.

                              in your backup job, you have NBD enabled AND a compliant NBD enabled network on the host (network tab of the pool) ?
                              your XO/XOA VM is connected to this NBD enabled network ?

                              in ADVANCED TAB of your pool, at the bottom, MIGRATION NETWORK and BACKUP NETWORK are configured as this NBD enabled network ?

                              just some basic checks, sorry if it seems dumb questions

                              JSylvia007J 1 Reply Last reply Reply Quote 0
                              • JSylvia007J Online
                                JSylvia007 @Pilow
                                last edited by

                                @Pilow - The issue is that I've changed nothing... And the job is suddenly failing. And all the other jobs and VMs are working just fine, so I don't think that would have anything to do with it.

                                P 1 Reply Last reply Reply Quote 0
                                • P Online
                                  Pilow @JSylvia007
                                  last edited by

                                  @JSylvia007 did you test create a new job, with just this VM ?
                                  is it still failing ?

                                  a new job would have a separate UUID and create new folders with clean metadatas on your NAS.

                                  JSylvia007J 1 Reply Last reply Reply Quote 0
                                  • JSylvia007J Online
                                    JSylvia007 @Pilow
                                    last edited by

                                    @Pilow - I did just that. Fails in the exact same way.

                                    P 1 Reply Last reply Reply Quote 0
                                    • P Online
                                      Pilow @JSylvia007
                                      last edited by

                                      @JSylvia007 long stretch test but, if you have some space on the SR where resides the VM.
                                      shut the VM down and full-clone it.

                                      try a backup of this clone.

                                      report back ?

                                      JSylvia007J 1 Reply Last reply Reply Quote 0
                                      • JSylvia007J Online
                                        JSylvia007 @Pilow
                                        last edited by

                                        @Pilow - I can try this, but not until a bit later.

                                        JSylvia007J 1 Reply Last reply Reply Quote 0
                                        • JSylvia007J Online
                                          JSylvia007 @JSylvia007
                                          last edited by

                                          @pilow & @florent - The plot thickens. I'm unable to full-clone the VM...

                                          {
                                            "id": "0mn6ds7ih",
                                            "properties": {
                                              "method": "vm.copy",
                                              "params": {
                                                "vm": "afe4bee2-745d-da4a-0016-c74751856556",
                                                "sr": "247ef8a6-9c10-e100-acd3-c9193f34ddc3",
                                                "name": "ADMIN-VM02_COPY"
                                              },
                                              "name": "API call: vm.copy",
                                              "userId": "b06e5d9f-a602-4b76-a7bb-b1c915712ca3",
                                              "type": "api.call"
                                            },
                                            "start": 1774463552009,
                                            "status": "failure",
                                            "updatedAt": 1774464678345,
                                            "end": 1774464678344,
                                            "result": {
                                              "code": "VDI_COPY_FAILED",
                                              "params": [
                                                "Fatal error: exception Unix.Unix_error(Unix.EIO, \"read\", \"\")\n"
                                              ],
                                              "task": {
                                                "uuid": "555f90cc-12b7-7c2c-a2df-0f29a16a007e",
                                                "name_label": "Async.VM.copy",
                                                "name_description": "",
                                                "allowed_operations": [],
                                                "current_operations": {},
                                                "created": "20260325T18:32:32Z",
                                                "finished": "20260325T18:51:18Z",
                                                "status": "failure",
                                                "resident_on": "OpaqueRef:22c5ddea-00c6-f412-4439-536c4bbdca63",
                                                "progress": 1,
                                                "type": "<none/>",
                                                "result": "",
                                                "error_info": [
                                                  "VDI_COPY_FAILED",
                                                  "Fatal error: exception Unix.Unix_error(Unix.EIO, \"read\", \"\")\n"
                                                ],
                                                "other_config": {},
                                                "subtask_of": "OpaqueRef:NULL",
                                                "subtasks": [
                                                  "OpaqueRef:655cc4e3-0205-ba7d-5831-4b191ecfba9e"
                                                ],
                                                "backtrace": "(((process xapi)(filename ocaml/xapi/xapi_vm_clone.ml)(line 77))((process xapi)(filename list.ml)(line 110))((process xapi)(filename ocaml/xapi/xapi_vm_clone.ml)(line 120))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/xapi_vm_clone.ml)(line 128))((process xapi)(filename ocaml/xapi/xapi_vm_clone.ml)(line 171))((process xapi)(filename ocaml/xapi/xapi_vm_clone.ml)(line 210))((process xapi)(filename ocaml/xapi/xapi_vm_clone.ml)(line 221))((process xapi)(filename list.ml)(line 121))((process xapi)(filename ocaml/xapi/xapi_vm_clone.ml)(line 223))((process xapi)(filename ocaml/xapi/xapi_vm_clone.ml)(line 461))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/xapi_vm.ml)(line 791))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/rbac.ml)(line 228))((process xapi)(filename ocaml/xapi/rbac.ml)(line 238))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 78)))"
                                              },
                                              "message": "VDI_COPY_FAILED(Fatal error: exception Unix.Unix_error(Unix.EIO, \"read\", \"\")\n)",
                                              "name": "XapiError",
                                              "stack": "XapiError: VDI_COPY_FAILED(Fatal error: exception Unix.Unix_error(Unix.EIO, \"read\", \"\")\n)\n    at XapiError.wrap (file:///opt/xo/xo-builds/xen-orchestra-202603241416/packages/xen-api/_XapiError.mjs:16:12)\n    at default (file:///opt/xo/xo-builds/xen-orchestra-202603241416/packages/xen-api/_getTaskResult.mjs:13:29)\n    at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202603241416/packages/xen-api/index.mjs:1078:24)\n    at file:///opt/xo/xo-builds/xen-orchestra-202603241416/packages/xen-api/index.mjs:1112:14\n    at Array.forEach (<anonymous>)\n    at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202603241416/packages/xen-api/index.mjs:1102:12)\n    at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202603241416/packages/xen-api/index.mjs:1275:14)"
                                            }
                                          }
                                          
                                          P 1 Reply Last reply Reply Quote 0
                                          • P Online
                                            Pilow @JSylvia007
                                            last edited by

                                            @JSylvia007

                                                  "Fatal error: exception Unix.Unix_error(Unix.EIO, \"read\", \"\")\n"
                                            

                                            mmmm SR is failing ?
                                            can you restore the last known good state of this VM (in parallel of the one in production) and try to backup this restored version ?

                                            1 Reply Last reply Reply Quote 0

                                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                            With your input, this post could be even better 💗

                                            Register Login
                                            • First post
                                              Last post