XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    S3 backup broken

    Scheduled Pinned Locked Moved Xen Orchestra
    31 Posts 5 Posters 4.9k Views 6 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      Andrew Top contributor @florent
      last edited by

      @florent I did a git update again (to commit aa261) and it works! Maybe I missed and update?,,,,

      {
        "data": {
          "mode": "delta",
          "reportWhen": "never"
        },
        "id": "1661987889299",
        "jobId": "d6c0a656-62c5-4c39-a57a-f246b39f1cef",
        "jobName": "minio-test",
        "message": "backup",
        "scheduleId": "bd4ef436-fd85-4f16-bf9e-71d1d0c8586f",
        "start": 1661987889299,
        "status": "success",
        "infos": [
          {
            "data": {
              "vms": [
                "c45dd52b-fa92-df6f-800a-10853c183c23"
              ]
            },
            "message": "vms"
          }
        ],
        "tasks": [
          {
            "data": {
              "type": "VM",
              "id": "c45dd52b-fa92-df6f-800a-10853c183c23"
            },
            "id": "1661987890202",
            "message": "backup VM",
            "start": 1661987890202,
            "status": "success",
            "tasks": [
              {
                "id": "1661987890615",
                "message": "clean-vm",
                "start": 1661987890615,
                "status": "success",
                "end": 1661987891797,
                "result": {
                  "merge": false
                }
              },
              {
                "id": "1661987891992",
                "message": "snapshot",
                "start": 1661987891992,
                "status": "success",
                "end": 1661987893470,
                "result": "ec858b82-4f64-c5fe-a258-887ba57c7458"
              },
              {
                "data": {
                  "id": "9890e0c4-ba3a-4810-8245-a49fdf16b16e",
                  "isFull": false,
                  "type": "remote"
                },
                "id": "1661987893471",
                "message": "export",
                "start": 1661987893471,
                "status": "success",
                "tasks": [
                  {
                    "id": "1661987893513",
                    "message": "transfer",
                    "start": 1661987893513,
                    "status": "success",
                    "end": 1661987897713,
                    "result": {
                      "size": 75549184
                    }
                  },
                  {
                    "id": "1661987898186",
                    "message": "clean-vm",
                    "start": 1661987898186,
                    "status": "success",
                    "tasks": [
                      {
                        "id": "1661987899091",
                        "message": "merge",
                        "start": 1661987899091,
                        "status": "success",
                        "end": 1661987902420
                      }
                    ],
                    "end": 1661987902474,
                    "result": {
                      "merge": true
                    }
                  }
                ],
                "end": 1661987902480
              }
            ],
            "end": 1661987902480
          }
        ],
        "end": 1661987902480
      }
      
      florentF 1 Reply Last reply Reply Quote 0
      • florentF Offline
        florent Vates 🪐 XO Team @Andrew
        last edited by

        @Andrew the problem occurs only during resuming a merge

        Does your S3 provider apply some rate limits to queries ? We do a lot of copy / delete query during merge

        A 1 Reply Last reply Reply Quote 0
        • A Offline
          Andrew Top contributor @florent
          last edited by

          @florent For local testing I'm using MinIO and there is no limit or throttling enabled.

          florentF 1 Reply Last reply Reply Quote 0
          • florentF Offline
            florent Vates 🪐 XO Team @Andrew
            last edited by

            @Andrew I am testing on another user with less concurrency during merge ( today there is 16 blocks merged in parallel) , and it seems to solve the problem

            I will make it configurable soon ( today it's a parameter in the code 😰 )

            A 2 Replies Last reply Reply Quote 0
            • A Offline
              Andrew Top contributor @florent
              last edited by

              @florent I see the fix-s3-merge branch is gone. I updated to current master (commit d8e01) and delta backup S3 is not working (same problem as before).

              florentF 1 Reply Last reply Reply Quote 0
              • A Offline
                Andrew Top contributor @florent
                last edited by

                @olivierlambert @florent The s3.js program has different code between the fix-s3-merge branch and master.

                1 Reply Last reply Reply Quote 0
                • florentF Offline
                  florent Vates 🪐 XO Team @Andrew
                  last edited by

                  @Andrew back to the nosuchkey error ?

                  A 1 Reply Last reply Reply Quote 0
                  • A Offline
                    Andrew Top contributor @florent
                    last edited by

                    @florent 10 minute timeout...

                    {
                      "data": {
                        "mode": "delta",
                        "reportWhen": "never"
                      },
                      "id": "1662038805540",
                      "jobId": "d6c0a656-62c5-4c39-a57a-f246b39f1cef",
                      "jobName": "minio-test",
                      "message": "backup",
                      "scheduleId": "bd4ef436-fd85-4f16-bf9e-71d1d0c8586f",
                      "start": 1662038805540,
                      "status": "failure",
                      "infos": [
                        {
                          "data": {
                            "vms": [
                              "c45dd52b-fa92-df6f-800a-10853c183c23"
                            ]
                          },
                          "message": "vms"
                        }
                      ],
                      "tasks": [
                        {
                          "data": {
                            "type": "VM",
                            "id": "c45dd52b-fa92-df6f-800a-10853c183c23"
                          },
                          "id": "1662038806415",
                          "message": "backup VM",
                          "start": 1662038806415,
                          "status": "failure",
                          "tasks": [
                            {
                              "id": "1662038806845",
                              "message": "clean-vm",
                              "start": 1662038806845,
                              "status": "success",
                              "end": 1662038807088,
                              "result": {
                                "merge": false
                              }
                            },
                            {
                              "id": "1662038807286",
                              "message": "snapshot",
                              "start": 1662038807286,
                              "status": "success",
                              "end": 1662038808752,
                              "result": "03c16279-9af3-e35a-a7a0-7724e62c9cc2"
                            },
                            {
                              "data": {
                                "id": "9890e0c4-ba3a-4810-8245-a49fdf16b16e",
                                "isFull": false,
                                "type": "remote"
                              },
                              "id": "1662038808752:0",
                              "message": "export",
                              "start": 1662038808752,
                              "status": "failure",
                              "tasks": [
                                {
                                  "id": "1662038808802",
                                  "message": "transfer",
                                  "start": 1662038808802,
                                  "status": "success",
                                  "end": 1662038832144,
                                  "result": {
                                    "size": 1558597632
                                  }
                                },
                                {
                                  "id": "1662038832622",
                                  "message": "clean-vm",
                                  "start": 1662038832622,
                                  "status": "failure",
                                  "tasks": [
                                    {
                                      "id": "1662038832809",
                                      "message": "merge",
                                      "start": 1662038832809,
                                      "status": "failure",
                                      "end": 1662039432907,
                                      "result": {
                                        "chain": [
                                          "/xo-vm-backups/c45dd52b-fa92-df6f-800a-10853c183c23/vdis/d6c0a656-62c5-4c39-a57a-f246b39f1cef/ae8fffde-b2bd-4205-a596-9139ef59193f/20220901T124634Z.alias.vhd",
                                          "/xo-vm-backups/c45dd52b-fa92-df6f-800a-10853c183c23/vdis/d6c0a656-62c5-4c39-a57a-f246b39f1cef/ae8fffde-b2bd-4205-a596-9139ef59193f/20220901T130323Z.alias.vhd"
                                        ],
                                        "message": "operation timed out",
                                        "name": "TimeoutError",
                                        "stack": "TimeoutError: operation timed out\n    at Promise.timeout (/opt/xo/xo-builds/xen-orchestra-202209010921/node_modules/promise-toolbox/timeout.js:11:16)\n    at S3Handler.rename (/opt/xo/xo-builds/xen-orchestra-202209010921/@xen-orchestra/fs/dist/abstract.js:338:37)\n    at Queue.next (/opt/xo/xo-builds/xen-orchestra-202209010921/node_modules/limit-concurrency-decorator/dist/index.js:21:22)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)"
                                      }
                                    }
                                  ],
                                  "end": 1662039432908,
                                  "result": {
                                    "chain": [
                                      "/xo-vm-backups/c45dd52b-fa92-df6f-800a-10853c183c23/vdis/d6c0a656-62c5-4c39-a57a-f246b39f1cef/ae8fffde-b2bd-4205-a596-9139ef59193f/20220901T124634Z.alias.vhd",
                                      "/xo-vm-backups/c45dd52b-fa92-df6f-800a-10853c183c23/vdis/d6c0a656-62c5-4c39-a57a-f246b39f1cef/ae8fffde-b2bd-4205-a596-9139ef59193f/20220901T130323Z.alias.vhd"
                                    ],
                                    "message": "operation timed out",
                                    "name": "TimeoutError",
                                    "stack": "TimeoutError: operation timed out\n    at Promise.timeout (/opt/xo/xo-builds/xen-orchestra-202209010921/node_modules/promise-toolbox/timeout.js:11:16)\n    at S3Handler.rename (/opt/xo/xo-builds/xen-orchestra-202209010921/@xen-orchestra/fs/dist/abstract.js:338:37)\n    at Queue.next (/opt/xo/xo-builds/xen-orchestra-202209010921/node_modules/limit-concurrency-decorator/dist/index.js:21:22)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)"
                                  }
                                }
                              ],
                              "end": 1662039433046
                            }
                          ],
                          "end": 1662039433046
                        }
                      ],
                      "end": 1662039433047
                    }
                    
                    florentF 1 Reply Last reply Reply Quote 0
                    • florentF Offline
                      florent Vates 🪐 XO Team @Andrew
                      last edited by

                      @Andrew yes The timeout are not fixed for now, only the nosuchkey

                      the fix will allow you to at least let you set a custom concurrency limit or maybe to calculate a value in a smarter way

                      A 1 Reply Last reply Reply Quote 0
                      • A Offline
                        Andrew Top contributor @florent
                        last edited by

                        @florent Sure.... but delta backup merge S3 no longer works in master.

                        florentF 1 Reply Last reply Reply Quote 0
                        • florentF Offline
                          florent Vates 🪐 XO Team @Andrew
                          last edited by florent

                          @Andrew yes, we will reduce default concurrency while waiting for the parametrized version

                          here is the branch if you want to test it https://github.com/vatesfr/xen-orchestra/pull/6400

                          also , can you monitor the minio resource usage ? I'm curious to where is the bottleneck during a rename ( cpu / ram or disk usage)

                          fbeauchamp opened this pull request in vatesfr/xen-orchestra

                          closed fix(vhd-lib/merge): reduce concurrency to protect slower backends #6400

                          A 1 Reply Last reply Reply Quote 0
                          • A Offline
                            Andrew Top contributor @florent
                            last edited by

                            @florent Limiting concurrency did not fix my S3 backup problem, but it's working again after updating the build. So I guess it's resolved.

                            1 Reply Last reply Reply Quote 1
                            • First post
                              Last post