XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Backup Job HTTP connection abruptly closed

    Scheduled Pinned Locked Moved Xen Orchestra
    17 Posts 4 Posters 1.9k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • _danielgurgel_ Offline
      _danielgurgel
      last edited by

      We are getting backup error on only 1 server in our pool. We've already swapped NFS storage and done a FULL CLONE of the VM for testing, but we're still failing (all other servers work fine in the backup operation to the same NFS Server).

      I have not found anything related to this error and the snapshot operations are working correctly. Any tips to solve this problem?

      transfer 
      Start: Jul 27, 2021, 08:50:02 AM
      End: Jul 27, 2021, 09:39:49 AM
      Duration: an hour
      Error: HTTP connection abruptly closed
      Start: Jul 27, 2021, 08:50:02 AM
      End: Jul 27, 2021, 09:39:49 AM
      Duration: an hour
      Error: HTTP connection abruptly closed
      Start: Jul 27, 2021, 08:49:33 AM
      End: Jul 27, 2021, 09:44:45 AM
      Duration: an hour
      Error: all targets have failed, step: writer.run()
      Type: full
      
      _danielgurgel_ ForzaF 2 Replies Last reply Reply Quote 0
      • _danielgurgel_ Offline
        _danielgurgel @_danielgurgel
        last edited by

        @_danielgurgel Here is complete log of the operation.

        vm.copy
        {
          "vm": "54676579-2328-d137-1002-0f32920eab23",
          "sr": "50c59b18-5b5c-2eed-8c82-b8f7fdc8e9b5",
          "name": "VM_NAME"
        }
        {
          "call": {
            "method": "VM.destroy",
            "params": [
              "OpaqueRef:fc032b38-d8d7-43ab-983c-f54bc9dc6f85"
            ]
          },
          "message": "operation timed out",
          "name": "TimeoutError",
          "stack": "TimeoutError: operation timed out
            at Promise.call (/opt/xen-orchestra/node_modules/promise-toolbox/timeout.js:13:16)
            at Xapi._call (/opt/xen-orchestra/packages/xen-api/src/index.js:644:37)
            at /opt/xen-orchestra/packages/xen-api/src/index.js:722:21
            at loopResolver (/opt/xen-orchestra/node_modules/promise-toolbox/retry.js:94:23)
            at Promise._execute (/opt/xen-orchestra/node_modules/bluebird/js/release/debuggability.js:384:9)
            at Promise._resolveFromExecutor (/opt/xen-orchestra/node_modules/bluebird/js/release/promise.js:518:18)
            at new Promise (/opt/xen-orchestra/node_modules/bluebird/js/release/promise.js:103:10)
            at loop (/opt/xen-orchestra/node_modules/promise-toolbox/retry.js:98:12)
            at retry (/opt/xen-orchestra/node_modules/promise-toolbox/retry.js:101:10)
            at Xapi._sessionCall (/opt/xen-orchestra/packages/xen-api/src/index.js:713:20)
            at Xapi.call (/opt/xen-orchestra/packages/xen-api/src/index.js:247:14)
            at loopResolver (/opt/xen-orchestra/node_modules/promise-toolbox/retry.js:94:23)
            at Promise._execute (/opt/xen-orchestra/node_modules/bluebird/js/release/debuggability.js:384:9)
            at Promise._resolveFromExecutor (/opt/xen-orchestra/node_modules/bluebird/js/release/promise.js:518:18)
            at new Promise (/opt/xen-orchestra/node_modules/bluebird/js/release/promise.js:103:10)
            at loop (/opt/xen-orchestra/node_modules/promise-toolbox/retry.js:98:12)
            at Xapi.retry (/opt/xen-orchestra/node_modules/promise-toolbox/retry.js:101:10)
            at Xapi.call (/opt/xen-orchestra/node_modules/promise-toolbox/retry.js:119:18)
            at Xapi.destroy (/opt/xen-orchestra/@xen-orchestra/xapi/src/vm.js:324:16)
            at Xapi._copyVm (file:///opt/xen-orchestra/packages/xo-server/src/xapi/index.mjs:322:9)
            at Xapi.copyVm (file:///opt/xen-orchestra/packages/xo-server/src/xapi/index.mjs:337:7)
            at Api.callApiMethod (file:///opt/xen-orchestra/packages/xo-server/src/xo-mixins/api.mjs:304:20)"
        }
        
        1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          It means XO sent an order to your XAPI (of your pool) and it never answered, at least not before a timeout.

          _danielgurgel_ 1 Reply Last reply Reply Quote 0
          • _danielgurgel_ Offline
            _danielgurgel @olivierlambert
            last edited by _danielgurgel

            @olivierlambert But any reason why this issue only occurs for this VM? Even cloning the VM, the problem happens with the clone... even changing the NFS Server, the problem happens... Let's try moving it to a new cluster.

            1 Reply Last reply Reply Quote 0
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              I can't guess without taking more time to investigate, ideally on the host directly.

              My guess is the issue is related to the host/pool connection with XO, not the storage.

              _danielgurgel_ 1 Reply Last reply Reply Quote 0
              • _danielgurgel_ Offline
                _danielgurgel @olivierlambert
                last edited by

                @olivierlambert Is there any difference between "traditional Backup" and Export VM performed by Xen Orchestra?

                Even changing the cluster virtual server, the problem still occurs. However, the Export operation works normally.

                olivierlambertO 1 Reply Last reply Reply Quote 0
                • ForzaF Offline
                  Forza @_danielgurgel
                  last edited by

                  @_danielgurgel said in Backup Job HTTP connection abruptly closed:

                  Error: all targets have failed, step: writer.run()

                  I had similar issue today too. But restarting the backup worked. Weird. I had another similar case a little while ago that I opened a ticket for too.

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO @_danielgurgel
                    last edited by

                    @_danielgurgel if you mean basic backup, it's XVA export for both. The only different is in a back case, you are writing the file in a remote instead of sending it to your browser.

                    1 Reply Last reply Reply Quote 0
                    • L Offline
                      lavamind
                      last edited by

                      We've been having the same problem with our Delta backups for several weeks now. The job runs every day and about 1 / 3 days, we have failures like this. It seems to affect random VMs, but one or two seem to be affected more often.

                      We tried increasing the ring buffers on the physical network interfaces but it didn't help. Now we're going to try to pause GC during the backups to see if it helps.

                      We looked at SMlog and daemon.log and could not find any obvious problems on the host occuring at the time of the error. If it's a problem with networking, how could we verify this?

                      1 Reply Last reply Reply Quote 0
                      • olivierlambertO Offline
                        olivierlambert Vates 🪐 Co-Founder CEO
                        last edited by

                        @lavamind please triple check you are using XOA on latest or if XO from the sources, on master.

                        L 2 Replies Last reply Reply Quote 0
                        • L Offline
                          lavamind @olivierlambert
                          last edited by

                          @olivierlambert Yeah that's definately the next thing we'll try. For now we're using sources on release 5.59. If the problem persists we'll upgrade to 5.63 next week.

                          Not too keen on following master, since we have issues with it in the past (including bad backups)...

                          L 1 Reply Last reply Reply Quote 0
                          • L Offline
                            lavamind @lavamind
                            last edited by

                            This post is deleted!
                            1 Reply Last reply Reply Quote 0
                            • L Offline
                              lavamind @olivierlambert
                              last edited by

                              FYI, we do our best to ensure master is not broken but we only do the complete QA process just before an XOA release

                              Is that still the case?

                              From https://github.com/vatesfr/xen-orchestra/issues/3784#issuecomment-447797895

                              jcharaoui created this issue in vatesfr/xen-orchestra

                              closed [Backup NG] Delta backup base VHDs missing after hitting retention limit #3784

                              1 Reply Last reply Reply Quote 0
                              • olivierlambertO Offline
                                olivierlambert Vates 🪐 Co-Founder CEO
                                last edited by

                                It's always the case 🙂

                                _danielgurgel_ 1 Reply Last reply Reply Quote 0
                                • _danielgurgel_ Offline
                                  _danielgurgel @olivierlambert
                                  last edited by

                                  @olivierlambert Even updating the host from 8.0 to 8.2 (with last update level) and after cluster and NFS migration, the problem persists.

                                  We updated the virtualization agent on the virtual server to the latest available version from Citrix and we were able to back it up for a few weeks...but the problem reoccurred, again only for the same server.

                                  Are there any logs I can paste to help identify this failure?

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambertO Offline
                                    olivierlambert Vates 🪐 Co-Founder CEO
                                    last edited by

                                    This is not an easy questions 🤔 This would require investigation on the host I'm afraid.

                                    1 Reply Last reply Reply Quote 0
                                    • L Offline
                                      lavamind
                                      last edited by

                                      For the record, since upgrading to 5.63 the issue hasn't re-occurred at all.

                                      1 Reply Last reply Reply Quote 1
                                      • First post
                                        Last post