XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    worker exited with code 1 and signal null

    Scheduled Pinned Locked Moved Backup
    9 Posts 3 Posters 126 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • utopianfishU Offline
      utopianfish
      last edited by

      Hi Have a 2 node cluster with local ssd storage and have one host as master the other is used as a replication host.. they are directlymconnected via 10gb nics.. has been working perfectly the last few days since setting it up but last night i'm getting the above error.... no idea how to troubleshoot this... i'm on Xen Orchestra, commit c3905...hosts did 29 patch updates last night...

      cheers

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Do you have the entire log?

        utopianfishU 1 Reply Last reply Reply Quote 0
        • utopianfishU Offline
          utopianfish @olivierlambert
          last edited by olivierlambert

          @olivierlambert

          {
            "data": {
              "mode": "delta",
              "reportWhen": "always"
            },
            "id": "1756806348037",
            "jobId": "184411ba-b5db-41b4-8915-9b35d731ac77",
            "jobName": "Replication to HOST147",
            "message": "backup",
            "scheduleId": "28eb38ca-41b4-403e-b34e-92015d51207c",
            "start": 1756806348037,
            "status": "failure",
            "end": 1756806348278,
            "result": {
              "message": "worker exited with code 1 and signal null",
              "name": "Error",
              "stack": "Error: worker exited with code 1 and signal null\n    at ChildProcess.<anonymous> (file:///opt/xo/xo-builds/xen-orchestra-202509011824/@xen-orchestra/backups/runBackupWorker.mjs:24:48)\n    at ChildProcess.emit (node:events:519:28)\n    at ChildProcess.patchedEmit [as emit] (/opt/xo/xo-builds/xen-orchestra-202509011824/@xen-orchestra/log/configure.js:52:17)\n    at Process.ChildProcess._handle.onexit (node:internal/child_process:293:12)\n    at Process.callbackTrampoline (node:internal/async_hooks:130:17)"
            }
          }
          
          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            Do you have enough memory in your XO? It's hard to answer because it's from the sources.

            Double check the resources available.

            utopianfishU 1 Reply Last reply Reply Quote 0
            • utopianfishU Offline
              utopianfish @olivierlambert
              last edited by

              @olivierlambert 4gb mem on XO on debian VM.... host have 64 gb each... is it worth bumping it up and giving it a reboot and try again.. as i say was working great for the last few days....

              utopianfishU 1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                4GiB for XO might be enough, but it seems a worker is crashing or failing. I am not sure I've already seen that message myself. I will ask around.

                1 Reply Last reply Reply Quote 0
                • utopianfishU Offline
                  utopianfish @utopianfish
                  last edited by

                  ran it thru chatgpt.. and got this.. Perfect 👌 — now we have the real error.

                  The crash is because libfuse.so.2 is missing. XO (specifically the fuse-native module) depends on the FUSE library, but Debian 13 only ships libfuse3, not the older libfuse2 that XO wants.

                  i did bump the mem up to 8 as well and the repliaction job is now running...

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by

                    Libfuse is not used at all during replication, so I'm afraid ChatGPT is hallucinating.

                    1 Reply Last reply Reply Quote 0
                    • J Offline
                      john.c
                      last edited by john.c

                      Sometimes it when asking questions as well as hallucinating ChatGPT, can respond with results based on old versions (not updated for an up to date code base).

                      So if any past in development or experimental code branch, had its code and was mistakenly released. Then the mistake was found and fixed, it can have found its way into the training data set.

                      Can also be an example of training data set poisoning, forcing it to give misleading or mistaken responses, due to it hallucinating as a result.

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post