XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Unable to enable High Availability - INTERNAL_ERROR(Not_found)

    Scheduled Pinned Locked Moved XCP-ng
    12 Posts 6 Posters 171 Views 5 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J Offline
      jmannik
      last edited by olivierlambert

      Good Morning all,

      Im running into an issue with my pool where it won't let me enable HA and I can't figure out why, it starts enabling HA then just stops, the below shows up in the logs for the task list.

      {
        "id": "0mgsn7vq8",
        "properties": {
          "method": "pool.enableHa",
          "params": {
            "pool": "213186d2-e3ba-154f-d371-4122388deb83",
            "heartbeatSrs": [
              "381caeb2-5ad9-8924-365d-4b130c67c064"
            ],
            "configuration": {}
          },
          "name": "API call: pool.enableHa",
          "userId": "71d48027-d471-4b01-83f9-830df4279f7e",
          "type": "api.call"
        },
        "start": 1760572179296,
        "status": "failure",
        "updatedAt": 1760572219231,
        "end": 1760572219230,
        "result": {
          "code": "INTERNAL_ERROR",
          "params": [
            "Not_found"
          ],
          "call": {
            "duration": 39934,
            "method": "pool.enable_ha",
            "params": [
              "* session id *",
              [
                "OpaqueRef:a83a416f-c97d-1ed8-c7fc-213af89b8f86"
              ],
              {}
            ]
          },
          "message": "INTERNAL_ERROR(Not_found)",
          "name": "XapiError",
          "stack": "XapiError: INTERNAL_ERROR(Not_found)\n    at Function.wrap (file:///opt/xen-orchestra/packages/xen-api/_XapiError.mjs:16:12)\n    at file:///opt/xen-orchestra/packages/xen-api/transports/json-rpc.mjs:38:21\n    at runNextTicks (node:internal/process/task_queues:65:5)\n    at processImmediate (node:internal/timers:453:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)"
        }
      }
      
      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Hi,

        This is weird indeed. Do you have a shared SR available/connected? Can you try to enable HA with xe CLI directly from the host?

        J 1 Reply Last reply Reply Quote 0
        • J Offline
          jmannik @olivierlambert
          last edited by

          @olivierlambert

          [18:15 vmhost13 ~]# xe pool-ha-enable heartbeat-sr-uuids=381caeb2-5ad9-8924-365d-4b130c67c064
          The server failed to handle your request, due to an internal error. The given message may give details useful for debugging the problem.
          message: Not_found

          A psafontP 2 Replies Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            That's weird. Ping @Team-XAPI-Network and maybe directly @psafont

            1 Reply Last reply Reply Quote 0
            • A Offline
              andriy.sultanov Vates 🪐 XAPI & Network Team @jmannik
              last edited by

              @jmannik Please upload your /var/log/xensource.log from the time of the error, otherwise it's hard to see what went wrong

              J 1 Reply Last reply Reply Quote 1
              • psafontP Offline
                psafont Vates 🪐 XAPI & Network Team @jmannik
                last edited by

                @jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):

                @olivierlambert

                [18:15 vmhost13 ~]# xe pool-ha-enable heartbeat-sr-uuids=381caeb2-5ad9-8924-365d-4b130c67c064
                The server failed to handle your request, due to an internal error. The given message may give details useful for debugging the problem.
                message: Not_found

                That message is created by an exception. It's commonly raised by List.find and List.assoc, in this case the exception wasn't caught.

                It's usually difficult to find out which one, since these functions are frequently used and catching the exception can happen in a caller of the function that uses it.

                Could you provide the xenserver.log, as Andriy has asked? Otherwise I don't think we'll be able to find the exact cause.

                1 Reply Last reply Reply Quote 0
                • J Offline
                  jmannik @andriy.sultanov
                  last edited by

                  @andriy.sultanov @psafont
                  https://drive.google.com/file/d/1aJyCYSAuRIBb0X-23gJ6ORtrHSciYH8a/view?usp=sharing
                  Here is the log file

                  psafontP 2 Replies Last reply Reply Quote 0
                  • psafontP Offline
                    psafont Vates 🪐 XAPI & Network Team @jmannik
                    last edited by

                    @jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):

                    @andriy.sultanov @psafont
                    https://drive.google.com/file/d/1aJyCYSAuRIBb0X-23gJ6ORtrHSciYH8a/view?usp=sharing
                    Here is the log file

                    It's not crystal clear the condition that causes the exception, but I can see some unprotected exception being raised in that path host.ha_join_liveset when trying to recover the host uuid and it's not found. I'll investigate

                    1 Reply Last reply Reply Quote 1
                    • psafontP Offline
                      psafont Vates 🪐 XAPI & Network Team @jmannik
                      last edited by

                      @jmannik I have a test build that you can test, it will hopefully provide better error messages by raising an internal error with a reason.

                      The code is based on the newest builds, so I recommend updating to the latest version of XCP beforehand:

                      yum update
                      reboot
                      

                      Once that is done, the test packages can be installed by creating the file /etc/yum.repos.d/xcp-test.repo:

                      [xcp-ng-psafont1]
                      name=xcp-ng-psafont1
                      baseurl=https://koji.xcp-ng.org/repos/user/8/8.3/psafont1/x86_64/
                      enabled=0
                      gpgcheck=1
                      repo_gpgcheck=1
                      metadata_expire=0
                      gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-xcpng
                      

                      then updating the host using the test developer repo

                      yum update --enablerepo=xcp-ng-psafont1
                      

                      and finally restarting all the daemons

                      xe-toolstack-restart
                      

                      Note: the repository will only be available for a limited amount of time, after which I will repurpose it and delete the instructions so it's not used anymore by accident.

                      1 Reply Last reply Reply Quote 1
                      • tjkreidlT Offline
                        tjkreidl Ambassador
                        last edited by

                        Note also that if HA is turned on or off, the host must be restarted for that change to take effect, if I recall correctly.

                        J 1 Reply Last reply Reply Quote 0
                        • J Offline
                          jmannik @tjkreidl
                          last edited by jmannik

                          @tjkreidl This hasn't been my experience so far, enabling HA has just enabled HA, no reboot needed.

                          @psafont I am patching all my hosts now, will do the above test packages on Sunday Night (it is Friday afternoon at the time of this post)

                          nikadeN 1 Reply Last reply Reply Quote 0
                          • nikadeN Offline
                            nikade Top contributor @jmannik
                            last edited by

                            @jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):

                            @tjkreidl This hasn't been my experience so far, enabling HA has just enabled HA, no reboot needed.

                            @psafont I am patching all my hosts now, will do the above test packages on Sunday Night (it is Friday afternoon at the time of this post)

                            Correct, no reboot needed to enable/disable HA.

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post