XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Cannot join cluster after node upgrade

    Scheduled Pinned Locked Moved Compute
    8 Posts 2 Posters 731 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • V Offline
      VGerris
      last edited by

      I have a simple home lab of two machines. I have had several issues running a cluster, mostly because I could not join a node properly with XO.
      So I joined it via the command line and it went fine.

      Then I tested upgrading : latest beta via USB. For the non master it showed an upgrade which I can. In the end it shows offline.
      When trying to join it I get :

       Host Failed to Join the Pool          
      

      ("'NoneType' object has no attribute 'xenapi'",)

      In XO it shows offline, although I can connect to the console.
      Strangely, the local storage for that node shows a full partition:
      location: /dev/xapi/block ( local, unshared , 3,7 GB ).

      What could have gone wrong here?
      Thanks

      1 Reply Last reply Reply Quote 0
      • DanpD Offline
        Danp Pro Support Team
        last edited by

        Did you upgrade the pool master first?

        You may want to provide some further details about your situation to avoid some confusion, ie --

        • What do you mean by cluster?
        • What was the exact error message received in XO?
        • etc.
        V 1 Reply Last reply Reply Quote 0
        • V Offline
          VGerris @Danp
          last edited by VGerris

          @Danp Thank you for the quick reply!
          I upgraded the master first yes.

          • with cluster I mean two machines with shared NFS storage (HA enabled) that are joint in the same pool
          • in XO - the node shows offline. When trying to start it, it shows the host is offline, I will attach the XO log

          I think the solution is probably just to reinstall the node, but I have seen this multiple times reported and wanted to understand it and learn how to fix it ideally.
          It looks to me the XAPI is somehow not working and I do not understand the storage shown in XO - when I type mount on the host is does not show.[2023-08-20T14_41_50.142Z - XO.log](Invalid file type. Allowed types are: .png, .jpg, .bmp, .txt, .jpeg) 2023-08-20T14_41_50.142Z - XO.log.txt

          V 1 Reply Last reply Reply Quote 0
          • V Offline
            VGerris @VGerris
            last edited by

            [16:47 xcp-ng-2 ~]# xe task-list
            The coordinator reports that it cannot talk back to the supporter on the supplied management IP address.
            ip: 192.168.68.54

            That is on the command line.

            1 Reply Last reply Reply Quote 0
            • DanpD Offline
              Danp Pro Support Team
              last edited by

              IDK. Usually HA requires a minimum of 3 hosts. Hopefully someone else will respond with some ideas for you.

              V 1 Reply Last reply Reply Quote 0
              • V Offline
                VGerris @Danp
                last edited by

                @Danp sure, recommended. but this is not per se a result of a quorum issue? I basically wonder if the host can be brought back into the cluster - the issue and errors are unclear to me, I do not understand what the problem is and how to prevent it.
                3rd host is underway, but I see no reason why it would not happen then. I thought to keep the issue to trace bugs. I was unable to join the host successfully from XO by the way - but I guess I should address that in another post? I am looking for operational troubleshooting knowledge I suppose.

                Thank you, then let's see who else knows something.

                V 1 Reply Last reply Reply Quote 0
                • V Offline
                  VGerris @VGerris
                  last edited by

                  attached two more log files that were made around the time of upgrade.
                  It looks like the USB was mounted and that it disturbs the XAPI somehow?
                  I restarted that also and it shows ok. Gonnna keep looking :).2023-08-19T23_42_06.565Z - XO.log.txt 2023-08-19T23_41_59.940Z - XO.log.txt

                  V 1 Reply Last reply Reply Quote 0
                  • V Offline
                    VGerris @VGerris
                    last edited by

                    I didn't find a solution, it looked like a change of the host that affected network. Rerunning from USB with latest beta kept showing an upgrade.
                    Network reset did not fix it and the machine was laggy, like when I tried to join from XO ( via settings ).

                    I just did a complete reinstall, disabled HA on the NFS SR, joined pool and it worked and re-enabled HA.

                    Seems ok now, thanks for the help.
                    By the way, is there a way to join a node to the main pool in XO ?
                    Thanks!

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post