XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XOSTOR hyperconvergence preview

    Scheduled Pinned Locked Moved XOSTOR
    446 Posts 47 Posters 481.3k Views 48 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B Offline
      BHellman 3rd party vendor
      last edited by

      I'm not sure what the expected behavior is but....

      I have xcp1, xcp2, xcp3 as hosts in my XOSTOR pool, using an XOSTOR repository. I had a VM running on xcp2, unplugged the power from it and left it uplugged for about 5 minutes. The VM remained "running" according to XOA, however it wasn't.

      What is the expected behavior when this happens and how do you go about recovering from a temporarily failed/powered off node?

      My expectation was that my vm would move to xcp1 (where there is a replica) and start, then outdate xcp2. I have "auto start" enabled under advanced on the VM.

      L 1 Reply Last reply Reply Quote 0
      • L Offline
        limezest @BHellman
        last edited by

        @BHellman
        "auto start" means that when you power up the cluster or host node that VM will be automatically started.

        I think you're describing high availability, which needs to be enabled at the cluster level. Then you need to define a HA policy for the vm

        ronan-aR 1 Reply Last reply Reply Quote 1
        • ronan-aR Offline
          ronan-a Vates 🪐 XCP-ng Team @limezest
          last edited by

          @limezest Exactly. Auto start feature is only checked during host boot.

          @BHellman To automatically restart a VM in case of failure:

          xe vm-param-set uuid=<VM_UUID> ha-restart-priority=restart order=1 
          xe pool-ha-enable heartbeat-sr-uuids=<SR_UUID> 
          
          B 1 Reply Last reply Reply Quote 0
          • B Offline
            BHellman 3rd party vendor @ronan-a
            last edited by

            @ronan-a @limezest

            Thank you for the replies 🙂

            Sorry for all the newb questions - I'm diving into this when time permits. Appreciate the help and understanding.

            1 Reply Last reply Reply Quote 1
            • B Offline
              BHellman 3rd party vendor
              last edited by

              I did those commands on xcp1 (pool master) and on the SR that was XOSTOR (linstor) and powered off xcp2. At that point the pool disappeared.

              Now I'm getting the following on the xcp servers console:

              Broadcast message from systemd-journald@xcp3 (Thu 2024-02-08 14:03:12 EST):
              
              xapi-nbd[5580]: main: Failed to log in via xapi's Unix domain socket in 300.000000 seconds
              
              
              Broadcast message from systemd-journald@xcp3 (Thu 2024-02-08 14:03:12 EST):
              
              xapi-nbd[5580]: main: Caught unexpected exception: (Failure
              
              
              Broadcast message from systemd-journald@xcp3 (Thu 2024-02-08 14:03:12 EST):
              
              xapi-nbd[5580]: main:   "Failed to log in via xapi's Unix domain socket in 300.000000 seconds")
              
              

              After powering up xcp2 the pool never comes back in the XOA interface.

              I'm seeing this on
              xcp1:

              [14:04 xcp1 ~]# drbdadm status
              xcp-persistent-database role:Secondary
                disk:Diskless quorum:no
                xcp2 connection:Connecting
                xcp3 connection:Connecting
              
              

              xcp2 and 3

              [14:10 xcp2 ~]# drbdadm status
              # No currently configured DRBD found.
              

              Seems like I hosed this thing up really good. I assume this broke because XOSTOR isn't a shared disk technically.

              [14:15 xcp1 /]# xe sr-list
              The server could not join the liveset because the HA daemon could not access the heartbeat disk.
              

              Is HA + XOSTOR something that should work?

              M olivierlambertO 2 Replies Last reply Reply Quote 0
              • J Offline
                Jonathon
                last edited by Jonathon

                Hello!

                I am attempting to update our hosts, starting with the pool controller. But I am getting a message that I wanted to ask about.

                The following happens when I attempt a yum update

                --> Processing Dependency: sm-linstor for package: xcp-ng-linstor-1.1-3.xcpng8.2.noarch
                --> Finished Dependency Resolution
                Error: Package: xcp-ng-linstor-1.1-3.xcpng8.2.noarch (xcp-ng-updates)
                           Requires: sm-linstor
                You could try using --skip-broken to work around the problem
                           You could try running: rpm -Va --nofiles --nodigest
                

                Only reference I am finding is here: https://koji.xcp-ng.org/buildinfo?buildID=3044
                My best guess is I need to do two updates, the first one skip broken. But wanted to ask to be sure as to not put things in a weird state.

                Thanks in advance!

                stormiS 2 Replies Last reply Reply Quote 0
                • M Offline
                  Midget @BHellman
                  last edited by

                  @BHellman I have the EXACT same errors and scrolling logs now. I made a thread here...

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO @BHellman
                    last edited by

                    @BHellman Yes it should. @ronan-a will take a look around when he can 🙂

                    1 Reply Last reply Reply Quote 0
                    • stormiS Offline
                      stormi Vates 🪐 XCP-ng Team @Jonathon
                      last edited by

                      @Jonathon Never use --skip-broken.

                      1 Reply Last reply Reply Quote 0
                      • stormiS Offline
                        stormi Vates 🪐 XCP-ng Team @Jonathon
                        last edited by

                        @Jonathon What's the output of yum repolist?

                        J 1 Reply Last reply Reply Quote 0
                        • J Offline
                          Jonathon @stormi
                          last edited by

                          @stormi said in XOSTOR hyperconvergence preview:

                          yum repolist

                          lol glad I checked then

                          # yum repolist
                          Loaded plugins: fastestmirror
                          Loading mirror speeds from cached hostfile
                          Excluding mirror: updates.xcp-ng.org
                           * xcp-ng-base: mirrors.xcp-ng.org
                          Excluding mirror: updates.xcp-ng.org
                           * xcp-ng-linstor: mirrors.xcp-ng.org
                          Excluding mirror: updates.xcp-ng.org
                           * xcp-ng-updates: mirrors.xcp-ng.org
                          repo id                                                                                                                        repo name                                                                                                                                            status
                          !xcp-ng-base                                                                                                                   XCP-ng Base Repository                                                                                                                               2,161
                          !xcp-ng-linstor                                                                                                                XCP-ng LINSTOR Repository                                                                                                                              142
                          !xcp-ng-updates                                                                                                                XCP-ng Updates Repository                                                                                                                            1,408
                          !zabbix/x86_64                                                                                                                 Zabbix Official Repository - x86_64                                                                                                                     79
                          !zabbix-non-supported/x86_64                                                                                                   Zabbix Official Repository non-supported - x86_64                                                                                                        6
                          repolist: 3,796
                          
                          J 1 Reply Last reply Reply Quote 0
                          • V Offline
                            vaewyn
                            last edited by

                            Are there any rough estimates for timeline on paid support being available? Looking at ditching vmware and my company requires professional support availability. Virtualization I see the availability but I need storage as well that is at least mostly in parity with the vsan I have. Thanks to you all! Love these projects!

                            1 Reply Last reply Reply Quote 0
                            • olivierlambertO Offline
                              olivierlambert Vates 🪐 Co-Founder CEO
                              last edited by

                              We are working at full speed to get it available ASAP. There's still some bugs to fix and LINBIT is working on it.

                              1 Reply Last reply Reply Quote 1
                              • V Offline
                                vaewyn
                                last edited by

                                With the integration you are doing is there provision to designate racks/sites/datacenters/etc so at some level replications can be kept off hosts in the same physical risk space(s)?

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by

                                  XOSTOR works at the pool level. You can have all your hosts in the pool, or only some of them participating to the HCI (eg 4 hosts with disks used for HCI and others just consuming it). Obviously, it means some hosts without the disks will have to read and write "remotely" on the hosts with the disks. But it might be perfectly acceptable 🙂

                                  V 1 Reply Last reply Reply Quote 0
                                  • V Offline
                                    vaewyn @olivierlambert
                                    last edited by

                                    @olivierlambert I've understood that part... what I am wondering is if I have 3 hosts in one data center and 3 hosts in another, and I have asked for redundancy of 3 copies, is there a way to ensure all three copies are never in the same data center all at the same time.

                                    1 Reply Last reply Reply Quote 1
                                    • olivierlambertO Offline
                                      olivierlambert Vates 🪐 Co-Founder CEO
                                      last edited by

                                      So I imagine a very low latency between the 2 DCs? One pool with 6 hosts total and 3 per DC right?

                                      For now, there's no placement preference, we need to discuss with LINBIT about topology.

                                      And if the 2x DCs are far each other, I would advice to get 2x pools and use 2x XOSTOR total

                                      V B 2 Replies Last reply Reply Quote 0
                                      • V Offline
                                        vaewyn @olivierlambert
                                        last edited by

                                        @olivierlambert Correct... these DCs are across a campus on private fiber so single digit milliseconds worst case. We've historically had vmware keep 3 data copies and make sure at least one is in a separate DC... that way, when a DC is lost, the HA VMs can restart on the remaining host pool successfully because they have their storage available still.

                                        1 Reply Last reply Reply Quote 0
                                        • olivierlambertO Offline
                                          olivierlambert Vates 🪐 Co-Founder CEO
                                          last edited by

                                          So you can create a pool across the 2x DCs no problem. We'll take a deeper look on telling where to replicate to avoid having everything on the same place.

                                          1 Reply Last reply Reply Quote 0
                                          • B Offline
                                            BHellman 3rd party vendor @olivierlambert
                                            last edited by

                                            @olivierlambert said in XOSTOR hyperconvergence preview:

                                            So I imagine a very low latency between the 2 DCs? One pool with 6 hosts total and 3 per DC right?

                                            For now, there's no placement preference, we need to discuss with LINBIT about topology.

                                            And if the 2x DCs are far each other, I would advice to get 2x pools and use 2x XOSTOR total

                                            This can be done using placement policies as outlined in the LINSTOR users guide. It will probably require a bit of extra work on XO to use those properties

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post