XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XOSTOR hyperconvergence preview

    Scheduled Pinned Locked Moved XOSTOR
    446 Posts 47 Posters 479.1k Views 48 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates ๐Ÿช Co-Founder CEO
      last edited by

      Pinging @ronan-a

      1 Reply Last reply Reply Quote 0
      • ronan-aR Offline
        ronan-a Vates ๐Ÿช XCP-ng Team @dumarjo
        last edited by

        @dumarjo Regarding your error during the attach call, could you send me the SMlog please?

        dumarjoD 1 Reply Last reply Reply Quote 0
        • dumarjoD Offline
          dumarjo @ronan-a
          last edited by

          @ronan-a
          Here some more info

          16d8f3b9-e27d-4a17-8805-4826a8d25eda-image.png

          ca92a088-c840-4b90-8107-2ec262d81d4f-image.png

          Mar 10 17:05:22 xcp-ng-04 SM: [28002] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/sr
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] lock: acquired /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/sr
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] sr_attach {'sr_uuid': 'bef191f3-e976-94ec-6bb7-d87529a72dbb', 'subtask_of': 'DummyRef:|79eb31a6-806c-4883-8e8d-de59cde66469|SR.at$
          Mar 10 17:05:22 xcp-ng-04 SMGC: [28002] === SR bef191f3-e976-94ec-6bb7-d87529a72dbb: abort ===
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] lock: tried lock /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active, acquired: True (exists: True)
          Mar 10 17:05:22 xcp-ng-04 SMGC: [28002] abort: releasing the process lock
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] lock: released /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] lock: acquired /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] RESET for SR bef191f3-e976-94ec-6bb7-d87529a72dbb (master: False)
          Mar 10 17:05:22 xcp-ng-04 SM: [28002] lock: released /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
          Mar 10 17:05:23 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 0
          Mar 10 17:05:27 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 1
          Mar 10 17:05:30 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 2
          Mar 10 17:05:33 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 3
          Mar 10 17:05:37 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 4
          Mar 10 17:05:40 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 5
          Mar 10 17:05:43 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 6
          Mar 10 17:05:47 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 7
          Mar 10 17:05:50 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 8
          Mar 10 17:05:53 xcp-ng-04 SM: [28002] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 9
          Mar 10 17:05:54 xcp-ng-04 SM: [28002] Raising exception [47, The SR is not available [opterr=Error: Unable to connect to any of the given controller hosts: ['linstor:/$
          Mar 10 17:05:54 xcp-ng-04 SM: [28002] lock: released /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/sr
          Mar 10 17:05:54 xcp-ng-04 SM: [28002] ***** generic exception: sr_attach: EXCEPTION <class 'SR.SROSError'>, The SR is not available [opterr=Error: Unable to connect to$
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     return self._run_locked(sr)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     rv = self._run(sr, target)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/SRCommand.py", line 352, in _run
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     return sr.attach(sr_uuid)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/LinstorSR", line 489, in wrap
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     return load(self, *args, **kwargs)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/LinstorSR", line 415, in load
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     raise xs_errors.XenError('SRUnavailable', opterr=str(e))
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]
          Mar 10 17:05:54 xcp-ng-04 SM: [28002] ***** LINSTOR resources on XCP-ng: EXCEPTION <class 'SR.SROSError'>, The SR is not available [opterr=Error: Unable to connect to $
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/SRCommand.py", line 378, in run
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     ret = cmd.run(sr)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     return self._run_locked(sr)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     rv = self._run(sr, target)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/SRCommand.py", line 352, in _run
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     return sr.attach(sr_uuid)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/LinstorSR", line 489, in wrap
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     return load(self, *args, **kwargs)
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]   File "/opt/xensource/sm/LinstorSR", line 415, in load
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]     raise xs_errors.XenError('SRUnavailable', opterr=str(e))
          Mar 10 17:05:54 xcp-ng-04 SM: [28002]
          

          This is what I have in the log file. If you need more info, let me know.

          ronan-aR 1 Reply Last reply Reply Quote 0
          • H Offline
            Huxy
            last edited by Huxy

            Hi.

            I have a couple of questions I'm hoping you might be able to answer. I'm currently using Linstor with a ZFS, encrypted and compressed zpool. This solves a lot of problems for me, including high availability, data redundancy and disk wide encryption. Looking at the install script provided, this simply installs the software and sets up the LVM group.

            Does this mean the sr-create command is responsible for setting up the Linstor controllers and their respective storage pools? If so, presumably at this point it only supports LVM as the storage-pool creation command will specify LVM?

            If it's not possible at this time to use ZFS backing devices with the SR, is there any future plans to do so? As the Linstor controllers are responsible for managing the storage pools, I would imagine it wouldn't be too difficult to achieve. Maybe even as simple as changing the creation command to zfs i.e. linstor storage-pool create zfs node data srtank?

            Cheers.

            ronan-aR 1 Reply Last reply Reply Quote 0
            • ronan-aR Offline
              ronan-a Vates ๐Ÿช XCP-ng Team @dumarjo
              last edited by

              @dumarjo Well, the new node (xcp-ng-04) is missing in the LINSTOR database. You must add it. ๐Ÿ˜‰

              dumarjoD 1 Reply Last reply Reply Quote 1
              • ronan-aR Offline
                ronan-a Vates ๐Ÿช XCP-ng Team @Huxy
                last edited by

                @Huxy Yeah for the moment we support only thin and thick LVM. We also want to support ZFS for several features: ZFS snapshots and lifting the restriction of the VHD format (maximum 2TB of data). Unfortunately we don't have a planning for this now. This will probably be done in the SMapiv3, not in the V1.

                H 1 Reply Last reply Reply Quote 1
                • H Offline
                  Huxy @ronan-a
                  last edited by

                  @ronan-a no worries. Thanks for the quick response. I'll consider moving to LVM or staying with Proxmox for the time being. ๐Ÿค”

                  1 Reply Last reply Reply Quote 0
                  • dumarjoD Offline
                    dumarjo @ronan-a
                    last edited by

                    @ronan-a

                    Is my understanding is good ? A XCP-NG host cannot use a shared SR (based on linstor) if it's not part of the linstor nodes ?

                    How this new host can be part of the nodes if I don't want to add HDD/SSD to this new host ? can it be done ?

                    Again, I want to know the limitation before thinking using this new promising technology !

                    ronan-aR 1 Reply Last reply Reply Quote 0
                    • ronan-aR Offline
                      ronan-a Vates ๐Ÿช XCP-ng Team @dumarjo
                      last edited by

                      @dumarjo

                      A XCP-NG host cannot use a shared SR (based on linstor) if it's not part of the linstor nodes ?

                      Yeah it's required to use diskless devices.

                      How this new host can be part of the nodes if I don't want to add HDD/SSD to this new host ? can it be done ?

                      There are small changes to add to support a node without storage. But it's feasible with a modification in the driver. It should be supported because diskless volumes is a feature of DRBD to access to data using local network. ๐Ÿ™‚

                      dumarjoD A 2 Replies Last reply Reply Quote 0
                      • dumarjoD Offline
                        dumarjo @ronan-a
                        last edited by

                        @ronan-a
                        Hi,

                        I did some experiment... and cannot get the missing piece of the puzzle to add my xcp-ng-04 host.

                        From my last status, I have my new xcp-ng-04 host part of the pool and I have installed all the tools for the linstor.

                        I checked the services for satellite and controller on the xcp-ng-04 and they are not running. I have no Idea if I need to start something manualy of not.

                        Here my SMlog on a freshly booted xcp-ng-04

                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] sr_attach {'sr_uuid': '38c2baa3-bc8f-fbc5-ef5a-42461db92c51', 'subtask_of': 'DummyRef:|b4ea936f-80bb-4a6b-98b9-66a95f86b006|SR.attach', 'args': [],$
                        Mar 22 11:54:14 xcp-ng-04 SMGC: [2685] === SR 38c2baa3-bc8f-fbc5-ef5a-42461db92c51: abort ===
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: opening lock file /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/running
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: opening lock file /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/gc_active
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: tried lock /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/gc_active, acquired: True (exists: True)
                        Mar 22 11:54:14 xcp-ng-04 SMGC: [2685] abort: releasing the process lock
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: released /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/gc_active
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: opening lock file /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/sr
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: acquired /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/running
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: acquired /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/sr
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] RESET for SR 38c2baa3-bc8f-fbc5-ef5a-42461db92c51 (master: True)
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: released /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/sr
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: released /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/running
                        Mar 22 11:54:14 xcp-ng-04 SM: [2685] set_dirty 'OpaqueRef:ea31ae92-4207-4c8b-8db6-da901c6a00a8' succeeded
                        Mar 22 11:54:14 xcp-ng-04 SM: [2709] sr_update {'sr_uuid': '38c2baa3-bc8f-fbc5-ef5a-42461db92c51', 'subtask_of': 'DummyRef:|ae4505d5-b9fe-4f39-92dc-67ab9d0b579b|SR.stat', 'args': [], '$
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/sr
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: acquired /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/sr
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] sr_attach {'sr_uuid': 'bef191f3-e976-94ec-6bb7-d87529a72dbb', 'subtask_of': 'DummyRef:|ae624d96-e91a-4a12-afd6-35593be4ce51|SR.attach', 'args': [],$
                        Mar 22 11:54:15 xcp-ng-04 SMGC: [2725] === SR bef191f3-e976-94ec-6bb7-d87529a72dbb: abort ===
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: tried lock /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active, acquired: True (exists: True)
                        Mar 22 11:54:15 xcp-ng-04 SMGC: [2725] abort: releasing the process lock
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: released /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: acquired /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] RESET for SR bef191f3-e976-94ec-6bb7-d87529a72dbb (master: False)
                        Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: released /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
                        Mar 22 11:54:16 xcp-ng-04 SM: [2725] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 0
                        Mar 22 11:54:19 xcp-ng-04 SM: [2725] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 1
                        
                        

                        On xcp-ng-02 (linstor controller)

                        [11:42 xcp-ng-02 ~]# linstor node list
                        โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
                        โ”Š Node      โ”Š NodeType โ”Š Addresses                  โ”Š State  โ”Š
                        โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
                        โ”Š xcp-ng-01 โ”Š COMBINED โ”Š 192.168.2.221:3366 (PLAIN) โ”Š Online โ”Š
                        โ”Š xcp-ng-02 โ”Š COMBINED โ”Š 192.168.2.222:3366 (PLAIN) โ”Š Online โ”Š
                        โ”Š xcp-ng-03 โ”Š COMBINED โ”Š 192.168.2.223:3366 (PLAIN) โ”Š Online โ”Š
                        โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
                        
                        

                        I list all the linstor PBD on hosts:

                        [11:49 xcp-ng-01 ~]# xe pbd-list | grep linstor -3
                        uuid ( RO)                  : e75fdc51-29a4-aa57-bc44-459f80a0d230
                                     host-uuid ( RO): ad95c6ca-612d-42af-8909-d4e9dc7645bb
                                       sr-uuid ( RO): bef191f3-e976-94ec-6bb7-d87529a72dbb
                                 device-config (MRO): provisioning: thin; redundancy: 2; group-name: linstor_group/thin_device; hosts: xcp-ng-01,xcp-ng-02,xcp-ng-03
                            currently-attached ( RO): true
                        
                        
                        --
                        uuid ( RO)                  : 063db650-55b1-a3e0-9d9a-e94ce938988d
                                     host-uuid ( RO): 5747f145-0dc2-4987-a6b9-b6c5a7ed0505
                                       sr-uuid ( RO): bef191f3-e976-94ec-6bb7-d87529a72dbb
                                 device-config (MRO): hosts: xcp-ng-01,xcp-ng-02,xcp-ng-03; group-name: linstor_group/thin_device; redundancy: 2; provisioning: thin
                            currently-attached ( RO): false
                        
                        
                        --
                        uuid ( RO)                  : a1f876b1-0568-71ac-9ffb-720e626cb4ab
                                     host-uuid ( RO): e286a04a-69bf-4d59-a0c8-e7338e8c1831
                                       sr-uuid ( RO): bef191f3-e976-94ec-6bb7-d87529a72dbb
                                 device-config (MRO): provisioning: thin; redundancy: 2; group-name: linstor_group/thin_device; hosts: xcp-ng-01,xcp-ng-02,xcp-ng-03
                            currently-attached ( RO): true
                        
                        
                        uuid ( RO)                  : 08564ab5-a518-f709-8527-f592c2592d14
                                     host-uuid ( RO): eb48f91d-9916-4542-9cf4-4a718abdc451
                                       sr-uuid ( RO): bef191f3-e976-94ec-6bb7-d87529a72dbb
                                 device-config (MRO): provisioning: thin; redundancy: 2; group-name: linstor_group/thin_device; hosts: xcp-ng-01,xcp-ng-02,xcp-ng-03
                            currently-attached ( RO): true
                        

                        After that, I remove all the PBD to all hosts to be able to recreate the PBDs on all hosts with the new xcp-ng-04

                        xe pbd-create host-uuid=e286a04a-69bf-4d59-a0c8-e7338e8c1831 sr-uuid=bef191f3-e976-94ec-6bb7-d87529a72dbb device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02,xcp-ng-03,xcp-ng-04
                        xe pbd-create host-uuid=ad95c6ca-612d-42af-8909-d4e9dc7645bb sr-uuid=bef191f3-e976-94ec-6bb7-d87529a72dbb device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02,xcp-ng-03,xcp-ng-04
                        xe pbd-create host-uuid=eb48f91d-9916-4542-9cf4-4a718abdc451 sr-uuid=bef191f3-e976-94ec-6bb7-d87529a72dbb device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02,xcp-ng-03,xcp-ng-04
                        xe pbd-create host-uuid=5747f145-0dc2-4987-a6b9-b6c5a7ed0505 sr-uuid=bef191f3-e976-94ec-6bb7-d87529a72dbb device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02,xcp-ng-03,xcp-ng-04
                        

                        all succeed. No error.

                        After that I try to connect all PBDs to the SR

                        [11:13 xcp-ng-01 ~]# xe pbd-plug uuid=7d588c37-a152-9666-175e-91b2d48c150f
                        [11:13 xcp-ng-01 ~]# xe pbd-plug uuid=99f76235-1b1a-e5fa-bb19-3883737fcc6d
                        [11:13 xcp-ng-01 ~]# xe pbd-plug uuid=df727345-f475-b929-ecc1-b506f0053361
                        [11:13 xcp-ng-01 ~]# xe pbd-plug uuid=8b4ddc6f-e25a-1942-a435-345ccc93551a
                        Error code: SR_BACKEND_FAILURE_47
                        Error parameters: , The SR is not available [opterr=Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']],
                        

                        From there, I'm a bit lost...

                        1- Do I need to add the linstor satellite xcp-ng-04 before doing all the PBDs ?
                        2- Should I start any of the services on xcp-ng-04 before doing all this ?

                        Regards

                        dumarjoD 1 Reply Last reply Reply Quote 0
                        • dumarjoD Offline
                          dumarjo @dumarjo
                          last edited by

                          Any input appreciated

                          Regards

                          ronan-aR 1 Reply Last reply Reply Quote 0
                          • olivierlambertO Offline
                            olivierlambert Vates ๐Ÿช Co-Founder CEO
                            last edited by

                            Question for @ronan-a

                            1 Reply Last reply Reply Quote 0
                            • ronan-aR Offline
                              ronan-a Vates ๐Ÿช XCP-ng Team @dumarjo
                              last edited by

                              @dumarjo Could you open a ticket with a tunnel please? I can take a look. Also: I started a script this week to simplify the management of LINSTOR with add/remove commands. ๐Ÿ™‚

                              dumarjoD 1 Reply Last reply Reply Quote 0
                              • dumarjoD Offline
                                dumarjo @ronan-a
                                last edited by dumarjo

                                Hi,
                                @ronan-a said in XOSTOR hyperconvergence preview:

                                @dumarjo Could you open a ticket with a tunnel please? I can take a look. Also: I started a script this week to simplify the management of LINSTOR with add/remove commands. ๐Ÿ™‚

                                Ticket done on vates.

                                ronan-aR 1 Reply Last reply Reply Quote 1
                                • ronan-aR Offline
                                  ronan-a Vates ๐Ÿช XCP-ng Team @dumarjo
                                  last edited by

                                  So after analysis:

                                  • The iptables of the new host must be updated (in the case of add/remove action) to connect the corresponding new node to the linstor controller. Otherwise, we can have a "Auto-evict" error in the database.
                                  • PBDs must/can be updated after the creation of the new node.

                                  Thank you for the feedback, it's helpful to see the potential issues to automatise LINSTOR management. ๐Ÿ™‚

                                  1 Reply Last reply Reply Quote 1
                                  • dumarjoD Offline
                                    dumarjo
                                    last edited by

                                    To be able to recreate and reconnect all the PBD, I have modified the /etc/hosts file to add manually each hosts with their IPs. I know that @ronan-a is working to fix the hostname addressing in the driver. But at least I can continue to test the scalability.

                                    Looks promising !

                                    1 Reply Last reply Reply Quote 1
                                    • dumarjoD Offline
                                      dumarjo
                                      last edited by olivierlambert

                                      Hi,

                                      Ok, Figured out how to do it and get it working on 2 or more nodes. Here the process:

                                      [xcp-ng-01 ~]#wget https://gist.githubusercontent.com/Wescoeur/7bb568c0e09e796710b0ea966882fcac/raw/26d1db55fafa4622af2d9ee29a48f6756b8b11a3/gistfile1.txt -O install && chmod +x install
                                      [xcp-ng-01 ~]# ./install --disks /dev/sdb --thin
                                      [xcp-ng-01 ~]# vgchange -a y linstor_group
                                      [xcp-ng-01 ~]# xe sr-create type=linstor name-label=XOSTOR host-uuid=71324aae-aff1-4323-bb0b-2c5f858b223e device-config:hosts=xcp-ng-01 device-config:group-name=linstor_group/thin_device device-config:redundancy=1 shared=true device-config:provisioning=thin
                                      

                                      Now the SR is available to create the VMs. For simplicity, I won't create VM now.

                                      [xcp-ng-02 ~]# wget https://gist.githubusercontent.com/Wescoeur/7bb568c0e09e796710b0ea966882fcac/raw/26d1db55fafa4622af2d9ee29a48f6756b8b11a3/gistfile1.txt -O install && chmod +x install
                                      [xcp-ng-02 ~]# ./install --disks /dev/sdb --thin
                                      [xcp-ng-02 ~]# vgchange -a y linstor_group
                                      

                                      On both hosts, I modify the /etc/hosts to add both hosts with their IP to workaround the driver bug

                                      Start services on nodes 2

                                      systemctl enable minidrbdcluster.service
                                      systemctl enable linstor-satellite.service
                                      systemctl start linstor-satellite.service
                                      systemctl start minidrbdcluster.service
                                      

                                      Open IPTables on node 2

                                      /etc/xapi.d/plugins/firewall-port open 3366
                                      /etc/xapi.d/plugins/firewall-port open 3370
                                      /etc/xapi.d/plugins/firewall-port open 3376
                                      /etc/xapi.d/plugins/firewall-port open 3377
                                      /etc/xapi.d/plugins/firewall-port open 8076
                                      /etc/xapi.d/plugins/firewall-port open 8077
                                      /etc/xapi.d/plugins/firewall-port open 7000:8000
                                      
                                      [xcp-ng-02 ~]linstor --controllers=10.33.33.40 node create --node-type combined $HOSTNAME
                                      [xcp-ng-02 ~]linstor --controllers=10.33.33.40 storage-pool create lvmthin $HOSTNAME xcp-sr-linstor_group_thin_device linstor_group/thin_device
                                      [xcp-ng-02 ~]linstor --controllers=10.33.33.40 resource create $HOSTNAME xcp-persistent-database --storage-pool xcp-sr-linstor_group_thin_device
                                      

                                      After that, you should be in splitbrain. I have no Idea !, my knowledge are not good enough to figure it right now. But I know how to fix it.

                                      On node 2 run those commands:

                                      drbdadm secondary all
                                      drbdadm disconnect all
                                      drbdadm -- --discard-my-data connect all
                                      

                                      On node 1 run those commands:

                                      drbdadm primary all
                                      drbdadm disconnect all
                                      drbdadm connect all
                                      

                                      Now the linstor/drbd are in good shape and should have all the resources.
                                      for the sake of fun, I change the place count from 1 to 2 on linstor controller

                                      [xcp-ng-01 ~]linstor rg modify --place-count 2 xcp-sr-linstor_group_thin_device
                                      

                                      Now the replication is working.

                                      Now on node 1, I unplug the pbd of xcp-ng-02, destroyit and create a new on with 2 hosts
                                      [xcp-ng-01 ~]# xe pbd-unplug uuid=6295519d-1071-2127-4313-f14c9615f244
                                      [xcp-ng-01 ~]# xe pbd-destroy uuid=6295519d-1071-2127-4313-f14c9615f244
                                      [xcp-ng-01 ~]# xe pbd-create host-uuid=eb48f91d-9916-4542-9cf4-4a718abdc451  sr-uuid=505c1928-d39d-421c-1556-143f82770ff5  device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02
                                      [xcp-ng-01 ~]# xe pbd-plug uuid=774971b4-dd03-18c8-92e5-32cac9bdc1e3
                                      

                                      do the same thing with the second pbd and everything is connected together.

                                      Not an easy task !

                                      Imagine if I have 30 Vms... alot of resources to be created....

                                      ronan-aR 1 Reply Last reply Reply Quote 1
                                      • olivierlambertO Offline
                                        olivierlambert Vates ๐Ÿช Co-Founder CEO
                                        last edited by

                                        I edited your post to use Markdown syntax. It's easier to read. Don't forget next time ๐Ÿ˜‰

                                        1 Reply Last reply Reply Quote 0
                                        • A Offline
                                          abufrejoval Top contributor @ronan-a
                                          last edited by

                                          @ronan-a

                                          Ah, so diskless nodes aren't supported at Xcp-ng storage API level yet?

                                          Because that was the next thing on my list of things to try and I'm confident enough to do it at the DRBD level (even if the documentation is skimping on examples there). But if still needs SR integration on the Xen hosts, then I can push that back onto the todo stack.

                                          For background: For Xcp-ng and oVirt I have HCI clusters running permanently on low-power machines. And then I have powerful (noisy and hungry) workstations which I turn off when I'm not running experiments (they also run all kids of different operating systems).

                                          So these only occasionally connect to the clusters but need access to the HCI storage. That's very natural in GlusterFS and I need something similar in LINSTOR.

                                          ronan-aR 1 Reply Last reply Reply Quote 0
                                          • ronan-aR Offline
                                            ronan-a Vates ๐Ÿช XCP-ng Team @dumarjo
                                            last edited by

                                            @dumarjo FYI this logic is available since few days in this commit: https://github.com/xcp-ng/sm/commit/ec3ffffced1bf63fc3a88e0681ecbf7e288828de But not merged in the current beta.

                                            Regarding the splitbrain, it's probably because you use two hosts, the minimal ideal count is 3. With a replication count of two, the third node can be used as a quorum โ€œtie breakerโ€ diskless node.

                                            Imagine if I have 30 Vms... alot of resources to be created....

                                            I'm not sure to understand the link with VMs? ^^"

                                            0 Wescoeur committed to xcp-ng/sm
                                            feat(linstor-manager): add methods to add remove/host from LINSTOR SR
                                            
                                            Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
                                            dumarjoD 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post