XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XOSTOR hyperconvergence preview

    Scheduled Pinned Locked Moved XOSTOR
    446 Posts 47 Posters 481.4k Views 48 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ronan-aR Offline
      ronan-a Vates ๐Ÿช XCP-ng Team @Maelstrom96
      last edited by

      @Maelstrom96 said in XOSTOR hyperconvergence preview:

      What happened was that for some unknown reason, our /var/lib/linstor mount (xcp-persistent-database) became read only

      There is already a protection against that: https://github.com/xcp-ng/sm/commit/55779a64593df9407f861c3132ab85863b4f7e46 (2021-10-21)

      So I don't understand how it's possible to have a new time this issue. Without the log files I can't say what's the source of this issue, can you share them?

      Did you launch a controller manually before having this problem or not? There is a daemon to automatically mount and start a controller: minidrbdcluster. All actions related to the controllers must be executed by this program.

      Another idea, the problem can be related to: https://github.com/xcp-ng/sm/commit/a6385091370c6b358c7466944cc9b63f8c337c0d
      But this commit should be present in the last release.

      0 Wescoeur committed to xcp-ng/sm
      fix(var-lib-linstor.mount): ensure we always mount database with RW flags
      
      Sometimes systemd fallback to read only FS if the volume can't be mounted, we must
      forbid that. It's probably a DRBD error.
      
      Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
      0 Wescoeur committed to xcp-ng/sm
      fix(minidrbdcluster): ensure SIGINT is handled correctly
      
      This patch is here to make sure no LINSTOR controller survives when
      systemd asks to minidrbdcluster to stop with `SIGINT`.
      
      - Remove `os.system`, it's totally unsafe, all signals are ignored with it.
      - Use `subprocess.Popen` instead and catch correctly signal exceptions, it works
        because `wait` call doesn't hide the signals.
      - Ensure `SIGINT` is only sent to the main process, not to the subprocesses.
      - Ensure `SIGKILL` is NEVER sent to minidrbdcluster.
      
      Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
      Maelstrom96M 1 Reply Last reply Reply Quote 0
      • Maelstrom96M Offline
        Maelstrom96 @ronan-a
        last edited by Maelstrom96

        @ronan-a Is there a way to easily check if the process is managed by the daemon and not a manual start? We might have some point restarted the controller manually.

        Edit :

        โ— minidrbdcluster.service - Minimalistic high-availability cluster resource manager
           Loaded: loaded (/usr/lib/systemd/system/minidrbdcluster.service; enabled; vendor preset: disabled)
           Active: active (running) since Wed 2023-01-25 15:58:01 EST; 1 weeks 0 days ago
         Main PID: 2738 (python2)
           CGroup: /system.slice/minidrbdcluster.service
                   โ”œโ”€2738 python2 /opt/xensource/libexec/minidrbdcluster
                   โ”œโ”€2902 /usr/sbin/dmeventd
                   โ””โ”€2939 drbdsetup events2
        
        [11:58 ovbh-pprod-xen10 system]# systemctl status var-lib-linstor.service
        โ— var-lib-linstor.service - Mount filesystem for the LINSTOR controller
           Loaded: loaded (/etc/systemd/system/var-lib-linstor.service; static; vendor p                                                                                                                                                                                                                                                                                                                                                           reset: disabled)
           Active: active (exited) since Wed 2023-01-25 15:58:03 EST; 1 weeks 0 days ago
          Process: 2947 ExecStart=/bin/mount -w /dev/drbd/by-res/xcp-persistent-database                                                                                                                                                                                                                                                                                                                                                           /0 /var/lib/linstor (code=exited, status=0/SUCCESS)
         Main PID: 2947 (code=exited, status=0/SUCCESS)
           CGroup: /system.slice/var-lib-linstor.service
        

        Also, what logs would you like to have?

        Edit2 : Also, I don't believe that service would've actually caught what happened, since it was mounted first using RW, but seems like DRBD had an issue while the mount was active and changed it to RO. The controller service was still healthy and active, just impacted on DB writes.

        ronan-aR 1 Reply Last reply Reply Quote 0
        • ronan-aR Offline
          ronan-a Vates ๐Ÿช XCP-ng Team @Maelstrom96
          last edited by ronan-a

          @Maelstrom96 It's fine to restart the controller on the same host where it was running. But if you want to to move the controller on another host, just temporarily stop minidrbdcluster on the host where the controller is running. Then you can restart it.

          The danger is to start a controller on a host where the shared database is not mounted in /var/lib/linstor.

          To resume, if the database is mounted (check using mountpoint /var/lib/linstor) and if there is a running controller: no issue.

          Edit2 : Also, I don't believe that service would've actually caught what happened, since it was mounted first using RW, but seems like DRBD had an issue while the mount was active and changed it to RO. The controller service was still healthy and active, just impacted on DB writes.

          So if it's not related to a database mount, the system may have changed the mount point to read only for some reason yes, it's clearly not impossible. ๐Ÿ™‚

          Also, what logs would you like to have?

          daemon.log, SMlog, kern.log, (and also drbd-kern.log if present)

          Maelstrom96M 1 Reply Last reply Reply Quote 0
          • Maelstrom96M Offline
            Maelstrom96 @ronan-a
            last edited by

            @ronan-a I will copy those logs soon - Do you have a way I can provide you the logs off forum since it's a production systems?

            1 Reply Last reply Reply Quote 0
            • Maelstrom96M Offline
              Maelstrom96
              last edited by Maelstrom96

              Not sure what we're doing wrong - Attempted to add a new host to the linstor SR and it's failing. I've run the install command with the disks we want on the host, but when running the "addHost" function, it fails.

              [13:25 ovbh-pprod-xen13 ~]# xe host-call-plugin host-uuid=6e845981-1c12-4e70-b0f7-54431959d630 plugin=linstor-manager fn=addHost args:groupName=linstor_group/thin_device
              There was a failure communicating with the plug-in.
              status: addHost
              stdout: Failure
              stderr: ['VDI_IN_USE', 'OpaqueRef:f25cd94b-c948-4c3a-a410-aa29a3749943']
              
              

              Edit : So it's not documented, but it looks like it's failing because the SR is in use? Does that mean that we can't add or remove hosts from linstor without unmounting all VDIs?

              ronan-aR 1 Reply Last reply Reply Quote 0
              • ronan-aR Offline
                ronan-a Vates ๐Ÿช XCP-ng Team @Maelstrom96
                last edited by ronan-a

                @Maelstrom96 No you can add a host with running VMs.
                I suppose there is a small issue here... Please send me a new time your logs (SMlog of each host). ๐Ÿ˜‰

                1 Reply Last reply Reply Quote 0
                • Maelstrom96M Offline
                  Maelstrom96
                  last edited by Maelstrom96

                  @ronan-a

                  We were able to finally add our new #4 host to the linstor SR after killing all VMs with attached VDIs. However, we've hit a new bug that we're not sure how to fix.

                  Once we added the new host, we were curious to see if a live migration to it would work - It did not. It actually just resulted in the VM being in a zombie state and we had to manually destroy the domains on both the source and destination servers, and reset the power state of the VM.

                  That first bug most likely was caused by our custom linstor configuration that we use where we have setup another linstor node interface on each nodes, and changed their PrefNics. It wasn't applied to the new host so the drbd connection wouldn't have worked.

                  [16:51 ovbh-pprod-xen10 lib]# linstor --controllers=10.2.0.19,10.2.0.20,10.2.0.21 node interface list ovbh-pprod-xen12
                  โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
                  โ”Š ovbh-pprod-xen12 โ”Š NetInterface โ”Š IP        โ”Š Port โ”Š EncryptionType โ”Š
                  โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
                  โ”Š + StltCon        โ”Š default      โ”Š 10.2.0.21 โ”Š 3366 โ”Š PLAIN          โ”Š
                  โ”Š +                โ”Š stornet      โ”Š 10.2.4.12 โ”Š      โ”Š                โ”Š
                  โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
                  [16:41 ovbh-pprod-xen10 lib]# linstor --controllers=10.2.0.19,10.2.0.20,10.2.0.21 node list-properties ovbh-pprod-xen12
                  โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
                  โ”Š Key             โ”Š Value            โ”Š
                  โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
                  โ”Š Aux/xcp-ng.node โ”Š ovbh-pprod-xen12 โ”Š
                  โ”Š Aux/xcp-ng/node โ”Š ovbh-pprod-xen12 โ”Š
                  โ”Š CurStltConnName โ”Š default          โ”Š
                  โ”Š NodeUname       โ”Š ovbh-pprod-xen12 โ”Š
                  โ”Š PrefNic         โ”Š stornet          โ”Š
                  โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
                  
                  

                  However, once the VM was down and all the linstor configuration was updated to match the rest of the cluster, I've tried to manually start that VM on the new host but it's not working. It seems like if linstor is not called to add the volume to the host as a diskless volume, since it's not on that host.

                  SMLog:

                  Feb 28 17:01:31 ovbh-pprod-xen13 SM: [25108] lock: opening lock file /var/lock/sm/a8b860a9-5246-0dd2-8b7f-4806604f219a/sr
                  Feb 28 17:01:31 ovbh-pprod-xen13 SM: [25108] lock: acquired /var/lock/sm/a8b860a9-5246-0dd2-8b7f-4806604f219a/sr
                  Feb 28 17:01:31 ovbh-pprod-xen13 SM: [25108] call-plugin on ff631fff-1947-4631-a35d-9352204f98d9 (linstor-manager:lockVdi with {'groupName': 'linstor_group/thin_device', 'srUuid': 'a8b860a9-5246-0dd2-8b7f-4806604f219a', 'vdiUuid': '02ca1b5b-fef4-47d4-8736-40908385739c', 'locked': 'True'}) returned: True
                  Feb 28 17:01:33 ovbh-pprod-xen13 SM: [25108] ['/usr/bin/vhd-util', 'query', '--debug', '-vsfp', '-n', '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0']
                  Feb 28 17:01:33 ovbh-pprod-xen13 SM: [25108] FAILED in util.pread: (rc 2) stdout: 'error opening /dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0: -2
                  Feb 28 17:01:33 ovbh-pprod-xen13 SM: [25108] ', stderr: ''
                  Feb 28 17:01:33 ovbh-pprod-xen13 SM: [25108] Got exception: No such file or directory. Retry number: 0
                  Feb 28 17:01:35 ovbh-pprod-xen13 SM: [25108] ['/usr/bin/vhd-util', 'query', '--debug', '-vsfp', '-n', '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0']
                  Feb 28 17:01:35 ovbh-pprod-xen13 SM: [25108] FAILED in util.pread: (rc 2) stdout: 'error opening /dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0: -2
                  Feb 28 17:01:35 ovbh-pprod-xen13 SM: [25108] ', stderr: ''
                  Feb 28 17:01:35 ovbh-pprod-xen13 SM: [25108] Got exception: No such file or directory. Retry number: 1
                  Feb 28 17:01:37 ovbh-pprod-xen13 SM: [25108] ['/usr/bin/vhd-util', 'query', '--debug', '-vsfp', '-n', '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0']
                  Feb 28 17:01:37 ovbh-pprod-xen13 SM: [25108] FAILED in util.pread: (rc 2) stdout: 'error opening /dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0: -2
                  Feb 28 17:01:37 ovbh-pprod-xen13 SM: [25108] ', stderr: ''
                  Feb 28 17:01:37 ovbh-pprod-xen13 SM: [25108] Got exception: No such file or directory. Retry number: 2
                  Feb 28 17:01:39 ovbh-pprod-xen13 SM: [25108] ['/usr/bin/vhd-util', 'query', '--debug', '-vsfp', '-n', '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0']
                  Feb 28 17:01:39 ovbh-pprod-xen13 SM: [25108] FAILED in util.pread: (rc 2) stdout: 'error opening /dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0: -2
                  Feb 28 17:01:39 ovbh-pprod-xen13 SM: [25108] ', stderr: ''
                  Feb 28 17:01:39 ovbh-pprod-xen13 SM: [25108] Got exception: No such file or directory. Retry number: 3
                  Feb 28 17:01:41 ovbh-pprod-xen13 SM: [25108] ['/usr/bin/vhd-util', 'query', '--debug', '-vsfp', '-n', '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0']
                  Feb 28 17:01:41 ovbh-pprod-xen13 SM: [25108] FAILED in util.pread: (rc 2) stdout: 'error opening /dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0: -2
                  Feb 28 17:01:41 ovbh-pprod-xen13 SM: [25108] ', stderr: ''
                  Feb 28 17:01:41 ovbh-pprod-xen13 SM: [25108] Got exception: No such file or directory. Retry number: 4
                  Feb 28 17:01:41 ovbh-pprod-xen13 SM: [25108] ['/usr/bin/vhd-util', 'query', '--debug', '-vsfp', '-n', '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0']
                  Feb 28 17:01:41 ovbh-pprod-xen13 SM: [25108] FAILED in util.pread: (rc 2) stdout: 'error opening /dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0: -2
                  Feb 28 17:01:41 ovbh-pprod-xen13 SM: [25108] ', stderr: ''
                  Feb 28 17:01:41 ovbh-pprod-xen13 SM: [25108] failed to execute locally vhd-util (sys 2)
                  Feb 28 17:01:42 ovbh-pprod-xen13 SM: [25108] call-plugin (getVHDInfo with {'devicePath': '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0', 'groupName': 'linstor_group/thin_device', 'includeParent': 'True'}) returned: {"uuid": "02ca1b5b-fef4-47d4-8736-40908385739c", "parentUuid": "1ad76dd3-14af-4636-bf5d-6822b81bfd0c", "sizeVirt": 53687091200, "sizePhys": 1700033024, "parentPath": "/dev/drbd/by-res/xcp-v$
                  Feb 28 17:01:42 ovbh-pprod-xen13 SM: [25108] VDI 02ca1b5b-fef4-47d4-8736-40908385739c loaded! (path=/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0, hidden=0)
                  Feb 28 17:01:42 ovbh-pprod-xen13 SM: [25108] lock: released /var/lock/sm/a8b860a9-5246-0dd2-8b7f-4806604f219a/sr
                  Feb 28 17:01:42 ovbh-pprod-xen13 SM: [25108] vdi_epoch_begin {'sr_uuid': 'a8b860a9-5246-0dd2-8b7f-4806604f219a', 'subtask_of': 'DummyRef:|3f01e26c-0225-40e1-9683-bffe5bb69490|VDI.epoch_begin', 'vdi_ref': 'OpaqueRef:f25cd94b-c948-4c3a-a410-aa29a3749943', 'vdi_on_boot': 'persist', 'args': [], 'vdi_location': '02ca1b5b-fef4-47d4-8736-40908385739c', 'host_ref': 'OpaqueRef:3cd7e97c-4b79-473e-b925-c25f8cb393d8', 'session_ref': '$
                  Feb 28 17:01:42 ovbh-pprod-xen13 SM: [25108] call-plugin on ff631fff-1947-4631-a35d-9352204f98d9 (linstor-manager:lockVdi with {'groupName': 'linstor_group/thin_device', 'srUuid': 'a8b860a9-5246-0dd2-8b7f-4806604f219a', 'vdiUuid': '02ca1b5b-fef4-47d4-8736-40908385739c', 'locked': 'False'}) returned: True
                  Feb 28 17:01:42 ovbh-pprod-xen13 SM: [25278] lock: opening lock file /var/lock/sm/a8b860a9-5246-0dd2-8b7f-4806604f219a/sr
                  Feb 28 17:01:42 ovbh-pprod-xen13 SM: [25278] lock: acquired /var/lock/sm/a8b860a9-5246-0dd2-8b7f-4806604f219a/sr
                  Feb 28 17:01:43 ovbh-pprod-xen13 SM: [25278] call-plugin on ff631fff-1947-4631-a35d-9352204f98d9 (linstor-manager:lockVdi with {'groupName': 'linstor_group/thin_device', 'srUuid': 'a8b860a9-5246-0dd2-8b7f-4806604f219a', 'vdiUuid': '02ca1b5b-fef4-47d4-8736-40908385739c', 'locked': 'True'}) returned: True
                  Feb 28 17:01:44 ovbh-pprod-xen13 SM: [25278] ['/usr/bin/vhd-util', 'query', '--debug', '-vsfp', '-n', '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0']
                  Feb 28 17:01:44 ovbh-pprod-xen13 SM: [25278] FAILED in util.pread: (rc 2) stdout: 'error opening /dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0: -2
                  Feb 28 17:01:44 ovbh-pprod-xen13 SM: [25278] ', stderr: ''
                  Feb 28 17:01:44 ovbh-pprod-xen13 SM: [25278] Got exception: No such file or directory. Retry number: 0
                  Feb 28 17:01:46 ovbh-pprod-xen13 SM: [25278] ['/usr/bin/vhd-util', 'query', '--debug', '-vsfp', '-n', '/dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0']
                  Feb 28 17:01:46 ovbh-pprod-xen13 SM: [25278] FAILED in util.pread: (rc 2) stdout: 'error opening /dev/drbd/by-res/xcp-volume-fb565237-b169-434d-b694-4707e6f51f4c/0: -2
                  Feb 28 17:01:46 ovbh-pprod-xen13 SM: [25278] ', stderr: ''
                  [...]
                  

                  The folder /dev/drbd/by-res/ doesn't exist currently.

                  Also, not sure why, but it seems like when adding the new host, a new storage pool linstor_group_thin_device for it's local storage wasn't provisioned automatically, but we can see that there is a diskless storage pool that was provisionned.

                  [17:26 ovbh-pprod-xen10 lib]# linstor --controllers=10.2.0.19,10.2.0.20,10.2.0.21  storage-pool list
                  โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
                  โ”Š StoragePool                      โ”Š Node                                     โ”Š Driver   โ”Š PoolName                  โ”Š FreeCapacity โ”Š TotalCapacity โ”Š CanSnapshots โ”Š State โ”Š SharedName โ”Š
                  โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-pprod-xen10                         โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-pprod-xen11                         โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-pprod-xen12                         โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-pprod-xen13                         โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-vprod-k8s04-worker01.floatplane.com โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-vprod-k8s04-worker02.floatplane.com โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-vprod-k8s04-worker03.floatplane.com โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-vtest-k8s02-worker01.floatplane.com โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-vtest-k8s02-worker02.floatplane.com โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š DfltDisklessStorPool             โ”Š ovbh-vtest-k8s02-worker03.floatplane.com โ”Š DISKLESS โ”Š                           โ”Š              โ”Š               โ”Š False        โ”Š Ok    โ”Š            โ”Š
                  โ”Š xcp-sr-linstor_group_thin_device โ”Š ovbh-pprod-xen10                         โ”Š LVM_THIN โ”Š linstor_group/thin_device โ”Š     3.00 TiB โ”Š      3.49 TiB โ”Š True         โ”Š Ok    โ”Š            โ”Š
                  โ”Š xcp-sr-linstor_group_thin_device โ”Š ovbh-pprod-xen11                         โ”Š LVM_THIN โ”Š linstor_group/thin_device โ”Š     3.03 TiB โ”Š      3.49 TiB โ”Š True         โ”Š Ok    โ”Š            โ”Š
                  โ”Š xcp-sr-linstor_group_thin_device โ”Š ovbh-pprod-xen12                         โ”Š LVM_THIN โ”Š linstor_group/thin_device โ”Š     3.06 TiB โ”Š      3.49 TiB โ”Š True         โ”Š Ok    โ”Š            โ”Š
                  โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
                  
                  
                  [17:32 ovbh-pprod-xen13 ~]# lsblk
                  NAME                                                                                                MAJ:MIN RM    SIZE RO TYPE  MOUNTPOINT
                  nvme0n1                                                                                             259:0    0    3.5T  0 disk
                  โ”œโ”€nvme0n1p1                                                                                         259:1    0      1T  0 part
                  โ”‚ โ””โ”€md128                                                                                             9:128  0 1023.9G  0 raid1
                  โ””โ”€nvme0n1p2                                                                                         259:2    0    2.5T  0 part
                    โ”œโ”€linstor_group-thin_device_tdata                                                                 252:1    0      5T  0 lvm
                    โ”‚ โ””โ”€linstor_group-thin_device                                                                     252:2    0      5T  0 lvm
                    โ””โ”€linstor_group-thin_device_tmeta                                                                 252:0    0     80M  0 lvm
                      โ””โ”€linstor_group-thin_device                                                                     252:2    0      5T  0 lvm
                  sdb                                                                                                   8:16   1  447.1G  0 disk
                  โ””โ”€md127                                                                                               9:127  0  447.1G  0 raid1
                    โ”œโ”€md127p5                                                                                         259:10   0      4G  0 md    /var/log
                    โ”œโ”€md127p3                                                                                         259:8    0  405.6G  0 md
                    โ”‚ โ””โ”€XSLocalEXT--ea64a6f6--9ef2--408a--039f--33b119fbd7e8-ea64a6f6--9ef2--408a--039f--33b119fbd7e8 252:3    0  405.6G  0 lvm   /run/sr-mount/ea64a6f6-9ef2-408a-039f-33b119fbd7e8
                    โ”œโ”€md127p1                                                                                         259:6    0     18G  0 md    /
                    โ”œโ”€md127p6                                                                                         259:11   0      1G  0 md    [SWAP]
                    โ”œโ”€md127p4                                                                                         259:9    0    512M  0 md    /boot/efi
                    โ””โ”€md127p2                                                                                         259:7    0     18G  0 md
                  nvme1n1                                                                                             259:3    0    3.5T  0 disk
                  โ”œโ”€nvme1n1p2                                                                                         259:5    0    2.5T  0 part
                  โ”‚ โ””โ”€linstor_group-thin_device_tdata                                                                 252:1    0      5T  0 lvm
                  โ”‚   โ””โ”€linstor_group-thin_device                                                                     252:2    0      5T  0 lvm
                  โ””โ”€nvme1n1p1                                                                                         259:4    0      1T  0 part
                    โ””โ”€md128                                                                                             9:128  0 1023.9G  0 raid1
                  sda                                                                                                   8:0    1  447.1G  0 disk
                  โ””โ”€md127                                                                                               9:127  0  447.1G  0 raid1
                    โ”œโ”€md127p5                                                                                         259:10   0      4G  0 md    /var/log
                    โ”œโ”€md127p3                                                                                         259:8    0  405.6G  0 md
                    โ”‚ โ””โ”€XSLocalEXT--ea64a6f6--9ef2--408a--039f--33b119fbd7e8-ea64a6f6--9ef2--408a--039f--33b119fbd7e8 252:3    0  405.6G  0 lvm   /run/sr-mount/ea64a6f6-9ef2-408a-039f-33b119fbd7e8
                    โ”œโ”€md127p1                                                                                         259:6    0     18G  0 md    /
                    โ”œโ”€md127p6                                                                                         259:11   0      1G  0 md    [SWAP]
                    โ”œโ”€md127p4                                                                                         259:9    0    512M  0 md    /boot/efi
                    โ””โ”€md127p2                                                                                         259:7    0     18G  0 md
                  
                  
                  ronan-aR 1 Reply Last reply Reply Quote 0
                  • ronan-aR Offline
                    ronan-a Vates ๐Ÿช XCP-ng Team @Maelstrom96
                    last edited by

                    @Maelstrom96 said in XOSTOR hyperconvergence preview:

                    The folder /dev/drbd/by-res/ doesn't exist currently.

                    You're lucky, I just produced a fix yesterday to fix this kind of problem on pools with more than 3 machines: https://github.com/xcp-ng/sm/commit/f916647f44223206b24cf70d099637882c53fee8

                    Unfortunately, I can't release a new version right away, but I think this change can be applied to your pool.
                    In the worst case I'll see if I can release a new version without all the fixes in progress...

                    0 Wescoeur committed to xcp-ng/sm
                    fix(LinstorSR): ensure vhdutil calls are correctly executed on pools with > 3 hosts
                    
                    Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
                    Maelstrom96M 2 Replies Last reply Reply Quote 0
                    • Maelstrom96M Offline
                      Maelstrom96 @ronan-a
                      last edited by Maelstrom96

                      @ronan-a said in XOSTOR hyperconvergence preview:

                      You're lucky, I just produced a fix yesterday to fix this kind of problem on pools with more than 3 machines: https://github.com/xcp-ng/sm/commit/f916647f44223206b24cf70d099637882c53fee8

                      Unfortunately, I can't release a new version right away, but I think this change can be applied to your pool.
                      In the worst case I'll see if I can release a new version without all the fixes in progress...

                      Thanks, that does look like it would fix the missing drbd/by-res/ volumes.

                      Do you have an idea about the missing StoragePool for the new host that was added using linstor-manager.addHost? I've checked the code and it seems like it might just provision the SP on sr.create?

                      Also, I'm not sure how feasible it would be for SM but having a nightly-style build process for those cases seems like it would be really useful for hotfix testing.

                      1 Reply Last reply Reply Quote 0
                      • drfillD Offline
                        drfill
                        last edited by drfill

                        Hello guys,
                        Awesome realisation, works like a charm! But security in low level. Anyone who wants break down disk cluster/HC storage does it. I installed and as I see Linstor Controller ports opened to whole world. Any solution have you to close external port (when management in global IP), and communicate through Storage Network?

                        1 Reply Last reply Reply Quote 0
                        • olivierlambertO Offline
                          olivierlambert Vates ๐Ÿช Co-Founder CEO
                          last edited by

                          Hmm in theory I would say it should only listen on the management network (or the storage network, but not everything)

                          1 Reply Last reply Reply Quote 0
                          • Maelstrom96M Offline
                            Maelstrom96 @ronan-a
                            last edited by Maelstrom96

                            @ronan-a Any news on when the new version of linstor SM will be released? We're actually hard blocked by the behavior with 4 nodes right now so we can't move forward with a lot of other tests we want to do.

                            We also worked on doing a custom build of linstor-controller and linstor-satellite to allow support of Centos 7 with it's lack of sedsid -w support, and we might want to see if we could get a satisfactory PR that could be merged into linstor-server master so that people using XCP-ng can also use linstor's built-in snapshot shipping. Since K8s linstor snapshotter uses that functionality to provide volume backups, it means that using K8s with linstor on XCP-ng is not really possible unless this is fixed.

                            Would that be something that you guys could help us push to linstor?

                            ronan-aR 1 Reply Last reply Reply Quote 0
                            • ronan-aR Offline
                              ronan-a Vates ๐Ÿช XCP-ng Team @Maelstrom96
                              last edited by

                              @Maelstrom96

                              Do you have an idea about the missing StoragePool for the new host that was added using linstor-manager.addHost? I've checked the code and it seems like it might just provision the SP on sr.create?

                              If I remember correctly, this script is only here for adding the PBDs of the new host and configuring the services. If you want to add a new device, it is necessary to manually create a new VG LVM, and add it via a linstor command.

                              Also, I'm not sure how feasible it would be for SM but having a nightly-style build process for those cases seems like it would be really useful for hotfix testing.

                              The branch was in a bad state (many fixes to test, regressions, etc). I was able to clean all that up, it should be easier to do releases now.

                              Any news on when the new version of linstor SM will be released?

                              Today. ๐Ÿ˜‰

                              ronan-aR 1 Reply Last reply Reply Quote 0
                              • ronan-aR Offline
                                ronan-a Vates ๐Ÿช XCP-ng Team @ronan-a
                                last edited by ronan-a

                                WARNING: I just pushed new packages (they should be available in our repo in few minutes) and I made an important change in the driver which requires manual intervention.

                                minidrbdcluster is no longer used to start the controller, instead we use drbd-reactor which is more robust.
                                To update properly, you must:

                                1. Disable minidrbdcluster on each host: systemctl disable --now minidrbdcluster.
                                2. Install new LINSTOR packages using yum:
                                • blktap-3.37.4-1.0.1.0.linstor.1.xcpng8.2.x86_64.rpm
                                • xcp-ng-linstor-1.0-1.xcpng8.2.noarch.rpm
                                • xcp-ng-release-linstor-1.2-1.xcpng8.2.noarch.rpm
                                • http-nbd-transfer-1.2.0-1.xcpng8.2.x86_64.rpm
                                • sm-2.30.7-1.3.0.linstor.7.xcpng8.2.x86_64.rpm
                                1. On each host, edit /etc/drbd-reactor.d/sm-linstor.toml. (Note: it's probably necessary to create the folder /etc/drbd-reactor.d/ with mkdir.) And add this content:
                                [[promoter]]
                                
                                [promoter.resources.xcp-persistent-database]
                                start = [ "var-lib-linstor.service", "linstor-controller.service" ]
                                
                                1. After that you can manually start drbd-reactor on each machine: systemctl enable --now drbd-reactor.

                                You can reuse your SR again.

                                1 Reply Last reply Reply Quote 0
                                • SwenS Offline
                                  Swen
                                  last edited by

                                  @ronan-a Just to be sure: IF you install it from scratch you can still use the installation instructions from the top of this thread, correct?

                                  ronan-aR 1 Reply Last reply Reply Quote 0
                                  • ronan-aR Offline
                                    ronan-a Vates ๐Ÿช XCP-ng Team @Swen
                                    last edited by

                                    @Swen Yes you can still use the installation script. I just changed a line to install the new blktap, so redownload it if necessary.

                                    SwenS 1 Reply Last reply Reply Quote 0
                                    • SwenS Offline
                                      Swen @ronan-a
                                      last edited by

                                      @ronan-a perfect, thx! Is this the new release Olivier was talking about? Can you provide some information when to expect the first stable release?

                                      F ronan-aR 2 Replies Last reply Reply Quote 0
                                      • F Offline
                                        fred974 @Swen
                                        last edited by fred974

                                        @ronan-a Just to be clear on what I have to do..

                                        1. Diable minidrbdcluster on each host: systemctl disable --now minidrbdcluster no issue here...

                                        2. Install new LINSTOR packages. How do we do that? Do we run the installer again by running:
                                          wget https://gist.githubusercontent.com/Wescoeur/7bb568c0e09e796710b0ea966882fcac/raw/1707fbcfac22e662c2b80c14762f2c7d937e677c/gistfile1.txt -O install && chmod +x install
                                          or
                                          ./install update

                                        Or do I simply installed the new RPM without running the installer?

                                        wget --no-check-certificate blktap-3.37.4-1.0.1.0.linstor.1.xcpng8.2.x86_64.rpm
                                        wget --no-check-certificate xcp-ng-linstor-1.0-1.xcpng8.2.noarch.rpm
                                        wget --no-check-certificate xcp-ng-release-linstor-1.2-1.xcpng8.2.noarch.rpm
                                        wget --no-check-certificate http-nbd-transfer-1.2.0-1.xcpng8.2.x86_64.rpm
                                        wget --no-check-certificate sm-2.30.7-1.3.0.linstor.7.xcpng8.2.x86_64.rpm
                                        yum install *.rpm
                                        

                                        Where do we get this files from? what is the URL?

                                        1. On each host, edit /etc/drbd-reactor.d/sm-linstor.toml no problem here...

                                        Can you please confirm which procedure to use for steps 2?

                                        Thank you.

                                        F 1 Reply Last reply Reply Quote 0
                                        • F Offline
                                          fred974 @fred974
                                          last edited by

                                          @ronan-a is that the correct URL?

                                          https://koji.xcp-ng.org/kojifiles/packages/blktap/3.37.4/1.0.1.0.linstor.1.xcpng8.2/x86_64/blktap-3.37.4-1.0.1.0.linstor.1.xcpng8.2.x86_64.rpm

                                          https://koji.xcp-ng.org/kojifiles/packages/xcp-ng-linstor/1.0/1.xcpng8.2/noarch/xcp-ng-linstor-1.0-1.xcpng8.2.noarch.rpm

                                          https://koji.xcp-ng.org/kojifiles/packages/xcp-ng-release-linstor/1.2/1.xcpng8.2/noarch/xcp-ng-release-linstor-1.2-1.xcpng8.2.noarch.rpm

                                          https://koji.xcp-ng.org/kojifiles/packages/http-nbd-transfer/1.2.0/1.xcpng8.2/x86_64/http-nbd-transfer-1.2.0-1.xcpng8.2.x86_64.rpm

                                          https://koji.xcp-ng.org/kojifiles/packages/sm/2.30.7/1.3.0.linstor.7.xcpng8.2/x86_64/sm-2.30.7-1.3.0.linstor.7.xcpng8.2.x86_64.rpm

                                          1 Reply Last reply Reply Quote 0
                                          • ronan-aR Offline
                                            ronan-a Vates ๐Ÿช XCP-ng Team @Swen
                                            last edited by ronan-a

                                            @Swen said in XOSTOR hyperconvergence preview:

                                            @ronan-a perfect, thx! Is this the new release Olivier was talking about? Can you provide some information when to expect the first stable release?

                                            If we don't have a new critical bug, normally in few weeks.

                                            @fred974

                                            Or do I simply installed the new RPM without running the installer?

                                            You can update the packages just using yum if you already have the xcp-ng-linstor yum repo config. There is no reason to download manually the packages from koji.

                                            F SwenS 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post