XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XOSTOR hyperconvergence preview

    Scheduled Pinned Locked Moved XOSTOR
    446 Posts 47 Posters 479.1k Views 48 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Maelstrom96M Offline
      Maelstrom96
      last edited by

      @ronan-a Since XOSTOR is supposed to be stable now, I figured I would try it out with a new setup of 3 newly installed 8.2 nodes.

      I used the CLI to deploy it. It all went well, and the SR was quickly ready. I was even able to migrate a disk to the Linstor SR and boot the VM. However, after rebooting the master, it seems like the SR doesn't want to allow any disk migration, and manual Scan are failing. I've tried unmounting/remounting the SR fully, restarting the toolstack, but nothing seems to help. The disk that was on Linstor is still accessible and the VM is able to boot.

      Here is the error I'm getting:

      sr.scan
      {
        "id": "e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9"
      }
      {
        "code": "SR_BACKEND_FAILURE_47",
        "params": [
          "",
          "The SR is not available [opterr=Database is not mounted]",
          ""
        ],
        "task": {
          "uuid": "a467bd90-8d47-09cc-b8ac-afa35056ff25",
          "name_label": "Async.SR.scan",
          "name_description": "",
          "allowed_operations": [],
          "current_operations": {},
          "created": "20240502T21:40:00Z",
          "finished": "20240502T21:40:01Z",
          "status": "failure",
          "resident_on": "OpaqueRef:b3e2f390-f45f-4614-a150-1eee53f204e1",
          "progress": 1,
          "type": "<none/>",
          "result": "",
          "error_info": [
            "SR_BACKEND_FAILURE_47",
            "",
            "The SR is not available [opterr=Database is not mounted]",
            ""
          ],
          "other_config": {},
          "subtask_of": "OpaqueRef:NULL",
          "subtasks": [],
          "backtrace": "(((process xapi)(filename lib/backtrace.ml)(line 210))((process xapi)(filename ocaml/xapi/storage_access.ml)(line 32))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 131))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 205))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 95)))"
        },
        "message": "SR_BACKEND_FAILURE_47(, The SR is not available [opterr=Database is not mounted], )",
        "name": "XapiError",
        "stack": "XapiError: SR_BACKEND_FAILURE_47(, The SR is not available [opterr=Database is not mounted], )
          at Function.wrap (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/_XapiError.mjs:16:12)
          at default (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/_getTaskResult.mjs:11:29)
          at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1029:24)
          at file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1063:14
          at Array.forEach (<anonymous>)
          at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1053:12)
          at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1226:14)"
      }
      

      I quickly glanced over the source code and the SM logs to see if I could identify what was going on but it doesn't seem to be a simple thing.

      Logs from SM:

      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] LinstorSR.scan for e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] Raising exception [47, The SR is not available [opterr=Database is not mounted]]
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] lock: released /var/lock/sm/e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9/sr
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] ***** generic exception: sr_scan: EXCEPTION <class 'SR.SROSError'>, The SR is not available [opterr=Database is not mounted]
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return self._run_locked(sr)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     rv = self._run(sr, target)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 364, in _run
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return sr.scan(self.params['sr_uuid'])
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 536, in wrap
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return load(self, *args, **kwargs)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 521, in load
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return wrapped_method(self, *args, **kwargs)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 381, in wrapped_method
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return method(self, *args, **kwargs)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 777, in scan
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     opterr='Database is not mounted'
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]
      
      ronan-aR 1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates πŸͺ Co-Founder CEO
        last edited by

        Have you restarted the satellites?

        1 Reply Last reply Reply Quote 0
        • ronan-aR Offline
          ronan-a Vates πŸͺ XCP-ng Team @Maelstrom96
          last edited by

          @Maelstrom96 said in XOSTOR hyperconvergence preview:

          However, after rebooting the master, it seems like the SR doesn't want to allow any disk migration, and manual Scan are failing.

          What's the status of these commands on each host?

          systemctl status linstor-controller
          systemctl status linstor-satellite
          systemctl status drbd-reactor
          mountpoint /var/lib/linstor
          drbdsetup events2
          

          Also please share your SMlog files. πŸ™‚

          Maelstrom96M 1 Reply Last reply Reply Quote 1
          • Maelstrom96M Offline
            Maelstrom96 @ronan-a
            last edited by Maelstrom96

            @ronan-a said in XOSTOR hyperconvergence preview:

            drbdsetup events2

            Host1:

            [09:49 xcp-ng-labs-host01 ~]# systemctl status linstor-controller
            ● linstor-controller.service - drbd-reactor controlled linstor-controller
               Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
              Drop-In: /run/systemd/system/linstor-controller.service.d
                       └─reactor.conf
               Active: active (running) since Thu 2024-05-02 13:24:32 PDT; 20h ago
             Main PID: 21340 (java)
               CGroup: /system.slice/linstor-controller.service
                       └─21340 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Controller --logs=/var/log/linstor-controller --config-directory=/etc/linstor
            [09:49 xcp-ng-labs-host01 ~]# systemctl status linstor-satellite
            ● linstor-satellite.service - LINSTOR Satellite Service
               Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
              Drop-In: /etc/systemd/system/linstor-satellite.service.d
                       └─override.conf
               Active: active (running) since Wed 2024-05-01 16:04:05 PDT; 1 day 17h ago
             Main PID: 1947 (java)
               CGroup: /system.slice/linstor-satellite.service
                       β”œβ”€1947 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
                       β”œβ”€2109 drbdsetup events2 all
                       └─2347 /usr/sbin/dmeventd
            [09:49 xcp-ng-labs-host01 ~]# systemctl status drbd-reactor
            ● drbd-reactor.service - DRBD-Reactor Service
               Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
              Drop-In: /etc/systemd/system/drbd-reactor.service.d
                       └─override.conf
               Active: active (running) since Wed 2024-05-01 16:04:11 PDT; 1 day 17h ago
                 Docs: man:drbd-reactor
                       man:drbd-reactorctl
                       man:drbd-reactor.toml
             Main PID: 1950 (drbd-reactor)
               CGroup: /system.slice/drbd-reactor.service
                       β”œβ”€1950 /usr/sbin/drbd-reactor
                       └─1976 drbdsetup events2 --full --poll
            [09:49 xcp-ng-labs-host01 ~]# mountpoint /var/lib/linstor
            /var/lib/linstor is a mountpoint
            [09:49 xcp-ng-labs-host01 ~]# drbdsetup events2
            exists resource name:xcp-persistent-database role:Primary suspended:no force-io-failures:no may_promote:no promotion_score:10103
            exists connection name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
            exists connection name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 connection:Connected role:Secondary
            exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
            exists peer-device name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.200:7000 peer:ipv4:10.100.0.202:7000 established:yes
            exists peer-device name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.200:7000 peer:ipv4:10.100.0.201:7000 established:yes
            exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
            exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
            exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 connection:Connected role:Primary
            exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
            exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.200:7001 peer:ipv4:10.100.0.202:7001 established:yes
            exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.200:7001 peer:ipv4:10.100.0.201:7001 established:yes
            exists -
            

            Host2:

            [09:51 xcp-ng-labs-host02 ~]# systemctl status linstor-controller
            ● linstor-controller.service - drbd-reactor controlled linstor-controller
               Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
              Drop-In: /run/systemd/system/linstor-controller.service.d
                       └─reactor.conf
               Active: inactive (dead)
            [09:51 xcp-ng-labs-host02 ~]# systemctl status linstor-satellite
            ● linstor-satellite.service - LINSTOR Satellite Service
               Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
              Drop-In: /etc/systemd/system/linstor-satellite.service.d
                       └─override.conf
               Active: active (running) since Thu 2024-05-02 10:26:59 PDT; 23h ago
             Main PID: 1990 (java)
               CGroup: /system.slice/linstor-satellite.service
                       β”œβ”€1990 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
                       β”œβ”€2128 drbdsetup events2 all
                       └─2552 /usr/sbin/dmeventd
            [09:51 xcp-ng-labs-host02 ~]# systemctl status drbd-reactor
            ● drbd-reactor.service - DRBD-Reactor Service
               Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
              Drop-In: /etc/systemd/system/drbd-reactor.service.d
                       └─override.conf
               Active: active (running) since Thu 2024-05-02 10:27:07 PDT; 23h ago
                 Docs: man:drbd-reactor
                       man:drbd-reactorctl
                       man:drbd-reactor.toml
             Main PID: 1989 (drbd-reactor)
               CGroup: /system.slice/drbd-reactor.service
                       β”œβ”€1989 /usr/sbin/drbd-reactor
                       └─2035 drbdsetup events2 --full --poll
            [09:51 xcp-ng-labs-host02 ~]# mountpoint /var/lib/linstor
            /var/lib/linstor is not a mountpoint
            [09:51 xcp-ng-labs-host02 ~]# drbdsetup events2
            exists resource name:xcp-persistent-database role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
            exists connection name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 connection:Connected role:Primary
            exists connection name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
            exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
            exists peer-device name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.201:7000 peer:ipv4:10.100.0.200:7000 established:yes
            exists peer-device name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.201:7000 peer:ipv4:10.100.0.202:7000 established:yes
            exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Primary suspended:no force-io-failures:no may_promote:no promotion_score:10103
            exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 connection:Connected role:Secondary
            exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
            exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
            exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.201:7001 peer:ipv4:10.100.0.200:7001 established:yes
            exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.201:7001 peer:ipv4:10.100.0.202:7001 established:yes
            exists -
            

            Host3:

            [09:51 xcp-ng-labs-host03 ~]# systemctl status linstor-controller
            ● linstor-controller.service - drbd-reactor controlled linstor-controller
               Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
              Drop-In: /run/systemd/system/linstor-controller.service.d
                       └─reactor.conf
               Active: inactive (dead)
            [09:52 xcp-ng-labs-host03 ~]# systemctl status linstor-satellite
            ● linstor-satellite.service - LINSTOR Satellite Service
               Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
              Drop-In: /etc/systemd/system/linstor-satellite.service.d
                       └─override.conf
               Active: active (running) since Thu 2024-05-02 10:10:16 PDT; 23h ago
             Main PID: 1937 (java)
               CGroup: /system.slice/linstor-satellite.service
                       β”œβ”€1937 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
                       β”œβ”€2151 drbdsetup events2 all
                       └─2435 /usr/sbin/dmeventd
            [09:52 xcp-ng-labs-host03 ~]# systemctl status drbd-reactor
            ● drbd-reactor.service - DRBD-Reactor Service
               Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
              Drop-In: /etc/systemd/system/drbd-reactor.service.d
                       └─override.conf
               Active: active (running) since Thu 2024-05-02 10:10:26 PDT; 23h ago
                 Docs: man:drbd-reactor
                       man:drbd-reactorctl
                       man:drbd-reactor.toml
             Main PID: 1939 (drbd-reactor)
               CGroup: /system.slice/drbd-reactor.service
                       β”œβ”€1939 /usr/sbin/drbd-reactor
                       └─1981 drbdsetup events2 --full --poll
            [09:52 xcp-ng-labs-host03 ~]# mountpoint /var/lib/linstor
            /var/lib/linstor is not a mountpoint
            [09:52 xcp-ng-labs-host03 ~]# drbdsetup events2
            exists resource name:xcp-persistent-database role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
            exists connection name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 connection:Connected role:Primary
            exists connection name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 connection:Connected role:Secondary
            exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
            exists peer-device name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.202:7000 peer:ipv4:10.100.0.200:7000 established:yes
            exists peer-device name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.202:7000 peer:ipv4:10.100.0.201:7000 established:yes
            exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
            exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 connection:Connected role:Secondary
            exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 connection:Connected role:Primary
            exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
            exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.202:7001 peer:ipv4:10.100.0.200:7001 established:yes
            exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
            exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.202:7001 peer:ipv4:10.100.0.201:7001 established:yes
            exists -
            
            

            Will be sending the debug file as a DM.

            Edit: Just as a sanity check, I tried to reboot the master instead of just restarting the toolstack, and the linstor SR seems to be working as expected again. The XOSTOR tab in XOA now populates (it just errored out before) and the SR scan now goes through.

            Edit2: Was able to move a VDI, but then, the same exact error started to happen again. No idea why.

            ronan-aR 2 Replies Last reply Reply Quote 0
            • Theoi-MeteoroiT Offline
              Theoi-Meteoroi
              last edited by

              You lost quorum.

              I would start looking at DRBD - that is the underlying part that isn't working at the moment. When I deployed this I wanted to understand the parts. Key to the Linstor layer - drbd stores the cluster state and membership.

              I'd advise reading the DRBD docs as well as the Linstor docs to find the commands you need to stand this back up. I really don't advise using anything spinning for disk. SSD and NVMe is the ticket. You can make rust work but its terminally slow. I found 3TB disk was ok ( ~60MB/sec ) but 9.1 (10 ) TB were just awful at with 20-40MB/sec the best I saw. I removed all the XOSTOR stuff this week to maybe reinstall on some 4TB NVMe.

              The upside of all that time learning drbd and linstor was helpful when I decided to use the Piraeus operator at the kubernetes level. Its basically all the same bits built from source on the nodes you deploy on and includes a CSI driver.

              ronan-aR 1 Reply Last reply Reply Quote 0
              • ronan-aR Offline
                ronan-a Vates πŸͺ XCP-ng Team @Theoi-Meteoroi
                last edited by

                @Theoi-Meteoroi said in XOSTOR hyperconvergence preview:

                You lost quorum.

                Not a quorum issue:

                exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes

                1 Reply Last reply Reply Quote 0
                • ronan-aR Offline
                  ronan-a Vates πŸͺ XCP-ng Team @Maelstrom96
                  last edited by

                  @Maelstrom96 Thank you for the logs, I'm trying to understand the issue.
                  For the moment I don't see a problem regarding the status of the services.

                  1 Reply Last reply Reply Quote 0
                  • ronan-aR Offline
                    ronan-a Vates πŸͺ XCP-ng Team @Maelstrom96
                    last edited by ronan-a

                    @Maelstrom96 It sounds like a race condition or a bad mount of the database. But I'm not sure, so I will add more logs for the next RPM. We plan to release it in a few weeks.

                    Maelstrom96M 1 Reply Last reply Reply Quote 0
                    • Maelstrom96M Offline
                      Maelstrom96 @ronan-a
                      last edited by Maelstrom96

                      @ronan-a I will be testing my theory a little bit later today, but I believe it might be a hostname mismatch between the node name it expects in linstor and what it set to now on Dom0. We had the hostname of the node updated before the cluster was spinned up, but I think it still had the previous name active when the linstor SR was created.

                      This means that the node name doesn't match here:
                      https://github.com/xcp-ng/sm/blob/e951676098c80e6da6de4d4653f496b15f5a8cb9/drivers/linstorvolumemanager.py#L2641C21-L2641C41

                      I will try to revert the hostname and see if it fixes everything.

                      Edit: Just tested and reverted the hostname to the default one, which matches what's in linstor, and it works again. So seems like changing a hostname after the cluster is provisionned is a no-no.

                      ronan-aR 1 Reply Last reply Reply Quote 0
                      • ronan-aR Offline
                        ronan-a Vates πŸͺ XCP-ng Team @Maelstrom96
                        last edited by

                        @Maelstrom96 Oh! This explanation makes sense, thank you. πŸ™‚ Yes in case of change of hostname, the LINSTOR node name must also be modified, otherwise the path to the database resource will not be found.

                        Maelstrom96M F 2 Replies Last reply Reply Quote 0
                        • Maelstrom96M Offline
                          Maelstrom96 @ronan-a
                          last edited by

                          @ronan-a Do you know of a way to update a node name in Linstor? I've tried to look in their documentation and checked through CLI commands but couldn't find a way.

                          ronan-aR 1 Reply Last reply Reply Quote 0
                          • ronan-aR Offline
                            ronan-a Vates πŸͺ XCP-ng Team @Maelstrom96
                            last edited by

                            @Maelstrom96 Well there is no simple helper to do that using the CLI.

                            So you can create a new node using:

                            linstor node create --node-type Combined <NAME> <IP>
                            

                            Then you must evacuate the old node to preserve the replication count:

                            linstor node evacuate <OLD_NAME>
                            

                            Next, you can change your hostname an restart the services on each host:

                            systemctl stop linstor-controller
                            systemctl restart linstor-satellites
                            

                            Finally you can delete the node:

                            linstor node delete <OLD_NAME>
                            

                            After that you must recreate the diskless resources if necessary. Exec linstor advise r to see the commands to execute.

                            Maelstrom96M 1 Reply Last reply Reply Quote 2
                            • Maelstrom96M Offline
                              Maelstrom96 @ronan-a
                              last edited by

                              @ronan-a Thanks a lot for that procedure.

                              Ended up needing to do a little bit more, since for some reason, "evacuate" failed. I deleted the node and then went and just manually recreated my resources using:

                              linstor resource create --auto-place +1 <resource_name>
                              

                              Which didn't work at first because the new node didn't have a storage-pool configured, which required this command to work (NOTE - This is only valid if your SR was setup as thin):

                              linstor storage-pool create lvmthin <node_name> xcp-sr-linstor_group_thin_device linstor_group/thin_device
                              

                              Also, worth nothing that before actually re-creating the resources, you might want to manually clean up the lingering Logical Volumes that weren't cleaned up if evacuate failed.

                              Find volumes with:

                              lvdisplay
                              

                              and then delete them with:

                              lvremove <LV Path>
                              

                              example:

                              lvremove /dev/linstor_group/xcp-persistent-database_00000
                              
                              1 Reply Last reply Reply Quote 0
                              • F Offline
                                flibbi
                                last edited by flibbi

                                I've using xcp-ng with NFS Shared Storage for some months now and I am happy with it so far.
                                I've some ssds in a 3 server setup and I'd like to test xostor. Before I will setup xostor, there are some questions regarding it, as I am only familiar with virtuozzo/acronis-storage and ceph so far. Are there the same restrictions for using xostor, that exists in ceph e.g.?

                                • only enterprise ssds because of power loss protection
                                • do not use Raids ( especially Raid 0 ) if the controller is capable of using HBA Mode. I've an Dell H330 Controller and if Raid is no problem, I'd like to setup OS with Hardware Raid 1 and xostor on the rest of the ssds with raid0 arrays per each disk. If HBA mode is prefered, I need to stick with Software Raid 1, I think. Software Raid 1 is working fine, but I've had some problems in the past if the boot drive of the mirror died...
                                • I've installed xen-orchestra manually. Once xostor is installed, will the xostor button in xen orchestra will have a function or is it only available within the XOA appliance?

                                Thanks for your answers!

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates πŸͺ Co-Founder CEO
                                  last edited by olivierlambert

                                  1. That shouldn't be a problem, if it's bad on one disk for one host, it should be resync. It's not a big filesystem shared, it's only a block space split between each created virtual disk.
                                  2. That should work fine (@ronan-a will confirm but I don't think there's very low level optimization that could be affected by a RAID card?)
                                  3. XOSTOR UI is only available in XOA, but you'll be able to manage and have all the features from the CLI
                                  J 1 Reply Last reply Reply Quote 0
                                  • J Offline
                                    JeffBerntsen Top contributor @olivierlambert
                                    last edited by

                                    @flibbi, @olivierlambert RAID shouldn't be problem for XOSTOR. During some of my testing shortly after the preview was released, I was running it on software RAID 10 arrays on each of my test servers. As long as the RAID isn't some sort of "fake RAID" and is done in hardware, it should work fine.

                                    1 Reply Last reply Reply Quote 0
                                    • H Offline
                                      ha_tu_su
                                      last edited by

                                      I read on the blog that XOSTOR has been officially released and wanted to test it. I have installed v8.2.1 of XCP-ng on the server nodes. On a separate computer in the management network I have XO built from sources. I have updated the hosts to the latest packages.

                                      Then I started following instructions from the first post in the thread. I am getting error at the sr-create step.

                                      [15:11 xcp-ng-vh1 ~]# xe sr-create type=linstor name-label=XOSTOR host-uuid=382d49a5-7435-425e-8588-f56e7a7711f8  device-config:group-name=linstor_group/thin_device device-config:redundancy=2 shared=true device-config:provisioning=thin
                                      Error code: SR_BACKEND_FAILURE_202
                                      Error parameters: , General backend error [opterr=['XENAPI_PLUGIN_FAILURE', 'non-zero exit', '', 'Traceback (most recent call last):\n  File "/etc/xapi.d/plugins/linstor-manager", line 24, in <module>\n    from linstorjournaler import LinstorJournaler\n  File "/opt/xensource/sm/linstorjournaler.py", line 19, in <module>\n    from linstorvolumemanager import LinstorVolumeManager\n  File "/opt/xensource/sm/linstorvolumemanager.py", line 20, in <module>\n    import linstor\nImportError: No module named linstor\n']], 
                                      

                                      I tried to find possible causes on the forums and it was mentioned that the linstor packages are not yet mature for 8.3 release and that python versions between 8.2 and 8.3 versions of xcp-ng can cause issues. I am using 8.2 branch though so not sure what I am missing here:

                                      [15:12 xcp-ng-vh1 ~]# cat /etc/os-release 
                                      NAME="XCP-ng"
                                      VERSION="8.2.1"
                                      ID="xenenterprise"
                                      ID_LIKE="centos rhel fedora"
                                      VERSION_ID="8.2.1"
                                      PRETTY_NAME="XCP-ng 8.2.1"
                                      ANSI_COLOR="0;31"
                                      HOME_URL="http://xcp-ng.org/"
                                      BUG_REPORT_URL="https://github.com/xcp-ng/xcp"
                                      

                                      Packages related to linstor on the system:

                                      [20:11 xcp-ng-vh1 ~]# yum list | grep linstor
                                      drbd.x86_64                        9.27.0-1.el7             @xcp-ng-linstor     
                                      drbd-bash-completion.x86_64        9.27.0-1.el7             @xcp-ng-linstor     
                                      drbd-pacemaker.x86_64              9.27.0-1.el7             @xcp-ng-linstor     
                                      drbd-reactor.x86_64                1.4.0-1                  @xcp-ng-linstor     
                                      drbd-udev.x86_64                   9.27.0-1.el7             @xcp-ng-linstor     
                                      drbd-utils.x86_64                  9.27.0-1.el7             @xcp-ng-linstor     
                                      drbd-xen.x86_64                    9.27.0-1.el7             @xcp-ng-linstor     
                                      kmod-drbd.x86_64                   9.2.8_4.19.0+1-1         @xcp-ng-linstor     
                                      linstor-client.noarch              1.21.1-1.xcpng8.2        @xcp-ng-linstor     
                                      linstor-common.noarch              1.26.1-1.el7             @xcp-ng-linstor     
                                      linstor-controller.noarch          1.26.1-1.el7             @xcp-ng-linstor     
                                      linstor-satellite.noarch           1.26.1-1.el7             @xcp-ng-linstor     
                                      python-linstor.noarch              1.21.1-1.xcpng8.2        @xcp-ng-linstor     
                                      sm.x86_64                          2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                  @xcp-ng-linstor     
                                      sm-rawhba.x86_64                   2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                  @xcp-ng-linstor     
                                      tzdata-java.noarch                 2023c-1.el7              @xcp-ng-linstor     
                                      xcp-ng-linstor.noarch              1.1-3.xcpng8.2           @xcp-ng-updates     
                                      xcp-ng-release-linstor.noarch      1.3-1.xcpng8.2           @xcp-ng-updates     
                                      drbd-debuginfo.x86_64              9.27.0-1.el7             xcp-ng-linstor      
                                      drbd-heartbeat.x86_64              9.27.0-1.el7             xcp-ng-linstor      
                                      sm-debuginfo.x86_64                2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                  xcp-ng-linstor      
                                      sm-test-plugins.x86_64             2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                  xcp-ng-linstor      
                                      sm-testresults.x86_64              2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                  xcp-ng-linstor     
                                      

                                      Any help appreciated.

                                      Thanks.

                                      H 1 Reply Last reply Reply Quote 0
                                      • H Offline
                                        ha_tu_su @ha_tu_su
                                        last edited by

                                        @ha_tu_su said in XOSTOR hyperconvergence preview:

                                        I read on the blog that XOSTOR has been officially released and wanted to test it. I have installed v8.2.1 of XCP-ng on the server nodes. On a separate computer in the management network I have XO built from sources. I have updated the hosts to the latest packages.

                                        Then I started following instructions from the first post in the thread. I am getting error at the sr-create step.

                                        [15:11 xcp-ng-vh1 ~]# xe sr-create type=linstor name-label=XOSTOR host-uuid=382d49a5-7435-425e-8588-f56e7a7711f8  device-config:group-name=linstor_group/thin_device device-config:redundancy=2 shared=true device-config:provisioning=thin
                                        Error code: SR_BACKEND_FAILURE_202
                                        Error parameters: , General backend error [opterr=['XENAPI_PLUGIN_FAILURE', 'non-zero exit', '', 'Traceback (most recent call last):\n  File "/etc/xapi.d/plugins/linstor-manager", line 24, in <module>\n    from linstorjournaler import LinstorJournaler\n  File "/opt/xensource/sm/linstorjournaler.py", line 19, in <module>\n    from linstorvolumemanager import LinstorVolumeManager\n  File "/opt/xensource/sm/linstorvolumemanager.py", line 20, in <module>\n    import linstor\nImportError: No module named linstor\n']], 
                                        

                                        I tried to find possible causes on the forums and it was mentioned that the linstor packages are not yet mature for 8.3 release and that python versions between 8.2 and 8.3 versions of xcp-ng can cause issues. I am using 8.2 branch though so not sure what I am missing here:

                                        [15:12 xcp-ng-vh1 ~]# cat /etc/os-release 
                                        NAME="XCP-ng"
                                        VERSION="8.2.1"
                                        ID="xenenterprise"
                                        ID_LIKE="centos rhel fedora"
                                        VERSION_ID="8.2.1"
                                        PRETTY_NAME="XCP-ng 8.2.1"
                                        ANSI_COLOR="0;31"
                                        HOME_URL="http://xcp-ng.org/"
                                        BUG_REPORT_URL="https://github.com/xcp-ng/xcp"
                                        

                                        Packages related to linstor on the system:

                                        [20:11 xcp-ng-vh1 ~]# yum list | grep linstor
                                        drbd.x86_64                        9.27.0-1.el7             @xcp-ng-linstor     
                                        drbd-bash-completion.x86_64        9.27.0-1.el7             @xcp-ng-linstor     
                                        drbd-pacemaker.x86_64              9.27.0-1.el7             @xcp-ng-linstor     
                                        drbd-reactor.x86_64                1.4.0-1                  @xcp-ng-linstor     
                                        drbd-udev.x86_64                   9.27.0-1.el7             @xcp-ng-linstor     
                                        drbd-utils.x86_64                  9.27.0-1.el7             @xcp-ng-linstor     
                                        drbd-xen.x86_64                    9.27.0-1.el7             @xcp-ng-linstor     
                                        kmod-drbd.x86_64                   9.2.8_4.19.0+1-1         @xcp-ng-linstor     
                                        linstor-client.noarch              1.21.1-1.xcpng8.2        @xcp-ng-linstor     
                                        linstor-common.noarch              1.26.1-1.el7             @xcp-ng-linstor     
                                        linstor-controller.noarch          1.26.1-1.el7             @xcp-ng-linstor     
                                        linstor-satellite.noarch           1.26.1-1.el7             @xcp-ng-linstor     
                                        python-linstor.noarch              1.21.1-1.xcpng8.2        @xcp-ng-linstor     
                                        sm.x86_64                          2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                    @xcp-ng-linstor     
                                        sm-rawhba.x86_64                   2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                    @xcp-ng-linstor     
                                        tzdata-java.noarch                 2023c-1.el7              @xcp-ng-linstor     
                                        xcp-ng-linstor.noarch              1.1-3.xcpng8.2           @xcp-ng-updates     
                                        xcp-ng-release-linstor.noarch      1.3-1.xcpng8.2           @xcp-ng-updates     
                                        drbd-debuginfo.x86_64              9.27.0-1.el7             xcp-ng-linstor      
                                        drbd-heartbeat.x86_64              9.27.0-1.el7             xcp-ng-linstor      
                                        sm-debuginfo.x86_64                2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                    xcp-ng-linstor      
                                        sm-test-plugins.x86_64             2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                    xcp-ng-linstor      
                                        sm-testresults.x86_64              2.30.8-10.1.0.linstor.2.xcpng8.2
                                                                                                    xcp-ng-linstor     
                                        

                                        Any help appreciated.

                                        Thanks.

                                        Ok, I had 3 hosts in the pool. Above error I was getting on 2 hosts. Just to repeat the process cleanly given in the first post I tried steps on 3rd host and SR creation was successful.

                                        Initially on the 2 hosts I had used the 'thick' version of command to prepare disks. Then I had deleted the lvm and used wipefs on disks and then redid steps using the 'thin' version of command. My guess is that the disks were not 'wiped' completely and then I got error during SR creation.

                                        I am going to use gparted to wipe the disks properly and then redo steps. If that doesn't work, then nuke the install of xcp-ng and reinstall and then check. Will update the post accordingly.

                                        Cheers.

                                        H 1 Reply Last reply Reply Quote 0
                                        • H Offline
                                          ha_tu_su @ha_tu_su
                                          last edited by

                                          @ha_tu_su
                                          After using gparted to wiping out all disks, sr-create command works as expected to create XOSTOR.

                                          1 Reply Last reply Reply Quote 3
                                          • H ha_tu_su referenced this topic on
                                          • F Offline
                                            ferrao @ronan-a
                                            last edited by ferrao

                                            @ronan-a and @Maelstrom96 I didn't get this hostname issue.

                                            Does XOSTOR needs a fully functional DNS setup to work? Or the failure was local due to the local change of the hostname?

                                            I didn't understand if the communication is done by IP addresses directly or if DNS name resolution is needed.

                                            I'm particularly interested in this because with XOSTOR I'm considering virtualizing my pfSense firewall directly and get rid of the physical servers. And in this scenario in a case of a entire pool reboot I must guarantee that I will have the two pfSense VMs up and running, with the option to auto start after reboot, so I can access the entire infrastructure or else I'll be locked from outside.

                                            ronan-aR 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post