XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. Maelstrom96
    Offline
    • Profile
    • Following 0
    • Followers 1
    • Topics 0
    • Posts 30
    • Groups 0

    Maelstrom96

    @Maelstrom96

    13
    Reputation
    38
    Profile views
    30
    Posts
    1
    Followers
    0
    Following
    Joined
    Last Online
    Location Montreal, Quebec, Canaca

    Maelstrom96 Unfollow Follow

    Best posts made by Maelstrom96

    • RE: XOSTOR hyperconvergence preview

      @gb-123 said in XOSTOR hyperconvergence preview:

      @ronan-a

      VMs would be using LUKS encryption.

      So if only VDI is replicated and hypothetically, if I loose the master node or any other node actually having the VM, then I will have to create the VM again using the replicated disk? Or would it be something like DRBD where there are actually 2 VMs running in Active/Passive mode and there is an automatic switchover ? Or would it be that One VM is running and the second gets automatically started when 1st is down ?

      Sorry for the noob questions. I just wanted to be sure of the implementation.

      The VM metadata is at the pool level, meaning that you wouldn't have to re-create the VM if the current VM host has a failure. However, memory can't/isn't replicated in the cluster, unless you're doing a live migration which would temporarily replicate the VM memory to the new host, so it can be moved.

      DRBD only replicates the VDI, or in other terms, the disk data across the active Linstor members. If the VM is stopped or is terminated because of host failure, you should be able to start it back up on another host in your pool, but by default, this will require manual intervention to start the VM, and will require you to input your encryption password since it will be a cold boot.

      If you want the VM to automatically self-start in case of failure, you can use the HA feature of XCP-ng. This wouldn't solve your issue with having to input your encryption password since, like explain earlier, the memory isn't replicated, and it would cold boot from the replicated VDI. Also, keep in mind that enabling HA adds maintenance complexity and might not be worth it.

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: Three-node Networking for XOSTOR

      @T3CCH What you might be looking for: https://xcp-ng.org/docs/networking.html#full-mesh-network

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @ronan-a said in XOSTOR hyperconvergence preview:

      @Maelstrom96 We must update our documentation for that, This will probably require executing commands manually during an upgrade.

      Any news on that? We're still pretty much blocked until that's figured out.

      Also, any news on when it will be officially released?

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @ronan-a I've checked the commit history and saw that the breaking change seems to be related to the renaming of the KV store. Also just noticed that you renamed the volume namespace. Is there any other breaking changes that would require the deletion of the SR in order to update the sm package?

      I've made a python code that makes a copy of all the old KV data to the new KV name, along with renaming the key names for the volume data and was wondering if that would be sufficient.

      Thanks,

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      Hi @ronan-a ,

      So like we said at some point, we're using a K8s cluster that is connecting to the linstor directly. It's actually going surprisingly well, and we've even deployed that in production with contingency plans in case of failure, but it's been rock solid for now.

      We're working on setting up Velero to automatically backup all of our K8s cluster metadata along with the PVs for easy Disaster Recovery, but we've hit a unfortunate blocker. Here is what we're getting from Velero when attempting to do the backup/snapshot:

      error:
          message: 'Failed to check and update snapshot content: failed to take snapshot
            of the volume pvc-3602bca1-5b92-4fc7-96af-ce77f35e802c: "rpc error: code = Internal
            desc = failed to create snapshot: error creating S3 backup: Message: ''LVM_THIN
            based backup shipping requires at least version 2.24 for setsid from util_linux''
            next error: Message: ''LVM_THIN based backup shipping requires support for thin_send_recv''
            next error: Message: ''Backup shipping of resource ''pvc-3602bca1-5b92-4fc7-96af-ce77f35e802c''
            cannot be started since there is no node available that supports backup shipping.''"'
      

      It looks like when using thin volumes, we can't actually run a backup. We've checked and the current version of setsid is 2.23.2 on xcp-ng :

      [12:57 ovbh-pprod-xen12 ~]# setsid --v
      setsid from util-linux 2.23.2
      

      We know that updating a package directly is a pretty bad idea, so I'm wondering if you have an idea on what we could do to solve this, or if this will be updated with other XCP-ng updates?

      Thanks in advance for you time!

      P.S: We're working on a full post on how we went about deploying our full K8s linstor CSI setup for other people if anyone is interested in that.

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @olivierlambert
      I just checked the sm repository, and it looks like it wouldn't be that complicated to add a new sm-config and pass it down to the volume creation. Do you accept PR/Contributions on that repository? We're really interested in this feature and I think I can take the time to write the code to handle this.

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      After reading the sm LinstorSR file, I figured out the hosts names need to exactly match the hosts names in the XCP-ng pool. I thought I tried that and that it failed the same way, but after re-trying with all valid hosts, it setup the SR correctly.

      Something I've also noticed in the code is that it seems like there's not a way to deploy a secondary SR connectted to the same lintstor controller that could have a different replication factor. For some VMs that have built-in software replication/HA, like DBs, it might be prefered to have replication=1 set for the VDI.

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96

    Latest posts made by Maelstrom96

    • RE: Icon appears in the XOA interface

      @olivierlambert Thanks for the suggestion! The problem is I have no idea where I would start to build it for Slackware. I'll see if I can figure it out but with my research, I'm not quite sure I'll be able to.

      posted in Management
      Maelstrom96M
      Maelstrom96
    • RE: Icon appears in the XOA interface

      @eurodrigolira Sorry to revive this topic, but do you have pointers on how to build a slackware package for xe-guest-utilities? I'm trying to add the VM guest tools to UnRAID and I'm not having much luck.

      posted in Management
      Maelstrom96M
      Maelstrom96
    • RE: Three-node Networking for XOSTOR

      @T3CCH What you might be looking for: https://xcp-ng.org/docs/networking.html#full-mesh-network

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @ronan-a Thanks a lot for that procedure.

      Ended up needing to do a little bit more, since for some reason, "evacuate" failed. I deleted the node and then went and just manually recreated my resources using:

      linstor resource create --auto-place +1 <resource_name>
      

      Which didn't work at first because the new node didn't have a storage-pool configured, which required this command to work (NOTE - This is only valid if your SR was setup as thin):

      linstor storage-pool create lvmthin <node_name> xcp-sr-linstor_group_thin_device linstor_group/thin_device
      

      Also, worth nothing that before actually re-creating the resources, you might want to manually clean up the lingering Logical Volumes that weren't cleaned up if evacuate failed.

      Find volumes with:

      lvdisplay
      

      and then delete them with:

      lvremove <LV Path>
      

      example:

      lvremove /dev/linstor_group/xcp-persistent-database_00000
      
      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @ronan-a Do you know of a way to update a node name in Linstor? I've tried to look in their documentation and checked through CLI commands but couldn't find a way.

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @ronan-a I will be testing my theory a little bit later today, but I believe it might be a hostname mismatch between the node name it expects in linstor and what it set to now on Dom0. We had the hostname of the node updated before the cluster was spinned up, but I think it still had the previous name active when the linstor SR was created.

      This means that the node name doesn't match here:
      https://github.com/xcp-ng/sm/blob/e951676098c80e6da6de4d4653f496b15f5a8cb9/drivers/linstorvolumemanager.py#L2641C21-L2641C41

      I will try to revert the hostname and see if it fixes everything.

      Edit: Just tested and reverted the hostname to the default one, which matches what's in linstor, and it works again. So seems like changing a hostname after the cluster is provisionned is a no-no.

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @ronan-a said in XOSTOR hyperconvergence preview:

      drbdsetup events2

      Host1:

      [09:49 xcp-ng-labs-host01 ~]# systemctl status linstor-controller
      ● linstor-controller.service - drbd-reactor controlled linstor-controller
         Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
        Drop-In: /run/systemd/system/linstor-controller.service.d
                 └─reactor.conf
         Active: active (running) since Thu 2024-05-02 13:24:32 PDT; 20h ago
       Main PID: 21340 (java)
         CGroup: /system.slice/linstor-controller.service
                 └─21340 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Controller --logs=/var/log/linstor-controller --config-directory=/etc/linstor
      [09:49 xcp-ng-labs-host01 ~]# systemctl status linstor-satellite
      ● linstor-satellite.service - LINSTOR Satellite Service
         Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/linstor-satellite.service.d
                 └─override.conf
         Active: active (running) since Wed 2024-05-01 16:04:05 PDT; 1 day 17h ago
       Main PID: 1947 (java)
         CGroup: /system.slice/linstor-satellite.service
                 ├─1947 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
                 ├─2109 drbdsetup events2 all
                 └─2347 /usr/sbin/dmeventd
      [09:49 xcp-ng-labs-host01 ~]# systemctl status drbd-reactor
      ● drbd-reactor.service - DRBD-Reactor Service
         Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/drbd-reactor.service.d
                 └─override.conf
         Active: active (running) since Wed 2024-05-01 16:04:11 PDT; 1 day 17h ago
           Docs: man:drbd-reactor
                 man:drbd-reactorctl
                 man:drbd-reactor.toml
       Main PID: 1950 (drbd-reactor)
         CGroup: /system.slice/drbd-reactor.service
                 ├─1950 /usr/sbin/drbd-reactor
                 └─1976 drbdsetup events2 --full --poll
      [09:49 xcp-ng-labs-host01 ~]# mountpoint /var/lib/linstor
      /var/lib/linstor is a mountpoint
      [09:49 xcp-ng-labs-host01 ~]# drbdsetup events2
      exists resource name:xcp-persistent-database role:Primary suspended:no force-io-failures:no may_promote:no promotion_score:10103
      exists connection name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
      exists connection name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 connection:Connected role:Secondary
      exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
      exists peer-device name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.200:7000 peer:ipv4:10.100.0.202:7000 established:yes
      exists peer-device name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.200:7000 peer:ipv4:10.100.0.201:7000 established:yes
      exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
      exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
      exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 connection:Connected role:Primary
      exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
      exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.200:7001 peer:ipv4:10.100.0.202:7001 established:yes
      exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.200:7001 peer:ipv4:10.100.0.201:7001 established:yes
      exists -
      

      Host2:

      [09:51 xcp-ng-labs-host02 ~]# systemctl status linstor-controller
      ● linstor-controller.service - drbd-reactor controlled linstor-controller
         Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
        Drop-In: /run/systemd/system/linstor-controller.service.d
                 └─reactor.conf
         Active: inactive (dead)
      [09:51 xcp-ng-labs-host02 ~]# systemctl status linstor-satellite
      ● linstor-satellite.service - LINSTOR Satellite Service
         Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/linstor-satellite.service.d
                 └─override.conf
         Active: active (running) since Thu 2024-05-02 10:26:59 PDT; 23h ago
       Main PID: 1990 (java)
         CGroup: /system.slice/linstor-satellite.service
                 ├─1990 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
                 ├─2128 drbdsetup events2 all
                 └─2552 /usr/sbin/dmeventd
      [09:51 xcp-ng-labs-host02 ~]# systemctl status drbd-reactor
      ● drbd-reactor.service - DRBD-Reactor Service
         Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/drbd-reactor.service.d
                 └─override.conf
         Active: active (running) since Thu 2024-05-02 10:27:07 PDT; 23h ago
           Docs: man:drbd-reactor
                 man:drbd-reactorctl
                 man:drbd-reactor.toml
       Main PID: 1989 (drbd-reactor)
         CGroup: /system.slice/drbd-reactor.service
                 ├─1989 /usr/sbin/drbd-reactor
                 └─2035 drbdsetup events2 --full --poll
      [09:51 xcp-ng-labs-host02 ~]# mountpoint /var/lib/linstor
      /var/lib/linstor is not a mountpoint
      [09:51 xcp-ng-labs-host02 ~]# drbdsetup events2
      exists resource name:xcp-persistent-database role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
      exists connection name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 connection:Connected role:Primary
      exists connection name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
      exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
      exists peer-device name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.201:7000 peer:ipv4:10.100.0.200:7000 established:yes
      exists peer-device name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.201:7000 peer:ipv4:10.100.0.202:7000 established:yes
      exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Primary suspended:no force-io-failures:no may_promote:no promotion_score:10103
      exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 connection:Connected role:Secondary
      exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
      exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
      exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.201:7001 peer:ipv4:10.100.0.200:7001 established:yes
      exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.201:7001 peer:ipv4:10.100.0.202:7001 established:yes
      exists -
      

      Host3:

      [09:51 xcp-ng-labs-host03 ~]# systemctl status linstor-controller
      ● linstor-controller.service - drbd-reactor controlled linstor-controller
         Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
        Drop-In: /run/systemd/system/linstor-controller.service.d
                 └─reactor.conf
         Active: inactive (dead)
      [09:52 xcp-ng-labs-host03 ~]# systemctl status linstor-satellite
      ● linstor-satellite.service - LINSTOR Satellite Service
         Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/linstor-satellite.service.d
                 └─override.conf
         Active: active (running) since Thu 2024-05-02 10:10:16 PDT; 23h ago
       Main PID: 1937 (java)
         CGroup: /system.slice/linstor-satellite.service
                 ├─1937 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
                 ├─2151 drbdsetup events2 all
                 └─2435 /usr/sbin/dmeventd
      [09:52 xcp-ng-labs-host03 ~]# systemctl status drbd-reactor
      ● drbd-reactor.service - DRBD-Reactor Service
         Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/drbd-reactor.service.d
                 └─override.conf
         Active: active (running) since Thu 2024-05-02 10:10:26 PDT; 23h ago
           Docs: man:drbd-reactor
                 man:drbd-reactorctl
                 man:drbd-reactor.toml
       Main PID: 1939 (drbd-reactor)
         CGroup: /system.slice/drbd-reactor.service
                 ├─1939 /usr/sbin/drbd-reactor
                 └─1981 drbdsetup events2 --full --poll
      [09:52 xcp-ng-labs-host03 ~]# mountpoint /var/lib/linstor
      /var/lib/linstor is not a mountpoint
      [09:52 xcp-ng-labs-host03 ~]# drbdsetup events2
      exists resource name:xcp-persistent-database role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
      exists connection name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 connection:Connected role:Primary
      exists connection name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 connection:Connected role:Secondary
      exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
      exists peer-device name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.202:7000 peer:ipv4:10.100.0.200:7000 established:yes
      exists peer-device name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.202:7000 peer:ipv4:10.100.0.201:7000 established:yes
      exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
      exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 connection:Connected role:Secondary
      exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 connection:Connected role:Primary
      exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
      exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.202:7001 peer:ipv4:10.100.0.200:7001 established:yes
      exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
      exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.202:7001 peer:ipv4:10.100.0.201:7001 established:yes
      exists -
      
      

      Will be sending the debug file as a DM.

      Edit: Just as a sanity check, I tried to reboot the master instead of just restarting the toolstack, and the linstor SR seems to be working as expected again. The XOSTOR tab in XOA now populates (it just errored out before) and the SR scan now goes through.

      Edit2: Was able to move a VDI, but then, the same exact error started to happen again. No idea why.

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @ronan-a Since XOSTOR is supposed to be stable now, I figured I would try it out with a new setup of 3 newly installed 8.2 nodes.

      I used the CLI to deploy it. It all went well, and the SR was quickly ready. I was even able to migrate a disk to the Linstor SR and boot the VM. However, after rebooting the master, it seems like the SR doesn't want to allow any disk migration, and manual Scan are failing. I've tried unmounting/remounting the SR fully, restarting the toolstack, but nothing seems to help. The disk that was on Linstor is still accessible and the VM is able to boot.

      Here is the error I'm getting:

      sr.scan
      {
        "id": "e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9"
      }
      {
        "code": "SR_BACKEND_FAILURE_47",
        "params": [
          "",
          "The SR is not available [opterr=Database is not mounted]",
          ""
        ],
        "task": {
          "uuid": "a467bd90-8d47-09cc-b8ac-afa35056ff25",
          "name_label": "Async.SR.scan",
          "name_description": "",
          "allowed_operations": [],
          "current_operations": {},
          "created": "20240502T21:40:00Z",
          "finished": "20240502T21:40:01Z",
          "status": "failure",
          "resident_on": "OpaqueRef:b3e2f390-f45f-4614-a150-1eee53f204e1",
          "progress": 1,
          "type": "<none/>",
          "result": "",
          "error_info": [
            "SR_BACKEND_FAILURE_47",
            "",
            "The SR is not available [opterr=Database is not mounted]",
            ""
          ],
          "other_config": {},
          "subtask_of": "OpaqueRef:NULL",
          "subtasks": [],
          "backtrace": "(((process xapi)(filename lib/backtrace.ml)(line 210))((process xapi)(filename ocaml/xapi/storage_access.ml)(line 32))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 131))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 205))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 95)))"
        },
        "message": "SR_BACKEND_FAILURE_47(, The SR is not available [opterr=Database is not mounted], )",
        "name": "XapiError",
        "stack": "XapiError: SR_BACKEND_FAILURE_47(, The SR is not available [opterr=Database is not mounted], )
          at Function.wrap (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/_XapiError.mjs:16:12)
          at default (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/_getTaskResult.mjs:11:29)
          at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1029:24)
          at file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1063:14
          at Array.forEach (<anonymous>)
          at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1053:12)
          at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1226:14)"
      }
      

      I quickly glanced over the source code and the SM logs to see if I could identify what was going on but it doesn't seem to be a simple thing.

      Logs from SM:

      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] LinstorSR.scan for e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] Raising exception [47, The SR is not available [opterr=Database is not mounted]]
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] lock: released /var/lock/sm/e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9/sr
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] ***** generic exception: sr_scan: EXCEPTION <class 'SR.SROSError'>, The SR is not available [opterr=Database is not mounted]
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return self._run_locked(sr)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     rv = self._run(sr, target)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 364, in _run
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return sr.scan(self.params['sr_uuid'])
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 536, in wrap
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return load(self, *args, **kwargs)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 521, in load
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return wrapped_method(self, *args, **kwargs)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 381, in wrapped_method
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return method(self, *args, **kwargs)
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 777, in scan
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     opterr='Database is not mounted'
      May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]
      
      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @ronan-a said in XOSTOR hyperconvergence preview:

      @Maelstrom96 We must update our documentation for that, This will probably require executing commands manually during an upgrade.

      Any news on that? We're still pretty much blocked until that's figured out.

      Also, any news on when it will be officially released?

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96
    • RE: XOSTOR hyperconvergence preview

      @Maelstrom96 said in XOSTOR hyperconvergence preview:

      Is there a procedure on how we can update our current 8.2 XCP-ng cluster to 8.3? My undertanding is that if I update the host using the ISO, it will effectively wipe all changes that were made to DOM0, including the linstor/sm-linstor packages.

      Any input on this @ronan-a?

      posted in XOSTOR
      Maelstrom96M
      Maelstrom96