-
@Theoi-Meteoroi said in XOSTOR hyperconvergence preview:
You lost quorum.
Not a quorum issue:
exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
-
@Maelstrom96 Thank you for the logs, I'm trying to understand the issue.
For the moment I don't see a problem regarding the status of the services. -
@Maelstrom96 It sounds like a race condition or a bad mount of the database. But I'm not sure, so I will add more logs for the next RPM. We plan to release it in a few weeks.
-
@ronan-a I will be testing my theory a little bit later today, but I believe it might be a hostname mismatch between the node name it expects in linstor and what it set to now on Dom0. We had the hostname of the node updated before the cluster was spinned up, but I think it still had the previous name active when the linstor SR was created.
This means that the node name doesn't match here:
https://github.com/xcp-ng/sm/blob/e951676098c80e6da6de4d4653f496b15f5a8cb9/drivers/linstorvolumemanager.py#L2641C21-L2641C41I will try to revert the hostname and see if it fixes everything.
Edit: Just tested and reverted the hostname to the default one, which matches what's in linstor, and it works again. So seems like changing a hostname after the cluster is provisionned is a no-no.
-
@Maelstrom96 Oh! This explanation makes sense, thank you. Yes in case of change of hostname, the LINSTOR node name must also be modified, otherwise the path to the database resource will not be found.
-
@ronan-a Do you know of a way to update a node name in Linstor? I've tried to look in their documentation and checked through CLI commands but couldn't find a way.
-
@Maelstrom96 Well there is no simple helper to do that using the CLI.
So you can create a new node using:
linstor node create --node-type Combined <NAME> <IP>
Then you must evacuate the old node to preserve the replication count:
linstor node evacuate <OLD_NAME>
Next, you can change your hostname an restart the services on each host:
systemctl stop linstor-controller systemctl restart linstor-satellites
Finally you can delete the node:
linstor node delete <OLD_NAME>
After that you must recreate the diskless resources if necessary. Exec
linstor advise r
to see the commands to execute. -
@ronan-a Thanks a lot for that procedure.
Ended up needing to do a little bit more, since for some reason, "evacuate" failed. I deleted the node and then went and just manually recreated my resources using:
linstor resource create --auto-place +1 <resource_name>
Which didn't work at first because the new node didn't have a storage-pool configured, which required this command to work (NOTE - This is only valid if your SR was setup as thin):
linstor storage-pool create lvmthin <node_name> xcp-sr-linstor_group_thin_device linstor_group/thin_device
Also, worth nothing that before actually re-creating the resources, you might want to manually clean up the lingering Logical Volumes that weren't cleaned up if evacuate failed.
Find volumes with:
lvdisplay
and then delete them with:
lvremove <LV Path>
example:
lvremove /dev/linstor_group/xcp-persistent-database_00000
-
I've using xcp-ng with NFS Shared Storage for some months now and I am happy with it so far.
I've some ssds in a 3 server setup and I'd like to test xostor. Before I will setup xostor, there are some questions regarding it, as I am only familiar with virtuozzo/acronis-storage and ceph so far. Are there the same restrictions for using xostor, that exists in ceph e.g.?- only enterprise ssds because of power loss protection
- do not use Raids ( especially Raid 0 ) if the controller is capable of using HBA Mode. I've an Dell H330 Controller and if Raid is no problem, I'd like to setup OS with Hardware Raid 1 and xostor on the rest of the ssds with raid0 arrays per each disk. If HBA mode is prefered, I need to stick with Software Raid 1, I think. Software Raid 1 is working fine, but I've had some problems in the past if the boot drive of the mirror died...
- I've installed xen-orchestra manually. Once xostor is installed, will the xostor button in xen orchestra will have a function or is it only available within the XOA appliance?
Thanks for your answers!
-
- That shouldn't be a problem, if it's bad on one disk for one host, it should be resync. It's not a big filesystem shared, it's only a block space split between each created virtual disk.
- That should work fine (@ronan-a will confirm but I don't think there's very low level optimization that could be affected by a RAID card?)
- XOSTOR UI is only available in XOA, but you'll be able to manage and have all the features from the CLI
-
@flibbi, @olivierlambert RAID shouldn't be problem for XOSTOR. During some of my testing shortly after the preview was released, I was running it on software RAID 10 arrays on each of my test servers. As long as the RAID isn't some sort of "fake RAID" and is done in hardware, it should work fine.
-
I read on the blog that XOSTOR has been officially released and wanted to test it. I have installed v8.2.1 of XCP-ng on the server nodes. On a separate computer in the management network I have XO built from sources. I have updated the hosts to the latest packages.
Then I started following instructions from the first post in the thread. I am getting error at the sr-create step.
[15:11 xcp-ng-vh1 ~]# xe sr-create type=linstor name-label=XOSTOR host-uuid=382d49a5-7435-425e-8588-f56e7a7711f8 device-config:group-name=linstor_group/thin_device device-config:redundancy=2 shared=true device-config:provisioning=thin Error code: SR_BACKEND_FAILURE_202 Error parameters: , General backend error [opterr=['XENAPI_PLUGIN_FAILURE', 'non-zero exit', '', 'Traceback (most recent call last):\n File "/etc/xapi.d/plugins/linstor-manager", line 24, in <module>\n from linstorjournaler import LinstorJournaler\n File "/opt/xensource/sm/linstorjournaler.py", line 19, in <module>\n from linstorvolumemanager import LinstorVolumeManager\n File "/opt/xensource/sm/linstorvolumemanager.py", line 20, in <module>\n import linstor\nImportError: No module named linstor\n']],
I tried to find possible causes on the forums and it was mentioned that the linstor packages are not yet mature for 8.3 release and that python versions between 8.2 and 8.3 versions of xcp-ng can cause issues. I am using 8.2 branch though so not sure what I am missing here:
[15:12 xcp-ng-vh1 ~]# cat /etc/os-release NAME="XCP-ng" VERSION="8.2.1" ID="xenenterprise" ID_LIKE="centos rhel fedora" VERSION_ID="8.2.1" PRETTY_NAME="XCP-ng 8.2.1" ANSI_COLOR="0;31" HOME_URL="http://xcp-ng.org/" BUG_REPORT_URL="https://github.com/xcp-ng/xcp"
Packages related to linstor on the system:
[20:11 xcp-ng-vh1 ~]# yum list | grep linstor drbd.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-bash-completion.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-pacemaker.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-reactor.x86_64 1.4.0-1 @xcp-ng-linstor drbd-udev.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-utils.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-xen.x86_64 9.27.0-1.el7 @xcp-ng-linstor kmod-drbd.x86_64 9.2.8_4.19.0+1-1 @xcp-ng-linstor linstor-client.noarch 1.21.1-1.xcpng8.2 @xcp-ng-linstor linstor-common.noarch 1.26.1-1.el7 @xcp-ng-linstor linstor-controller.noarch 1.26.1-1.el7 @xcp-ng-linstor linstor-satellite.noarch 1.26.1-1.el7 @xcp-ng-linstor python-linstor.noarch 1.21.1-1.xcpng8.2 @xcp-ng-linstor sm.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 @xcp-ng-linstor sm-rawhba.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 @xcp-ng-linstor tzdata-java.noarch 2023c-1.el7 @xcp-ng-linstor xcp-ng-linstor.noarch 1.1-3.xcpng8.2 @xcp-ng-updates xcp-ng-release-linstor.noarch 1.3-1.xcpng8.2 @xcp-ng-updates drbd-debuginfo.x86_64 9.27.0-1.el7 xcp-ng-linstor drbd-heartbeat.x86_64 9.27.0-1.el7 xcp-ng-linstor sm-debuginfo.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 xcp-ng-linstor sm-test-plugins.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 xcp-ng-linstor sm-testresults.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 xcp-ng-linstor
Any help appreciated.
Thanks.
-
@ha_tu_su said in XOSTOR hyperconvergence preview:
I read on the blog that XOSTOR has been officially released and wanted to test it. I have installed v8.2.1 of XCP-ng on the server nodes. On a separate computer in the management network I have XO built from sources. I have updated the hosts to the latest packages.
Then I started following instructions from the first post in the thread. I am getting error at the sr-create step.
[15:11 xcp-ng-vh1 ~]# xe sr-create type=linstor name-label=XOSTOR host-uuid=382d49a5-7435-425e-8588-f56e7a7711f8 device-config:group-name=linstor_group/thin_device device-config:redundancy=2 shared=true device-config:provisioning=thin Error code: SR_BACKEND_FAILURE_202 Error parameters: , General backend error [opterr=['XENAPI_PLUGIN_FAILURE', 'non-zero exit', '', 'Traceback (most recent call last):\n File "/etc/xapi.d/plugins/linstor-manager", line 24, in <module>\n from linstorjournaler import LinstorJournaler\n File "/opt/xensource/sm/linstorjournaler.py", line 19, in <module>\n from linstorvolumemanager import LinstorVolumeManager\n File "/opt/xensource/sm/linstorvolumemanager.py", line 20, in <module>\n import linstor\nImportError: No module named linstor\n']],
I tried to find possible causes on the forums and it was mentioned that the linstor packages are not yet mature for 8.3 release and that python versions between 8.2 and 8.3 versions of xcp-ng can cause issues. I am using 8.2 branch though so not sure what I am missing here:
[15:12 xcp-ng-vh1 ~]# cat /etc/os-release NAME="XCP-ng" VERSION="8.2.1" ID="xenenterprise" ID_LIKE="centos rhel fedora" VERSION_ID="8.2.1" PRETTY_NAME="XCP-ng 8.2.1" ANSI_COLOR="0;31" HOME_URL="http://xcp-ng.org/" BUG_REPORT_URL="https://github.com/xcp-ng/xcp"
Packages related to linstor on the system:
[20:11 xcp-ng-vh1 ~]# yum list | grep linstor drbd.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-bash-completion.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-pacemaker.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-reactor.x86_64 1.4.0-1 @xcp-ng-linstor drbd-udev.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-utils.x86_64 9.27.0-1.el7 @xcp-ng-linstor drbd-xen.x86_64 9.27.0-1.el7 @xcp-ng-linstor kmod-drbd.x86_64 9.2.8_4.19.0+1-1 @xcp-ng-linstor linstor-client.noarch 1.21.1-1.xcpng8.2 @xcp-ng-linstor linstor-common.noarch 1.26.1-1.el7 @xcp-ng-linstor linstor-controller.noarch 1.26.1-1.el7 @xcp-ng-linstor linstor-satellite.noarch 1.26.1-1.el7 @xcp-ng-linstor python-linstor.noarch 1.21.1-1.xcpng8.2 @xcp-ng-linstor sm.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 @xcp-ng-linstor sm-rawhba.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 @xcp-ng-linstor tzdata-java.noarch 2023c-1.el7 @xcp-ng-linstor xcp-ng-linstor.noarch 1.1-3.xcpng8.2 @xcp-ng-updates xcp-ng-release-linstor.noarch 1.3-1.xcpng8.2 @xcp-ng-updates drbd-debuginfo.x86_64 9.27.0-1.el7 xcp-ng-linstor drbd-heartbeat.x86_64 9.27.0-1.el7 xcp-ng-linstor sm-debuginfo.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 xcp-ng-linstor sm-test-plugins.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 xcp-ng-linstor sm-testresults.x86_64 2.30.8-10.1.0.linstor.2.xcpng8.2 xcp-ng-linstor
Any help appreciated.
Thanks.
Ok, I had 3 hosts in the pool. Above error I was getting on 2 hosts. Just to repeat the process cleanly given in the first post I tried steps on 3rd host and SR creation was successful.
Initially on the 2 hosts I had used the 'thick' version of command to prepare disks. Then I had deleted the lvm and used wipefs on disks and then redid steps using the 'thin' version of command. My guess is that the disks were not 'wiped' completely and then I got error during SR creation.
I am going to use gparted to wipe the disks properly and then redo steps. If that doesn't work, then nuke the install of xcp-ng and reinstall and then check. Will update the post accordingly.
Cheers.
-
@ha_tu_su
After using gparted to wiping out all disks, sr-create command works as expected to create XOSTOR. -
-
@ronan-a and @Maelstrom96 I didn't get this hostname issue.
Does XOSTOR needs a fully functional DNS setup to work? Or the failure was local due to the local change of the hostname?
I didn't understand if the communication is done by IP addresses directly or if DNS name resolution is needed.
I'm particularly interested in this because with XOSTOR I'm considering virtualizing my pfSense firewall directly and get rid of the physical servers. And in this scenario in a case of a entire pool reboot I must guarantee that I will have the two pfSense VMs up and running, with the option to auto start after reboot, so I can access the entire infrastructure or else I'll be locked from outside.
-
@ferrao said in XOSTOR hyperconvergence preview:
Does XOSTOR needs a fully functional DNS setup to work? Or the failure was local due to the local change of the hostname?
No. But your LINSTOR node name must match the hostname. We use IPs to communicate between nodes and in our driver.
-
@ronan-a thanks. I've deployed it already with the script on the first post. Seems to be working. I've opted to used redundancy=3 in a 3 hosts setup. It's a lot of 'wasted' resources but seems to be the best option for performance and reliability.
May I ask now a licensing issue: if we upgrade to Vates VM, does the deployment mode on the first message is considered supported or everything will need to be done again from XOA?
Thanks.
-
@ferrao said in XOSTOR hyperconvergence preview:
May I ask now a licensing issue: if we upgrade to Vates VM, does the deployment mode on the first message is considered supported or everything will need to be done again from XOA?
Regarding XOSTOR Support Licenses: In general, we prefer our users to use a trial license through XOA. And if they are interested, they can subscribe to a commercial license.
To be more precise: the manual steps in this thread are still valid to configure an SR LINSTOR, no difference with the XOA commands. However, if you wish to suscribe to a support license from a pool without XOA nor trial license, we are quite strict on the fact that the infrastructure must be in a stable state. -
Anyone else getting a 301 error?
http://mirrors.xcp-ng.org/8/8.2/base/x86_64/repodata/repomd.xml: [Errno 14] HTTPS Error 301 - Moved Permanently Trying other mirror.
-
@lover said in XOSTOR hyperconvergence preview:
Anyone else getting a 301 error?
http://mirrors.xcp-ng.org/8/8.2/base/x86_64/repodata/repomd.xml: [Errno 14] HTTPS Error 301 - Moved Permanently Trying other mirror.
301 is not an error (as a failure) it's a redirect. Here it redirects correctly to a mirror nearby. In my case: https://mirror.uepg.br/xcp-ng/8/8.2/base/x86_64/repodata/repomd.xml