-
@Jonathon Ok, here's the solution, and the explanation:
First, just update
xcp-ng-release-linstor
. This will update the repository file for linstor-related RPMs to make it point to a newer repository, located at a different place.Then updating the rest of the system will become possible again.
-
Unfortunately does not look like that works. Unless I am doing something wrong.
[12:26 ovbh-pprod-xen10 ~]# yum update xcp-ng-release-linstor Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-linstor: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org Resolving Dependencies --> Running transaction check ---> Package xcp-ng-release-linstor.noarch 0:1.2-1.xcpng8.2 will be updated ---> Package xcp-ng-release-linstor.noarch 0:1.3-1.xcpng8.2 will be an update --> Finished Dependency Resolution Dependencies Resolved ========================================================================================================================================================================================================================================================================================== Package Arch Version Repository Size ========================================================================================================================================================================================================================================================================================== Updating: xcp-ng-release-linstor noarch 1.3-1.xcpng8.2 xcp-ng-updates 4.0 k Transaction Summary ========================================================================================================================================================================================================================================================================================== Upgrade 1 Package Total download size: 4.0 k Is this ok [y/d/N]: y Downloading packages: Delta RPMs disabled because /usr/bin/applydeltarpm not installed. xcp-ng-release-linstor-1.3-1.xcpng8.2.noarch.rpm | 4.0 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Updating : xcp-ng-release-linstor-1.3-1.xcpng8.2.noarch 1/2 Cleanup : xcp-ng-release-linstor-1.2-1.xcpng8.2.noarch 2/2 Verifying : xcp-ng-release-linstor-1.3-1.xcpng8.2.noarch 1/2 Verifying : xcp-ng-release-linstor-1.2-1.xcpng8.2.noarch 2/2 Updated: xcp-ng-release-linstor.noarch 0:1.3-1.xcpng8.2 Complete! [12:26 ovbh-pprod-xen10 ~]# yum update Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org Resolving Dependencies --> Running transaction check ---> Package blktap.x86_64 0:3.37.4-1.0.1.0.linstor.1.xcpng8.2 will be updated ---> Package blktap.x86_64 0:3.37.4-2.1.xcpng8.2 will be an update ---> Package device-mapper-multipath.x86_64 0:0.4.9-119.xs+1.2.xcpng8.2 will be updated ---> Package device-mapper-multipath.x86_64 0:0.4.9-136.xcpng8.2 will be an update ---> Package device-mapper-multipath-libs.x86_64 0:0.4.9-119.xs+1.2.xcpng8.2 will be updated ---> Package device-mapper-multipath-libs.x86_64 0:0.4.9-136.xcpng8.2 will be an update ---> Package e2fsprogs.x86_64 0:1.42.9-12.el7_5 will be updated ---> Package e2fsprogs.x86_64 0:1.47.0-1.1.xcpng8.2 will be an update --> Processing Dependency: libfuse.so.2(FUSE_2.5)(64bit) for package: e2fsprogs-1.47.0-1.1.xcpng8.2.x86_64 --> Processing Dependency: libfuse.so.2(FUSE_2.6)(64bit) for package: e2fsprogs-1.47.0-1.1.xcpng8.2.x86_64 --> Processing Dependency: libfuse.so.2(FUSE_2.8)(64bit) for package: e2fsprogs-1.47.0-1.1.xcpng8.2.x86_64 --> Processing Dependency: libfuse.so.2()(64bit) for package: e2fsprogs-1.47.0-1.1.xcpng8.2.x86_64 ---> Package e2fsprogs-libs.x86_64 0:1.42.9-12.el7_5 will be updated ---> Package e2fsprogs-libs.x86_64 0:1.47.0-1.1.xcpng8.2 will be an update ---> Package forkexecd.x86_64 0:1.18.1-1.1.xcpng8.2 will be updated ---> Package forkexecd.x86_64 0:1.18.3-3.1.xcpng8.2 will be an update ---> Package gpumon.x86_64 0:0.18.0-4.2.xcpng8.2 will be updated ---> Package gpumon.x86_64 0:0.18.0-11.2.xcpng8.2 will be an update ---> Package grub.x86_64 1:2.02-3.1.0.xcpng8.2 will be updated ---> Package grub.x86_64 1:2.02-3.2.0.xcpng8.2 will be an update ---> Package grub-efi.x86_64 1:2.02-3.1.0.xcpng8.2 will be updated ---> Package grub-efi.x86_64 1:2.02-3.2.0.xcpng8.2 will be an update ---> Package grub-tools.x86_64 1:2.02-3.1.0.xcpng8.2 will be updated ---> Package grub-tools.x86_64 1:2.02-3.2.0.xcpng8.2 will be an update ---> Package guest-templates-json.noarch 0:1.9.6-1.2.xcpng8.2 will be updated ---> Package guest-templates-json.noarch 0:1.10.6-1.1.xcpng8.2 will be an update ---> Package guest-templates-json-data-linux.noarch 0:1.9.6-1.2.xcpng8.2 will be updated ---> Package guest-templates-json-data-linux.noarch 0:1.10.6-1.1.xcpng8.2 will be an update ---> Package guest-templates-json-data-other.noarch 0:1.9.6-1.2.xcpng8.2 will be updated ---> Package guest-templates-json-data-other.noarch 0:1.10.6-1.1.xcpng8.2 will be an update ---> Package guest-templates-json-data-windows.noarch 0:1.9.6-1.2.xcpng8.2 will be updated ---> Package guest-templates-json-data-windows.noarch 0:1.10.6-1.1.xcpng8.2 will be an update ---> Package http-nbd-transfer.x86_64 0:1.2.0-1.xcpng8.2 will be updated ---> Package http-nbd-transfer.x86_64 0:1.3.0-1.xcpng8.2 will be an update ---> Package irqbalance.x86_64 3:1.0.7-11.xcpng8.2 will be updated ---> Package irqbalance.x86_64 3:1.0.7-16.xcpng8.2 will be an update ---> Package kernel.x86_64 0:4.19.19-7.0.15.1.xcpng8.2 will be updated ---> Package kernel.x86_64 0:4.19.19-7.0.23.1.xcpng8.2 will be an update ---> Package kpartx.x86_64 0:0.4.9-119.xs+1.2.xcpng8.2 will be updated ---> Package kpartx.x86_64 0:0.4.9-136.xcpng8.2 will be an update ---> Package libcom_err.x86_64 0:1.42.9-12.el7_5 will be updated ---> Package libcom_err.x86_64 0:1.47.0-1.1.xcpng8.2 will be an update ---> Package libss.x86_64 0:1.42.9-12.el7_5 will be updated ---> Package libss.x86_64 0:1.47.0-1.1.xcpng8.2 will be an update ---> Package linux-firmware.noarch 0:20190314-5.1.xcpng8.2 will be updated ---> Package linux-firmware.noarch 0:20190314-10.2.xcpng8.2 will be an update ---> Package lldpad.x86_64 0:1.0.1-3.git036e314.xcpng8.2 will be updated ---> Package lldpad.x86_64 0:1.0.1-10.xcpng8.2 will be an update ---> Package message-switch.x86_64 0:1.23.2-3.2.xcpng8.2 will be updated ---> Package message-switch.x86_64 0:1.23.2-10.1.xcpng8.2 will be an update ---> Package microcode_ctl.x86_64 2:2.1-26.xs23.1.xcpng8.2 will be updated ---> Package microcode_ctl.x86_64 2:2.1-26.xs26.2.xcpng8.2 will be an update ---> Package nbd.x86_64 0:3.14-2.el7 will be updated ---> Package nbd.x86_64 0:3.24-1.xcpng8.2 will be an update ---> Package qemu.x86_64 2:4.2.1-4.6.2.1.xcpng8.2 will be updated ---> Package qemu.x86_64 2:4.2.1-4.6.3.1.xcpng8.2 will be an update ---> Package rrd2csv.x86_64 0:1.2.5-7.1.xcpng8.2 will be updated ---> Package rrd2csv.x86_64 0:1.2.6-8.1.xcpng8.2 will be an update ---> Package rrdd-plugins.x86_64 0:1.10.8-5.1.xcpng8.2 will be updated ---> Package rrdd-plugins.x86_64 0:1.10.9-5.1.xcpng8.2 will be an update ---> Package sm.x86_64 0:2.30.7-1.3.0.linstor.7.xcpng8.2 will be updated ---> Package sm.x86_64 0:2.30.8-7.1.xcpng8.2 will be an update ---> Package sm-cli.x86_64 0:0.23.0-7.xcpng8.2 will be updated ---> Package sm-cli.x86_64 0:0.23.0-54.1.xcpng8.2 will be an update ---> Package sm-rawhba.x86_64 0:2.30.7-1.3.0.linstor.7.xcpng8.2 will be updated ---> Package sm-rawhba.x86_64 0:2.30.8-7.1.xcpng8.2 will be an update ---> Package squeezed.x86_64 0:0.27.0-5.xcpng8.2 will be updated ---> Package squeezed.x86_64 0:0.27.0-11.1.xcpng8.2 will be an update ---> Package tzdata.noarch 0:2022a-1.el7 will be updated ---> Package tzdata.noarch 0:2023c-1.el7 will be an update ---> Package tzdata-java.noarch 0:2022a-1.el7 will be updated ---> Package tzdata-java.noarch 0:2023c-1.el7 will be an update ---> Package varstored-guard.x86_64 0:0.6.2-1.xcpng8.2 will be updated ---> Package varstored-guard.x86_64 0:0.6.2-8.xcpng8.2 will be an update ---> Package vendor-drivers.x86_64 0:1.0.2-1.3.xcpng8.2 will be updated ---> Package vendor-drivers.x86_64 0:1.0.2-1.6.xcpng8.2 will be an update --> Processing Dependency: mpi3mr-module for package: vendor-drivers-1.0.2-1.6.xcpng8.2.x86_64 --> Processing Dependency: r8125-module for package: vendor-drivers-1.0.2-1.6.xcpng8.2.x86_64 --> Processing Dependency: igc-module for package: vendor-drivers-1.0.2-1.6.xcpng8.2.x86_64 ---> Package vhd-tool.x86_64 0:0.43.0-4.1.xcpng8.2 will be updated ---> Package vhd-tool.x86_64 0:0.43.0-11.1.xcpng8.2 will be an update ---> Package wsproxy.x86_64 0:1.12.0-5.xcpng8.2 will be updated ---> Package wsproxy.x86_64 0:1.12.0-12.xcpng8.2 will be an update ---> Package xapi-core.x86_64 0:1.249.26-2.1.xcpng8.2 will be updated ---> Package xapi-core.x86_64 0:1.249.32-2.1.xcpng8.2 will be an update ---> Package xapi-nbd.x86_64 0:1.11.0-3.2.xcpng8.2 will be updated ---> Package xapi-nbd.x86_64 0:1.11.0-10.1.xcpng8.2 will be an update ---> Package xapi-storage.x86_64 0:11.19.0_sxm2-3.xcpng8.2 will be updated ---> Package xapi-storage.x86_64 0:11.19.0_sxm2-10.xcpng8.2 will be an update ---> Package xapi-storage-script.x86_64 0:0.34.1-2.1.xcpng8.2 will be updated ---> Package xapi-storage-script.x86_64 0:0.34.1-9.1.xcpng8.2 will be an update ---> Package xapi-tests.x86_64 0:1.249.26-2.1.xcpng8.2 will be updated ---> Package xapi-tests.x86_64 0:1.249.32-2.1.xcpng8.2 will be an update ---> Package xapi-xe.x86_64 0:1.249.26-2.1.xcpng8.2 will be updated ---> Package xapi-xe.x86_64 0:1.249.32-2.1.xcpng8.2 will be an update ---> Package xcp-networkd.x86_64 0:0.56.2-1.xcpng8.2 will be updated ---> Package xcp-networkd.x86_64 0:0.56.2-8.xcpng8.2 will be an update ---> Package xcp-ng-linstor.noarch 0:1.0-1.xcpng8.2 will be updated ---> Package xcp-ng-linstor.noarch 0:1.1-3.xcpng8.2 will be an update --> Processing Dependency: sm-linstor for package: xcp-ng-linstor-1.1-3.xcpng8.2.noarch ---> Package xcp-ng-release.x86_64 0:8.2.1-6 will be updated ---> Package xcp-ng-release.x86_64 0:8.2.1-10 will be an update ---> Package xcp-ng-release-config.x86_64 0:8.2.1-6 will be updated ---> Package xcp-ng-release-config.x86_64 0:8.2.1-10 will be an update ---> Package xcp-ng-release-presets.x86_64 0:8.2.1-6 will be updated ---> Package xcp-ng-release-presets.x86_64 0:8.2.1-10 will be an update ---> Package xcp-ng-xapi-plugins.noarch 0:1.7.2-1.0.0.linstor.1.xcpng8.2 will be updated ---> Package xcp-ng-xapi-plugins.noarch 0:1.8.0-1.xcpng8.2 will be an update ---> Package xcp-rrdd.x86_64 0:1.33.0-6.1.xcpng8.2 will be updated ---> Package xcp-rrdd.x86_64 0:1.33.2-7.1.xcpng8.2 will be an update ---> Package xen-dom0-libs.x86_64 0:4.13.5-9.30.3.xcpng8.2 will be updated ---> Package xen-dom0-libs.x86_64 0:4.13.5-9.38.3.xcpng8.2 will be an update ---> Package xen-dom0-tools.x86_64 0:4.13.5-9.30.3.xcpng8.2 will be updated ---> Package xen-dom0-tools.x86_64 0:4.13.5-9.38.3.xcpng8.2 will be an update ---> Package xen-hypervisor.x86_64 0:4.13.5-9.30.3.xcpng8.2 will be updated ---> Package xen-hypervisor.x86_64 0:4.13.5-9.38.3.xcpng8.2 will be an update ---> Package xen-libs.x86_64 0:4.13.5-9.30.3.xcpng8.2 will be updated ---> Package xen-libs.x86_64 0:4.13.5-9.38.3.xcpng8.2 will be an update ---> Package xen-tools.x86_64 0:4.13.5-9.30.3.xcpng8.2 will be updated ---> Package xen-tools.x86_64 0:4.13.5-9.38.3.xcpng8.2 will be an update ---> Package xenopsd.x86_64 0:0.150.12-1.2.xcpng8.2 will be updated ---> Package xenopsd.x86_64 0:0.150.17-2.1.xcpng8.2 will be an update ---> Package xenopsd-cli.x86_64 0:0.150.12-1.2.xcpng8.2 will be updated ---> Package xenopsd-cli.x86_64 0:0.150.17-2.1.xcpng8.2 will be an update ---> Package xenopsd-xc.x86_64 0:0.150.12-1.2.xcpng8.2 will be updated ---> Package xenopsd-xc.x86_64 0:0.150.17-2.1.xcpng8.2 will be an update ---> Package xs-openssl-libs.x86_64 1:1.1.1k-6.1.xcpng8.2 will be updated ---> Package xs-openssl-libs.x86_64 1:1.1.1k-9.1.xcpng8.2 will be an update ---> Package zabbix-agent.x86_64 0:7.0.0-alpha3.release1.el7 will be updated ---> Package zabbix-agent.x86_64 0:7.0.0-beta1.release1.el7 will be an update --> Running transaction check ---> Package fuse-libs.x86_64 0:2.9.2-10.xcpng8.2 will be installed ---> Package igc-module.x86_64 0:5.10.200-1.xcpng8.2 will be installed ---> Package mpi3mr-module.x86_64 0:8.6.1.0.0-1.xcpng8.2 will be installed ---> Package r8125-module.x86_64 0:9.012.03-1.xcpng8.2 will be installed ---> Package xcp-ng-linstor.noarch 0:1.1-3.xcpng8.2 will be an update --> Processing Dependency: sm-linstor for package: xcp-ng-linstor-1.1-3.xcpng8.2.noarch --> Finished Dependency Resolution Error: Package: xcp-ng-linstor-1.1-3.xcpng8.2.noarch (xcp-ng-updates) Requires: sm-linstor You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest
-
Looks like the package is listed twice?
[12:32 ovbh-pprod-xen10 ~]# yum update xcp-ng-linstor Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org Resolving Dependencies --> Running transaction check ---> Package xcp-ng-linstor.noarch 0:1.0-1.xcpng8.2 will be updated ---> Package xcp-ng-linstor.noarch 0:1.1-3.xcpng8.2 will be an update --> Processing Dependency: sm-linstor for package: xcp-ng-linstor-1.1-3.xcpng8.2.noarch --> Finished Dependency Resolution Error: Package: xcp-ng-linstor-1.1-3.xcpng8.2.noarch (xcp-ng-updates) Requires: sm-linstor You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest
[12:34 ovbh-pprod-xen10 ~]# yum list [...] xcp-networkd.x86_64 0.56.2-8.xcpng8.2 xcp-ng-updates xcp-networkd-debuginfo.x86_64 0.56.2-8.xcpng8.2 xcp-ng-updates xcp-ng-generic-lib-devel.x86_64 1.1.1-3.xcpng8.2 xcp-ng-base xcp-ng-linstor.noarch 1.1-3.xcpng8.2 xcp-ng-updates xcp-ng-release.x86_64 8.2.1-10 xcp-ng-updates xcp-ng-release-config.x86_64 8.2.1-10 xcp-ng-updates [...]
Seems like the old package is just stuck there and is a duplicate version
[12:34 ovbh-pprod-xen10 ~]# yum remove xcp-ng-linstor.noarch 0:1.0-1 Loaded plugins: fastestmirror No Match for argument: 0:1.0-1 Resolving Dependencies --> Running transaction check ---> Package xcp-ng-linstor.noarch 0:1.0-1.xcpng8.2 will be erased --> Finished Dependency Resolution Dependencies Resolved =========================================================================================================================================================================================================================================================================================================================================================================================================================================== Package Arch Version Repository Size =========================================================================================================================================================================================================================================================================================================================================================================================================================================== Removing: xcp-ng-linstor noarch 1.0-1.xcpng8.2 @xcp-ng-linstor 0.0 Transaction Summary =========================================================================================================================================================================================================================================================================================================================================================================================================================================== Remove 1 Package Installed size: 0 Is this ok [y/N]:
-
I don't see any duplicates in this output.
---> Package xcp-ng-linstor.noarch 0:1.0-1.xcpng8.2 will be updated ---> Package xcp-ng-linstor.noarch 0:1.1-3.xcpng8.2 will be an update
This lists the current package, then the one which will update it.
-
The problem was yum cache. If I did yum update right after
yum update xcp-ng-release-linstor
it would still fail. To get it working right away did the followingyum update xcp-ng-release-linstor yum clean all yum update
-
Some questions about XOSTOR
Storage between the servers is not shared as a large file system like Gluster. Right?
So for example, each 4 hosts has 2TB storage then the max HD space is 2TB (widely)Is the NIC speed of the storage network important? Is 2x40G on each server for this overkill?
What software raid on the NVME disks is recommended?
/Christian
-
@Chr57 I'm no XOSTOR expert, but AFAIK the total available space will depend on the replication factor.
-
Storage between the servers is not shared as a large file system like Gluster. Right?
So for example, each 4 hosts has 2TB storage then the max HD space is 2TBAs @stormi mentioned, this depends on your replication factor. It works like this for your example:
Total Space = (No of Hosts x Storage ) / Replication Factor
(Assuming you have same storage on all nodes)Eg. replication factor is 2, then :
Total Space = (4 x 2)/2 = 4 TB
Note:
What you have to keep in mind though is it also depends on each SSD you have so say if you put 1.5 TB SSD & 0.5 TB SSD, then although you have 2TB on each node, but if you create a VM with 1TB space, you will not be able to create another VM with 1TB since there will not be enough contiguous space. What it means is that XOSTOR will not split the VM disk on two separate drives in case of JBOD.
In case of Raid 0 at bios level, you may be able to get away with this but Raid 0 is not recommended.Is the NIC speed of the storage network important? Is 2x40G on each server for this overkill?
The question is generic and it actually depends on your workload and SSD speed (Gen4 or Gen5 or if you have an old Gen1). At the outset 2x40G seems to be more than enough for most applications. If you have a an old Gen1 SSD or SATA SSD, then you might not even reach the full bandwidth in case of 2x40GB (practically speaking).
What software raid on the NVME disks is recommended?
For going production with Nvme SSDs, I would not recommend RAID at all ! JBOD would work just fine.(Assuming your are using generic applications)
-
Hello to you,
I am new to the XOSTOR solution,
I followed the instructions to build an SR XOSTOR, except that unfortunately I have been stuck for 2-3 weeks on an error when creating the SR, below is my error:
Error code: SR_BACKEND_FAILURE_5006
Error parameters: , LINSTOR SR creation error [opterr=Could not create SPxcp-sr-linstor_group
on nodeDEV-XCP02
: The satellite does not support the device provider LVM],here is the command I run: xe sr-create type=linstor name-label=XOSTOR host-uuid=52c3a2bb-50a8-4700-a232-6e535e24d759 device-config:group-name=linstor_group device-config:redundancy=2 shared =true device-config:provisioning=thick
thank you in advance for your help, and this great project
-
Hi,
I am actually new to XOSTOR and I have very basic questions to begin with.
Does it support only Pools ? Can we attach such SR on many independent XCP-ng hosts?Thanks again for this incredible project.
-
Hi,
It works only at pool level, the only way have coordination between hosts and knowing which host have the lock on which VM. This is essential to avoid booting the same VM/disk at 2 different places without knowing, leading to data corruption.
-
@olivierlambert said in XOSTOR hyperconvergence preview:
- FINALLY YOU CAN CREATE THE SR:
Otherwise with thin provisioning:
xe sr-create type=linstor name-label=<SR_NAME> host-uuid=<MASTER_UUID> device-config:group-name=linstor_group/thin_device device-config:redundancy=<REDUNDANCY> shared=tru
Is this part of the command not needed?
device-config:hosts=XCP-01,XCP-02,XCP-xx
- FINALLY YOU CAN CREATE THE SR:
-
question for @ronan-a
-
@fatek No. I removed this param, it's useless now.
-
@ronan-a Since XOSTOR is supposed to be stable now, I figured I would try it out with a new setup of 3 newly installed 8.2 nodes.
I used the CLI to deploy it. It all went well, and the SR was quickly ready. I was even able to migrate a disk to the Linstor SR and boot the VM. However, after rebooting the master, it seems like the SR doesn't want to allow any disk migration, and manual Scan are failing. I've tried unmounting/remounting the SR fully, restarting the toolstack, but nothing seems to help. The disk that was on Linstor is still accessible and the VM is able to boot.
Here is the error I'm getting:
sr.scan { "id": "e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9" } { "code": "SR_BACKEND_FAILURE_47", "params": [ "", "The SR is not available [opterr=Database is not mounted]", "" ], "task": { "uuid": "a467bd90-8d47-09cc-b8ac-afa35056ff25", "name_label": "Async.SR.scan", "name_description": "", "allowed_operations": [], "current_operations": {}, "created": "20240502T21:40:00Z", "finished": "20240502T21:40:01Z", "status": "failure", "resident_on": "OpaqueRef:b3e2f390-f45f-4614-a150-1eee53f204e1", "progress": 1, "type": "<none/>", "result": "", "error_info": [ "SR_BACKEND_FAILURE_47", "", "The SR is not available [opterr=Database is not mounted]", "" ], "other_config": {}, "subtask_of": "OpaqueRef:NULL", "subtasks": [], "backtrace": "(((process xapi)(filename lib/backtrace.ml)(line 210))((process xapi)(filename ocaml/xapi/storage_access.ml)(line 32))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 131))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 205))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 95)))" }, "message": "SR_BACKEND_FAILURE_47(, The SR is not available [opterr=Database is not mounted], )", "name": "XapiError", "stack": "XapiError: SR_BACKEND_FAILURE_47(, The SR is not available [opterr=Database is not mounted], ) at Function.wrap (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/_XapiError.mjs:16:12) at default (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/_getTaskResult.mjs:11:29) at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1029:24) at file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1063:14 at Array.forEach (<anonymous>) at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1053:12) at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1226:14)" }
I quickly glanced over the source code and the SM logs to see if I could identify what was going on but it doesn't seem to be a simple thing.
Logs from SM:
May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] LinstorSR.scan for e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9 May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] Raising exception [47, The SR is not available [opterr=Database is not mounted]] May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] lock: released /var/lock/sm/e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9/sr May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] ***** generic exception: sr_scan: EXCEPTION <class 'SR.SROSError'>, The SR is not available [opterr=Database is not mounted] May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] File "/opt/xensource/sm/SRCommand.py", line 110, in run May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] return self._run_locked(sr) May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] rv = self._run(sr, target) May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] File "/opt/xensource/sm/SRCommand.py", line 364, in _run May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] return sr.scan(self.params['sr_uuid']) May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] File "/opt/xensource/sm/LinstorSR", line 536, in wrap May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] return load(self, *args, **kwargs) May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] File "/opt/xensource/sm/LinstorSR", line 521, in load May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] return wrapped_method(self, *args, **kwargs) May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] File "/opt/xensource/sm/LinstorSR", line 381, in wrapped_method May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] return method(self, *args, **kwargs) May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] File "/opt/xensource/sm/LinstorSR", line 777, in scan May 2 13:22:02 xcp-ng-labs-host01 SM: [19242] opterr='Database is not mounted' May 2 13:22:02 xcp-ng-labs-host01 SM: [19242]
-
Have you restarted the satellites?
-
@Maelstrom96 said in XOSTOR hyperconvergence preview:
However, after rebooting the master, it seems like the SR doesn't want to allow any disk migration, and manual Scan are failing.
What's the status of these commands on each host?
systemctl status linstor-controller systemctl status linstor-satellite systemctl status drbd-reactor mountpoint /var/lib/linstor drbdsetup events2
Also please share your SMlog files.
-
@ronan-a said in XOSTOR hyperconvergence preview:
drbdsetup events2
Host1:
[09:49 xcp-ng-labs-host01 ~]# systemctl status linstor-controller ● linstor-controller.service - drbd-reactor controlled linstor-controller Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled) Drop-In: /run/systemd/system/linstor-controller.service.d └─reactor.conf Active: active (running) since Thu 2024-05-02 13:24:32 PDT; 20h ago Main PID: 21340 (java) CGroup: /system.slice/linstor-controller.service └─21340 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Controller --logs=/var/log/linstor-controller --config-directory=/etc/linstor [09:49 xcp-ng-labs-host01 ~]# systemctl status linstor-satellite ● linstor-satellite.service - LINSTOR Satellite Service Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/linstor-satellite.service.d └─override.conf Active: active (running) since Wed 2024-05-01 16:04:05 PDT; 1 day 17h ago Main PID: 1947 (java) CGroup: /system.slice/linstor-satellite.service ├─1947 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor ├─2109 drbdsetup events2 all └─2347 /usr/sbin/dmeventd [09:49 xcp-ng-labs-host01 ~]# systemctl status drbd-reactor ● drbd-reactor.service - DRBD-Reactor Service Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/drbd-reactor.service.d └─override.conf Active: active (running) since Wed 2024-05-01 16:04:11 PDT; 1 day 17h ago Docs: man:drbd-reactor man:drbd-reactorctl man:drbd-reactor.toml Main PID: 1950 (drbd-reactor) CGroup: /system.slice/drbd-reactor.service ├─1950 /usr/sbin/drbd-reactor └─1976 drbdsetup events2 --full --poll [09:49 xcp-ng-labs-host01 ~]# mountpoint /var/lib/linstor /var/lib/linstor is a mountpoint [09:49 xcp-ng-labs-host01 ~]# drbdsetup events2 exists resource name:xcp-persistent-database role:Primary suspended:no force-io-failures:no may_promote:no promotion_score:10103 exists connection name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary exists connection name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 connection:Connected role:Secondary exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes exists peer-device name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.200:7000 peer:ipv4:10.100.0.202:7000 established:yes exists peer-device name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.200:7000 peer:ipv4:10.100.0.201:7000 established:yes exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103 exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 connection:Connected role:Primary exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.200:7001 peer:ipv4:10.100.0.202:7001 established:yes exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.200:7001 peer:ipv4:10.100.0.201:7001 established:yes exists -
Host2:
[09:51 xcp-ng-labs-host02 ~]# systemctl status linstor-controller ● linstor-controller.service - drbd-reactor controlled linstor-controller Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled) Drop-In: /run/systemd/system/linstor-controller.service.d └─reactor.conf Active: inactive (dead) [09:51 xcp-ng-labs-host02 ~]# systemctl status linstor-satellite ● linstor-satellite.service - LINSTOR Satellite Service Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/linstor-satellite.service.d └─override.conf Active: active (running) since Thu 2024-05-02 10:26:59 PDT; 23h ago Main PID: 1990 (java) CGroup: /system.slice/linstor-satellite.service ├─1990 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor ├─2128 drbdsetup events2 all └─2552 /usr/sbin/dmeventd [09:51 xcp-ng-labs-host02 ~]# systemctl status drbd-reactor ● drbd-reactor.service - DRBD-Reactor Service Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/drbd-reactor.service.d └─override.conf Active: active (running) since Thu 2024-05-02 10:27:07 PDT; 23h ago Docs: man:drbd-reactor man:drbd-reactorctl man:drbd-reactor.toml Main PID: 1989 (drbd-reactor) CGroup: /system.slice/drbd-reactor.service ├─1989 /usr/sbin/drbd-reactor └─2035 drbdsetup events2 --full --poll [09:51 xcp-ng-labs-host02 ~]# mountpoint /var/lib/linstor /var/lib/linstor is not a mountpoint [09:51 xcp-ng-labs-host02 ~]# drbdsetup events2 exists resource name:xcp-persistent-database role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103 exists connection name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 connection:Connected role:Primary exists connection name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes exists peer-device name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.201:7000 peer:ipv4:10.100.0.200:7000 established:yes exists peer-device name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.201:7000 peer:ipv4:10.100.0.202:7000 established:yes exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Primary suspended:no force-io-failures:no may_promote:no promotion_score:10103 exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 connection:Connected role:Secondary exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.201:7001 peer:ipv4:10.100.0.200:7001 established:yes exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.201:7001 peer:ipv4:10.100.0.202:7001 established:yes exists -
Host3:
[09:51 xcp-ng-labs-host03 ~]# systemctl status linstor-controller ● linstor-controller.service - drbd-reactor controlled linstor-controller Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled) Drop-In: /run/systemd/system/linstor-controller.service.d └─reactor.conf Active: inactive (dead) [09:52 xcp-ng-labs-host03 ~]# systemctl status linstor-satellite ● linstor-satellite.service - LINSTOR Satellite Service Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/linstor-satellite.service.d └─override.conf Active: active (running) since Thu 2024-05-02 10:10:16 PDT; 23h ago Main PID: 1937 (java) CGroup: /system.slice/linstor-satellite.service ├─1937 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor ├─2151 drbdsetup events2 all └─2435 /usr/sbin/dmeventd [09:52 xcp-ng-labs-host03 ~]# systemctl status drbd-reactor ● drbd-reactor.service - DRBD-Reactor Service Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/drbd-reactor.service.d └─override.conf Active: active (running) since Thu 2024-05-02 10:10:26 PDT; 23h ago Docs: man:drbd-reactor man:drbd-reactorctl man:drbd-reactor.toml Main PID: 1939 (drbd-reactor) CGroup: /system.slice/drbd-reactor.service ├─1939 /usr/sbin/drbd-reactor └─1981 drbdsetup events2 --full --poll [09:52 xcp-ng-labs-host03 ~]# mountpoint /var/lib/linstor /var/lib/linstor is not a mountpoint [09:52 xcp-ng-labs-host03 ~]# drbdsetup events2 exists resource name:xcp-persistent-database role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103 exists connection name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 connection:Connected role:Primary exists connection name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 connection:Connected role:Secondary exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes exists peer-device name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.202:7000 peer:ipv4:10.100.0.200:7000 established:yes exists peer-device name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.202:7000 peer:ipv4:10.100.0.201:7000 established:yes exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103 exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 connection:Connected role:Secondary exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 connection:Connected role:Primary exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.202:7001 peer:ipv4:10.100.0.200:7001 established:yes exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.202:7001 peer:ipv4:10.100.0.201:7001 established:yes exists -
Will be sending the debug file as a DM.
Edit: Just as a sanity check, I tried to reboot the master instead of just restarting the toolstack, and the linstor SR seems to be working as expected again. The XOSTOR tab in XOA now populates (it just errored out before) and the SR scan now goes through.
Edit2: Was able to move a VDI, but then, the same exact error started to happen again. No idea why.
-
You lost quorum.
I would start looking at DRBD - that is the underlying part that isn't working at the moment. When I deployed this I wanted to understand the parts. Key to the Linstor layer - drbd stores the cluster state and membership.
I'd advise reading the DRBD docs as well as the Linstor docs to find the commands you need to stand this back up. I really don't advise using anything spinning for disk. SSD and NVMe is the ticket. You can make rust work but its terminally slow. I found 3TB disk was ok ( ~60MB/sec ) but 9.1 (10 ) TB were just awful at with 20-40MB/sec the best I saw. I removed all the XOSTOR stuff this week to maybe reinstall on some 4TB NVMe.
The upside of all that time learning drbd and linstor was helpful when I decided to use the Piraeus operator at the kubernetes level. Its basically all the same bits built from source on the nodes you deploy on and includes a CSI driver.
-
@Theoi-Meteoroi said in XOSTOR hyperconvergence preview:
You lost quorum.
Not a quorum issue:
exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes