-
I found one of the VMs I had been using to test XOSTOR was locked up this morning. I restarted it but it will not start up and gives an error about the VDI being missing.
>xe vm-list name-label=test04 uuid ( RO) : 8ec952b4-7229-7a30-81b6-1564a58f6343 name-label ( RW): test04 power-state ( RO): halted >xe vm-disk-list vm=test04 Disk 0 VBD: uuid ( RO) : e3c465b8-17a4-d147-6383-527bd9341a16 vm-name-label ( RO): test04 userdevice ( RW): 0 Disk 0 VDI: uuid ( RO) : 735fc2d7-f1f0-4cc6-9d35-42a049d8ec6c name-label ( RW): test04_xostor01_vdi sr-name-label ( RO): XOSTOR01 virtual-size ( RO): 42949672960 >xe vm-start vm=test04 Error code: SR_BACKEND_FAILURE_46 Error parameters: , The VDI is not available [opterr=Could not load 735fc2d7-f1f0-4cc6-9d35-42a049d8ec6c because: ['XENAPI_PLUGIN_FAILURE', 'getVHDInfo', 'CommandException', 'No such file or directory']],
The logs for this are attached as file xostor issue 1.txt
-
@geoffbland Can you execute this command on the other hosts please?
ls -l /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0
Also I don't have all the info in your previous log, can you send me the previous SMlog files? (Using private message if you want. )
-
@ronan-a said in XOSTOR hyperconvergence preview:
Can you execute this command on the other hosts please?
As requested
XCPNG01 - Current linstor master
[10:59 XCPNG01 ~]# ls -l /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 lrwxrwxrwx 1 root root 17 May 22 19:24 /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 -> ../../../drbd1004
XCPNG02
[11:00 XCPNG02 ~]# ls -l /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 lrwxrwxrwx 1 root root 17 May 22 19:25 /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 -> ../../../drbd1004
XCPNG03
[07:31 XCPNG03 ~]# ls -l /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 lrwxrwxrwx 1 root root 17 May 22 19:24 /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 -> ../../../drbd1004
XCPNG04
[07:35 XCPNG04 ~]# ls -l /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 lrwxrwxrwx 1 root root 17 May 22 19:24 /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 -> ../../../drbd1004
XCPNG05
[10:49 XCPNG05 ~]# ls -l /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 lrwxrwxrwx 1 root root 17 May 22 19:24 /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 -> ../../../drbd1004
-
It seems like xe may be getting mixed up between where a host is running and where the XOSTOR storage is held.
Apologies if I have misunderstood and done something wrong here - but I think this migration should have worked.
I created a new VM on one of my hosts XCPNG05 using XOSTOR as the VDI RS. I can see that the linstore volumes are on hosts XCPNG01, XCPNG03 and XCPNG05.
XCPNG05 is an Intel server, XCPNG01 and XCPNG03 are AMD. The VM is running on XCPNG05.
Now when I try to migrate the VM's VDI from XOSTOR onto a local VDI on the same host the VM is currently running on I get a warning about incompatible CPUs.To replicate the issue:
Create new VM test05 on XOSTOR.
VM is created on host XCPNG05.>xe vm-list name-label=test05 uuid ( RO) : d3f8c52d-be3c-3712-0ccc-a526dcc241a5 name-label ( RW): test05 power-state ( RO): running >xe vm-disk-list vm=test05 Disk 0 VBD: uuid ( RO) : a337cd1f-04cc-ce46-fbfb-d5d8e290dc03 vm-name-label ( RO): test05 userdevice ( RW): 0 Disk 0 VDI: uuid ( RO) : f856680c-c00d-44af-ba3f-16d9952ccb2f name-label ( RW): test05_vdi sr-name-label ( RO): XOSTOR01 virtual-size ( RO): 34359738368 >xe sr-list name-label=XOSTOR01 uuid ( RO) : cf896912-cd71-d2b2-488a-xxxxxxxx7c87 name-label ( RW): XOSTOR01 name-description ( RW): host ( RO): <shared> type ( RO): linstor content-type ( RO):
Migrate to local disk (SSD1) on same host (XCPNG05) - this fails migrating to the same host the VM is currently running on due to incompatible CPU.
>xe sr-list name-label=XCPNG05SSD1 uuid ( RO) : c0851501-3a1b-c661-70b9-54373e0d9847 name-label ( RW): XCPNG05SSD1 name-description ( RW): host ( RO): XCPNG05 type ( RO): lvm content-type ( RO): user >xe vdi-pool-migrate uuid=f856680c-c00d-44af-ba3f-16d9952ccb2f sr-uuid=c0851501-3a1b-c661-70b9-54373e0d9847 The VM is incompatible with the CPU features of this host. vm: d3f8c52d-be3c-3712-0ccc-a526dcc241a5 (test05) host: 7bd62a77-71d6-4b51-9a86-850dd4ff4b60 (XCPNG05) reason: VM last booted on a host which had a CPU from a different vendor.
-
@geoffbland Thank you, so ok the VDI is still here on all hosts.
You can try to check the status of the VDH like the smapi using:
/usr/bin/vhd-util query --debug -vsfp -n /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0
If you have this problem on many hosts, I suspect a problem with DRBD, so maybe there is a useful info in daemon.log and/or kern.log.
-
@ronan-a said in XOSTOR hyperconvergence preview:
You can try to check the status of the VDH like the smapi using:
/usr/bin/vhd-util query --debug -vsfp -n /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0
useful info in daemon.log and/or kern.log.This gives the following:
[10:59 XCPNG01 ~]# /usr/bin/vhd-util query --debug -vsfp -n /dev/drbd/by-res/xcp-volume-75c11231-0fb8-4b40-9e2e-a0665bb758c0/0 40960 2061447680 query failed hidden: 0
I will send logs by direct mail.
-
@geoffbland Okay so it's probably not related to the driver itself, I will take a look to the logs after reception.
-
Failure trying to revert a VM to a snapshot with XOSTOR.
Created a VM with main VDI on XOSTOR (24GB) and with 6 disks each also on XOSTOR (2GB each).
All is running OK.
Now create a snapshot of the VM - this takes quite a while but does eventually succeed.
Now using XO (from sources) click the "Revert VM to this snapshot". This errors and the VM stops.vm.revert { "snapshot": "6032fc73-eb7f-cf64-2481-4346b7b57204" } { "code": "VM_REVERT_FAILED", "params": [ "OpaqueRef:1439fd0f-4e66-44c9-99af-1f8536e59378", "OpaqueRef:5ad4c51e-473e-4ab0-877d-2d0dbdb90add" ], "task": { "uuid": "4804fefd-0037-d7dd-9a7c-769230728483", "name_label": "Async.VM.revert", "name_description": "", "allowed_operations": [], "current_operations": {}, "created": "20220527T15:01:42Z", "finished": "20220527T15:01:46Z", "status": "failure", "resident_on": "OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190", "progress": 1, "type": "<none/>", "result": "", "error_info": [ "VM_REVERT_FAILED", "OpaqueRef:1439fd0f-4e66-44c9-99af-1f8536e59378", "OpaqueRef:5ad4c51e-473e-4ab0-877d-2d0dbdb90add" ], "other_config": {}, "subtask_of": "OpaqueRef:NULL", "subtasks": [], "backtrace": "(((process xapi)(filename ocaml/xapi/xapi_vm_snapshot.ml)(line 492))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 131))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 231))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 103)))" }, "message": "VM_REVERT_FAILED(OpaqueRef:1439fd0f-4e66-44c9-99af-1f8536e59378, OpaqueRef:5ad4c51e-473e-4ab0-877d-2d0dbdb90add)", "name": "XapiError", "stack": "XapiError: VM_REVERT_FAILED(OpaqueRef:1439fd0f-4e66-44c9-99af-1f8536e59378, OpaqueRef:5ad4c51e-473e-4ab0-877d-2d0dbdb90add) at Function.wrap (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/_XapiError.js:16:12) at _default (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/_getTaskResult.js:11:29) at Xapi._addRecordToCache (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/index.js:949:24) at forEach (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/index.js:983:14) at Array.forEach (<anonymous>) at Xapi._processEvents (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/index.js:973:12) at Xapi._watchEvents (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/index.js:1139:14)" }
Now viewing the VM with XO on the disks tab shows no attached disks - disk tab is blank.
But linstor appears to still have the disks and the snapshot disks too.
β XCPNG01 β xcp-volume-142cb89f-2850-4ac8-a47c-10bb2cfc4692 β xcp-sr-linstor_group β 0 β 1010 β /dev/drbd1010 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-142cb89f-2850-4ac8-a47c-10bb2cfc4692 β xcp-sr-linstor_group β 0 β 1010 β /dev/drbd1010 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-142cb89f-2850-4ac8-a47c-10bb2cfc4692 β DfltDisklessStorPool β 0 β 1010 β /dev/drbd1010 β β Unused β Diskless β β XCPNG04 β xcp-volume-142cb89f-2850-4ac8-a47c-10bb2cfc4692 β xcp-sr-linstor_group β 0 β 1010 β /dev/drbd1010 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-18fa145a-d36b-44bd-b1b5-af1e9424ea00 β xcp-sr-linstor_group β 0 β 1018 β /dev/drbd1018 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-18fa145a-d36b-44bd-b1b5-af1e9424ea00 β xcp-sr-linstor_group β 0 β 1018 β /dev/drbd1018 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-18fa145a-d36b-44bd-b1b5-af1e9424ea00 β DfltDisklessStorPool β 0 β 1018 β /dev/drbd1018 β β InUse β Diskless β β XCPNG04 β xcp-volume-18fa145a-d36b-44bd-b1b5-af1e9424ea00 β DfltDisklessStorPool β 0 β 1018 β /dev/drbd1018 β β Unused β Diskless β β XCPNG05 β xcp-volume-18fa145a-d36b-44bd-b1b5-af1e9424ea00 β xcp-sr-linstor_group β 0 β 1018 β /dev/drbd1018 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-1a6c7272-f718-4c4d-a8b0-ca8419eab314 β DfltDisklessStorPool β 0 β 1024 β /dev/drbd1024 β β Unused β Diskless β β XCPNG02 β xcp-volume-1a6c7272-f718-4c4d-a8b0-ca8419eab314 β xcp-sr-linstor_group β 0 β 1024 β /dev/drbd1024 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-1a6c7272-f718-4c4d-a8b0-ca8419eab314 β xcp-sr-linstor_group β 0 β 1024 β /dev/drbd1024 β 2.02 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-1a6c7272-f718-4c4d-a8b0-ca8419eab314 β xcp-sr-linstor_group β 0 β 1024 β /dev/drbd1024 β 2.02 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-1a6c7272-f718-4c4d-a8b0-ca8419eab314 β DfltDisklessStorPool β 0 β 1024 β /dev/drbd1024 β β Unused β Diskless β β XCPNG01 β xcp-volume-2cab6c2d-abf6-42c7-9094-d75351ed8ebb β xcp-sr-linstor_group β 0 β 1016 β /dev/drbd1016 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-2cab6c2d-abf6-42c7-9094-d75351ed8ebb β xcp-sr-linstor_group β 0 β 1016 β /dev/drbd1016 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-2cab6c2d-abf6-42c7-9094-d75351ed8ebb β xcp-sr-linstor_group β 0 β 1016 β /dev/drbd1016 β 2.02 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-2cab6c2d-abf6-42c7-9094-d75351ed8ebb β DfltDisklessStorPool β 0 β 1016 β /dev/drbd1016 β β Unused β Diskless β β XCPNG05 β xcp-volume-2cab6c2d-abf6-42c7-9094-d75351ed8ebb β DfltDisklessStorPool β 0 β 1016 β /dev/drbd1016 β β Unused β Diskless β β XCPNG01 β xcp-volume-30bf014b-025d-4f3f-a068-f9a9bf34fab2 β xcp-sr-linstor_group β 0 β 1013 β /dev/drbd1013 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-30bf014b-025d-4f3f-a068-f9a9bf34fab2 β xcp-sr-linstor_group β 0 β 1013 β /dev/drbd1013 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-30bf014b-025d-4f3f-a068-f9a9bf34fab2 β xcp-sr-linstor_group β 0 β 1013 β /dev/drbd1013 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-3bdb2b25-706c-4309-ab8f-df3190f57c43 β DfltDisklessStorPool β 0 β 1021 β /dev/drbd1021 β β Unused β Diskless β β XCPNG02 β xcp-volume-3bdb2b25-706c-4309-ab8f-df3190f57c43 β xcp-sr-linstor_group β 0 β 1021 β /dev/drbd1021 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-3bdb2b25-706c-4309-ab8f-df3190f57c43 β xcp-sr-linstor_group β 0 β 1021 β /dev/drbd1021 β 2.02 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-3bdb2b25-706c-4309-ab8f-df3190f57c43 β DfltDisklessStorPool β 0 β 1021 β /dev/drbd1021 β β Unused β Diskless β β XCPNG05 β xcp-volume-3bdb2b25-706c-4309-ab8f-df3190f57c43 β xcp-sr-linstor_group β 0 β 1021 β /dev/drbd1021 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-450f65f7-7fcc-4ffd-893e-761a2f6ac366 β xcp-sr-linstor_group β 0 β 1020 β /dev/drbd1020 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-450f65f7-7fcc-4ffd-893e-761a2f6ac366 β DfltDisklessStorPool β 0 β 1020 β /dev/drbd1020 β β Unused β Diskless β β XCPNG03 β xcp-volume-450f65f7-7fcc-4ffd-893e-761a2f6ac366 β DfltDisklessStorPool β 0 β 1020 β /dev/drbd1020 β β Unused β Diskless β β XCPNG04 β xcp-volume-450f65f7-7fcc-4ffd-893e-761a2f6ac366 β xcp-sr-linstor_group β 0 β 1020 β /dev/drbd1020 β 2.02 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-450f65f7-7fcc-4ffd-893e-761a2f6ac366 β xcp-sr-linstor_group β 0 β 1020 β /dev/drbd1020 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-466938db-11f1-4b59-8a90-ad08fa20e085 β xcp-sr-linstor_group β 0 β 1015 β /dev/drbd1015 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-466938db-11f1-4b59-8a90-ad08fa20e085 β xcp-sr-linstor_group β 0 β 1015 β /dev/drbd1015 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-466938db-11f1-4b59-8a90-ad08fa20e085 β DfltDisklessStorPool β 0 β 1015 β /dev/drbd1015 β β Unused β Diskless β β XCPNG04 β xcp-volume-466938db-11f1-4b59-8a90-ad08fa20e085 β xcp-sr-linstor_group β 0 β 1015 β /dev/drbd1015 β 2.02 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-466938db-11f1-4b59-8a90-ad08fa20e085 β DfltDisklessStorPool β 0 β 1015 β /dev/drbd1015 β β Unused β Diskless β β XCPNG01 β xcp-volume-470dcf6f-d916-403d-8258-e012c065b8ec β xcp-sr-linstor_group β 0 β 1009 β /dev/drbd1009 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-470dcf6f-d916-403d-8258-e012c065b8ec β xcp-sr-linstor_group β 0 β 1009 β /dev/drbd1009 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-470dcf6f-d916-403d-8258-e012c065b8ec β DfltDisklessStorPool β 0 β 1009 β /dev/drbd1009 β β Unused β Diskless β β XCPNG04 β xcp-volume-470dcf6f-d916-403d-8258-e012c065b8ec β xcp-sr-linstor_group β 0 β 1009 β /dev/drbd1009 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-551db5b5-7772-407a-9e8c-e549db3a0e5f β xcp-sr-linstor_group β 0 β 1008 β /dev/drbd1008 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-551db5b5-7772-407a-9e8c-e549db3a0e5f β xcp-sr-linstor_group β 0 β 1008 β /dev/drbd1008 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-551db5b5-7772-407a-9e8c-e549db3a0e5f β DfltDisklessStorPool β 0 β 1008 β /dev/drbd1008 β β Unused β Diskless β β XCPNG04 β xcp-volume-551db5b5-7772-407a-9e8c-e549db3a0e5f β xcp-sr-linstor_group β 0 β 1008 β /dev/drbd1008 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-699871db-2319-4ddd-9a44-0514d2e7aee3 β xcp-sr-linstor_group β 0 β 1025 β /dev/drbd1025 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-699871db-2319-4ddd-9a44-0514d2e7aee3 β DfltDisklessStorPool β 0 β 1025 β /dev/drbd1025 β β Unused β Diskless β β XCPNG03 β xcp-volume-699871db-2319-4ddd-9a44-0514d2e7aee3 β DfltDisklessStorPool β 0 β 1025 β /dev/drbd1025 β β Unused β Diskless β β XCPNG04 β xcp-volume-699871db-2319-4ddd-9a44-0514d2e7aee3 β xcp-sr-linstor_group β 0 β 1025 β /dev/drbd1025 β 2.02 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-699871db-2319-4ddd-9a44-0514d2e7aee3 β xcp-sr-linstor_group β 0 β 1025 β /dev/drbd1025 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-6c96822b-7ded-41dd-b4ff-690dc4795ee7 β xcp-sr-linstor_group β 0 β 1023 β /dev/drbd1023 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-6c96822b-7ded-41dd-b4ff-690dc4795ee7 β DfltDisklessStorPool β 0 β 1023 β /dev/drbd1023 β β Unused β Diskless β β XCPNG03 β xcp-volume-6c96822b-7ded-41dd-b4ff-690dc4795ee7 β xcp-sr-linstor_group β 0 β 1023 β /dev/drbd1023 β 2.02 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-6c96822b-7ded-41dd-b4ff-690dc4795ee7 β DfltDisklessStorPool β 0 β 1023 β /dev/drbd1023 β β Unused β Diskless β β XCPNG05 β xcp-volume-6c96822b-7ded-41dd-b4ff-690dc4795ee7 β xcp-sr-linstor_group β 0 β 1023 β /dev/drbd1023 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-70004559-a2c4-480f-b7bc-b26dcb95bfba β xcp-sr-linstor_group β 0 β 1027 β /dev/drbd1027 β 24.06 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-70004559-a2c4-480f-b7bc-b26dcb95bfba β xcp-sr-linstor_group β 0 β 1027 β /dev/drbd1027 β 24.06 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-70004559-a2c4-480f-b7bc-b26dcb95bfba β DfltDisklessStorPool β 0 β 1027 β /dev/drbd1027 β β Unused β Diskless β β XCPNG04 β xcp-volume-70004559-a2c4-480f-b7bc-b26dcb95bfba β xcp-sr-linstor_group β 0 β 1027 β /dev/drbd1027 β 24.06 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-70004559-a2c4-480f-b7bc-b26dcb95bfba β DfltDisklessStorPool β 0 β 1027 β /dev/drbd1027 β β Unused β Diskless β β XCPNG01 β xcp-volume-707a0158-ad31-4b4b-af2b-20d89e5717de β DfltDisklessStorPool β 0 β 1026 β /dev/drbd1026 β β Unused β Diskless β β XCPNG02 β xcp-volume-707a0158-ad31-4b4b-af2b-20d89e5717de β xcp-sr-linstor_group β 0 β 1026 β /dev/drbd1026 β 24.06 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-707a0158-ad31-4b4b-af2b-20d89e5717de β xcp-sr-linstor_group β 0 β 1026 β /dev/drbd1026 β 24.06 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-707a0158-ad31-4b4b-af2b-20d89e5717de β DfltDisklessStorPool β 0 β 1026 β /dev/drbd1026 β β Unused β Diskless β β XCPNG05 β xcp-volume-707a0158-ad31-4b4b-af2b-20d89e5717de β xcp-sr-linstor_group β 0 β 1026 β /dev/drbd1026 β 24.06 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-7aaa7a6e-98c4-4a57-a4f1-4fea0a36b17a β xcp-sr-linstor_group β 0 β 1011 β /dev/drbd1011 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-7aaa7a6e-98c4-4a57-a4f1-4fea0a36b17a β xcp-sr-linstor_group β 0 β 1011 β /dev/drbd1011 β 2.02 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-7aaa7a6e-98c4-4a57-a4f1-4fea0a36b17a β xcp-sr-linstor_group β 0 β 1011 β /dev/drbd1011 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-9320c158-489e-49e7-92b8-85c93c9e3eeb β xcp-sr-linstor_group β 0 β 1022 β /dev/drbd1022 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-9320c158-489e-49e7-92b8-85c93c9e3eeb β xcp-sr-linstor_group β 0 β 1022 β /dev/drbd1022 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-9320c158-489e-49e7-92b8-85c93c9e3eeb β DfltDisklessStorPool β 0 β 1022 β /dev/drbd1022 β β Unused β Diskless β β XCPNG04 β xcp-volume-9320c158-489e-49e7-92b8-85c93c9e3eeb β xcp-sr-linstor_group β 0 β 1022 β /dev/drbd1022 β 2.02 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-9320c158-489e-49e7-92b8-85c93c9e3eeb β DfltDisklessStorPool β 0 β 1022 β /dev/drbd1022 β β Unused β Diskless β β XCPNG02 β xcp-volume-b341848b-01d1-4019-a62f-85c6108a53e3 β xcp-sr-linstor_group β 0 β 1006 β /dev/drbd1006 β 24.06 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-b341848b-01d1-4019-a62f-85c6108a53e3 β xcp-sr-linstor_group β 0 β 1006 β /dev/drbd1006 β 24.06 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-b341848b-01d1-4019-a62f-85c6108a53e3 β xcp-sr-linstor_group β 0 β 1006 β /dev/drbd1006 β 24.06 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-bccefe12-9ff5-4317-b05c-515cb44a5710 β DfltDisklessStorPool β 0 β 1014 β /dev/drbd1014 β β Unused β Diskless β β XCPNG02 β xcp-volume-bccefe12-9ff5-4317-b05c-515cb44a5710 β xcp-sr-linstor_group β 0 β 1014 β /dev/drbd1014 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-bccefe12-9ff5-4317-b05c-515cb44a5710 β xcp-sr-linstor_group β 0 β 1014 β /dev/drbd1014 β 2.02 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-bccefe12-9ff5-4317-b05c-515cb44a5710 β xcp-sr-linstor_group β 0 β 1014 β /dev/drbd1014 β 2.02 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-bccefe12-9ff5-4317-b05c-515cb44a5710 β DfltDisklessStorPool β 0 β 1014 β /dev/drbd1014 β β Unused β Diskless β β XCPNG01 β xcp-volume-cdc051ae-bc39-4012-9ce0-6e4f855a5063 β xcp-sr-linstor_group β 0 β 1012 β /dev/drbd1012 β 2.02 GiB β Unused β UpToDate β β XCPNG02 β xcp-volume-cdc051ae-bc39-4012-9ce0-6e4f855a5063 β xcp-sr-linstor_group β 0 β 1012 β /dev/drbd1012 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-cdc051ae-bc39-4012-9ce0-6e4f855a5063 β DfltDisklessStorPool β 0 β 1012 β /dev/drbd1012 β β Unused β Diskless β β XCPNG04 β xcp-volume-cdc051ae-bc39-4012-9ce0-6e4f855a5063 β xcp-sr-linstor_group β 0 β 1012 β /dev/drbd1012 β 2.02 GiB β Unused β UpToDate β β XCPNG01 β xcp-volume-d5a744ec-d1a1-4116-a576-38608b9dd790 β DfltDisklessStorPool β 0 β 1019 β /dev/drbd1019 β β Unused β Diskless β β XCPNG02 β xcp-volume-d5a744ec-d1a1-4116-a576-38608b9dd790 β xcp-sr-linstor_group β 0 β 1019 β /dev/drbd1019 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-d5a744ec-d1a1-4116-a576-38608b9dd790 β xcp-sr-linstor_group β 0 β 1019 β /dev/drbd1019 β 2.02 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-d5a744ec-d1a1-4116-a576-38608b9dd790 β xcp-sr-linstor_group β 0 β 1019 β /dev/drbd1019 β 2.02 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-d5a744ec-d1a1-4116-a576-38608b9dd790 β DfltDisklessStorPool β 0 β 1019 β /dev/drbd1019 β β Unused β Diskless β β XCPNG01 β xcp-volume-f9cf9143-829d-4246-9051-9102f2c4709c β DfltDisklessStorPool β 0 β 1017 β /dev/drbd1017 β β Unused β Diskless β β XCPNG02 β xcp-volume-f9cf9143-829d-4246-9051-9102f2c4709c β xcp-sr-linstor_group β 0 β 1017 β /dev/drbd1017 β 2.02 GiB β Unused β UpToDate β β XCPNG03 β xcp-volume-f9cf9143-829d-4246-9051-9102f2c4709c β xcp-sr-linstor_group β 0 β 1017 β /dev/drbd1017 β 2.02 GiB β Unused β UpToDate β β XCPNG04 β xcp-volume-f9cf9143-829d-4246-9051-9102f2c4709c β xcp-sr-linstor_group β 0 β 1017 β /dev/drbd1017 β 2.02 GiB β Unused β UpToDate β β XCPNG05 β xcp-volume-f9cf9143-829d-4246-9051-9102f2c4709c β DfltDisklessStorPool β 0 β 1017 β /dev/drbd1017 β β Unused β Diskless β
From the VM disks tab if I try to Attach the disks, two of the disks created on XOSTOR are missing (data1 and data4).
Finally if I go to storage and bring up the XOSTOR storage and then press "Rescan all disks" I get this error:
sr.scan { "id": "cf896912-cd71-d2b2-488a-5792b7147c87" } { "code": "SR_BACKEND_FAILURE_46", "params": [ "", "The VDI is not available [opterr=Could not load 735fc2d7-f1f0-4cc6-9d35-42a049d8ec6c because: ['XENAPI_PLUGIN_FAILURE', 'getVHDInfo', 'CommandException', 'No such file or directory']]", "" ], "task": { "uuid": "4dcac885-dfaa-784a-eb2d-02335efde0fb", "name_label": "Async.SR.scan", "name_description": "", "allowed_operations": [], "current_operations": {}, "created": "20220527T16:27:36Z", "finished": "20220527T16:27:50Z", "status": "failure", "resident_on": "OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190", "progress": 1, "type": "<none/>", "result": "", "error_info": [ "SR_BACKEND_FAILURE_46", "", "The VDI is not available [opterr=Could not load 735fc2d7-f1f0-4cc6-9d35-42a049d8ec6c because: ['XENAPI_PLUGIN_FAILURE', 'getVHDInfo', 'CommandException', 'No such file or directory']]", "" ], "other_config": {}, "subtask_of": "OpaqueRef:NULL", "subtasks": [], "backtrace": "(((process xapi)(filename lib/backtrace.ml)(line 210))((process xapi)(filename ocaml/xapi/storage_access.ml)(line 32))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 128))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 231))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 103)))" }, "message": "SR_BACKEND_FAILURE_46(, The VDI is not available [opterr=Could not load 735fc2d7-f1f0-4cc6-9d35-42a049d8ec6c because: ['XENAPI_PLUGIN_FAILURE', 'getVHDInfo', 'CommandException', 'No such file or directory']], )", "name": "XapiError", "stack": "XapiError: SR_BACKEND_FAILURE_46(, The VDI is not available [opterr=Could not load 735fc2d7-f1f0-4cc6-9d35-42a049d8ec6c because: ['XENAPI_PLUGIN_FAILURE', 'getVHDInfo', 'CommandException', 'No such file or directory']], ) at Function.wrap (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/_XapiError.js:16:12) at _default (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/_getTaskResult.js:11:29) at Xapi._addRecordToCache (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/index.js:949:24) at forEach (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/index.js:983:14) at Array.forEach (<anonymous>) at Xapi._processEvents (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/index.js:973:12) at Xapi._watchEvents (/opt/xo/xo-builds/xen-orchestra-202204291839/packages/xen-api/src/index.js:1139:14)" }
-
@ronan-a said in XOSTOR hyperconvergence preview:
Okay so it's probably not related to the driver itself, I will take a look to the logs after reception.
Did you get chance to look at the logs I sent?
-
@geoffbland So, I didn't notice useful info outside of:
FIXME drbd_a_xcp-volu[24302] op clear, bitmap locked for 'set_n_write sync_handshake' by drbd_r_xcp-volu[24231] ... FIXME drbd_a_xcp-volu[24328] op clear, bitmap locked for 'demote' by drbd_w_xcp-volu[24188]
Like I said in my e-mail, maybe there are more details in another log file. I hope.
-
@ronan-a said in [XOSTOR hyperconvergence preview]
Like I said in my e-mail, maybe there are more details in another log file. I hope.
In the end I realised I was more trying to "use" XOSTOR whilst testing rather than properly test it. So I decided to rip it all down and start again and retest it again - this time properly recording each step so any issues can be replicated. I will let you know how this goes.
-
@ronan-a Could you please take a look at this issue I raised elsewhere on the forums.
I am currently unable to create new VMs, getting a
No such Tapdisk
error - checking down the stack trace - it seems to be coming from aget()
call in/opt/xensource/sm/blktap2.py
and this code seems to have been changed in a XOSTOR release from 24th May. -
@geoffbland Hi, I was away last week, I will take a look.
-
This is a supercool and amazing thing. Coming from Proxmox with Ceph I really feel this was a missing piece. I want to migrate my whole homelab to this!
I've been playing around with it a fair bit, and it works well, but when it came to enabling HA I ran into trouble. Is XOSTOR not a valid shared storage target to enable HA on?
Having multiple shared storages (NFS/CIFS etc) in production is a given to put backups and whatnot on, but I thought it was weird that I couldn't use XOSTOR storage to enable HA.
-
@yrrips said in XOSTOR hyperconvergence preview:
when it came to enabling HA I ran into trouble. Is XOSTOR not a valid shared storage target to enable HA on?
I'm also really hoping that XOSTOR works well and also feel like this is something XCP-NG really needs. I've tried other distributed storage solutions, notably GlusterFS but never found anything that really works 100% when outages occur.
Note I plan to use XOSTOR just for VM disks and any data they use - not for the HA share and not for backups. My logic is that:
- backups should not be anywhere near XCP-NG and should be isolated software and hardware - and hopefully location wise.
- HA share needs to survive a issue where XCP-NG HA fails; if XCP-NG quorum is not working then XOSTOR (Linstor) quorum may be affected in a similar way - so I keep my HA share on a reliable NFS share. Note that in testing I found if the HA share is not available for a while that XCP-NG stays running OK (just don't make any configuration changes until HA share is back).
-
@yrrips I fixed several problems with the HA and the linstor driver. I don't know what's your sm version, but I updated it few weeks ago (current:
sm-2.30.6-1.2.0.linstor.1.xcpng8.2.x86_64
). Could you give me more details? -
I have been waiting for this for some time. after XOSAN didn't really work for me, but neither did OVH's SAN offering.
Is there a documentation page to look at rather than going through the thread?
In the absense of that is there a minumum recommanded host count?
Would love to give this a try but would mean playing with production VMs!
-
Hello @markhewitt1978 !
- First post is a complete guide
- 3 hosts is fine, no less. No problem with more.
We are still investigating a bug we discovered recently, so I wouldn't play in production right now (except if you are very confident and you have a lot of backup).
-
Is it possible to use a separate network for the XOSTOR/Linstor disk replication from the "main" network used for XCP-ng servers?
If so when the SR is created with this command:
xe sr-create type=linstor name-label=XOSTOR host-uuid=bc3cd3af-3f09-48cf-ae55-515ba21930f5 device-config:hosts=host-a,host-b,host-c,host-d device-config:group-name=linstor_group/thin_device device-config:redundancy=4 shared=true device-config:provisioning=thin
Do the
device-config:hosts
need to be XCP-ng hosts - or can IP address of the "data-replication" network be provided here.For example, my XCP-NG servers have dual NICs, I could use the second NIC on a private network/switch with a different subnet to the "main" hosts and use this solely for XOSTOR/Linstor disk replication. Is this possible?
-
Doing some more testing on XOSTOR and starting from scratch again. A brand-new XCP-ng installation made onto a new 3 server pool, each server with a blank 4GB disk.
All servers have xcpng patched up to date.
Then I installed XOSTOR onto each of these servers, Linstor installed OK and I have the linstor group on each.[19:22 XCPNG30 ~]# vgs VG #PV #LV #SN Attr VSize VFree VG_XenStorage-a776b6b1-9a96-e179-ea12-f2419ae512b6 1 1 0 wz--n- <405.62g 405.61g linstor_group 1 0 0 wz--n- <3.64t <3.64t [19:22 XCPNG30 ~]# rpm -qa | grep -E "^(sm|xha)-.*linstor.*" xha-10.1.0-2.2.0.linstor.2.xcpng8.2.x86_64 sm-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64 sm-rawhba-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64 [19:21 XCPNG31 ~]# vgs VG #PV #LV #SN Attr VSize VFree VG_XenStorage-f75785ef-df30-b54c-2af4-84d19c966453 1 1 0 wz--n- <405.62g 405.61g linstor_group 1 0 0 wz--n- <3.64t <3.64t [19:21 XCPNG31 ~]# rpm -qa | grep -E "^(sm|xha)-.*linstor.*" xha-10.1.0-2.2.0.linstor.2.xcpng8.2.x86_64 sm-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64 sm-rawhba-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64 [19:23 XCPNG32 ~]# vgs VG #PV #LV #SN Attr VSize VFree VG_XenStorage-abaf8356-fc58-9124-a23b-c29e7e67c983 1 1 0 wz--n- <405.62g 405.61g linstor_group 1 0 0 wz--n- <3.64t <3.64t [19:23 XCPNG32 ~]# rpm -qa | grep -E "^(sm|xha)-.*linstor.*" xha-10.1.0-2.2.0.linstor.2.xcpng8.2.x86_64 sm-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64 sm-rawhba-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64 [19:26 XCPNG31 ~]# xe host-list uuid ( RO) : 7c3f2fae-0456-4155-a9ad-43790fcb4155 name-label ( RW): XCPNG32 name-description ( RW): Default install uuid ( RO) : 2e48b46a-c420-4957-9233-3e029ea39305 name-label ( RW): XCPNG30 name-description ( RW): Default install uuid ( RO) : 7aaaf4a5-0e43-442e-a9b1-38620c87fd69 name-label ( RW): XCPNG31 name-description ( RW): Default install
But I am not able to create the SR.
xe sr-create type=linstor name-label=XOSTOR01 host-uuid=7aaaf4a5-0e43-442e-a9b1-38620c87fd69 device-config:hosts=xcpng30,xcpng31,xcpng32 device-config:group-name=linstor_group device-config:redundancy=2 shared=true device-config:provisioning=thick
This gives the following error:
Error code: SR_BACKEND_FAILURE_5006 Error parameters: , LINSTOR SR creation error [opterr=Not enough online hosts],
Here's the error in the SMLog
Jul 15 19:29:22 XCPNG31 SM: [9747] sr_create {'sr_uuid': '14aa2b8b-430f-34e5-fb74-c37667cb18ec', 'subtask_of': 'DummyRef:|d39839f1-ee3a-4bfe-8a41-7a077f4f2640|SR.create', 'args': ['0'], 'host_ref': 'OpaqueRef:196f738d-24fa-4598-8e96-4a13390abc87', 'session_ref': 'OpaqueRef:e806b347-1e5f-4644-842f-26a7b06b2561', 'device_config': {'group-name': 'linstor_group', 'redundancy': '2', 'hosts': 'xcpng30,xcpng31,xcpng32', 'SRmaster': 'true', 'provisioning': 'thick'}, 'command': 'sr_create', 'sr_ref': 'OpaqueRef:7ded7feb-729f-47c3-9893-1b62db0b7e17'} Jul 15 19:29:22 XCPNG31 SM: [9747] LinstorSR.create for 14aa2b8b-430f-34e5-fb74-c37667cb18ec Jul 15 19:29:22 XCPNG31 SM: [9747] Raising exception [5006, LINSTOR SR creation error [opterr=Not enough online hosts]] Jul 15 19:29:22 XCPNG31 SM: [9747] lock: released /var/lock/sm/14aa2b8b-430f-34e5-fb74-c37667cb18ec/sr Jul 15 19:29:22 XCPNG31 SM: [9747] ***** generic exception: sr_create: EXCEPTION <class 'SR.SROSError'>, LINSTOR SR creation error [opterr=Not enough online hosts] Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/SRCommand.py", line 110, in run Jul 15 19:29:22 XCPNG31 SM: [9747] return self._run_locked(sr) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked Jul 15 19:29:22 XCPNG31 SM: [9747] rv = self._run(sr, target) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/SRCommand.py", line 323, in _run Jul 15 19:29:22 XCPNG31 SM: [9747] return sr.create(self.params['sr_uuid'], long(self.params['args'][0])) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/LinstorSR", line 612, in wrap Jul 15 19:29:22 XCPNG31 SM: [9747] return load(self, *args, **kwargs) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/LinstorSR", line 597, in load Jul 15 19:29:22 XCPNG31 SM: [9747] return wrapped_method(self, *args, **kwargs) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/LinstorSR", line 443, in wrapped_method Jul 15 19:29:22 XCPNG31 SM: [9747] return method(self, *args, **kwargs) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/LinstorSR", line 688, in create Jul 15 19:29:22 XCPNG31 SM: [9747] opterr='Not enough online hosts' Jul 15 19:29:22 XCPNG31 SM: [9747] Jul 15 19:29:22 XCPNG31 SM: [9747] ***** LINSTOR resources on XCP-ng: EXCEPTION <class 'SR.SROSError'>, LINSTOR SR creation error [opterr=Not enough online hosts] Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/SRCommand.py", line 378, in run Jul 15 19:29:22 XCPNG31 SM: [9747] ret = cmd.run(sr) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/SRCommand.py", line 110, in run Jul 15 19:29:22 XCPNG31 SM: [9747] return self._run_locked(sr) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked Jul 15 19:29:22 XCPNG31 SM: [9747] rv = self._run(sr, target) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/SRCommand.py", line 323, in _run Jul 15 19:29:22 XCPNG31 SM: [9747] return sr.create(self.params['sr_uuid'], long(self.params['args'][0])) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/LinstorSR", line 612, in wrap Jul 15 19:29:22 XCPNG31 SM: [9747] return load(self, *args, **kwargs) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/LinstorSR", line 597, in load Jul 15 19:29:22 XCPNG31 SM: [9747] return wrapped_method(self, *args, **kwargs) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/LinstorSR", line 443, in wrapped_method Jul 15 19:29:22 XCPNG31 SM: [9747] return method(self, *args, **kwargs) Jul 15 19:29:22 XCPNG31 SM: [9747] File "/opt/xensource/sm/LinstorSR", line 688, in create Jul 15 19:29:22 XCPNG31 SM: [9747] opterr='Not enough online hosts'
I have found the issue - the
device-config:hosts
list is case-sensitive, if the hosts are given in lower-case the above error occurs. Specifying the hosts in upper-case works.Also using a fully-qualified name for the host fails - regardless of the case used.